Saravana Kumar wrote:
> Hi,

Hello,

> I am new to the list and newbie in perl.
> 
> I have a big flat file(100G). The file was supposed to be in a single line
> but many of records(as it has ^M). There are also ^@ and tabs in between.
> 
> I want to first replace the control characters and tabs with space.
> 
> I tried this s/[[:cntrl:]\t]/ /g.

The [:cntrl:] character class includes the "\t" character.


> After replacing the above said characters
> with space i have to insert \n after each 1000th character.
> 
> But the program hangs after reading about 24G( 1/4th of the file).
> 
> I thought of reading the file character by character, check if the character
> is ^M||^@||\t. If true replace with the space and write the ouput else
> simply write the output. I have to  keep track of the count of characters
> so as to insert \n after each 1000th character.
> 
> Will the above work or is there any other(simple) way to do this?( or should
> i just move on to C?)
> 
> I am not sure why my first program hang(i ran the program in a machine with
> 2G RAM).

You can do what you want if you set the Input Record Separator to read 1000
bytes at a time:

$/ = \1000;
while ( <FILE> ) {
    s/[[:cntrl:]]/ /g;
    print "$_\n";
    }



John
-- 
use Perl;
program
fulfillment

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to