Saravana Kumar wrote: > Hi, Hello,
> I am new to the list and newbie in perl. > > I have a big flat file(100G). The file was supposed to be in a single line > but many of records(as it has ^M). There are also ^@ and tabs in between. > > I want to first replace the control characters and tabs with space. > > I tried this s/[[:cntrl:]\t]/ /g. The [:cntrl:] character class includes the "\t" character. > After replacing the above said characters > with space i have to insert \n after each 1000th character. > > But the program hangs after reading about 24G( 1/4th of the file). > > I thought of reading the file character by character, check if the character > is ^M||^@||\t. If true replace with the space and write the ouput else > simply write the output. I have to keep track of the count of characters > so as to insert \n after each 1000th character. > > Will the above work or is there any other(simple) way to do this?( or should > i just move on to C?) > > I am not sure why my first program hang(i ran the program in a machine with > 2G RAM). You can do what you want if you set the Input Record Separator to read 1000 bytes at a time: $/ = \1000; while ( <FILE> ) { s/[[:cntrl:]]/ /g; print "$_\n"; } John -- use Perl; program fulfillment -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>