--thanks for the reply --you are correct; i did not have a chance to --remove the new line (\n) at the end of --the command string. The previous version --of this script was to replace the control-M --with new line (\n) ... i got carried away --after that (*thinking out loud* : i need some sort --of version control when i mess with scripts - ).
--what *should* happen is that it should --remove all of those things in my substitute --line. actually, the control-M is very minor --so it can be dropped or commented out for --another time. --i will try the 'tr' and let you folks know how it --goes. --thanks again! -X -----Original Message----- From: Jenda Krynicky [mailto:[EMAIL PROTECTED] Sent: Monday, June 09, 2003 2:06 PM To: [EMAIL PROTECTED] Subject: Re: faster way to remove characters? From: "Johnson, Shaunn" <[EMAIL PROTECTED]> > I have this bit of code (below) and I'm wondering > if there is a quicker way to remove > some odd-ball characters from very > large text files (large would be about the > 200M or so). > > [snip code] > > #!/usr/bin/perl > > #$_ =~ s/\cM\n/\n/g; > > while (<>) { > $_ =~ s/(\cM\n|\\|\~|\!|\@|\#|\$|\%|\^|\&|\*|\(|\))/\n/g; > print $_; > } > > [/snip code] > > I want to add some variable to pass (and rename INPUT file) > but before I do, I'd like to know if doing something like open() > would be any faster than this. tr/// is quicker than s/// so if that is possible you should use that. Also your s/// looks a bit strange. You really want to replace any of those characters with a newline? Also is there any reason to leave a \cM that's not followed by \n in the file? If there is not you could use $_ =~ [EMAIL PROTECTED]&*()}{\n}; Also ... if replacing any \cM is fine it would be more efficient to read the file in chunks instead of line by line: open my $IN, "< $filename" or die "Can't open $filename: $!\n" open my $OUT, "> $outfilename" or die "Can't create $outfilename: $!\n" while (read $IN, $buff, 10*1024) { $buff =~ [EMAIL PROTECTED]&*()}{\n}; print $OUT $buff; } close $IN; close $OUT; HTH, Jenda ===== [EMAIL PROTECTED] === http://Jenda.Krynicky.cz ===== When it comes to wine, women and song, wizards are allowed to get drunk and croon as much as they like. -- Terry Pratchett in Sourcery -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]