Re: Newline independence?

Peter N Lewis Mon, 18 Feb 2002 00:36:55 -0800

>>while (<>) {
>>   tr/\015\012//d;
>>   ...
>>}
>>
>>And that would work with CR or CRLF (or under unix with LF or CRLF).
>
>   you should instead
>
>while(<>){
>       chomp;
>}
>
>   Takes care of any combination of CRLF, CR and LF.


No it doesn't, at least not according to the docs at

http://www.perldoc.com/perl5.6/pod/func/chomp.html

chomp will only remove the $/ character which is typically \015 for 
MacPerl and \012 for unix perl.  The tr code, while definitely 
slower, will work with CRLF as well in either unix or MacPerl.

>   Unfortunately $/ trick doesn't work on all combination of CRLF, CR 
>and LF.  One possible solution would be as follows;
>
>undef $/;           # enable "slurp" mode
>my $content = <FH>; # whole file now here
>chomp $content;     # trailing CRLF would be gone.
>for my $line (split /\015\012|\015|\012/, $content){
>       # now process $line one by one.
>}

Yes, something like that would work, and as you say, memory is 
usually not that relevant.

>   I think the best approach toward this is to write a module that 
>does that.  IO::Handle::AnyNewline?

Yeah, something like that was what I was thinking.  Although again, 
it is a lot more mess than just <>.  It seems to me it's worth 
another special case for $/, although that is running out of special 
case possibilities!  It's never been a problem before because I never 
had to deal with Mac & Unix files, it was always one or the other, 
possibly plus DOS/Network CRLF files.  Oh well.  It's a shame $/ is 
not a regex, but I guess that runtime cost of that would be a bit 
much.

>P.S.  Thank you for great products like anarchy and Interarchy. 
>Though I am not a user myself my customers have benefit a lot.

Glad to hear it!

Thanks for your help,
    Peter.

-- 
<http://www.interarchy.com/>  <ftp://ftp.interarchy.com/interarchy.hqx>

Re: Newline independence?

Reply via email to