This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Standardize input record separator (for portability) =head1 VERSION Maintainer: N. Hao Ching <[EMAIL PROTECTED]> Date: 10 Aug 2000 Version: 3 Mailing List: [EMAIL PROTECTED] Number: 69 =head1 ABSTRACT The default input record separator is not safe for all input files on all platforms. There should also be support for Unicode line separator (U+2028) and paragraph separator (U+2029). =head1 DESCRIPTION The input record separator should match the platform's C compiler mappings of "\r\n" (CRLF), "\n" (LF) and "\r" (CR), which are often (but not always, e.g., EBCDIC-based platforms [Peter Prymmer]): 000D 000A 000A 000D For Unicode-capable platforms, the input record separator should also match: 2028 2029 Given this input file: D O S CR LF 0044 004F 0053 000D 000A U n i x LF 0055 006E 0069 0078 000A M a c CR 004D 0061 0063 000D l i n e LS 006C 0069 006E 0065 2028 p a r a PS 0070 0061 0072 0061 2029 l i n e 006C 0069 006E 0065 This should work as expected on as many platforms as possible: my @lines = <FH>; The @lines array should contain six elements. Bart Lateur has suggested differentiating between ASCII-compatible and UTF-16. Perhaps a flag? { local $utf16 = 1; my @lines = <FH>; } The binmode function should treat data as binary and not translate line disciplines. (No one objects to this so far?) Whether $/ will remain in Perl 6 is uncertain, so this is not necessarily about $/. =head1 IMPLEMENTATION Bart Lateur suggested using a dedicated DFA regex engine. =head1 REFERENCES perlport: http://www.pudge.net/macperl/perlport.html perlunicode http://www.activestate.com/Products/ActivePerl/docs/lib/Pod/perlunicode.html RFC 58: http://tmtowtdi.perl.org/rfc/58.pod Larry Wall's response to RFC 58 on 8 Aug 2000 http://www.mail-archive.com/perl6-language%40perl.org/msg01421.html http://www.mail-archive.com/perl6-language%40perl.org/msg01423.html Simon Cozens' work on line disciplines in 5.6 binmode http://www.xray.mpe.mpg.de/mailing-lists/perl5-porters/2000-04/msg00807.html
