Synopsis 5 says that "C<\n> now matches a logical (platform independent) newline not just C<\012>". But the devil is in the details, and I'm wanting confirmation (or discussion) of the details on \n so I can implement it in PGE...
Quick summary: I'm thinking that \n should be defined as the equivalent of rule nl { [ \015\012 | <[\015\012\f\x85\x{2028}\x{2029}]> ]: } Note the colon (:) at the end of the pattern, which means that the CRLF sequence (\x0d\x0a) will always be treated as a single newline for purposes of matching C<\n>. Discussion: The common newline characters in use today are LF (\x0a), CRLF (\x0d\x0a), and CR (\x0d) depending on the operating system involved. The CRLF is the tricky one when it comes to quantification, in particular, consider the following: "\012\012\012\012" ~~ / \n**{4} / # matches (4 LFs) "\015\015\015\015" ~~ / \n**{4} / # matches (4 CRs) "\015\012\015\012" ~~ / \n**{4} / # ??? I'm of the opinion that the sequence "\015\012" should always be treated as a single newline, in which case the last expression above would not match because the target string contains only two newlines. But I want to check if others' interpretations square with mine on this point (and if there's no consensus on it, we may need to pose the question to p6l for an official ruling). The other characters in the definition of C<\n> above come from Unicode, which gives the following as line terminators: LF - line feed - u000a CR - carriage return - u000d CR+LF - CR followed by LF FF - form feed - u000c NL - next line - u0085 LS - line separator - u2028 PS - paragraph separator - u2029 With this, the definition of \N is simply any character that is not in the set [\012\015\x0c\x85\x{2028}\x{2029}]. Comments and feedback welcomed. Pm