On 2009-11-13, Hans Aberg wrote: > On 13 Nov 2009, at 12:25, Bertalan Fodor (LilyPondTool) wrote: > > ><INITIAL,chords,lyrics,figures,notes>{BOM_UTF8}/.* { > >if (this->lexloc->line_number () != 1 || > >this->lexloc->column_number () != 0) > > { > > LexerError (_ ("stray UTF-8 BOM encountered").c_str ()); > > exit (1); > > Also, the conditions and the stuff following might possibly be > removed. Something like: > {BOM_UTF8} {} > or > {BOM_UTF8}+ {} > > If a language always zips out spaces, one can have rules: > [ \f\r\t\v]+ {} > \n+ { \* Maybe action here counting lines */ } > Then the BOM should just be treated as the other spaces and tabs.
This follows the behavior recommended by RFC 3629, so this is the behavior LilyPond should follow, too. More specifically, RFC 3629 states that "It is important to understand that the character U+FEFF appearing at any position other than the beginning of a stream MUST be interpreted with the semantics for the zero-width non-breaking space..." I'll try to address this issue eventually, but I can't promise anything right away. Thanks, Patrick _______________________________________________ lilypond-devel mailing list lilypond-devel@gnu.org http://lists.gnu.org/mailman/listinfo/lilypond-devel