On 2009-11-13, Hans Aberg wrote:
> On 13 Nov 2009, at 12:25, Bertalan Fodor (LilyPondTool) wrote:
> 
> ><INITIAL,chords,lyrics,figures,notes>{BOM_UTF8}/.* {
> >if (this->lexloc->line_number () != 1 ||
> >this->lexloc->column_number () != 0)
> > {
> >   LexerError (_ ("stray UTF-8 BOM encountered").c_str ());
> >   exit (1);
> 
> Also, the conditions and the stuff following might possibly be
> removed. Something like:
>   {BOM_UTF8} {}
> or
>   {BOM_UTF8}+ {}
> 
> If a language always zips out spaces, one can have rules:
>   [ \f\r\t\v]+ {}
>   \n+          { \* Maybe action here counting lines */ }
> Then the BOM should just be treated as the other spaces and tabs.

This follows the behavior recommended by RFC 3629, so this is the
behavior LilyPond should follow, too.

More specifically, RFC 3629 states that

  "It is important to understand that the character U+FEFF appearing
  at any position other than the beginning of a stream MUST be
  interpreted with the semantics for the zero-width non-breaking
  space..."

I'll try to address this issue eventually, but I can't promise
anything right away.

Thanks,
Patrick


_______________________________________________
lilypond-devel mailing list
lilypond-devel@gnu.org
http://lists.gnu.org/mailman/listinfo/lilypond-devel

Reply via email to