On 1 Jan 2012, at 21:06, David Kastrup wrote: >>> Updates: >>> Labels: Patch-new >>> >>> Comment #2 on issue 2159 by [email protected]: Patch: lexer.ll: Warn >>> about non-UTF-8 characters >>> http://code.google.com/p/lilypond/issues/detail?id=2159#c2 >>> >>> lexer.ll: Warn about non-UTF-8 characters >>> >>> Making the warnings point to the exact bad byte rather than the >>> enclosing construct would be nice. >> >> One way to implement this might be to use the Haskell program for Flex >> like UTF-8 regular expressions I made: >> http://xcybercloud.blogspot.com/2009/04/unicode-support-in-flex.html >> >> First make rules for the Unicode characters you want admit, followed >> by a '.' rule which picks up single excluded bytes. > > The "unicode characters we want admit" are not single characters, but > part of things like identifiers, strings and other stuff. Cf. > <URL:http://codereview.appspot.com/5505090#msg5> > for a reasoning about the current approach for this patch.
I translate Unicode character classes into Flex UTF-8 regular expressions, so you can apply the other Flex regex operators to get that stuff. Hans _______________________________________________ bug-lilypond mailing list [email protected] https://lists.gnu.org/mailman/listinfo/bug-lilypond
