Josiah Carlson:
From what I have noticed in my 3-4 years of working with Scintilla, its
underlying native representation is UTF-8 in GTK and Windows,
The underlying native representation is defined with the code page
and character set properties and can be multi-byte including UTF-8,
Shift-JIS and Big5 or single byte such as Latin-1 or KOI8-R. The regular
expression code doesn't really understand these encodings so is matching
multi-byte encodings by matching byte strings: it doesn't try to ensure
that characters align or that character ranges above ASCII like "[Γ-Ξ]"
work.
It would be asking too much to require new regular expression code to
handle all the encodings correctly but it shouldn't behave worse.
I'm also confronted to a similar problem... would it be sensible to have a
way in Scintilla to delegate this RE matching/syntax stuff to the container
application?
I use wxWidgets (and also ECMAScript based stuff) and clearly having the RE
matching in my app would be meaningful to share the engines/syntaxes...
Armel
_______________________________________________
Scintilla-interest mailing list
[email protected]
http://mailman.lyra.org/mailman/listinfo/scintilla-interest