Josiah Carlson:

From what I have noticed in my 3-4 years of working with Scintilla, its
underlying native representation is UTF-8 in GTK and Windows,

  The underlying native representation is defined with the code page
and character set properties and can be multi-byte including UTF-8,
Shift-JIS and Big5 or single byte such as Latin-1 or KOI8-R. The regular
expression code doesn't really understand these encodings so is matching
multi-byte encodings by matching byte strings: it doesn't try to ensure
that characters align or that character ranges above ASCII like "[Γ-Ξ]"
work.

  It would be asking too much to require new regular expression code to
handle all the encodings correctly but it shouldn't behave worse.
I'm also confronted to a similar problem... would it be sensible to have a way in Scintilla to delegate this RE matching/syntax stuff to the container application? I use wxWidgets (and also ECMAScript based stuff) and clearly having the RE matching in my app would be meaningful to share the engines/syntaxes...

Armel


_______________________________________________
Scintilla-interest mailing list
[email protected]
http://mailman.lyra.org/mailman/listinfo/scintilla-interest

Reply via email to