"Mike Roberts" <[EMAIL PROTECTED]> wrote: > What would the interest level be in an expanded regular expression > parser/matcher?
I don't know, I don't allow regular expression searches in Scintilla anyways (it's far too easy to write a regular expression that takes exponential time to execute, which results in a killed process). [snip] > The third drawback is relatively minor: my RE syntax has a few small > differences from the canonical Unix-style RE syntax - e.g., the quoting > character is "%" rather than "\". It would probably be desirable to fix > that; this isn't a big job. If you were going to fix the syntax, I would suggest (you don't have to listen to me) going with Perl or Python syntax regular expressions. From what I understand, they are more or less the same, so if you were going to fix it, you would end up with at least two currently existing languages that your syntax matched. > The fourth problem is that the code only handles SBCS. My original version > actually does all its work in UTF-8 (a multibyte Unicode encoding), so the > infrastructure is there for MBCS handling - but for the Scintilla conversion > I only accounted for single-byte characters. For proper MBCS support, it > would be necessary to retrofit whatever Scintilla's standard mechanism is. > This wouldn't be too hard, as all string access is already encapsulated as a > class; but it's obviously work, and as with the reformatting I probably > wouldn't be able to volunteer. >From what I have noticed in my 3-4 years of working with Scintilla, its underlying native representation is UTF-8 in GTK and Windows, and the style/character buffers have recently been split, so you should be able to get at the characters in an unadulterated form. - Josiah _______________________________________________ Scintilla-interest mailing list [email protected] http://mailman.lyra.org/mailman/listinfo/scintilla-interest
