"Mike Roberts" <[EMAIL PROTECTED]> wrote:
> What would the interest level be in an expanded regular expression 
> parser/matcher?

I don't know, I don't allow regular expression searches in Scintilla
anyways (it's far too easy to write a regular expression that takes
exponential time to execute, which results in a killed process).


[snip]
> The third drawback is relatively minor: my RE syntax has a few small 
> differences from the canonical Unix-style RE syntax - e.g., the quoting 
> character is "%" rather than "\".  It would probably be desirable to fix 
> that; this isn't a big job.

If you were going to fix the syntax, I would suggest (you don't have to
listen to me) going with Perl or Python syntax regular expressions. From
what I understand, they are more or less the same, so if you were going
to fix it, you would end up with at least two currently existing
languages that your syntax matched.


> The fourth problem is that the code only handles SBCS.  My original version 
> actually does all its work in UTF-8 (a multibyte Unicode encoding), so the 
> infrastructure is there for MBCS handling - but for the Scintilla conversion 
> I only accounted for single-byte characters.  For proper MBCS support, it 
> would be necessary to retrofit whatever Scintilla's standard mechanism is.  
> This wouldn't be too hard, as all string access is already encapsulated as a 
> class; but it's obviously work, and as with the reformatting I probably 
> wouldn't be able to volunteer.

>From what I have noticed in my 3-4 years of working with Scintilla, its
underlying native representation is UTF-8 in GTK and Windows, and the
style/character buffers have recently been split, so you should be able
to get at the characters in an unadulterated form.

 - Josiah


_______________________________________________
Scintilla-interest mailing list
[email protected]
http://mailman.lyra.org/mailman/listinfo/scintilla-interest

Reply via email to