--- Larry Wall <[EMAIL PROTECTED]> wrote:
> On Tue, Jun 29, 2004 at 10:52:34AM -0500, Jonathan Scott Duff wrote:
> 
> : Or was that to imply that a literal "a" in the RE would be
> : interpretted as a "grapheme a" when :u2 is active?
> 
> I don't know what you mean by "grapheme a" there.  If you mean, "Does
> it match any grapheme that happens to be exactly U+0061?", then the
> answer is yes.  

In my original question, I meant to differentiate between 'grapheme'
and 'possible component of a multibyte expression'.

> If you mean "Does it wildcard to any grapheme that uses
> U+0061 as the base character?", then the answer is probably no.  We
> have not yet come up with a syntax for that kind of wildcarding,
> other than dropping down to codepoints [:u1 a \pM+] or some such. 
> That may or may not be sufficient.  It'd be pretty easy to define a 
> <like a> assertion in any case.

I think this is something that we'll want as a "mode", a la
case-insensitivity. Think of it as "mark insensitivity."

I'm not sure if this should be language/locale dependent or not, but a
basic search feature for text is "fre'd" -> "fred". 

Maybe it can just roll into :i?

=Austin

Reply via email to