--- In [email protected], "Sheri" <sheri...@...> wrote:
>
> > > My thought was perhaps user could use unicode to set and unset a global
> > > variable that regex could read. If regex sets the variable, it doesn't
> > > mean unicode is loaded.
> >
> > Ok, how about this: you worry seems to be that regex plugin always drags in
> > unicode plugin, even though it's seldom needed.
> >
> > So I've compiled the few lines of code from unicode source that's needed to
> > recognise a unicode handle. Unicode plugin isn't loaded unless a unicode
> > handle is identified within regex. Just-in-time loading of unicode dll.
> > That do?
> >
> > If so no need for regex.allow_unicode_handles, I'll take it out.
>
> As far as I can tell, it isn't currently in.
> In addition to my worry that regex was unnecessarily loading the unicode
> plugin, it concerns me that regex is doing extra work. It sounds like it will
> be searching for unicode handles all the time now (despite that it will never
> find any). So I would still like a way to opt-out.
The amount of "work" done is undetectable.
However, how about I only check for unicode if
> Format/replacement string doesn't work any more. For example:
>
> win.debug(regex.version)
> ;regex.allow_unicode_handles(0) ;;doesn't work
> local subjectu=unicode.from_nums(0x00BC,0x0020,0x2153,0x00A0,0x2154)
> win.debug(?"2/3 char as utf8 from backref",regex.pcrematchall(?"\x{2154}",
> unicode.to_utf8(subjectu), "$0", "utf8"))
> win.debug(?"2/3 char as utf8 from backref",regex.pcrematchall(?"\x{2154}",
> unicode.to_utf8(subjectu), "$0", "utf8"))
> win.debug(?"2/3 char as utf8 from unicode", unicode.from_num(0x2154).to_utf8)
>
> Output using regex 206
>
> 2060 2009-01-20
> 2/3 char as utf8 from backref â
"
> 2/3 char as utf8 from backref â
"
> 2/3 char as utf8 from unicode â
"
>
> Output using regex 207
>
> 2070 2009-08-11
> 2/3 char as utf8 from backref â
"
> 2/3 char as utf8 from backref $0
> 2/3 char as utf8 from unicode â
"
>
> I got above output once (i.e., backref worked the first time in a boot),
> since then its been:
>
> 2070 2009-08-11
> 2/3 char as utf8 from backref $0
> 2/3 char as utf8 from backref $0
> 2/3 char as utf8 from unicode â
"
>
> Also, just tried regexplugintest, and it looks like format string is
> currently broken everywhere, not just with utf8 option. Over 50 conflicts
> comparing output logs. Looks like all the pcrereplace and pcrematchall tests
> failed.
Try regexPlugin207_090812.zip
http://tech.groups.yahoo.com/group/power-pro/files/0_TEMP_/AlansPluginProvisional/
> > BTW I seem to be parsing config ini file for
> >
> > defaultmatchseparator
> > defaultutf8matchseparator
> >
> > But not using them for anything, or remembering result. Redundant code?
>
> Sounds like they can be safely removed, since you say they are unused anyway.
> No such ini keys are documented.
Okay, out.
> Looks like a handle if you debug it. I think the table in the doc is in
> error. Elsewhere it says:
Will check