--- In [email protected], "entropyreduction" <alancampbelllists+ya...@...> wrote: > > --- In [email protected], "Sheri" <sherip99@> wrote: > > > > --- In [email protected], "entropyreduction" > > <alancampbelllists+yahoo@> wrote: > > > > Another that checks for unicode handles, but only if the utf > > > option has been specified (I'm right in thinking a unicode string > > > can only work if "utf8" option specifed?)
> > > > Great, thanks. Yes, only valid with the utf8 option. > > Okay, try regexPluginVariants207_090813.zip, which contains > > regexNoUnicodeDllLoaded.dll > regexUnicodeOnlyLoadedOnUTFoption.dll > > but _serious_ health warning: I changed code and compiled, but > have no time at all to test. The changes were tiny, but that > doesn;t preclude me screwing things up, as you know. > > I didn't realise (remember?) that you were keen on very efficient > code. At some time in the autumn I could have a look at the code > to see if could be speeded up: making very fast code wasn't my > first object when I wrote the thing to begin with. Well in this case it seemed possible that a bottleneck might be removed. However it doesn't seem to make much difference. Thanks for letting me see that. Seems like driving it off the "utf8" option is the most correct approach. However utf8 is (for pcre) a compile-time only option. Unicode handles are not currently working as arguments for compiled regex pattern handles because its erroneous to set the utf8 option on those services. If included on one you get an error, e.g., ERROR: regex.pcreReplace: Option incomprehensible: utf8 Error occurred near line 103 of script badunicoderegex: local test=rxpat.pcrereplace(subj2u, repl2u, "utf8") Maybe the plugin could be changed to observe but discard the option in the handle form of a pcreservice. ? This currently works fine for that precompiled utf8 pattern: local test=rxpat.pcrereplace(subj2u.to_utf8, repl2u.to_utf8) Another place they don't currently work is if the newish internal utf8 option is used. But I don't see a problem with that. If the user wants to specify unicode handles, the external "utf8" option should be used. Regards, Sheri Here were some test runs. I revised the test to remove most use of the debug window (which cut the time significantly). 206 NO unicode.dll regexPluginTest elapsed time: 9.83164 206 NO unicode.dll regexPluginTest elapsed time: 9.93363 206 NO unicode.dll regexPluginTest elapsed time: 10.0021 207 All Unicode = yesterday's build, no unicode.dll available 207 All Unicode regexPluginTest elapsed time: 10.0402 207 All Unicode regexPluginTest elapsed time: 10.2889 207 All Unicode regexPluginTest elapsed time: 9.80878 207 NO unicode regexPluginTest elapsed time: 10.2943 207 NO unicode regexPluginTest elapsed time: 9.8729 207 NO unicode regexPluginTest elapsed time: 9.91535 207 NO unicode (but plugin available) regexPluginTest elapsed time: 9.88163 207 NO unicode (but plugin available) regexPluginTest elapsed time: 9.91519 207 U on UTF OPT regexPluginTest elapsed time: 9.83704 207 U on UTF OPT regexPluginTest elapsed time: 10.0283 207 U on UTF OPT regexPluginTest elapsed time: 10.2531
