Glenn, I think a detail of the issue is, that all modern Linux distributions use the UTF-8 multibyte encoding, by default.
Olga On Tue, Oct 29, 2013 at 6:24 PM, Glenn Fowler <[email protected]> wrote: > my guess is its an n^2 or worse problem that would eventually cause problems > even with *s++ > besides for LC_ALL=C ast defaults back to *s++ any way modulo one ?: test > for each *s++ > > > On Tue, Oct 29, 2013 at 1:09 PM, ольга крыжановская > <[email protected]> wrote: >> >> Glenn, a possible optimization is to run the regex patterns on a >> wchar_t string and not a byte string. It would eliminate all the mb*() >> calls which are often called during backtracking, and represent a >> major hit at run time. >> >> Olga >> >> On Tue, Oct 29, 2013 at 3:24 PM, Glenn Fowler <[email protected]> >> wrote: >> > its a performance problem with the underlying regex >> > whenever (...) groups are involved it has to work harder >> > if you only care about *any* match vs the longest of the leftmost >> > matches >> > then prefix the pattern with ~(-g) >> > which means "not greedy" or "minimal" >> > this loop shows the time deterioration >> > >> > x= >> > for ((i = 1; i <= 20; i++)) >> > do x=x$x >> > time -f %E $SHELL -c "[[ x__${x}__x == *@(__+(+(x)?(_))__)* ]]; >> > printf '%d %2d ' $? $i" >> > done >> > >> > >> > >> > >> > On Tue, Oct 29, 2013 at 8:40 AM, Dan Rickhoff <[email protected]> >> > wrote: >> >> >> >> >> >> If this is a ksh bug, what ksh version should I upgrade to? >> >> >> >> On: >> >> OS: Red Hat Enterprise Linux Server release 6.1 (Santiago) >> >> ksh: version sh (AT&T Research) 93t+ 2010-06-21 >> >> >> >> Elapsed time less than 2 tenths of a second: >> >> >> >> $ time -f ‘%E\n' ksh -e '[[ A__BBBBBBBB_CCCCC_Z_EEEE__F == >> >> *@(__+(+([A-Z0-9])?(_))__)* ]]' >> >> 0:00.14 >> >> >> >> However, if that string is extended by adding, say, seven more "Z"s, >> >> then >> >> the elapsed mushrooms to almost 10 seconds. >> >> >> >> $ time -f '%E\n' ksh -e '[[ A__BBBBBBBB_CCCCC_ZZZZZZZZ_EEEE__F == >> >> *@(__+(+([A-Z0-9])?(_))__)* ]]' >> >> 0:09.96 >> >> >> >> This appears to be a ksh bug (a memory leak?), what ksh version must I >> >> upgrade to to get past it? >> >> >> >> Please let me know if I should provide further information. >> >> >> >> Thanks, >> >> Dan >> >> >> >> _______________________________________________ >> >> ast-users mailing list >> >> [email protected] >> >> http://lists.research.att.com/mailman/listinfo/ast-users >> >> >> > >> > >> > _______________________________________________ >> > ast-users mailing list >> > [email protected] >> > http://lists.research.att.com/mailman/listinfo/ast-users >> > >> >> >> >> -- >> , _ _ , >> { \/`o;====- Olga Kryzhanovska -====;o`\/ } >> .----'-/`-/ [email protected] \-`\-'----. >> `'-..-| / http://twitter.com/fleyta \ |-..-'` >> /\/\ Solaris/BSD//C/C++ programmer /\/\ >> `--` `--` > > -- , _ _ , { \/`o;====- Olga Kryzhanovska -====;o`\/ } .----'-/`-/ [email protected] \-`\-'----. `'-..-| / http://twitter.com/fleyta \ |-..-'` /\/\ Solaris/BSD//C/C++ programmer /\/\ `--` `--` _______________________________________________ ast-users mailing list [email protected] http://lists.research.att.com/mailman/listinfo/ast-users
