Re: [ast-users] If this is a bug, what ksh version must I upgrade to?

dan . rickhoff Wed, 30 Oct 2013 07:02:02 -0700

Glenn, 

Thank you very much. Your suggestion that I use ~(-g) did the trick.


Not only is the code is blazing fast, but, since I needed the left-most match, 
it (serendipitously) fixed the problem that my previous code was finding the 
right-most match. Here are the before and after versions. 

x=YY:__AB_CD_EF____12_34_56____PQ_RS_TU__:ZZ 

[[ $x == @(*__+(+([A-Z0-9])?(_))__*) ]] 
print “${.sh.match[2]}" 
PQ_RS_TU 

[[ $x == ~(-g:@(*__+(+([A-Z0-9])?(_))__*)) ]] 
print “${.sh.match[2]}" 
AB_CD_EF 

BTW — This is used in some code that marches along such lines, from left to 
right, finding and replacing tokens bounded by double underscores. Each token 
could be one or more numbers and/or uppercase letters and/or underscores. Where 
the token's embedded underscores must not be adjacent to another underscore, or 
appear at the token's ends. I also wanted to always be able to recover the 
match from the same index-numbered element of {.sh.match}, in this case index 
2. 

Thanks, 
Dan 

----- Original Message -----

From: "Glenn Fowler" <[email protected]> 
To: "Dan Rickhoff" <[email protected]> 
Cc: [email protected] 
Sent: Tuesday, October 29, 2013 7:24:26 AM 
Subject: Re: [ast-users] If this is a bug, what ksh version must I upgrade to? 

its a performance problem with the underlying regex 
whenever (...) groups are involved it has to work harder 
if you only care about *any* match vs the longest of the leftmost matches then 
prefix the pattern with ~(-g) 
which means "not greedy" or "minimal" 
this loop shows the time deterioration 




x= 
for ((i = 1; i <= 20; i++)) 
do x=x$x 
time -f %E $SHELL -c "[[ x__${x}__x == *@(__+(+(x)?(_))__)* ]]; printf '%d %2d 
' $? $i" 
done 




<blockquote>

</blockquote>

<blockquote>

</blockquote>

<blockquote>

</blockquote>

<blockquote>

</blockquote>




On Tue, Oct 29, 2013 at 8:40 AM, Dan Rickhoff < [email protected] > 
wrote: 

<blockquote>


If this is a ksh bug, what ksh version should I upgrade to? 

On: 
OS: Red Hat Enterprise Linux Server release 6.1 (Santiago) 
ksh: version sh (AT&T Research) 93t+ 2010-06-21 

Elapsed time less than 2 tenths of a second: 

$ time -f ‘%E\n' ksh -e '[[ A__BBBBBBBB_CCCCC_Z_EEEE__F == 
*@(__+(+([A-Z0-9])?(_))__)* ]]' 
0:00.14 

However, if that string is extended by adding, say, seven more "Z"s, then the 
elapsed mushrooms to almost 10 seconds. 

$ time -f '%E\n' ksh -e '[[ A__BBBBBBBB_CCCCC_ZZZZZZZZ_EEEE__F == 
*@(__+(+([A-Z0-9])?(_))__)* ]]' 
0:09.96 

This appears to be a ksh bug (a memory leak?), what ksh version must I upgrade 
to to get past it? 

Please let me know if I should provide further information. 

Thanks, 
Dan 

_______________________________________________ 
ast-users mailing list 
[email protected] 
http://lists.research.att.com/mailman/listinfo/ast-users 


</blockquote>

_______________________________________________
ast-users mailing list
[email protected]
http://lists.research.att.com/mailman/listinfo/ast-users

Re: [ast-users] If this is a bug, what ksh version must I upgrade to?

Reply via email to