Thanks for your reply.

> > s=ababxyabab
> > echo ${s/%+-(ab)/AB}
> > ababxyAB <-- Why not the shortest match just like ${s/#+-(ab)/AB} ?
>
> No bug (unless the ksh spec is out to surprise me :)

The ksh93 docs say:

    ${parameter /pattern /string }
    ${parameter //pattern /string }
    ${parameter /#pattern /string }
    ${parameter /%pattern /string }

        Expands parameter  and replaces the longest match of pattern  with
    the given string. Each occurrence of \n  in string is replaced by the
    portion of parameter  that matches the n -th sub-pattern. In the first
    form, only the first occurrence of pattern  is replaced. In the second
    form, each match for pattern  is replaced by the given string. The
    third form restricts the pattern match to the beginning of the string
    while the fourth form restricts the pattern match to the end of the
    string. When string  is null, the pattern  will be deleted and the / in
    front of string  may be omitted. When parameter  is @, *, or an array
    variable with subscript @ or *, the substitution operation is applied
    to each element in turn. In this case, the string  portion of word will
    be re-evaluated for each element. 

But maybe I am misunderstanding what "[...] restricts the pattern match
to the end of the string." means?

>
> Translated to Perl, it's the same output:

Not sure what perl has to do with that.

>
> perl -e '$_="ababxyabab"; s/(ab)+$/AB/; print' ==
> perl -e '$_="ababxyabab"; s/(ab)+?$/AB/; print' ==
> ababxyAB
>
>
> My take would be:
>
> Consider a simple-minded forward automaton for matching: anchor match
> at first ab, extend match non-greedy (2nd line) to EOL. Backtrack on
> failure and go forward to next ab match. Same results as above.
>
> Now if we start the match earlier with an additional greedy/non-greedy
> match, things will differ (note the greed "inversion" :) ):
>
> perl -e '$_="ababxyabab"; s/(.*)(ab)+$/${1}AB/; print' == ababxyabAB
> perl -e '$_="ababxyabab"; s/(.*?)(ab)+$/${1}AB/; print' == ababxyAB
>
> Thus:
>
> The behaviour is what I'd expect, though the EOL-anchoring syntax of
> ksh (% being first) makes the expression harder to read and the wrong
> expectation seem 'natural' in comparison to #, hiding the obvious
> differences between ^ and $ aka # and % in variable substitution.
>
> --
> cu
> Peter l Jakobi
> [email protected]
                                          
_______________________________________________
ast-users mailing list
[email protected]
https://mailman.research.att.com/mailman/listinfo/ast-users

Reply via email to