On Mon, Jun 17, 2013 at 6:29 PM, Paul Isambert <[email protected]> wrote:
> luigi scarso <[email protected]> a écrit: > > On Mon, Jun 17, 2013 at 4:54 PM, Paul Isambert <[email protected]> > wrote: > > > > > luigi scarso <[email protected]> a écrit: > > > > On Mon, Jun 17, 2013 at 2:43 PM, Paul Isambert <[email protected] > > > > > wrote: > > > > > > > > > Hello all, > > > > > > > > > > This is not really a LuaTeX question, but I ask it here anyway > since a > > > > > lot of knowledgeable people read this list. > > > > > > > > > > I’ve been surprised to discover that > > > > > > > > > > print(string.gsub('abc', '.*', '(%0)')) > > > > > > > > > > returns > > > > > > > > > > (abc)() > > > > > > > > > > (similarly, “string.gmatch('abc', '.*')” returns two matches). I’d > > > > > expect > > > > > > > > > > (abc) > > > > > > > > > > > > > > > > > > myabe this can help > > > > > > > > > print(string.gsub("abc","%s*","(%0)")) > > > > ()a()b()c() 4 > > > > > > > > > print(string.gsub("abc","%S*","(%0)")) > > > > (abc)() 2 > > > > > > > > """ > > > > A pattern item can be > > > > > > > > a single character class followed by '*', which matches 0 or more > > > > repetitions of characters in the class. These repetition items will > > > always > > > > match the longest possible sequence; > > > > """ > > > > > > Thank you Luigi, but “*” has the same definition in other languages, > > > including those where there is no match on a final empty string. > > > > > > As for your first example, all languages behave the same as far as I > > > can tell, as expected. > > > > > > Best, > > > Paul > > > > > > > > $ perl -e '$x="abc"; @w=($x=~ /(.*)/g); print "tot. matches:", > scalar(@w), > > " matches:($w[0])($w[1])\n"' > > tot. matches:2 matches:(abc)() > > > > $ perl -e '$x="abc"; @w=($x=~ /(.*)/); print "tot. matches:", > scalar(@w), > > " matches:($w[0])($w[1])\n"' > > tot. matches:1 matches:(abc)() > > > > in perl > > "the modifier //g stands for global matching and allows the matching > > operator to match within a string as many times as possible" > > and I think it corresponds to > > "These repetition items will always match the longest possible sequence;" > > of pattern. > > > > Thanks again, Luigi... but again, that doesn’t explain away the > > problem. Actually, I don’t think “g” corresponds to matching the > > longest possible sequence (simply matching as many times as possible > > instead of only once), > ok, better: in Lua string.gsub match always as many time as possible, and .* is greedy. Together they are like g (as many time as possibile) and .* (greedy, default) in perl . So we have the same result. The no greedy version of * is -: print(string.gsub("abc",".-","(%0)")) ()a()b()c() 4 perl -e '$x="abc"; @w=($x=~ /(.{0}?)/g); print "tot. matches:", scalar(@w)," matches:($w[0])($w[1])($w[2])($w[3])\n"' g as many time as possible {0}? no greedy (? is redundant) -- luigi
