On Monday, October 8, 2012 1:39:46 PM UTC-5, Chris Jones wrote: > > I researched it a little further over the weekend, and eventually, I ran > into this via a perl forum: > > | % echo 'ascii string: "string1", unicode string: "κορδόνι"' | perl -wnE > 'say for /"[^"]*"/g > | "string1" > | "κορδόνι" > > I don't know perl, but it looks like the match on the two sample strings > includes the quotes. > > > Now, if you add a capturing group¹ around the [^"]* negated character > > class that matches the actual strings, this is what you get: > > | % echo 'ascii string: "string1", unicode string: "κορδόνι"' | perl -wnE > 'say for /"([^"]*)"/g > | string1 > | κορδόνι > > This time the match does _not_ include the quotes. >
Yes it does. The captured group, now accessible with $1, does not include the quotes. The match does include the quotes. The full match (the equivalent of which gets highlighted in Vim) is accessible in $& (in Vim: \0). But the Perl snippet given only prints out the captured group because of the /g flag. See below. > > Or, with our sample text: > > | % echo 'xxx==aaa==bbbccc==ddd==yyy' | perl -wnE 'say for /==[^=]*==/g' > | ==aaa== > | ==ddd== > | > | % echo 'xxx==aaa==bbbccc==ddd==yyy' | perl -wnE 'say for /==([^=]*)==/g' > | aaa > | ddd > In list context, if there are capturing groups, the match operator /.../g returns a list of all strings where the capture group matches. "The /g modifier specifies global pattern matching--that is, matching as many times as possible within the string. How it behaves depends on the context. In list context, it returns a list of the substrings matched by any capturing parentheses in the regular expression. If there are no parentheses, it returns a list of all the matched strings, as if there were parentheses around the whole pattern." http://perldoc.perl.org/perlop.html#Regexp-Quote-Like-Operators This Perl is really saying: for each place where ==([^=]*)== matches, print the captured match Unlike Perl, Vim cannot access any captured groups outside of the search or substitute command. In other words, Perl can do stuff like this: $mystr =~ /==(.*)==/; print $1\n; Vim cannot. This is not a regex pattern thing. It's a language thing. > So, I tried the same approach with Vim: > > | xxx==aaa==bbbccc==ddd==yyy > | > | /==[^=]*== > | /==\([^=]*\)== > > > But it doesn't make any difference.. > > Both regexes match '==aaa==' and '==ddd==' including the quotes. > Yes, they both MATCH the quotes. But the capturing group only CAPTURES the text without quotes. The same is true in Perl. Vim HIGHLIGHTS a match as if you're doing this in Perl: print "$mystr\n" if ($mystr =~ /"([^"])"/); Vim CAPTURES a group as if you're doing this in Perl: $mystr =~ /"([^"])"/; print "$1\n"; Note that the SAME pattern is used both times, but in a different way. > > Isn't Vim supposed to mimic perl regexes..? > Not really. Vim has its own dialect. Although Vim regex can do a lot of what Perl's can, it's not a 1-1 match. > > Or is there something in Vim's regex syntax that would make it work? > Not in the regex syntax. But as discussed it's not Perl's regex syntax allowing it to work in Perl either, it's how the regex is applied. Using the /g flag on a match operator in Perl gives you all matching substrings. Vim can do something similar with the matchlist() function if you pass a count to it in a loop until the match fails. I'm not sure if there's a more efficient way to extract all matches or not. -- You received this message from the "vim_use" maillist. Do not top-post! Type your reply below the text you are replying to. For more information, visit http://www.vim.org/maillist.php
