Tim Chase wrote:
print ("small string");
print (
"This is a very long string");
and I need to format it as so:
print ("small string\n");
print (
"This is a very long string\n");
Ideally, I would like to do this in one command and I would also like
to understand the regex itself. So, given the above, here is what I
understand of the regex pattern:
%s/print\s*(\s*"[^"]*\(\\n\)\@<!\ze"/&\\n/g
% - globally
s - substitute
/ - delimeter
print\s*(\s*" - my phrase to match including zero or more matching
spaces at the end print, then a literal paren then zero or more
spaces up until the quote
[^"]* - then everything that is not a quote (zero or more)
Doing well up through here...
( - The beginning of the group ???
\\n - literal \n
) - End group ????
\@<! - Nothing, requires no match behind ???
You've got the understanding right (though those parens are "\(" and
"\)" with backslashes). Those four lines in concert assert that a
literal "\n" doesn't come before the current point. Without the
grouping, it would only assure that the previous atom (in this case,
the "n") didn't appear here, so you'd have problems with things like
print("terminal n")
because it sees the terminal "n" so it doesn't do the substitution.
By grouping them, you assert "and when you get to this point [before
the closing quote] and there isn't a literal backslash-en here, then
we match"
In here, you're missing the "\ze" which means "when doing the
replacement, treat it as though the thing we're substituting ended
here, even though there's more stuff we're looking for (namely, the
double-quote that's next)"
" - my ending quote to match in the pattern print ("")
correct
/& - ???
This is standard substitution...the slash is the break between the
search and its replacement. The ampersand is "the whole previous
match". In this case, it's slightly tweaked because of the "\ze" that
we used...the thing we replace goes up through (but not including) the
second double-quote. So it drops in everything from "print" through
the end of the internal string (sans-closing-quote)
\\n - literal \n
correct...appending the literal \n you want.
/ - delimeter
g - each occurrence on the line
Then we have the spanning multiple lines option:
\_ [^"]*
that's
\_[
not
\_ [
\_ - match text over multiple lines (Is this like another
regex engine, like the one sed uses?)
It's a vim thing:
:help /\_
should drop you in the fray. It prefixes (infixes?)a number of atoms
that could include whitespace, so for your change, you'd likely want
to do something like change the \s atoms to \_s to include newlines.
Does this make since? The area I am having difficulty with is /& and
how the grouping is working.
Hopefully this sheds some light on matters and helps you tweak your
own regexps in the future. If you have any questions, feel free to ask.
-tim
Yes, this helps greatly. Thanks again Tim.
Sean