On Wed, Feb 21, 2007 at 02:20:46PM -0800, Ralph Shumaker wrote:
> Chris Grau wrote:
> >Fixing the line wrap is easy enough:
> >
> > perl -lp0e 's{(?<!\n)\n}{ }xmsg' in.txt > out.txt
> >
> >This says, "replace any newline that doesn't immediately follow a
> >newline with a space." It does have a drawback. Each line in the
> >output file has a single space at the end.
> >
> >
>
> How would you say "replace any newline that is not preceded nor
> followed by a newline"? Of course, I suppose you could just run a
> second command to replace all " \n" with "\n".
Untested, but I'd start with,
s{(?<!\n)\n(?!\n)}{ }xmsg
That's a negative look-behind and a negative look-ahead. I'm not sure
it would work, but it would be where I'd start my tinkering.
> In the command you give, what is "-lp0e"? Yes, yes, I know they are
> switches, but what do they? How does the whole line read in standard
> SN (Stremler Notation®)? e.g.:
>
> perl the command
> - begin the switches
> l does this
> p does that
> 0 does yet
> e does another
The switches are documented in the perlrun(1) man page.
-l append a newline to every line printed
-p assume a while/print loop around the code
-0 specify the input record separator; normally this is \n, but without
a value, uses nul, so the entire input is slurped at once
-e the code to execute
> ' begin the command string
> s the substitution command
> { begin the search set
> ... etc.
This part is a bit more fun. I won't break it down character by
character.
s perform a substitution (in this case, on $_)
{ start the match (by default, uses /, but i like brackets)
(?<!\n) negative look-behind; make sure a newline does not precede what
is being matched; this will not be part of the final match
\n the newline we want to match
} the end of the match
{ } the replacement, in this case a single space
x re flag: extended; whitespace and comments are not significant
m re flag: multiline; ^$ match lines, not begin/end of string
s re flag: single line: . matches \n now
g re flag: global replace
Strictly speaking, neither the x or s were required, but I'm in the
habit of always using them these days.
> >Does it matter if, after replacing "line that is," that the existing
> >position of the newline is preserved? If not:
> >
> > perl -p0e 's{line\sthat\sis}{line which is}xmsg' in.txt
> >
> >
>
> I don't know what "line that is," is referring to.
The literal string from your original example. I've snipped it from
this response, but I originally took it from your message on 11 Feb with
the subject "vi, regex, and line wrapping." (Message ID:
[EMAIL PROTECTED]).
> And I don't know
> what you mean by "that the existing position of the newline is
> preserved", although my guess is that you're asking if I want to be
> able to later put back the newlines that have been stripped out. If
> so, then no, I don't care about remembering where they were.
Yes, that's what I meant.
> >Or, you could do stuff with capturing the whitespace that was there,
> >if you really wanted:
> >
> > perl -p0e 's{line(\s)that(\s)is}{line${1}which${2}is}xmsg' in.txt
> >
> >
>
> Wow, this is really good stuff about perl one liners. I know just
> enough perl to understand it (or to be able to deduce some of the
> parts I don't know).
Or get yourself into a whole lot of trouble. :)
> (I'm assuming that \s stands for any whitespace
> (" ", \t, or otherwise).)
Correct.
> I've already tackled this particular problem though, using sed and tr.
> I used sed to replace all "$" (EOL) with "`" (since the document did
> not contain any of that character and it didn't seem to be a special
> character needing a "\"). Then I used tr to strip out all "\n". Then
> I used sed to replace the last "`" with "", all "``" with "\n\n", and
> subsequently, all "`" with " ". I've used sed for everything so far,
> except for the onetime use of tr.
Yes, that's a good solution, too. I simply threw the Perl solution out
there for fun.
> It's nice to have a script doing everything from beginning to end
> because when the script is done, it will show everything that was
> done. No fading memory forgetting what all I've done and how I did
> it.
Be sure to document! :)
--
Chris Grau
pgpRYxxWQEzsF.pgp
Description: PGP signature
-- [email protected] http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-list
