On Wed, Feb 14, 2007 at 08:25:49AM -0800, John H. Robinson, IV wrote: > Ralph Shumaker wrote: > > I've been compiling my script, using sed to do everything I was > > previously doing in vim. However, I've hit a snag. One thing that > > works in vim does *not* in sed. > > > > vim would strip out all unwanted line feeds with: > > ":%s/\([ a-zA-Z0-9,\.:;?!)?-]\)\n\([A-Z^a-z(]\)/\1 \2/cg" > > > > In my script, > > "sed -e 's/\([ a-zA-Z0-9,\.:;?!)?-]\)\n\([A-Z^a-z(]\)/\1 \2/g' 0035 >0036" > > doesn't change anything, so (as a test) I reduced it down to match > > one line in particular: > > "sed -e 's/e\nu/e u/g' 0035" > > and still no go. But reducing it to: > > "sed -e 's/e$/eeeeeee/g' 0035" > > or > > "sed -e 's/^u/uuuuuuu/g' 0035" > > works (except that it does nothing to the newline). > > > > Any suggestions? > > Almost sounds like a job for perl. I will have to go back to the > original problem to see if a nice, clean perl one-liner can tend to > this.
Going back to the original:
^1 This is a line
that is broken by
the super-imposed
word wrapping.
^2 Short line.
^3 Another line.
^4 Yet another.
^5 Some lines:
they wrap; Some
lines: they don't.
Fixing the line wrap is easy enough:
perl -lp0e 's{(?<!\n)\n}{ }xmsg' in.txt > out.txt
This says, "replace any newline that doesn't immediately follow a
newline with a space." It does have a drawback. Each line in the
output file has a single space at the end.
Does it matter if, after replacing "line that is," that the existing
position of the newline is preserved? If not:
perl -p0e 's{line\sthat\sis}{line which is}xmsg' in.txt
This turns the first "line" into:
^1 This is a line which is broken by
the super-imposed
word wrapping.
Or, you could do stuff with capturing the whitespace that was there, if
you really wanted:
perl -p0e 's{line(\s)that(\s)is}{line${1}which${2}is}xmsg' in.txt
--
http://xkcd.com/c208.html
Chris Grau
pgpsdZFdoUdde.pgp
Description: PGP signature
-- [email protected] http://www.kernel-panic.org/cgi-bin/mailman/listinfo/kplug-list
