On 4/17/06, Scott Granneman <[EMAIL PROTECTED]> wrote:
> ---------- Forwarded message ----------
> From: Alan German <[EMAIL PROTECTED]>
> Date: Apr 17, 2006 10:48 AM
> Subject: SED (or other) command help
> To: Scott Granneman <[EMAIL PROTECTED]>
>
> Given a file, similar to
>
> <html>
> <head>
> stuff
> </head>
> <body>
> <pre>
> content of interest
> </pre>
> </body>
> </html>
>
> I'd like a command line to rewrite that file, keeping the <pre></pre>
> tag pair and the content inbetween, so that the result looks like:
>
> <pre>
> content of interest
> </pre>

This should work as long as the pre block does not contain another pre block:

  $ awk 'BEGIN {RS="\a"; FS="pre>"} { print "<pre>" $2 "pre>" }' input.html

The record separator (RS) should be some character that is not in the
file (above I use the audible bell).  It is possible to do in sed, but
it would be a very long command line.

As pointed out by others, the real solution is not a one-liner but
using an HTML or XML parser.

--
David Dooling
 
_______________________________________________
CWE-LUG mailing list
[email protected]
http://www.cwelug.org/
http://www.cwelug.org/archives/
http://www.cwelug.org/mailinglist/

Reply via email to