On 31-Jan-2008, at 15:56, Ken Loomis wrote:
I have a couple hundred HTML files that start off like this:
<html><head>
<title>Unique Page Title</title>
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<meta name="description" content="blah blah blah">
<meta name="keywords" content="long list of keywords, keyword 1,
keyword 2">
.
.
.
I need to reduce a copy of these files from the first example above
to this:
<title>Unique Page Title</title>
<meta name="keywords" content="long list of keywords, keyword 1,
keyword 2">
I don't know exactly what you mean here. Are you simply trying to
strip the title and keywords out of the files into a new file? Or do
you want to remove everything in the file except those lines?
Either way, I think what you want here is not grep, but "Process Lines
containing..." with a simple grep OR:
(<title>|<meta name="keywords")
I will then also need to replace the block of text in that first
example, with something like this:
<html><head>
<meta http-equiv="Content-Type" content="text/html;
charset=ISO-8859-1">
<!-- Start of Title, Description & Keywords -->
<title>Unique Page Title</title>
<meta name="description" content="blah blah blah">
<meta name="keywords" content="long list of keywords, keyword 1,
keyword 2">
<!-- End of Title, Description & Keywords -->
Well, rearranging the lines I have no suggestions on. Adding the
start and end should be straight forward enough
find <title>
replace <!-- Start... <title>
Later, I will ultimately need a single tab delimited file that looks
like this:
Page Titles Keyowrds
Unique Page Title1 keywords from page 1
Unique Page Title2 keywords from page 2
Unique Page Title2 keywords from page 3
That sounds like something well beyond grep, but it should be doable
with Process Lines containing... and then some manipulation of the
extracted lines.
--
Your letters they all say that you're beside me now.
Then why do I feel alone?
I'm standing on a ledge and your fine spider web
is fastening my ankle to a stone.
--
------------------------------------------------------------------
Have a feature request? Not sure the software's working correctly?
If so, please send mail to <[EMAIL PROTECTED]>, not to the list.
List FAQ: <http://www.barebones.com/support/lists/bbedit_talk.shtml>
List archives: <http://www.listsearch.com/BBEditTalk.lasso>
To unsubscribe, send mail to: <[EMAIL PROTECTED]>