Re: modify a text file

Tim Chase Fri, 28 Apr 2006 05:22:37 -0700

I am struggling with sed and gawk but I guess that it'd be possible to
employ vim in the command line (it's to make a script that will be
automatically launched every 24 hours) but I don't have any idea of
how to do it...


How could I select the blocks (see file ahead) of a text file (say
SSSS.txt) in which some particular words appear?
Imagine that I want to keep the blocks containing words like "black",
"supermassive", "red", "intermediate", "relativistic"...
 and delete the rest of blocks (and also the header and bottom of the file)

Well, my first thought would be to have a destroyable copyof the text:


   cat file | vim -

Then, clean up the stuff we don't want

   1,/received/d
   $?^\s*For subscribe options?,$d

to strip off the header and footer.

My first-pass solution will end up with duplicate results ifmore than one of your keywords appear in the same "block"but on diff. lines:


   :let @a=''
   :g/red\|relativistic/?^\s*astro-ph?,/^\s*astro-ph/-y A
   :%d
   :put a
   :1d
   :wq name_of_output.txt


You can alter that 2nd line for whatever keywords you want:

   red\|relativistic\|black\|supermassive\|intermediate

If case doesn't matter, you can tack "\c" onto your searchpattern to ignore case:


   red\|black\|supermassive\c

I don't know how it behaves with branching, so you mighthave to wrap the whole thing in parens first to make themall case-insensitive (maybe not):


   \(red\|black\|supermassive\)\c

If you want to highlight your hits as well, you can tweak itlike


:g/red\|relativistic/s!!<b>&</b>!g|?^\s*astro-ph...

which, given that you seem to want to HTMLize your results(as hinted at below), will bold each hit.

What would be the command line with vim? (or are there other possibilities?)

While you could hack all that into a command line, it mightbe easier to put those lines in a script, say "foo.vim", andthen just source that script on the command line:


   cat input.txt | vim -s foo.vim -

I would also like how to reemplace the

astro-ph/0604565 with <a href=" http://xxx.lanl.gov/pdf/astro-ph/0604565</a>

for all numbers, not only for 0604565 ...

after the ":1d" (that's "one dee", not "ell dee") line, youcould put something like

:%s!^\s*astro-ph/\(\d\+\)!<ahref="http://xxx.lanl.gov/pdf/astro-ph/\1";>&</a>

(all on one line in case my mailer bungs it). Your HTML wasa little funky there, so I made some assumptions and cleanedit up a little: The "\1" in the replacement is the number,and the "&" in the replacement is the whole original text(the "astro-ph:#######" bit), so you'll have an HTML linkwith the original text as the clickable bit.

I'm sorry I couldn't come up with a clean way to snag justthe unique paragraphs easily without having an instance showup as its own result-block.


Anyways, it's at least one sorta-solution to what you describe.

-tim

Re: modify a text file

Reply via email to