magnet wrote:

Hi all,
I require some pointers for the following problem please:

I have 100+ files of html code, ie: page0001.htm page0002.htm page0003.htm etc... but I need to remove the first 20 lines and the last 17 lines from each file and then save each with a ".txt" suffix. Where do I start with this?

I used to do this sort of thing on much larger numbers of pages on my trusty Amiga using a DOS script and an editor with an AREXX port running over-night but not a clue how to even start on my Mandrake 9.2 box.

As always, any help is much appreciated :)

magnet




------------------------------------------------------------------------



You may want to use sed for the actual editing. You can also use head and tail to chop off x number of lines. Sed tends to be more powerful, and you can put all the editing options in one command file. As far as feeding the file to sed, consider something like:

for i in *.html ; do
 NAME=`basename($I).html`
 sed -f $NAME.html > $NAME.txt
done

(I may have some of the syntax wrong - it is that kind of day... But you should get the idea.)
Mikkel


--
   Do not meddle in the affairs of dragons,
for you are crunchy and taste good with ketchup.


____________________________________________________
Want to buy your Pack or Services from MandrakeSoft? 
Go to http://www.mandrakestore.com
Join the Club : http://www.mandrakeclub.com
____________________________________________________

Reply via email to