Gary Kline wrote:
On Tue, May 15, 2007 at 03:34:14PM +1000, Ian Smith wrote:
On Sat, 12 May 2007 14:34:52 -0700 Gary Kline <[EMAIL PROTECTED]> wrote:
> On Mon, May 14, 2007 at 12:09:07PM -0700, Chuck Swiger wrote:
> > On May 12, 2007, at 12:54 PM, Gary Kline wrote:
> > >This is for those of us who appreciate ASCII or straight
> > > ISO_8859-15 rather than marked up files. I have slapped together
> > > a crude C program that does scotch (or *cleanse*) text of
> > > <B></B> and so on. Still... is there some standalone converter
> > > that gets rids of markup more elegantly? Something where i
> > > can say
> > >
> > > % cmd file_1.html ... file_N.html and output file_1.text ...
> > > file_N.text?
> > Perhaps:
> > lynx -dump file1.html ... > file.text
> > ...?
> Hm, maybe Ineed Bill Campbell's -force_html switch.
> Yes, seems that way. USing just -dump got most of them, but
> using the -force_html caught all. Need to script something to
> reformat, but the worst of it's done!
Also, if using Mozilla (so, I would assume, Firefox) the 'Save Page As'
dialog offers a picklist for 'Files of Type' that includes 'Text Files'.
This does a pretty decent job of producing text from HTML files, and is
quicker than firing up lynx (or links) if you're already viewing a page.
Oh sure; I've been saving html in text, ascii/8859-1 for years.
But what I've got, and there are more saved **somewhere**, are
files that are saved by default in markup. I have a slew of
these on different boxen and have been moving then to one place.
Problem is: how to de-html the bunch.
I'm too lazy to write something that would automate what Can be
automated--markup like "&foo;" are problematic. So probably the
easiest way would be to create a dehtml.sh script that is just a
wrapper around lynx.
I don't think I'm the only hacker who wants just-plain-ascii, so
this might mak a good project for somebody who's new to C or
perl. That's my two pennies' worth!
If you don't want formatting and the number of tags is trivial, the
solution is fairly simple in Perl (less than 150 lines, if even that).
firstname.lastname@example.org mailing list
To unsubscribe, send any mail to "[EMAIL PROTECTED]"