Volker, A couple of days late but .... First less is rather clever at viewing (and saving) html files these days - have you looked at or tried it for doing this?
secondly there is html2text and w3m. I have used and found both of these tools to be alright - for my purposes anyway. html2text is a command line utility, written in C++, that converts HTML documents into plain text. homepage: http://userpage.fu-berlin.de/~mbayer/tools/html2text.html bug page: http://userpage.fu-berlin.de/~mbayer/tools/problems_html2text.txt NB: please read the caveat re gcc versions on this page. download of most recent: html2text version 1.3.1 (stable, released 2002-09-02) http://userpage.fu-berlin.de/~mbayer/tools/html2text.html#download w3m is like links & lynx and *has* a dump option(with controllable col width) which is of high repute - well it has been quite highly regarded within the ldp[1] for instance - though current opinion may differ. english homepage - http://w3m.sourceforge.net/ thirdly other beasts of similar function though of less recent vintage may be found at: http://www.ibiblio.org/pub/Linux/apps/www/converters/!INDEX.html now last off is Vilistextum which i have never used and only mention because i know that a lot of european mutt users are fond of it. homepage is at - http://www.mysunrise.ch/users/bhaak/vilistextum/ goodluck & cheers peter [1] the linux documentation project ================================================= On Sat, 30 Aug 2003 23:27:45 +1200 Volker Kuhlmann <[EMAIL PROTECTED]> wrote: > What do people use to convert html pages to a legible formatted text > representation? I find that netscape 4.[78] is by far the best (save as > text), I can't recall it having ever let me down. Occasionally, lynx > -dump and html2text produce better results, but frequently both of them > also produce downright rubbish (it all depends on the particular page). > Mozilla unfortunately didn't copy the function form netscape 4, and > produces output no better than that of html2text/lynx, but the latter 2 > are considerably less bloated (understatement). > > I expect netscape 4 to be dropped any time, and I'd prefer a command > line solution. Is there anything better than netscape 4? > > Thanks, > > Volker > > -- > Volker Kuhlmann is possibly list0570 with the domain in header > http://volker.dnsalias.net/ Please do not CC list postings to me. >
