> Steve Palincsar wrote:
>
> > I'm using Netscape 4.x's command line interface at work as part of a
> > processing system that loads HTML pages and saves them as text under
> > program control. I haven't seen anything discussing whether Mozilla
> > also supports a command line interface, and as far as I've been able
> > to discover, Mozilla doesn't appear to support save as text at all at
> > this point.
Ben Bucksch writes:
> I don't think Mozilla supports this the way you describe (never heard of
> that 4.x feature).
Right, I don't know of a way to do this with the mozilla application
itself (file a bug/rfe!)
> However, if you have retrieved the HTML page anyhow, you might be able
> to use one of the test apps do to convert it to plaintext. Maybe, they
> are available in binary form in the zipfile/tarball builds from
> mozilla.org, maybe you need to compile them yourself.
>
> The HTML->TXT converter in Mozilla is nsPlainTextSerializer.cpp,
> formerly nsHTMLToTXTSinkStream.cpp.
Specifically, you might want to look at the TestOutput program,
built in the mozilla tree if you enable tests. It can function as a
standalone html-to-text converter using the same code mozilla uses
for converting mail messages to plaintext. The code for it is in
Convert.cpp, and sample usage is in TestOutSinks (see
http://lxr.mozilla.org/seamonkey/ to find or view these files
if you don't already have a source tree).
The TestOutput program is small, and it or something like it could
easily be distributed if there was need for it.
...Akkana