Just an idea..

Another "format" that would be useful for the website
to support would be tarballed collections of HTML.
Many documents are distributed as a tarball (.tar.gz) of
many HTML documents, for example, the HTML 4.01 specification
at the W3C.

It'd be nice to be able to point the parser or the website
to that file and say "Pluck it", and have a .pdb file with
exactly that content.  Hopefully it wouldn't include the
local file URL's, since that would be worthless.

Indeed, it'd be nice if the parser had the following options:
* Don't include hypertext links to local URLs;
   if the local file wasn't plucked (e.g., because it's the
   wrong filetype, not there, or wasn't included in the
   selection criteria), don't link to it.  I think someone
   else mentioned this idea.
* Define a "base URL".  If a file isn't included, strip off
   the prefix and replace it with the given "base URL".
   That would make it easier to download the file first,
   THEN generate an electronic book.

Anyway, just thinking out loud.  I hope someone finds these
ideas useful...


--- David A. Wheeler
     [EMAIL PROTECTED]

Reply via email to