Thomas Lord <l...@emf.net> writes:

Hi Thomas,

> I am trying to piece together a simple
> literate programming system that takes
> HTML as input and spews out source files.

are you aware of pandoc (http://johnmacfarlane.net/pandoc/)? Pandoc is
capable to import html files and export them in Org-mode. 

,------------------------------------------------------------------
| About pandoc
| 
| If you need to convert files from one markup format into another,
| pandoc is your swiss-army knife. Pandoc can convert documents in 
| markdown, reStructuredText, textile, HTML, or LaTeX to
| 
|   * HTML formats: XHTML, HTML5, and HTML slide shows using Slidy,
|     S5, or DZSlides.
|   * Word processor formats: Microsoft Word docx, OpenOffice/
|     LibreOffice ODT, OpenDocument XML
|   * Ebooks: EPUB
|   * Documentation formats: DocBook, GNU TexInfo, Groff man pages
|   * TeX formats: LaTeX, ConTeXt, LaTeX Beamer slides
|   * PDF via LaTeX
|   * Lightweight markup formats: Markdown, reStructuredText, 
|     AsciiDoc, MediaWiki markup, Emacs Org-Mode, Textile
`------------------------------------------------------------------

Maybe it could take care of the html, leaving only the postprocessing to
you?

-- 
cheers,
Thorsten


Reply via email to