I'm evaluating elinks as a candidate component for a toolchain in
which I have to translate html to plain text, obeying some
housestyle rules. [Namely, in the production of ebooks for Project
Gutenberg]. For instance, I have to render all
<strong>tags</strong> into *tags*, to add four blank
lines before any <h2> and two after, white-space indentations
in some places, changing <span dir="rtl"></span> with
unicode directionals, and so on. Elinks provides me a with much better solution than other html-to-text tools (I've assessed w3m, links2, lynx, html2text.py, netrik), because it honors to some extent the css. However, I have some questions about what is supported and what not, and to which extent renderings could be customized, if possible without hacking the source code. Some of the transformations I need to do toward the required text rendering, could in fact be done by preliminary regexp substitutions in the source html, and/or subsequent substitutions in the result. However, the toolchain would be a lot more streamlined if I could add some lines to my css, and let elinks work. For instance, strong:before { content: '*'; } strong:after { content: '*'; } would take care of what exemplified above. I'm typically using elinks 0.12pre5 with elinks -dump-width 80 -no-numbering -no-references -dump 1 $1.html > $1.txt The first batch of questions which comes up is: -are :strong and :after css selectors honored? [not as I can see; could they?] -are margin, padding and text-indent properties honored in some way, by adding an appropriate number of blanks? [ditto] -why -dump produces a text with 4 blanks at the beginning of each line, and can this be changed? -how can I change the number of blank lines before and after <h1>,<h2>,...,? -when elinks dumps, is it using a particular @media selector? -Is the unstable snapshot more advanced in any respect relevant to me? Thanks in advance for any hint and for providing this nice piece of software. Enrico |
_______________________________________________ elinks-users mailing list elinks-users@linuxfromscratch.org http://linuxfromscratch.org/mailman/listinfo/elinks-users