Hello Stefano,
after having read your mail, I think I can keep a lot of the code I wrote.
I didn't know a better tool than tidy to generate the XTML and it did
often fail, so I resigned to render HTML.
In fact, if I can replace the use of tidy by validator.nu to transform
the HTML and then let iText render the XHTML output to a PDF,
I mostly solved my problem, I don't need to make 2 different processes
to render a mail to PDF, and I can stay in JAVA.
Many thanks,
I understand quickly when you explain slowly ;-)
BenoƮt
On 25.01.2011 10:30, Stefano Bagnara wrote:
2011/1/25 Noss Benoit<benoit.n...@secu.lu>:
Hi, after your comments, I know think I have to split my project in two
parts
1/ The first part has to parse the message and write an html or xhtml page
representing the output I want for the message
2/ The second part has to render the html I precedently generated to PDF
I do that in a single step because of the content-id "cid:" image references.
BTW logically you need to separate components: parser and renderer.
I tried flying saucer in the past, it can generate PDF, but it needed strict
XHTML for the input, and lots of mails are not strict XHTML
I've had very good results parsing the html with validator.nu parser:
http://about.validator.nu/htmlparser/
I parsed thousands of HTML email and tested most html parser out there
and validator.nu was the only one parsing them all.
On the one hand, I think I can improve my parser to get the html I want for
most of the mails I have to transform.
On the other hand, I don't know the openoffice SDK, webkit and Mozilla, and
html rendering will be the hardest part....
If you used flying saucer in past then go ahead with that.
Stefano