Hi Stefano,
thanks for your answer. In the past, I already tried to do this with the javax.mail.Message class. it was not a big success..., and found lots of issues due to the variety of incoming mails, so couldn't get in production. With each parsed Message, I tried to build in parallel a xhtml page representing its content (From: To: Subject: Date: and body content) When the attachement was a message, I recursively went into it and appended info found in the xhtml I previously created When I found html, I tried to transform it to XHTML with tidy, then to PDF with iText when XHTML transformation failed and had a multipart/alternative, I then rendered txt to PDF
When I found attached images, I rendered them to PDF
When I found office documents I didn't transform them
After that I merged all created PDF in one big PDF and checked it in to Documentum DB (for one message, one pdf)

The aim of the project is not to have a pretty rendering of all mail, it's just to keep track of messages our client sent.

I faced three big issues :
**************************
0/ multipart/mixed with inline image content in "cid:...."
1/ like you said html to pdf rendering is difficult and (tidy+iText or multipart/alternative) was not always working. If only I could use the Mozilla components to render it, but my understanding of it is not high enough
2/ Special caracters and encoding pb in headers and attached file names

BenoƮt.

On 24.01.2011 09:34, Stefano Bagnara wrote:
2011/1/24 Noss Benoit<benoit.n...@secu.lu>:
I don't want to spam you with this question, but I would like to make an
headless PDF mail renderer.
In my project, I want to batch process incoming mails and inject them in a
content management DB as PDF.
Am I on the right way if I use your MimeStreamParser combined with a custom
handler to make rendering?
Can I access the content? Do you suggest something else for this?
You can use mime4j but you will have to manually deal with
attachments, multiparts/alternative to decide what part to render,
multipart/mixed to get inline images streams to be placed in the html.
And how do you plan to do headless html to pdf rendering? I think this
is the difficult task. Parsing with mime4j is easy: just look at some
example.

Stefano





Reply via email to