Re: Issue Decoding PDF Attachments

Norman Maurer Mon, 27 Dec 2010 12:44:13 -0800

Maybe you could provide a test case ? An email which reproduce it
would help to..


Bye,
Norman


2010/12/27 Max Gravitt <[email protected]>:
> Hi,
>
> Yes, I am using version 0.6.
>
> thanks
> Max
>
> On Dec 27, 2010, at 2:17 PM, Norman Maurer wrote:
>
>> Hi there,
>>
>> what version of mime4j ?
>>
>> Bye
>> Norman
>>
>>
>> 2010/12/27 Max Gravitt <[email protected]>:
>>> Hi,
>>>
>>> I have an application (running on Google App Engine) that strips 
>>> attachments from inbound emails and saves them as a byte[] in the JDO data 
>>> store.  I think I'm running into a decoding issue, but I'm unsure of the 
>>> true issue or the resolution.  I'm finding that for some files, it embeds 
>>> equal signs in places where the original document doesn't have any equal 
>>> signs.  I've found that MS documents and HTML are rather tolerant of this 
>>> behavior, but PDFs tend to get corrupt when this happens.  Also, it doesn't 
>>> happen with all PDFs and it seems that it only happens when the attachment 
>>> is has a transfer encoding of "quoted-printable".
>>>
>>> I'm using MimeStreamParser and I extended SimpleContentHandler (bodyDecoded 
>>> method).  Then, I use IOUtils.toByteArray(InputStream) to get the bytes 
>>> that I save.  Any idea of what I may be missing?
>>>
>>> Below is an example of the contents of a PDF from the "more" command.  You 
>>> can see the equal signs from the second representation of the file.
>>>
>>> Original file (Good):
>>> 1 0 obj
>>> <<
>>> /CreationDate (D:20101203120005)
>>> /Producer (SCS2PDF v1.0 (\251 BeppeCosta, 2005))
>>> /Title (PRINT1)
>>>>>
>>> endobj
>>> 2 0 obj
>>> <<
>>> /Type /Catalog
>>> /Pages 3 0 R
>>>>>
>>> endobj
>>>
>>> File Snippet After Parsing, Saving, and Retrieving (Bad):
>>> 1 0 obj
>>> <<
>>> /CreationDate =
>>> (D:20101203120005)
>>> /Producer (SCS2PDF v1.0 (\251 BeppeCosta, =
>>> 2005))
>>> /Title (PRINT1)
>>>>>
>>> endobj
>>> 2 0 obj
>>> <<
>>> /Type =
>>> /Catalog
>>> /Pages 3 0 R
>>>>>
>>> endobj
>>>
>>> Any thoughts?
>>> thanks!
>>> MG
>>>
>>>
>
>

Re: Issue Decoding PDF Attachments

Reply via email to