[ 
https://issues.apache.org/jira/browse/PDFBOX-866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12922232#action_12922232
 ] 

Adam Nichols commented on PDFBOX-866:
-------------------------------------

The original PDF has two nodes which both refer to object 21 (the colorspace, 
complete with the Index map).  When it writes out these two nodes, it does not 
write out a reference to a colorspace, but instead it writes out the entire 
colorspace twice.  This would be fine except that the colorspace is encrytped 
inline, so the first time it's written out correctly, but then the second time 
it's written out double encrypted (aka corrupted).

We can't solve this in the parser because if objects A and B reference C, you 
can't make a copy of C when parsing objects A and B because C hasn't been read 
yet.  So the solution must be to make a copy before encrypting and writing out. 
 I implemented this for COSStrings in visitFromArray() of COSWriter and it 
solved the issue for this PDF and did not cause any regression issues with the 
PDF from PDFBOX-99.

I was wondering how others felt about this solution.  It's in the COSWriter, 
which is only used for saving, so it should have a minimal impact (i.e. the 
parser, text extraction, and so forth is not affected).  My only concerns is 
that we'll run into this same issue in other places and need to make copies.  
The memory usage increase will be small as I'm only adding a local variable 
which is destroyed after executing the accept() method.  I will be doing a 
large number of tests to make sure this patch actually solves the problem and 
does not cause any undesired side effects.  If my tests come out okay and there 
are no objections, I will commit the patch.

> Indexed images are sometimes corrupted when encrypting the PDF
> --------------------------------------------------------------
>
>                 Key: PDFBOX-866
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-866
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 0.7.0, 0.7.1, 0.7.2, 0.7.3, 0.8.0-incubator, 1.0.0, 
> 1.1.0, 1.2.0, 1.2.1, 1.3.0
>         Environment: Windows Vista (32-bit)
> Java 1.5
> Head tag of PDFBox
>            Reporter: Adam Nichols
>            Assignee: Adam Nichols
>         Attachments: bitmaptest.pdf, HelpOrderingAppraisal_page1.unc.pdf
>
>
> While PDFBOX-99 did fix this problem with some images, it did not solve the 
> problem in 100% of the cases.  I'll be attaching a file which demonstrates 
> the problem and I plan on fixing this once I figure out what's going awry.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to