[ 
https://issues.apache.org/jira/browse/PDFBOX-2772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14507437#comment-14507437
 ] 

ASF subversion and git services commented on PDFBOX-2772:
---------------------------------------------------------

Commit 1675423 from [~tilman] in branch 'pdfbox/branches/1.8'
[ https://svn.apache.org/r1675423 ]

PDFBOX-2772: apppend EI after inline image

> EI token lost for rewrite
> -------------------------
>
>                 Key: PDFBOX-2772
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2772
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing, Writing
>    Affects Versions: 1.8.9, 1.8.10, 2.0.0
>            Reporter: Tilman Hausherr
>            Assignee: Tilman Hausherr
>              Labels: regression
>             Fix For: 1.8.10, 2.0.0
>
>
> From Lukas S. in the dev mailing list:
> {quote}
> a co-worker and i are currently developing a service for searching and 
> replacing content in pdf documents based on pdfbox. We started our project 
> with the 1.8.2 version of pdfbox and just trying to migrated to 1.8.8 
> recently.
> On changing to version 1.8.8 we are running into troubles with pdf content 
> concerning inline images. Our code study of the differences between those 
> versions of pdfbox led us to the handling of the EI operator as reason of our 
> troubles.
> In version 1.8.2 the method parseNextToken() of the 
> org.apache.pdfbox.pdfparser.PDFStreamParser does an unread of the EI token on 
> inline images. In newer versions this unread of the EI token doesn't exist 
> anymore with the following comment "// the EI operator isn't unread, as it 
> won't be processed anyway".
> As a consequence the token sets of a document containing an inline image 
> delivered by the PDFStreamParser can't be used to (re)render a valid pdf 
> document by the ContentStreamWriter.
> The reason is the missing token for the EI operator. Maybe, that the EI token 
> doesn't trigger any further processing, but it is still necessary to 
> represent the delimiter in the token sequence.
> On the other side if a inline image should be part of a pdf page and is 
> inserted as a token set manually, the EI token must also be present in the 
> token set, so that the ContentStreamWriter is able to create a correct pdf 
> document.
> From our point of view there are two simple approaches to get a more 
> consistent internal representation of pdf documents with pdfbox concerning 
> inline images. Either represent the EI operator as a token (revert to 
> handling in version 1.8.2.) explicitly or extend the writeObject method in 
> the ContentStreamWriter to append the EI operator implicitly. 
> {quote}
> THAT is what I call an excellent bug report :-) I think that the 2nd solution 
> you suggested is the better one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to