Hi,


> Am 05.02.2018 um 15:43 schrieb Esteban R <[email protected]>:
> 
> Hello. I need to rewrite a PDPage with many streams, one by one (making some 
> transformations, and there is a special need to do it one stream at a time). 
> Parsing (and pdfdebug) returns "wrong" tokens if one command begins at the 
> end of the first stream and ends at the begining of the next one. I'm using 
> pdfbox-2.0.8.
> 
> Rewriting the stream with those tokens produces a corrupted page.
> How could we re-write the page without getting a corrupted page?
> Or, at least, how can we detect this kind of failures (or this one)?
> 
> Please find a simplified example here:
> http://www.filedropper.com/out3unc
> 
> The first stream is:
> /F1 10 Tf
> BT
> 40 764.138 Td
> 0 -12.138 Td
> [
> 
> and the second one is:
> (CD) ] TJ
> ET
> 
> In this case, running the following code:
>        Iterator<PDStream> itStreams = pdPage.getContentStreams();
>        while (itStreams.hasNext()) {
>            PDStream pdstream = itStreams.next();
>            PDFStreamParser parser = new 
> PDFStreamParser(pdstream.toByteArray());
>            parser.parse();
>            List<Object> tokens = parser.getTokens();
>            for (Object token: tokens){
>                System.out.println("Token: "+token);
>            }
>        }
> 

instead of using pdPage.getContentStreams() and parsing the stream individually 
use pdPage.getContents() and read all content into a byte[]. You can then pass 
that to PDFStreamParser.

That will give you this output 

Token: COSName{F1}
Token: COSInt{10}
Token: PDFOperator{Tf}
Token: PDFOperator{BT}
Token: COSInt{40}
Token: COSFloat{764.138}
Token: PDFOperator{Td}
Token: COSInt{0}
Token: COSFloat{-12.138}
Token: PDFOperator{Td}
Token: COSArray{[COSString{CD}]}
Token: PDFOperator{TJ}
Token: PDFOperator{ET}

BR
Maruan


> shows:
> Token: COSName{F1}
> Token: COSInt{10}
> Token: PDFOperator{Tf}
> Token: PDFOperator{BT}
> Token: COSInt{40}
> Token: COSFloat{764.138}
> Token: PDFOperator{Td}
> Token: COSInt{0}
> Token: COSFloat{-12.138}
> Token: PDFOperator{Td}
> Token: COSArray{[]}                    !!!!! empty array detected, end of 
> first stream
> Token: COSString{CD}                 !!!!! begining of second stream
> Token: COSNull{}                         !!!!! closing "]"
> Token: PDFOperator{TJ}
> Token: PDFOperator{ET}
> 
> 
> Esteban


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to