Hi,
> Am 05.02.2018 um 15:43 schrieb Esteban R <[email protected]>: > > Hello. I need to rewrite a PDPage with many streams, one by one (making some > transformations, and there is a special need to do it one stream at a time). > Parsing (and pdfdebug) returns "wrong" tokens if one command begins at the > end of the first stream and ends at the begining of the next one. I'm using > pdfbox-2.0.8. > > Rewriting the stream with those tokens produces a corrupted page. > How could we re-write the page without getting a corrupted page? > Or, at least, how can we detect this kind of failures (or this one)? > > Please find a simplified example here: > http://www.filedropper.com/out3unc > > The first stream is: > /F1 10 Tf > BT > 40 764.138 Td > 0 -12.138 Td > [ > > and the second one is: > (CD) ] TJ > ET > > In this case, running the following code: > Iterator<PDStream> itStreams = pdPage.getContentStreams(); > while (itStreams.hasNext()) { > PDStream pdstream = itStreams.next(); > PDFStreamParser parser = new > PDFStreamParser(pdstream.toByteArray()); > parser.parse(); > List<Object> tokens = parser.getTokens(); > for (Object token: tokens){ > System.out.println("Token: "+token); > } > } > instead of using pdPage.getContentStreams() and parsing the stream individually use pdPage.getContents() and read all content into a byte[]. You can then pass that to PDFStreamParser. That will give you this output Token: COSName{F1} Token: COSInt{10} Token: PDFOperator{Tf} Token: PDFOperator{BT} Token: COSInt{40} Token: COSFloat{764.138} Token: PDFOperator{Td} Token: COSInt{0} Token: COSFloat{-12.138} Token: PDFOperator{Td} Token: COSArray{[COSString{CD}]} Token: PDFOperator{TJ} Token: PDFOperator{ET} BR Maruan > shows: > Token: COSName{F1} > Token: COSInt{10} > Token: PDFOperator{Tf} > Token: PDFOperator{BT} > Token: COSInt{40} > Token: COSFloat{764.138} > Token: PDFOperator{Td} > Token: COSInt{0} > Token: COSFloat{-12.138} > Token: PDFOperator{Td} > Token: COSArray{[]} !!!!! empty array detected, end of > first stream > Token: COSString{CD} !!!!! begining of second stream > Token: COSNull{} !!!!! closing "]" > Token: PDFOperator{TJ} > Token: PDFOperator{ET} > > > Esteban --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]

