[
https://issues.apache.org/jira/browse/PDFBOX-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17537224#comment-17537224
]
Michael Klink commented on PDFBOX-5433:
---------------------------------------
Beware, *TD* is not the only operator being implemented by two other ones, *"*
and *'* are other examples.
An alternative to replacing each such {{OperatorProcessor}} would be to
determine in {{processOperator}} whether it has been called recursively (e.g.
by using a counter which you increse at the start and decrease upon leaving)
and only apply special processing if not (e.g. if such a counter is 0).
I did something like that in my
[{{PdfContentStreamEditor}}|https://github.com/mkl-public/testarea-pdfbox2/blob/master/src/main/java/mkl/testarea/pdfbox2/content/PdfContentStreamEditor.java]
using a {{boolean}}, but I just realized that that doesn't suffice in case of
*"*.
> PDFStreamEngine creating new operators that do not exist in document
> --------------------------------------------------------------------
>
> Key: PDFBOX-5433
> URL: https://issues.apache.org/jira/browse/PDFBOX-5433
> Project: PDFBox
> Issue Type: Bug
> Reporter: Mike Cantrell
> Priority: Major
> Attachments: pdfbox-stream-engine-operators.zip, screenshot-1.png
>
>
> We're using PDFStreamEngine to do some analysis and filtering (optimizations)
> to the document's content streams. I've found an odd case where a form giving
> us extra (unwanted) operators that don't exist in the original stream.
> According to the PDFDebugger, the form's stream has the following contents:
>
> {code:java}
> 0 TL
> q
> BT
> 1 0 0 rg
> 0 i
> /TT0 20 Tf
> 0 Tc
> 0 Tw
> 0 Ts
> 100 Tz
> 0 Tr
> 0 -15.791 TD
> (HOODHD035236) Tj
> ET
> Q{code}
> I created a debug utility to output the operators given by the PDFStreamEngine
> {code:java}
> @Getter
> static class StreamDebugger extends PDFStreamEngine {
> String formName;
> Operator operator;
> List<COSBase> operands;
> int operatorCount;
> public StreamDebugger() {
> addOperator(new BeginText());
> addOperator(new Concatenate());
> addOperator(new DrawObject()); // special text version
> addOperator(new EndText());
> addOperator(new SetGraphicsStateParameters());
> addOperator(new Save());
> addOperator(new Restore());
> addOperator(new NextLine());
> addOperator(new SetCharSpacing());
> addOperator(new MoveText());
> addOperator(new MoveTextSetLeading());
> addOperator(new SetFontAndSize());
> addOperator(new ShowText());
> addOperator(new ShowTextAdjusted());
> addOperator(new SetTextLeading());
> addOperator(new SetMatrix());
> addOperator(new SetTextRenderingMode());
> addOperator(new SetTextRise());
> addOperator(new SetWordSpacing());
> addOperator(new SetTextHorizontalScaling());
> addOperator(new ShowTextLine());
> addOperator(new ShowTextLineAndSpace());
> }
> @Override
> public void showForm(PDFormXObject form) throws IOException {
> this.formName = ((COSName) operands.get(0)).getName();
> super.showForm(form);
> this.formName = null;
> }
> @Override
> protected void processOperator(Operator operator, List<COSBase> operands)
> throws IOException {
> this.operator = operator;
> this.operands = operands;
> if (Objects.equals(this.formName, "Fm0")) {
> this.operatorCount++;
> System.out.printf("%s:%s%n", operator.getName(),
> operands.toString());
> }
> super.processOperator(operator, operands);
> }
> } {code}
> The resulting output:
> {code:java}
> TL:[COSInt{0}]
> q:[]
> BT:[]
> rg:[COSInt{1}, COSInt{0}, COSInt{0}]
> i:[COSInt{0}]
> Tf:[COSName{TT0}, COSInt{20}]
> Tc:[COSInt{0}]
> Tw:[COSInt{0}]
> Ts:[COSInt{0}]
> Tz:[COSInt{100}]
> Tr:[COSInt{0}]
> TD:[COSInt{0}, COSFloat{-15.791}]
> TL:[COSFloat{15.791}]
> Td:[COSInt{0}, COSFloat{-15.791}]
> Tj:[COSString{HOODHD035236}]
> ET:[]
> Q:[] {code}
> These operators do not exist in the original stream:
> {code:java}
> TL:[COSFloat{15.791}]
> Td:[COSInt{0}, COSFloat{-15.791}]{code}
> If you were to re-write the stream given the operators from the engine, it
> causes display issues in the resulting PDF.
> I'm attaching a test case which demonstrates the issue.
>
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]