[ 
https://issues.apache.org/jira/browse/PDFBOX-5430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17533452#comment-17533452
 ] 

Michael Klink commented on PDFBOX-5430:
---------------------------------------

Indeed, this content stream simply is broken. As [~tilman] has shown, a number 
of instructions therein have - incorrectly! - been made the contents of the 
array argument of a *TJ* instruction.

Adobe Acrobat apparently ignores that the instructions are so enclosed and acts 
as if there was no *\[* or *\] TJ*.

Other viewers might simply ignore (or treat as strings) everything that is 
neither string nor number in the array.

In case of content that matters (as invoice content does), this might lead 
completely different appearances if viewed with different viewers. Thus, PDFBox 
should definitively throw an exception here and not repair it one way or the 
other.

> PDFStreamEngine.showTextStrings with font switch
> ------------------------------------------------
>
>                 Key: PDFBOX-5430
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5430
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 2.0.26
>            Reporter: Oliver Schmidtmer
>            Assignee: Tilman Hausherr
>            Priority: Major
>             Fix For: 2.0.27, 3.0.0 PDFBox
>
>         Attachments: keine Vorschau ELO-1228188_20220228_11462_HD_online.pdf
>
>
> The attached PDF fails to render with an PDFStreamEngine.showTextStrings with 
> the following exception:
> "java.io.IOException: Unknown type COSName in array for TJ 
> operation:COSName\{F3}"
> This seems to be a font switch.
> {code:java}
> diff --git 
> "a/pdfbox/src/main/java/org/apache/pdfbox/contentstream/PDFStreamEngine.java" 
> "b/pdfbox/src/main/java/org/apache/pdfbox/contentstream/PDFStreamEngine.java"
> index e4f2259a5..12edadd2b 100644
> --- 
> "a/pdfbox/src/main/java/org/apache/pdfbox/contentstream/PDFStreamEngine.java"
> +++ 
> "b/pdfbox/src/main/java/org/apache/pdfbox/contentstream/PDFStreamEngine.java"
> @@ -680,6 +680,18 @@ public abstract class PDFStreamEngine
>                  byte[] string = ((COSString)obj).getBytes();
>                  showText(string);
>              }
> +            else if (obj instanceof COSName)
> +            {
> +                if(((COSName) obj).getName().startsWith("F"))
> +                {
> +                    textState.setFont(resources.getFont((COSName) obj));
> +                }
> +                else
> +                {
> +                    throw new IOException("Unknown type " + 
> obj.getClass().getSimpleName()
> +                            + " in array for TJ operation:" + obj);
> +                }
> +            }
>              else if (obj instanceof COSArray)
>              {
>                  LOG.error("Nested arrays are not allowed in an array for TJ 
> operation:" + obj);
> {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to