[
https://issues.apache.org/jira/browse/FOP-3271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mark Gibson updated FOP-3271:
------------------------------
Description:
We have PDF images that are exported direct from Excel (with accessibility
enabled). When rendering an accessible PDF output, the images fail to get
rendered in final PDF output.
FOP logs show an index out of bounds exception:
{code:java}
Caused by: java.lang.IndexOutOfBoundsException: Index -1 out of bounds for
length 6
at
java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:100)
at
java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:106)
at
java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:302)
at java.base/java.util.Objects.checkIndex(Objects.java:385)
at java.base/java.util.ArrayList.get(ArrayList.java:427)
at
org.apache.fop.pdf.PDFStructElem.addKidInSpecificOrder(PDFStructElem.java:208)
at
org.apache.fop.render.pdf.pdfbox.StructureTreeMerger.createParents(StructureTreeMerger.java:209)
at
org.apache.fop.render.pdf.pdfbox.StructureTreeMerger.createParents(StructureTreeMerger.java:154)
at
org.apache.fop.render.pdf.pdfbox.StructureTreeMerger.copyStructure(StructureTreeMerger.java:89)
at
org.apache.fop.render.pdf.pdfbox.TaggedPDFConductor.handleLogicalStructure(TaggedPDFConductor.java:68)
at
org.apache.fop.render.pdf.pdfbox.AbstractPDFBoxHandler.createStreamForPDF(AbstractPDFBoxHandler.java:114)
at
org.apache.fop.render.pdf.pdfbox.PDFBoxImageHandler.handleImage(PDFBoxImageHandler.java:77)
... 62 more{code}
Because the following method returns -1 ...
{code:java}
public final class StructureTreeMergerUtil {
public static int findObjectPositionInKidsArray(COSObject kidObj) {
COSDictionary kid = (COSDictionary) kidObj.getObject();
COSObject parentObj = (COSObject) kid.getItem(COSName.P);
COSDictionary parent = (COSDictionary) parentObj.getObject();
COSBase kids = parent.getItem(COSName.K);
if (kids instanceof COSArray) {
COSArray kidsArray = (COSArray)kids;
return kidsArray.indexOfObject(kid);
} else {
return 0;
}
} {code}
It turns out that the Excel exported PDF images have records that do not exist
in that record's parent's children. This can be seen in the attached images,
and are always "Artifact" records (although I'm not sure that means all
artifact records are always broken).
I've also attached a reproduction with a simple fo file, and two pdf images
that present this issue. Command to execute it is
{code:java}
fop.bat -a -fo test.fo -pdf test.pdf {code}
My PDF spec knowledge is low. So I'm currently unsure whether Excel is
producing broken PDFs, or whether there is a bug in FOP's pdf-image handling
when copying over the structure tree on externally imported PDF images.
Hoping someone can shed some light here.
Maybe the fix would be as simple as returning 0 instead of -1 from the above
method?
was:
We have PDF images that are exported direct from Excel (with accessibility
enabled). When rendering an accessible PDF output, the images fail to get
rendered in final PDF output.
FOP logs show an index out of bounds exception:
{code:java}
Caused by: java.lang.IndexOutOfBoundsException: Index -1 out of bounds for
length 6
at
java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:100)
at
java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:106)
at
java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:302)
at java.base/java.util.Objects.checkIndex(Objects.java:385)
at java.base/java.util.ArrayList.get(ArrayList.java:427)
at
org.apache.fop.pdf.PDFStructElem.addKidInSpecificOrder(PDFStructElem.java:208)
at
org.apache.fop.render.pdf.pdfbox.StructureTreeMerger.createParents(StructureTreeMerger.java:209)
at
org.apache.fop.render.pdf.pdfbox.StructureTreeMerger.createParents(StructureTreeMerger.java:154)
at
org.apache.fop.render.pdf.pdfbox.StructureTreeMerger.copyStructure(StructureTreeMerger.java:89)
at
org.apache.fop.render.pdf.pdfbox.TaggedPDFConductor.handleLogicalStructure(TaggedPDFConductor.java:68)
at
org.apache.fop.render.pdf.pdfbox.AbstractPDFBoxHandler.createStreamForPDF(AbstractPDFBoxHandler.java:114)
at
org.apache.fop.render.pdf.pdfbox.PDFBoxImageHandler.handleImage(PDFBoxImageHandler.java:77)
... 62 more{code}
Because the following method returns -1 ...
> pdf-images: Fail to render accessible pdf image in accessible PDF output when
> "Artifact" elements present in image
> ------------------------------------------------------------------------------------------------------------------
>
> Key: FOP-3271
> URL: https://issues.apache.org/jira/browse/FOP-3271
> Project: FOP
> Issue Type: Bug
> Components: renderer/pdf
> Affects Versions: 2.10, 2.11
> Reporter: Mark Gibson
> Priority: Major
> Attachments: image1-artifactNotInParentsChildren.png,
> image2-artifactNotInParentsChildren.png
>
>
> We have PDF images that are exported direct from Excel (with accessibility
> enabled). When rendering an accessible PDF output, the images fail to get
> rendered in final PDF output.
> FOP logs show an index out of bounds exception:
> {code:java}
> Caused by: java.lang.IndexOutOfBoundsException: Index -1 out of bounds for
> length 6
> at
> java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:100)
> at
> java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:106)
> at
> java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:302)
> at java.base/java.util.Objects.checkIndex(Objects.java:385)
> at java.base/java.util.ArrayList.get(ArrayList.java:427)
> at
> org.apache.fop.pdf.PDFStructElem.addKidInSpecificOrder(PDFStructElem.java:208)
> at
> org.apache.fop.render.pdf.pdfbox.StructureTreeMerger.createParents(StructureTreeMerger.java:209)
> at
> org.apache.fop.render.pdf.pdfbox.StructureTreeMerger.createParents(StructureTreeMerger.java:154)
> at
> org.apache.fop.render.pdf.pdfbox.StructureTreeMerger.copyStructure(StructureTreeMerger.java:89)
> at
> org.apache.fop.render.pdf.pdfbox.TaggedPDFConductor.handleLogicalStructure(TaggedPDFConductor.java:68)
> at
> org.apache.fop.render.pdf.pdfbox.AbstractPDFBoxHandler.createStreamForPDF(AbstractPDFBoxHandler.java:114)
> at
> org.apache.fop.render.pdf.pdfbox.PDFBoxImageHandler.handleImage(PDFBoxImageHandler.java:77)
> ... 62 more{code}
>
> Because the following method returns -1 ...
> {code:java}
> public final class StructureTreeMergerUtil {
> public static int findObjectPositionInKidsArray(COSObject kidObj) {
> COSDictionary kid = (COSDictionary) kidObj.getObject();
> COSObject parentObj = (COSObject) kid.getItem(COSName.P);
> COSDictionary parent = (COSDictionary) parentObj.getObject();
> COSBase kids = parent.getItem(COSName.K);
> if (kids instanceof COSArray) {
> COSArray kidsArray = (COSArray)kids;
> return kidsArray.indexOfObject(kid);
> } else {
> return 0;
> }
> } {code}
> It turns out that the Excel exported PDF images have records that do not
> exist in that record's parent's children. This can be seen in the attached
> images, and are always "Artifact" records (although I'm not sure that means
> all artifact records are always broken).
> I've also attached a reproduction with a simple fo file, and two pdf images
> that present this issue. Command to execute it is
> {code:java}
> fop.bat -a -fo test.fo -pdf test.pdf {code}
>
> My PDF spec knowledge is low. So I'm currently unsure whether Excel is
> producing broken PDFs, or whether there is a bug in FOP's pdf-image handling
> when copying over the structure tree on externally imported PDF images.
>
> Hoping someone can shed some light here.
>
> Maybe the fix would be as simple as returning 0 instead of -1 from the above
> method?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)