[jira] [Created] (PDFBOX-5010) How to cancel/interrupt pdfbox 2 API call
Kostya Samarin created PDFBOX-5010: -- Summary: How to cancel/interrupt pdfbox 2 API call Key: PDFBOX-5010 URL: https://issues.apache.org/jira/browse/PDFBOX-5010 Project: PDFBox Issue Type: Improvement Reporter: Kostya Samarin We uses FixedThreadPool to process pdfs using PDFBOX 2 APIs. It looks like there is no way to cancel/interrupt long time calls gracefully. So * Would it worth to have a timeout for that or some sort of progress callback? * Could you please provide recommendations how to cancel/interrupt long time calls gracefully now? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-5009) Corrupt PDF can lead to a StackOverflow
[ https://issues.apache.org/jira/browse/PDFBOX-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-5009: Fix Version/s: 3.0.0 PDFBox 2.0.22 > Corrupt PDF can lead to a StackOverflow > --- > > Key: PDFBOX-5009 > URL: https://issues.apache.org/jira/browse/PDFBOX-5009 > Project: PDFBox > Issue Type: Task > Components: Text extraction >Affects Versions: 2.0.21 >Reporter: Tim Allison >Priority: Minor > Fix For: 2.0.22, 3.0.0 PDFBox > > > See TIKA-3224. I confirmed this with 2.0.21 by calling the app's ExtractText > on the file posted on the Tika issue. > cc [~dadoonet] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-5009) Corrupt PDF can lead to a StackOverflow
[ https://issues.apache.org/jira/browse/PDFBOX-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226501#comment-17226501 ] Tilman Hausherr commented on PDFBOX-5009: - I'm able to catch this by using a set to prevent a recursive call with the same parameter: {code:java} private final class PageIterator implements Iterator { private final Queue queue = new ArrayDeque<>(); private Set set = new HashSet<>(); private PageIterator(COSDictionary node) { enqueueKids(node); } private void enqueueKids(COSDictionary node) { if (isPageTreeNode(node)) { List kids = getKids(node); for (COSDictionary kid : kids) { // ** NEW ** if (set.contains(kid)) { LOG.error("This node has already been visited"); continue; } else { set.add(kid); } enqueueKids(kid); } } else { queue.add(node); } } {code} > Corrupt PDF can lead to a StackOverflow > --- > > Key: PDFBOX-5009 > URL: https://issues.apache.org/jira/browse/PDFBOX-5009 > Project: PDFBox > Issue Type: Task > Components: Text extraction >Affects Versions: 2.0.21 >Reporter: Tim Allison >Priority: Minor > > See TIKA-3224. I confirmed this with 2.0.21 by calling the app's ExtractText > on the file posted on the Tika issue. > cc [~dadoonet] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-5009) Corrupt PDF can lead to a StackOverflow
[ https://issues.apache.org/jira/browse/PDFBOX-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226486#comment-17226486 ] Tilman Hausherr commented on PDFBOX-5009: - I added some logging and stack tracing to see when it starts: {noformat} 020-11-05 05:19:14 WARN PDPageTree:154 - i = 4, element is: COSObject{207, 0} 2020-11-05 05:19:14 WARN PDPageTree:155 - COSDictionary expected, but got null java.lang.Exception at org.apache.pdfbox.pdmodel.PDPageTree.getKids(PDPageTree.java:157) at org.apache.pdfbox.pdmodel.PDPageTree.access$200(PDPageTree.java:41) at org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:184) at org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:187) at org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:187) at org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.(PDPageTree.java:173) at org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.(PDPageTree.java:167) at org.apache.pdfbox.pdmodel.PDPageTree.iterator(PDPageTree.java:126) at org.apache.pdfbox.text.PDFTextStripper.processPages(PDFTextStripper.java:289) at org.apache.pdfbox.text.PDFTextStripper.writeText(PDFTextStripper.java:241) at org.apache.pdfbox.tools.ExtractText.extractPages(ExtractText.java:364) at org.apache.pdfbox.tools.ExtractText.startExtraction(ExtractText.java:267) at org.apache.pdfbox.tools.ExtractText.main(ExtractText.java:98) at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:57) 2020-11-05 05:19:14 WARN PDPageTree:154 - i = 5, element is: COSObject{214, 0} 2020-11-05 05:19:14 WARN PDPageTree:155 - COSDictionary expected, but got null java.lang.Exception at org.apache.pdfbox.pdmodel.PDPageTree.getKids(PDPageTree.java:157) at org.apache.pdfbox.pdmodel.PDPageTree.access$200(PDPageTree.java:41) at org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:184) at org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:187) at org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:187) at org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.(PDPageTree.java:173) at org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.(PDPageTree.java:167) at org.apache.pdfbox.pdmodel.PDPageTree.iterator(PDPageTree.java:126) at org.apache.pdfbox.text.PDFTextStripper.processPages(PDFTextStripper.java:289) at org.apache.pdfbox.text.PDFTextStripper.writeText(PDFTextStripper.java:241) at org.apache.pdfbox.tools.ExtractText.extractPages(ExtractText.java:364) at org.apache.pdfbox.tools.ExtractText.startExtraction(ExtractText.java:267) at org.apache.pdfbox.tools.ExtractText.main(ExtractText.java:98) at org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:57) {noformat} > Corrupt PDF can lead to a StackOverflow > --- > > Key: PDFBOX-5009 > URL: https://issues.apache.org/jira/browse/PDFBOX-5009 > Project: PDFBox > Issue Type: Task > Components: Text extraction >Affects Versions: 2.0.21 >Reporter: Tim Allison >Priority: Minor > > See TIKA-3224. I confirmed this with 2.0.21 by calling the app's ExtractText > on the file posted on the Tika issue. > cc [~dadoonet] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3953) StackOverflowError in org.apache.pdfbox.pdmodel.PDPageTree.getKids
[ https://issues.apache.org/jira/browse/PDFBOX-3953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226417#comment-17226417 ] Tim Allison commented on PDFBOX-3953: - Related? > StackOverflowError in org.apache.pdfbox.pdmodel.PDPageTree.getKids > -- > > Key: PDFBOX-3953 > URL: https://issues.apache.org/jira/browse/PDFBOX-3953 > Project: PDFBox > Issue Type: Bug > Components: PDModel >Affects Versions: 2.0.7 >Reporter: Jorge Spinsanti >Priority: Major > > I got an StackOverflowError in > org.apache.pdfbox.pdmodel.PDPageTree.getKids(PDPageTree.java:135) > {code} > java.lang.StackOverflowError > at org.apache.pdfbox.pdmodel.PDPageTree.getKids(PDPageTree.java:135) > at org.apache.pdfbox.pdmodel.PDPageTree.access$200(PDPageTree.java:38) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:166) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > ... > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-5009) Corrupt PDF can lead to a StackOverflow
[ https://issues.apache.org/jira/browse/PDFBOX-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maruan Sahyoun updated PDFBOX-5009: --- Affects Version/s: 2.0.21 > Corrupt PDF can lead to a StackOverflow > --- > > Key: PDFBOX-5009 > URL: https://issues.apache.org/jira/browse/PDFBOX-5009 > Project: PDFBox > Issue Type: Task > Components: Text extraction >Affects Versions: 2.0.21 >Reporter: Tim Allison >Priority: Minor > > See TIKA-3224. I confirmed this with 2.0.21 by calling the app's ExtractText > on the file posted on the Tika issue. > cc [~dadoonet] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-5009) Corrupt PDF can lead to a StackOverflow
[ https://issues.apache.org/jira/browse/PDFBOX-5009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maruan Sahyoun updated PDFBOX-5009: --- Component/s: Text extraction > Corrupt PDF can lead to a StackOverflow > --- > > Key: PDFBOX-5009 > URL: https://issues.apache.org/jira/browse/PDFBOX-5009 > Project: PDFBox > Issue Type: Task > Components: Text extraction >Reporter: Tim Allison >Priority: Minor > > See TIKA-3224. I confirmed this with 2.0.21 by calling the app's ExtractText > on the file posted on the Tika issue. > cc [~dadoonet] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Created] (PDFBOX-5009) Corrupt PDF can lead to a StackOverflow
Tim Allison created PDFBOX-5009: --- Summary: Corrupt PDF can lead to a StackOverflow Key: PDFBOX-5009 URL: https://issues.apache.org/jira/browse/PDFBOX-5009 Project: PDFBox Issue Type: Task Reporter: Tim Allison See TIKA-3224. I confirmed this with 2.0.21 by calling the app's ExtractText on the file posted on the Tika issue. cc [~dadoonet] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3953) StackOverflowError in org.apache.pdfbox.pdmodel.PDPageTree.getKids
[ https://issues.apache.org/jira/browse/PDFBOX-3953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226357#comment-17226357 ] Michael Klink commented on PDFBOX-3953: --- The PDF file embedded in that docx file actually appears to originally have been a 4509210 bytes long PDF the first 4496523 bytes of have been overwritten with a different PDF (a linearized PDF-1.3 file with cross reference streams... ahem). Thus, the cross reference table of the original file points to completely random locations in the slightly smaller file. This can result in arbitrary exceptions... > StackOverflowError in org.apache.pdfbox.pdmodel.PDPageTree.getKids > -- > > Key: PDFBOX-3953 > URL: https://issues.apache.org/jira/browse/PDFBOX-3953 > Project: PDFBox > Issue Type: Bug > Components: PDModel >Affects Versions: 2.0.7 >Reporter: Jorge Spinsanti >Priority: Major > > I got an StackOverflowError in > org.apache.pdfbox.pdmodel.PDPageTree.getKids(PDPageTree.java:135) > {code} > java.lang.StackOverflowError > at org.apache.pdfbox.pdmodel.PDPageTree.getKids(PDPageTree.java:135) > at org.apache.pdfbox.pdmodel.PDPageTree.access$200(PDPageTree.java:38) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:166) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > ... > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-1529) Exchange hard-coded values for variables and provide command-line options in TextToPDF component
[ https://issues.apache.org/jira/browse/PDFBOX-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226319#comment-17226319 ] Tilman Hausherr commented on PDFBOX-1529: - 4) has been done in PDFBOX-4025. > Exchange hard-coded values for variables and provide command-line options in > TextToPDF component > > > Key: PDFBOX-1529 > URL: https://issues.apache.org/jira/browse/PDFBOX-1529 > Project: PDFBox > Issue Type: Improvement > Components: Utilities >Affects Versions: 1.7.1 >Reporter: Dave Powell >Assignee: Andreas Lehmkühler >Priority: Minor > Labels: features, newbie, patch > Attachments: > patch-pdfbox-src-main-java-org-apache-pdfbox-TextToPDF.java.diff > > > Exchange hard-coded values for variables and provide command-line options in > TextToPDF component > 1) Enable the margins to be individually set from the command-line > 2) Enable the font size to be represented as a floating-point value, e.g. > 10.5 or 11.5 > 3) Allow the line-spacing to be changed from the command-line > 4) Allow the page size to be changed from the command-line, e.g. A4, A3, > US-Letter > I will provide a patch for review for this added functionality -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-5008) Wrong page dimensions
[ https://issues.apache.org/jira/browse/PDFBOX-5008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226318#comment-17226318 ] Tilman Hausherr commented on PDFBOX-5008: - {code} PDRectangle mediaBox = doc.getPage(0).getMediaBox(); System.out.println(doc.getPage(0).getMediaBox() + " " + mediaBox.getHeight() / mediaBox.getWidth()); {code} output: {noformat} [0.0,0.0,595.0,842.0] 1.4151261 {noformat} You won't find it in the COSDictionary because it is higher up (this is unusual, but allowed). > Wrong page dimensions > - > > Key: PDFBOX-5008 > URL: https://issues.apache.org/jira/browse/PDFBOX-5008 > Project: PDFBox > Issue Type: Bug > Components: PDModel >Affects Versions: 2.0.21 > Environment: Java 11, Windows 10 >Reporter: m m >Priority: Major > > For certain PDF files the dimensions seem incorrect when read, in comparison > to what other tools like Adobe Acrobat Reader gives (when inspecting document > properties). > I will attach a PDF file as an example. With Acrobat Reader i get the normal > page dimensions (210mm/297mm = *1.41*), but with PDFBox for each page, with > Crop box and Media box i get 612/792 = *1.29*. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Closed] (PDFBOX-5007) Content not visible after merging pdf with another pdf
[ https://issues.apache.org/jira/browse/PDFBOX-5007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr closed PDFBOX-5007. --- Resolution: Not A Bug > Content not visible after merging pdf with another pdf > -- > > Key: PDFBOX-5007 > URL: https://issues.apache.org/jira/browse/PDFBOX-5007 > Project: PDFBox > Issue Type: Bug > Components: AcroForm >Reporter: Ashish Yadav >Priority: Major > Labels: XFA > Attachments: Accredo pdf blank .pdf, image-2020-11-04-11-11-50-629.png > > > Pdf content is not visible after merging the pdf with another pdf. Please > find the attached error message while viewing the pdf file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Reopened] (PDFBOX-5007) Content not visible after merging pdf with another pdf
[ https://issues.apache.org/jira/browse/PDFBOX-5007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr reopened PDFBOX-5007: - > Content not visible after merging pdf with another pdf > -- > > Key: PDFBOX-5007 > URL: https://issues.apache.org/jira/browse/PDFBOX-5007 > Project: PDFBox > Issue Type: Bug > Components: AcroForm >Reporter: Ashish Yadav >Priority: Major > Labels: XFA > Attachments: Accredo pdf blank .pdf, image-2020-11-04-11-11-50-629.png > > > Pdf content is not visible after merging the pdf with another pdf. Please > find the attached error message while viewing the pdf file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3953) StackOverflowError in org.apache.pdfbox.pdmodel.PDPageTree.getKids
[ https://issues.apache.org/jira/browse/PDFBOX-3953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226037#comment-17226037 ] Mathaus Erich Ulbrich commented on PDFBOX-3953: --- I catch the same problem when using Apache Tika in Elasticsearch to extract an embedded PDF in word file. https://discuss.elastic.co/t/stackoverflow-on-elasticsearch-file-indexation-with-ingest-attachment/253455/4 > StackOverflowError in org.apache.pdfbox.pdmodel.PDPageTree.getKids > -- > > Key: PDFBOX-3953 > URL: https://issues.apache.org/jira/browse/PDFBOX-3953 > Project: PDFBox > Issue Type: Bug > Components: PDModel >Affects Versions: 2.0.7 >Reporter: Jorge Spinsanti >Priority: Major > > I got an StackOverflowError in > org.apache.pdfbox.pdmodel.PDPageTree.getKids(PDPageTree.java:135) > {code} > java.lang.StackOverflowError > at org.apache.pdfbox.pdmodel.PDPageTree.getKids(PDPageTree.java:135) > at org.apache.pdfbox.pdmodel.PDPageTree.access$200(PDPageTree.java:38) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:166) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > at > org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.enqueueKids(PDPageTree.java:169) > ... > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Resolved] (PDFBOX-5007) Content not visible after merging pdf with another pdf
[ https://issues.apache.org/jira/browse/PDFBOX-5007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maruan Sahyoun resolved PDFBOX-5007. Resolution: Not A Bug > Content not visible after merging pdf with another pdf > -- > > Key: PDFBOX-5007 > URL: https://issues.apache.org/jira/browse/PDFBOX-5007 > Project: PDFBox > Issue Type: Bug > Components: AcroForm >Reporter: Ashish Yadav >Priority: Major > Labels: XFA > Attachments: Accredo pdf blank .pdf, image-2020-11-04-11-11-50-629.png > > > Pdf content is not visible after merging the pdf with another pdf. Please > find the attached error message while viewing the pdf file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-5007) Content not visible after merging pdf with another pdf
[ https://issues.apache.org/jira/browse/PDFBOX-5007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226020#comment-17226020 ] Maruan Sahyoun commented on PDFBOX-5007: Well - a dynamic XFA consists basically of a template, data and scripting. An XFA processor will take that and render it into what you see on screen (using a supporting viewer). So the resulting document is generated on the fly. E.g. a template might have a definition for a single data line of a purchase order. No supplying (or entering) data for multiple lines the XFA processor takes the data creates a runtime model of the template and the binding result of the data into what might now be a purchase order with several hundred lines. What changes are required: - use a XFA processor to render the XFA together with the data being held into a PDF document or - save the Form in Adobe Form Designer into a *static* PDF. You can the merge it Added note. I'm working a lot with XFA based forms and workflows for customers. Using the proper software this can be a good technology. But pdfbox is not (and likely will not) be an XFA processor. For Firefox or others you need to ask the appropriate projects. I will be closing the ticket - for further questions please do use the users mailinglist https://pdfbox.apache.org/mailinglists.html but be aware that - again - pdfbox can't help. > Content not visible after merging pdf with another pdf > -- > > Key: PDFBOX-5007 > URL: https://issues.apache.org/jira/browse/PDFBOX-5007 > Project: PDFBox > Issue Type: Bug > Components: AcroForm >Reporter: Ashish Yadav >Priority: Major > Labels: XFA > Attachments: Accredo pdf blank .pdf, image-2020-11-04-11-11-50-629.png > > > Pdf content is not visible after merging the pdf with another pdf. Please > find the attached error message while viewing the pdf file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Comment Edited] (PDFBOX-5007) Content not visible after merging pdf with another pdf
[ https://issues.apache.org/jira/browse/PDFBOX-5007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226010#comment-17226010 ] Ashish Yadav edited comment on PDFBOX-5007 at 11/4/20, 12:01 PM: - Can you explain what does mean by renders the XFA first into a regular PDF? What changes is required to achieve the this? was (Author: 703251012): Can you explain what does mean by renders the XFA first into a regular PDF? > Content not visible after merging pdf with another pdf > -- > > Key: PDFBOX-5007 > URL: https://issues.apache.org/jira/browse/PDFBOX-5007 > Project: PDFBox > Issue Type: Bug > Components: AcroForm >Reporter: Ashish Yadav >Priority: Major > Labels: XFA > Attachments: Accredo pdf blank .pdf, image-2020-11-04-11-11-50-629.png > > > Pdf content is not visible after merging the pdf with another pdf. Please > find the attached error message while viewing the pdf file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-5007) Content not visible after merging pdf with another pdf
[ https://issues.apache.org/jira/browse/PDFBOX-5007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17226010#comment-17226010 ] Ashish Yadav commented on PDFBOX-5007: -- Can you explain what does mean by renders the XFA first into a regular PDF? > Content not visible after merging pdf with another pdf > -- > > Key: PDFBOX-5007 > URL: https://issues.apache.org/jira/browse/PDFBOX-5007 > Project: PDFBox > Issue Type: Bug > Components: AcroForm >Reporter: Ashish Yadav >Priority: Major > Labels: XFA > Attachments: Accredo pdf blank .pdf, image-2020-11-04-11-11-50-629.png > > > Pdf content is not visible after merging the pdf with another pdf. Please > find the attached error message while viewing the pdf file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Closed] (PDFBOX-5008) Wrong page dimensions
[ https://issues.apache.org/jira/browse/PDFBOX-5008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] m m closed PDFBOX-5008. --- Resolution: Not A Bug PDF COSDictionary was missing MediaBox attribute > Wrong page dimensions > - > > Key: PDFBOX-5008 > URL: https://issues.apache.org/jira/browse/PDFBOX-5008 > Project: PDFBox > Issue Type: Bug > Components: PDModel >Affects Versions: 2.0.21 > Environment: Java 11, Windows 10 >Reporter: m m >Priority: Major > > For certain PDF files the dimensions seem incorrect when read, in comparison > to what other tools like Adobe Acrobat Reader gives (when inspecting document > properties). > I will attach a PDF file as an example. With Acrobat Reader i get the normal > page dimensions (210mm/297mm = *1.41*), but with PDFBox for each page, with > Crop box and Media box i get 612/792 = *1.29*. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-5008) Wrong page dimensions
[ https://issues.apache.org/jira/browse/PDFBOX-5008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] m m updated PDFBOX-5008: Attachment: (was: draw-report.pdf) > Wrong page dimensions > - > > Key: PDFBOX-5008 > URL: https://issues.apache.org/jira/browse/PDFBOX-5008 > Project: PDFBox > Issue Type: Bug > Components: PDModel >Affects Versions: 2.0.21 > Environment: Java 11, Windows 10 >Reporter: m m >Priority: Major > > For certain PDF files the dimensions seem incorrect when read, in comparison > to what other tools like Adobe Acrobat Reader gives (when inspecting document > properties). > I will attach a PDF file as an example. With Acrobat Reader i get the normal > page dimensions (210mm/297mm = *1.41*), but with PDFBox for each page, with > Crop box and Media box i get 612/792 = *1.29*. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-5007) Content not visible after merging pdf with another pdf
[ https://issues.apache.org/jira/browse/PDFBOX-5007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17225949#comment-17225949 ] Maruan Sahyoun commented on PDFBOX-5007: We can not merge a dynamic XFA with another PDF. That would need us to do an XFA rendering first which would be several month of development and not something we have the resources for. You need to render the XFA first into a regular PDF. With dynamic XFA PDF is only the container to hold the XFA content. > Content not visible after merging pdf with another pdf > -- > > Key: PDFBOX-5007 > URL: https://issues.apache.org/jira/browse/PDFBOX-5007 > Project: PDFBox > Issue Type: Bug > Components: AcroForm >Reporter: Ashish Yadav >Priority: Major > Labels: XFA > Attachments: Accredo pdf blank .pdf, image-2020-11-04-11-11-50-629.png > > > Pdf content is not visible after merging the pdf with another pdf. Please > find the attached error message while viewing the pdf file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-5007) Content not visible after merging pdf with another pdf
[ https://issues.apache.org/jira/browse/PDFBOX-5007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17225938#comment-17225938 ] Ashish Yadav commented on PDFBOX-5007: -- Can you resolve the merging issue? We were facing a merging issue with fillable pdf earlier and after version update it resolves. > Content not visible after merging pdf with another pdf > -- > > Key: PDFBOX-5007 > URL: https://issues.apache.org/jira/browse/PDFBOX-5007 > Project: PDFBox > Issue Type: Bug > Components: AcroForm >Reporter: Ashish Yadav >Priority: Major > Labels: XFA > Attachments: Accredo pdf blank .pdf, image-2020-11-04-11-11-50-629.png > > > Pdf content is not visible after merging the pdf with another pdf. Please > find the attached error message while viewing the pdf file. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Created] (PDFBOX-5008) Wrong page dimensions
m m created PDFBOX-5008: --- Summary: Wrong page dimensions Key: PDFBOX-5008 URL: https://issues.apache.org/jira/browse/PDFBOX-5008 Project: PDFBox Issue Type: Bug Components: PDModel Affects Versions: 2.0.21 Environment: Java 11, Windows 10 Reporter: m m Attachments: draw-report.pdf For certain PDF files the dimensions seem incorrect when read, in comparison to what other tools like Adobe Acrobat Reader gives (when inspecting document properties). I will attach a PDF file as an example. With Acrobat Reader i get the normal page dimensions (210mm/297mm = *1.41*), but with PDFBox for each page, with Crop box and Media box i get 612/792 = *1.29*. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org