[jira] [Updated] (PDFBOX-1861) Line is incorrectly dashed
[ https://issues.apache.org/jira/browse/PDFBOX-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-1861: Attachment: asy-gouraud.pdf-1-good.png I had a look at other operators and notice that they do a transformation, but the dash pattern isn't transformed in SetLineDashPattern.java. Inserting this code {code} float[] dashArray = null; if (!lineDashPattern.isDashPatternEmpty()) { dashArray = lineDashPattern.getCOSDashPattern().toFloatArray(); Matrix ctm = context.getGraphicsState().getCurrentTransformationMatrix(); if (ctm != null && ctm.getXScale() > 0) { for (int i = 0; i < dashArray.length; ++i) { dashArray[i] *= ctm.getXScale(); } } } {code} and using that dashArray result instead of lineDashPattern.getCOSDashPattern().toFloatArray() later down produces the attached image (which includes my change from PDFBOX-615). > Line is incorrectly dashed > -- > > Key: PDFBOX-1861 > URL: https://issues.apache.org/jira/browse/PDFBOX-1861 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.0 > Environment: W7 >Reporter: Tilman Hausherr >Priority: Minor > Attachments: asy-gouraud.pdf, asy-gouraud.pdf-1-good.png, > asy-gouraud.pdf-1-trunk.png > > > The line in the attached page should be dashed differently than it is in the > rendering. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PDFBOX-1822) Signature byte range is Invalid
[ https://issues.apache.org/jira/browse/PDFBOX-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879493#comment-13879493 ] Andreas Lehmkühler commented on PDFBOX-1822: I'm still thinking about the limitation to the non-sequential parser. Is it allowed to mix up the usage of xref tables and xref streams within a pdf? > Signature byte range is Invalid > --- > > Key: PDFBOX-1822 > URL: https://issues.apache.org/jira/browse/PDFBOX-1822 > Project: PDFBox > Issue Type: Bug > Components: Parsing, PDModel, PDModel.AcroForm, Signing >Affects Versions: 1.8.3, 1.8.4, 2.0.0 >Reporter: vakhtang koroghlishvili >Assignee: Andreas Lehmkühler > Fix For: 1.8.4, 2.0.0 > > Attachments: araxis-merge - compare two document.jpg, > damaged-sig.jpg, unsigned-signed.pdf, unsigned.pdf, unsigned_signed_fix.pdf > > > On person send me a unsigned PDF document. He wanted to sign it. When I try > to sign it (using pad box), I have some problem. > After signing adobe reader tells me "The signature byre range is invalid". > I will attach original and signed document. > I think, it is PDF box parser error. another signature libraries sign > document very well. I'm searching the problem at the moment, in order to fix > it. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1861) Line is incorrectly dashed
[ https://issues.apache.org/jira/browse/PDFBOX-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-1861: Attachment: asy-gouraud.pdf-1-trunk.png asy-gouraud.pdf PDF file taken from http://asymptote.sourceforge.net/gallery/ (clicking on the images brings the pdfs, clicking on the texts brings the source codes) > Line is incorrectly dashed > -- > > Key: PDFBOX-1861 > URL: https://issues.apache.org/jira/browse/PDFBOX-1861 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.0 > Environment: W7 >Reporter: Tilman Hausherr >Priority: Minor > Attachments: asy-gouraud.pdf, asy-gouraud.pdf-1-trunk.png > > > The line in the attached page should be dashed differently than it is in the > rendering. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (PDFBOX-1861) Line is incorrectly dashed
Tilman Hausherr created PDFBOX-1861: --- Summary: Line is incorrectly dashed Key: PDFBOX-1861 URL: https://issues.apache.org/jira/browse/PDFBOX-1861 Project: PDFBox Issue Type: Bug Affects Versions: 2.0.0 Environment: W7 Reporter: Tilman Hausherr Priority: Minor The line in the attached page should be dashed differently than it is in the rendering. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (PDFBOX-1860) HTML converter escapes formatting close tags
Cheng Leong created PDFBOX-1860: --- Summary: HTML converter escapes formatting close tags Key: PDFBOX-1860 URL: https://issues.apache.org/jira/browse/PDFBOX-1860 Project: PDFBox Issue Type: Bug Components: Text extraction Affects Versions: 1.8.3 Reporter: Cheng Leong Priority: Minor Attachments: pdftest.pdf Bug introduced by PDFBOX-1213 in 1.8.3 for HTML style information. Bold style tags are opened correctly, but the close tags are html-escaped. {noformat} ~/work/pdfbox ((1.8.3))$ java -jar app/target/pdfbox-app-1.8.3.jar ExtractText -html -nonSeq -console pdftest.pdf http://www.w3.org/TR/html4/loose.dtd";> 1725.PDF E:\M55\!\1725.fm 2003-01-01 18:15 P Tagg, IPM, University of Liverpool A VERY SMALL PDF FILE A VERY SMALL PDF FILE A VERY SMALL PDF FILE A VERY SMALL PDF FILE A VERY SMALL PDF FILE A VERY SMALL PDF FILE {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1860) HTML converter escapes formatting close tags
[ https://issues.apache.org/jira/browse/PDFBOX-1860?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Leong updated PDFBOX-1860: Attachment: pdftest.pdf > HTML converter escapes formatting close tags > > > Key: PDFBOX-1860 > URL: https://issues.apache.org/jira/browse/PDFBOX-1860 > Project: PDFBox > Issue Type: Bug > Components: Text extraction >Affects Versions: 1.8.3 >Reporter: Cheng Leong >Priority: Minor > Attachments: pdftest.pdf > > > Bug introduced by PDFBOX-1213 in 1.8.3 for HTML style information. > Bold style tags are opened correctly, but the close tags are html-escaped. > {noformat} > ~/work/pdfbox ((1.8.3))$ java -jar app/target/pdfbox-app-1.8.3.jar > ExtractText -html -nonSeq -console pdftest.pdf > "http://www.w3.org/TR/html4/loose.dtd";> > 1725.PDF > > > > E:\M55\!\1725.fm 2003-01-01 18:15 P Tagg, > IPM, University of Liverpool > > A VERY SMALL PDF FILE > > A VERY SMALL PDF FILE > > A VERY SMALL PDF FILE > > A VERY SMALL PDF FILE > > A VERY SMALL PDF FILE > > A VERY SMALL PDF FILE > > > {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (PDFBOX-1822) Signature byte range is Invalid
[ https://issues.apache.org/jira/browse/PDFBOX-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler resolved PDFBOX-1822. Resolution: Fixed Fix Version/s: 2.0.0 1.8.4 Assignee: Andreas Lehmkühler I merged the changes into the 1.8 branch in revision 1560498. The fix is limited to the non-sequential parser. > Signature byte range is Invalid > --- > > Key: PDFBOX-1822 > URL: https://issues.apache.org/jira/browse/PDFBOX-1822 > Project: PDFBox > Issue Type: Bug > Components: Parsing, PDModel, PDModel.AcroForm, Signing >Affects Versions: 1.8.3, 1.8.4, 2.0.0 >Reporter: vakhtang koroghlishvili >Assignee: Andreas Lehmkühler > Fix For: 1.8.4, 2.0.0 > > Attachments: araxis-merge - compare two document.jpg, > damaged-sig.jpg, unsigned-signed.pdf, unsigned.pdf, unsigned_signed_fix.pdf > > > On person send me a unsigned PDF document. He wanted to sign it. When I try > to sign it (using pad box), I have some problem. > After signing adobe reader tells me "The signature byre range is invalid". > I will attach original and signed document. > I think, it is PDF box parser error. another signature libraries sign > document very well. I'm searching the problem at the moment, in order to fix > it. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Comment Edited] (PDFBOX-1822) Signature byte range is Invalid
[ https://issues.apache.org/jira/browse/PDFBOX-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879140#comment-13879140 ] Andreas Lehmkühler edited comment on PDFBOX-1822 at 1/22/14 8:24 PM: - I merged the changes into the 1.8 branch in revision 1560498. The fix is limited to the non-sequential parser. Thanks for all the input and help! was (Author: lehmi): I merged the changes into the 1.8 branch in revision 1560498. The fix is limited to the non-sequential parser. > Signature byte range is Invalid > --- > > Key: PDFBOX-1822 > URL: https://issues.apache.org/jira/browse/PDFBOX-1822 > Project: PDFBox > Issue Type: Bug > Components: Parsing, PDModel, PDModel.AcroForm, Signing >Affects Versions: 1.8.3, 1.8.4, 2.0.0 >Reporter: vakhtang koroghlishvili >Assignee: Andreas Lehmkühler > Fix For: 1.8.4, 2.0.0 > > Attachments: araxis-merge - compare two document.jpg, > damaged-sig.jpg, unsigned-signed.pdf, unsigned.pdf, unsigned_signed_fix.pdf > > > On person send me a unsigned PDF document. He wanted to sign it. When I try > to sign it (using pad box), I have some problem. > After signing adobe reader tells me "The signature byre range is invalid". > I will attach original and signed document. > I think, it is PDF box parser error. another signature libraries sign > document very well. I'm searching the problem at the moment, in order to fix > it. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: DeviceN/Separation JPEGs
Hi, Am 21.01.2014 23:29, schrieb Andreas Lehmkuehler: Hi, Am 21.01.2014 23:08, schrieb John Hewson: Does anyone have any PDF files with JPEGs that use DeviceN or Separation color spaces? I'm pretty sure that I have one, but I can't find it. I'll continue searching tomorrow I wrote a little search tool and finally I found some pdfs. Find attached a list of pdfs using images with a separation colorspace. Those pdfs should all be attached to the referenced jira issue. I’m trying to test out some new code... Thanks -- John BR Andreas Lehmkühler BR Andreas Lehmkühler PDFBOX1095-receipt.pdf: Found PDPixelmap using Separation: Im0 on page 1 PDFBOX1116-Flyer_A4_Zubehoer_10_Prozent_300dpi.pdf: Found PDJpeg using Separation: Im25 on page 2 PDFBOX1307-invertedImage.pdf: Found PDJpeg using Separation: Im1 on page 1 PDFBOX1307-invertedImage.pdf: Found PDJpeg using Separation: Im2 on page 1 PDFBOX1610-kitest3_2904_01.6710338.0.pdf: Found PDPixelmap using Separation: Im2 on page 1 PDFBOX1610-kitest3_2904_01.6710338.0.pdf: Found PDPixelmap using Separation: Im3 on page 1 PDFBOX1610-kitest3_2904_01.6710338.0.pdf: Found PDPixelmap using Separation: Im61 on page 1 PDFBOX1522-Traveler20120822.pdf: Found PDPixelmap using Separation: Im1 on page 1 PDFBOX1522-Traveler20120822.pdf: Found PDJpeg using Separation: Im17 on page 3 PDFBOX1522-Traveler20120822.pdf: Found PDJpeg using Separation: Im205 on page 8 PDFBOX1522-Traveler20120822.pdf: Found PDPixelmap using Separation: Im208 on page 8 PDFBOX1522-Traveler20120822.pdf: Found PDJpeg using Separation: Im209 on page 8 PDFBOX1522-Traveler20120822.pdf: Found PDPixelmap using Separation: Im210 on page 8 PDFBOX833-sample.pdf: Found PDPixelmap using Separation: Im6 on page 1 PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im1 on page 6 PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im2 on page 13 PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im3 on page 15 PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im4 on page 16 PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im5 on page 17 PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im6 on page 18 PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im7 on page 19 PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im8 on page 20 PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im9 on page 21 PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im10 on page 22 PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im11 on page 23 PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im12 on page 24 PDFBOX1437-AA.pdf: Found PDJpeg using Separation: Im13 on page 25 PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im5 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im4 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im7 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im6 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDPixelmap using Separation: Im9 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDPixelmap using Separation: Im8 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im10 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im27 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im0 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDPixelmap using Separation: Im13 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDPixelmap using Separation: Im26 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im1 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDPixelmap using Separation: Im14 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDPixelmap using Separation: Im29 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im2 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDPixelmap using Separation: Im11 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im3 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im28 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im12 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im23 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im17 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im22 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im18 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDPixelmap using Separation: Im25 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDPixelmap using Separation: Im15 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im24 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im16 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im21 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDPixelmap using Separation: Im19 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im20 on page 2 PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im5 on page 3 PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im4 on page 3 PDFBOX1691-FORIS-HV.pdf: Found PDJpeg using Separation: Im7 on page 3 PDFBOX1691-FORIS-HV.pdf: F
[jira] [Commented] (PDFBOX-1822) Signature byte range is Invalid
[ https://issues.apache.org/jira/browse/PDFBOX-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879123#comment-13879123 ] vakhtang koroghlishvili commented on PDFBOX-1822: - Álison Fernandes I'have tested your issue too. As I see , everything is well. > Signature byte range is Invalid > --- > > Key: PDFBOX-1822 > URL: https://issues.apache.org/jira/browse/PDFBOX-1822 > Project: PDFBox > Issue Type: Bug > Components: Parsing, PDModel, PDModel.AcroForm, Signing >Affects Versions: 1.8.3, 1.8.4, 2.0.0 >Reporter: vakhtang koroghlishvili > Attachments: araxis-merge - compare two document.jpg, > damaged-sig.jpg, unsigned-signed.pdf, unsigned.pdf, unsigned_signed_fix.pdf > > > On person send me a unsigned PDF document. He wanted to sign it. When I try > to sign it (using pad box), I have some problem. > After signing adobe reader tells me "The signature byre range is invalid". > I will attach original and signed document. > I think, it is PDF box parser error. another signature libraries sign > document very well. I'm searching the problem at the moment, in order to fix > it. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Comment Edited] (PDFBOX-1822) Signature byte range is Invalid
[ https://issues.apache.org/jira/browse/PDFBOX-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879109#comment-13879109 ] vakhtang koroghlishvili edited comment on PDFBOX-1822 at 1/22/14 8:13 PM: -- I have attached fixed version document. was (Author: v.koroghlishvili): I attach fixed version document. > Signature byte range is Invalid > --- > > Key: PDFBOX-1822 > URL: https://issues.apache.org/jira/browse/PDFBOX-1822 > Project: PDFBox > Issue Type: Bug > Components: Parsing, PDModel, PDModel.AcroForm, Signing >Affects Versions: 1.8.3, 1.8.4, 2.0.0 >Reporter: vakhtang koroghlishvili > Attachments: araxis-merge - compare two document.jpg, > damaged-sig.jpg, unsigned-signed.pdf, unsigned.pdf, unsigned_signed_fix.pdf > > > On person send me a unsigned PDF document. He wanted to sign it. When I try > to sign it (using pad box), I have some problem. > After signing adobe reader tells me "The signature byre range is invalid". > I will attach original and signed document. > I think, it is PDF box parser error. another signature libraries sign > document very well. I'm searching the problem at the moment, in order to fix > it. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Issue Comment Deleted] (PDFBOX-1857) Attachment damages singature
[ https://issues.apache.org/jira/browse/PDFBOX-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vakhtang koroghlishvili updated PDFBOX-1857: Comment: was deleted (was: that is result of file. everything is well.) > Attachment damages singature > > > Key: PDFBOX-1857 > URL: https://issues.apache.org/jira/browse/PDFBOX-1857 > Project: PDFBox > Issue Type: Bug > Components: Parsing, PDFReader, Signing, Utilities, Writing >Affects Versions: 1.8.3, 1.8.4, 2.0.0 >Reporter: jack >Assignee: Thomas Chojecki >Priority: Blocker > Attachments: attach.txt, original.pdf, original[signed].pdf, > original[with-attachment].pdf, original[with-attachment][signed].pdf, > original[with-attachment]_signed.pdf > > > I have PDF document. > 1) Adobe reader reads document well. > 2) I sign document (using pdfbox-examples) and everything is well > 3) Then I try to attach file to original PDF (Code is written in the pdfbox > web page - in the cookBook). > 4) Adobe reader reads attached document well. everything is well. > 5) Now I have document with attachment. > 6) I try to sign that document (I mean document with attachment). And I have > 2 problem: > First: > when I open document, Adobe reader tells me that signature byte range is > invalid. > Second: > when I try to close document (I mean to close adobe reader), Adobe reader > tells me that: > Do you want to save changes to "original[with-attachment][signed]" before > closing? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1857) Attachment damages singature
[ https://issues.apache.org/jira/browse/PDFBOX-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vakhtang koroghlishvili updated PDFBOX-1857: Attachment: original[with-attachment]_signed.pdf that is result of file. everything is well. > Attachment damages singature > > > Key: PDFBOX-1857 > URL: https://issues.apache.org/jira/browse/PDFBOX-1857 > Project: PDFBox > Issue Type: Bug > Components: Parsing, PDFReader, Signing, Utilities, Writing >Affects Versions: 1.8.3, 1.8.4, 2.0.0 >Reporter: jack >Assignee: Thomas Chojecki >Priority: Blocker > Attachments: attach.txt, original.pdf, original[signed].pdf, > original[with-attachment].pdf, original[with-attachment][signed].pdf, > original[with-attachment]_signed.pdf > > > I have PDF document. > 1) Adobe reader reads document well. > 2) I sign document (using pdfbox-examples) and everything is well > 3) Then I try to attach file to original PDF (Code is written in the pdfbox > web page - in the cookBook). > 4) Adobe reader reads attached document well. everything is well. > 5) Now I have document with attachment. > 6) I try to sign that document (I mean document with attachment). And I have > 2 problem: > First: > when I open document, Adobe reader tells me that signature byte range is > invalid. > Second: > when I try to close document (I mean to close adobe reader), Adobe reader > tells me that: > Do you want to save changes to "original[with-attachment][signed]" before > closing? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PDFBOX-1857) Attachment damages singature
[ https://issues.apache.org/jira/browse/PDFBOX-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879110#comment-13879110 ] vakhtang koroghlishvili commented on PDFBOX-1857: - the patch of PDFBOX-1822 is solution of this issue. I have tested and this issue is already fixed. > Attachment damages singature > > > Key: PDFBOX-1857 > URL: https://issues.apache.org/jira/browse/PDFBOX-1857 > Project: PDFBox > Issue Type: Bug > Components: Parsing, PDFReader, Signing, Utilities, Writing >Affects Versions: 1.8.3, 1.8.4, 2.0.0 >Reporter: jack >Assignee: Thomas Chojecki >Priority: Blocker > Attachments: attach.txt, original.pdf, original[signed].pdf, > original[with-attachment].pdf, original[with-attachment][signed].pdf, > original[with-attachment]_signed.pdf > > > I have PDF document. > 1) Adobe reader reads document well. > 2) I sign document (using pdfbox-examples) and everything is well > 3) Then I try to attach file to original PDF (Code is written in the pdfbox > web page - in the cookBook). > 4) Adobe reader reads attached document well. everything is well. > 5) Now I have document with attachment. > 6) I try to sign that document (I mean document with attachment). And I have > 2 problem: > First: > when I open document, Adobe reader tells me that signature byte range is > invalid. > Second: > when I try to close document (I mean to close adobe reader), Adobe reader > tells me that: > Do you want to save changes to "original[with-attachment][signed]" before > closing? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1822) Signature byte range is Invalid
[ https://issues.apache.org/jira/browse/PDFBOX-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vakhtang koroghlishvili updated PDFBOX-1822: Attachment: unsigned_signed_fix.pdf I attach fixed version document. > Signature byte range is Invalid > --- > > Key: PDFBOX-1822 > URL: https://issues.apache.org/jira/browse/PDFBOX-1822 > Project: PDFBox > Issue Type: Bug > Components: Parsing, PDModel, PDModel.AcroForm, Signing >Affects Versions: 1.8.3, 1.8.4, 2.0.0 >Reporter: vakhtang koroghlishvili > Attachments: araxis-merge - compare two document.jpg, > damaged-sig.jpg, unsigned-signed.pdf, unsigned.pdf, unsigned_signed_fix.pdf > > > On person send me a unsigned PDF document. He wanted to sign it. When I try > to sign it (using pad box), I have some problem. > After signing adobe reader tells me "The signature byre range is invalid". > I will attach original and signed document. > I think, it is PDF box parser error. another signature libraries sign > document very well. I'm searching the problem at the moment, in order to fix > it. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PDFBOX-1822) Signature byte range is Invalid
[ https://issues.apache.org/jira/browse/PDFBOX-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879107#comment-13879107 ] vakhtang koroghlishvili commented on PDFBOX-1822: - I'have tested and everything is well. Andreas, thank you for this patch. :) > Signature byte range is Invalid > --- > > Key: PDFBOX-1822 > URL: https://issues.apache.org/jira/browse/PDFBOX-1822 > Project: PDFBox > Issue Type: Bug > Components: Parsing, PDModel, PDModel.AcroForm, Signing >Affects Versions: 1.8.3, 1.8.4, 2.0.0 >Reporter: vakhtang koroghlishvili > Attachments: araxis-merge - compare two document.jpg, > damaged-sig.jpg, unsigned-signed.pdf, unsigned.pdf > > > On person send me a unsigned PDF document. He wanted to sign it. When I try > to sign it (using pad box), I have some problem. > After signing adobe reader tells me "The signature byre range is invalid". > I will attach original and signed document. > I think, it is PDF box parser error. another signature libraries sign > document very well. I'm searching the problem at the moment, in order to fix > it. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Error printing...
Yep, you’ll need to create a JavaFX bundle in the usual manner http://docs.oracle.com/javafx/2/deployment/packaging.htm -- John On 22 Jan 2014, at 11:36, Alin Mazilu wrote: > Thank you for your quick responses, but the application is a JavaFX self > contained application packaged with the JRE and is independent of the JRE > installed on the OS. So I think I need to package the JAI libraries but I > have no idea how :D Any thoughts? > > Thank you, > > Alin > > > On Wed, Jan 22, 2014 at 1:48 PM, John Hewson wrote: > >> Yes, there is. Simply Google "JBIG2 plugin” and follow the first link, it >> will be called "jbig2-imageio". >> >> -- John >> >> On 22 Jan 2014, at 09:16, Alin Mazilu wrote: >> >>> Hello all, >>> >>> I am printing some PDFs and I am getting this: >>> >>> Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode >>> SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded >> datastream. >>> Jan 22, 2014 12:07:47 PM >>> org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage >>> SEVERE: Something went wrong ... the pixelmap doesn't contain any data. >>> Jan 22, 2014 12:07:47 PM >> org.apache.pdfbox.util.operator.pagedrawer.Invoke >>> process >>> WARNING: getRGBImage returned NULL >>> Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode >>> SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded >> datastream. >>> Jan 22, 2014 12:07:47 PM >>> org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage >>> SEVERE: Something went wrong ... the pixelmap doesn't contain any data. >>> Jan 22, 2014 12:07:47 PM >> org.apache.pdfbox.util.operator.pagedrawer.Invoke >>> process >>> WARNING: getRGBImage returned NULL >>> Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode >>> SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded >> datastream. >>> Jan 22, 2014 12:07:47 PM >>> org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage >>> SEVERE: Something went wrong ... the pixelmap doesn't contain any data. >>> Jan 22, 2014 12:07:47 PM >> org.apache.pdfbox.util.operator.pagedrawer.Invoke >>> process >>> WARNING: getRGBImage returned NULL >>> >>> Is there a quick way to fix this? Is there a JBIG2 plugin? I really need >> to >>> fix it today or I'm in trouble. :) >>> >>> Thank you, >>> >>> Alin >> >>
Re: Error printing...
Thank you for your quick responses, but the application is a JavaFX self contained application packaged with the JRE and is independent of the JRE installed on the OS. So I think I need to package the JAI libraries but I have no idea how :D Any thoughts? Thank you, Alin On Wed, Jan 22, 2014 at 1:48 PM, John Hewson wrote: > Yes, there is. Simply Google "JBIG2 plugin” and follow the first link, it > will be called "jbig2-imageio". > > -- John > > On 22 Jan 2014, at 09:16, Alin Mazilu wrote: > > > Hello all, > > > > I am printing some PDFs and I am getting this: > > > > Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode > > SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded > datastream. > > Jan 22, 2014 12:07:47 PM > > org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage > > SEVERE: Something went wrong ... the pixelmap doesn't contain any data. > > Jan 22, 2014 12:07:47 PM > org.apache.pdfbox.util.operator.pagedrawer.Invoke > > process > > WARNING: getRGBImage returned NULL > > Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode > > SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded > datastream. > > Jan 22, 2014 12:07:47 PM > > org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage > > SEVERE: Something went wrong ... the pixelmap doesn't contain any data. > > Jan 22, 2014 12:07:47 PM > org.apache.pdfbox.util.operator.pagedrawer.Invoke > > process > > WARNING: getRGBImage returned NULL > > Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode > > SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded > datastream. > > Jan 22, 2014 12:07:47 PM > > org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage > > SEVERE: Something went wrong ... the pixelmap doesn't contain any data. > > Jan 22, 2014 12:07:47 PM > org.apache.pdfbox.util.operator.pagedrawer.Invoke > > process > > WARNING: getRGBImage returned NULL > > > > Is there a quick way to fix this? Is there a JBIG2 plugin? I really need > to > > fix it today or I'm in trouble. :) > > > > Thank you, > > > > Alin > >
[jira] [Commented] (PDFBOX-1822) Signature byte range is Invalid
[ https://issues.apache.org/jira/browse/PDFBOX-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879023#comment-13879023 ] Andreas Lehmkühler commented on PDFBOX-1822: I changed the implementation of COSDocument#isXRefStream in revision 1560474. I tno more relies on the trailer dictionary to determine if the pdf uses a xref table or a xref stream. Can someone please check if the signing now works as I'm not familiar with all that signing stuff. > Signature byte range is Invalid > --- > > Key: PDFBOX-1822 > URL: https://issues.apache.org/jira/browse/PDFBOX-1822 > Project: PDFBox > Issue Type: Bug > Components: Parsing, PDModel, PDModel.AcroForm, Signing >Affects Versions: 1.8.3, 1.8.4, 2.0.0 >Reporter: vakhtang koroghlishvili > Attachments: araxis-merge - compare two document.jpg, > damaged-sig.jpg, unsigned-signed.pdf, unsigned.pdf > > > On person send me a unsigned PDF document. He wanted to sign it. When I try > to sign it (using pad box), I have some problem. > After signing adobe reader tells me "The signature byre range is invalid". > I will attach original and signed document. > I think, it is PDF box parser error. another signature libraries sign > document very well. I'm searching the problem at the moment, in order to fix > it. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: Error printing...
Yes, there is. Simply Google "JBIG2 plugin” and follow the first link, it will be called "jbig2-imageio". -- John On 22 Jan 2014, at 09:16, Alin Mazilu wrote: > Hello all, > > I am printing some PDFs and I am getting this: > > Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode > SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded datastream. > Jan 22, 2014 12:07:47 PM > org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage > SEVERE: Something went wrong ... the pixelmap doesn't contain any data. > Jan 22, 2014 12:07:47 PM org.apache.pdfbox.util.operator.pagedrawer.Invoke > process > WARNING: getRGBImage returned NULL > Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode > SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded datastream. > Jan 22, 2014 12:07:47 PM > org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage > SEVERE: Something went wrong ... the pixelmap doesn't contain any data. > Jan 22, 2014 12:07:47 PM org.apache.pdfbox.util.operator.pagedrawer.Invoke > process > WARNING: getRGBImage returned NULL > Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode > SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded datastream. > Jan 22, 2014 12:07:47 PM > org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage > SEVERE: Something went wrong ... the pixelmap doesn't contain any data. > Jan 22, 2014 12:07:47 PM org.apache.pdfbox.util.operator.pagedrawer.Invoke > process > WARNING: getRGBImage returned NULL > > Is there a quick way to fix this? Is there a JBIG2 plugin? I really need to > fix it today or I'm in trouble. :) > > Thank you, > > Alin
RE: Error printing...
Hi, Install jar from https://code.google.com/p/jbig2-imageio/ Thanks -Original Message- From: Alin Mazilu [mailto:impet...@gmail.com] Sent: 22 January 2014 17:16 To: dev@pdfbox.apache.org Subject: Error printing... Hello all, I am printing some PDFs and I am getting this: Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded datastream. Jan 22, 2014 12:07:47 PM org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage SEVERE: Something went wrong ... the pixelmap doesn't contain any data. Jan 22, 2014 12:07:47 PM org.apache.pdfbox.util.operator.pagedrawer.Invoke process WARNING: getRGBImage returned NULL Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded datastream. Jan 22, 2014 12:07:47 PM org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage SEVERE: Something went wrong ... the pixelmap doesn't contain any data. Jan 22, 2014 12:07:47 PM org.apache.pdfbox.util.operator.pagedrawer.Invoke process WARNING: getRGBImage returned NULL Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded datastream. Jan 22, 2014 12:07:47 PM org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage SEVERE: Something went wrong ... the pixelmap doesn't contain any data. Jan 22, 2014 12:07:47 PM org.apache.pdfbox.util.operator.pagedrawer.Invoke process WARNING: getRGBImage returned NULL Is there a quick way to fix this? Is there a JBIG2 plugin? I really need to fix it today or I'm in trouble. :) Thank you, Alin
Error printing...
Hello all, I am printing some PDFs and I am getting this: Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded datastream. Jan 22, 2014 12:07:47 PM org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage SEVERE: Something went wrong ... the pixelmap doesn't contain any data. Jan 22, 2014 12:07:47 PM org.apache.pdfbox.util.operator.pagedrawer.Invoke process WARNING: getRGBImage returned NULL Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded datastream. Jan 22, 2014 12:07:47 PM org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage SEVERE: Something went wrong ... the pixelmap doesn't contain any data. Jan 22, 2014 12:07:47 PM org.apache.pdfbox.util.operator.pagedrawer.Invoke process WARNING: getRGBImage returned NULL Jan 22, 2014 12:07:47 PM org.apache.pdfbox.filter.JBIG2Filter decode SEVERE: Can't find an ImageIO plugin to decode the JBIG2 encoded datastream. Jan 22, 2014 12:07:47 PM org.apache.pdfbox.pdmodel.graphics.xobject.PDPixelMap getRGBImage SEVERE: Something went wrong ... the pixelmap doesn't contain any data. Jan 22, 2014 12:07:47 PM org.apache.pdfbox.util.operator.pagedrawer.Invoke process WARNING: getRGBImage returned NULL Is there a quick way to fix this? Is there a JBIG2 plugin? I really need to fix it today or I'm in trouble. :) Thank you, Alin
[jira] [Commented] (PDFBOX-1822) Signature byte range is Invalid
[ https://issues.apache.org/jira/browse/PDFBOX-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13878594#comment-13878594 ] Thomas Chojecki commented on PDFBOX-1822: - You are right.It looks also like a merged document. See the ID. So maybe the application merged a document that contains a classic xref table with a document that contains a xref stream. > Signature byte range is Invalid > --- > > Key: PDFBOX-1822 > URL: https://issues.apache.org/jira/browse/PDFBOX-1822 > Project: PDFBox > Issue Type: Bug > Components: Parsing, PDModel, PDModel.AcroForm, Signing >Affects Versions: 1.8.3, 1.8.4, 2.0.0 >Reporter: vakhtang koroghlishvili > Attachments: araxis-merge - compare two document.jpg, > damaged-sig.jpg, unsigned-signed.pdf, unsigned.pdf > > > On person send me a unsigned PDF document. He wanted to sign it. When I try > to sign it (using pad box), I have some problem. > After signing adobe reader tells me "The signature byre range is invalid". > I will attach original and signed document. > I think, it is PDF box parser error. another signature libraries sign > document very well. I'm searching the problem at the moment, in order to fix > it. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PDFBOX-1822) Signature byte range is Invalid
[ https://issues.apache.org/jira/browse/PDFBOX-1822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13878556#comment-13878556 ] Andreas Lehmkühler commented on PDFBOX-1822: I had a quick look at the attached pdf (unsigned.pdf) and IMO it's broken: {code} xref 0 20 00 65535 f 15 0 n 000497 0 n 004753 0 n 004810 0 n 004917 0 n 000200 0 n 004960 0 n 005185 0 n 005234 0 n 005275 0 n 005515 0 n 005537 0 n 005741 0 n 005776 0 n 006109 0 n 006161 0 n 006457 0 n 006724 0 n 007034 0 n trailer << /DecodeParms << /Columns 4 /Predictor 12 >> /Filter /FlateDecode /ID [ ] /Info 6 0 R /Length 58 /Root 1 0 R /Size 20 /Type /XRef /W [1 2 1] /Index [14 13] >> startxref 24851 %%EOF {code} The pdf obviously uses a xref table. The trailer contains values which are only expected if the pdf uses a xref stream, such as the /Type /Size /W, but the stream itself is missing. In the following the COSWriter tries to determine if a xref table or a xref stream should be written by calling COSDocument#isXRefStream. That method delivers "true" as the broken trailer dictionary contains a "/Type /XRef" entry. Maybe we should introduce a new internal boolean value within the trailer which is set when reading a xref stream, so that we can be sure that the pdf really uses a xref stream not not just has broken trailer dictionary which leads to false information. I hope this makes sense ... ;-) > Signature byte range is Invalid > --- > > Key: PDFBOX-1822 > URL: https://issues.apache.org/jira/browse/PDFBOX-1822 > Project: PDFBox > Issue Type: Bug > Components: Parsing, PDModel, PDModel.AcroForm, Signing >Affects Versions: 1.8.3, 1.8.4, 2.0.0 >Reporter: vakhtang koroghlishvili > Attachments: araxis-merge - compare two document.jpg, > damaged-sig.jpg, unsigned-signed.pdf, unsigned.pdf > > > On person send me a unsigned PDF document. He wanted to sign it. When I try > to sign it (using pad box), I have some problem. > After signing adobe reader tells me "The signature byre range is invalid". > I will attach original and signed document. > I think, it is PDF box parser error. another signature libraries sign > document very well. I'm searching the problem at the moment, in order to fix > it. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1858) Extracted text does not have spaces
[ https://issues.apache.org/jira/browse/PDFBOX-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vitalie Bureanu updated PDFBOX-1858: Description: Extracted text does not have spaces between some words. Use to test please a string on line 74a... inside of attached test.pdf. It will be extracted as: "74a Amount of line73youwant refunded toyou . If Form isattached , checkhere" The result is not seems to be good, the words are "glued". I tried to use a class PDF Text Stripper but the result still remain the same. For us it is a big problem. Can it be resolved, please? With respect, Vitalie was: Extracted text does not have spaces between some words. Use to test please a string on line 74a... inside of attached test.pdf. It will be extracted as: "74a Amount of line73youwant refunded toyou . If Form isattached , checkhere" The result is not seems to be good, the words are "glued". I tried to use a class PDF Text Stripper but the result still remain the same. Can it be resolved, please? With respect, Vitalie > Extracted text does not have spaces > --- > > Key: PDFBOX-1858 > URL: https://issues.apache.org/jira/browse/PDFBOX-1858 > Project: PDFBox > Issue Type: Bug > Components: Parsing, Text extraction >Affects Versions: 1.8.3 > Environment: Linux 64bit, Java >Reporter: Vitalie Bureanu > Attachments: Screenshot.jpg, test.pdf > > Original Estimate: 3h > Remaining Estimate: 3h > > Extracted text does not have spaces between some words. > Use to test please a string on line 74a... inside of attached test.pdf. > It will be extracted as: "74a Amount of line73youwant refunded toyou . If > Form isattached , checkhere" > The result is not seems to be good, the words are "glued". > I tried to use a class PDF Text Stripper but the result still remain the same. > For us it is a big problem. Can it be resolved, please? > With respect, > Vitalie -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1858) Extracted text does not have spaces
[ https://issues.apache.org/jira/browse/PDFBOX-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vitalie Bureanu updated PDFBOX-1858: Description: Extracted text does not have spaces between some words. Use to test please a string on line 74a... inside of attached test.pdf. It will be extracted as: "74a Amount of line73youwant refunded toyou . If Form isattached , checkhere" The result is not seems to be good, the words are "glued". I tried to use a class PDF Text Stripper but the result still remain the same. Can it be resolved, please? With respect, Vitalie was: Extracted text does not have spaces between some words. Use to test please a string on line 74a... inside of attached test.pdf. It will be extracted as: "74a Amount of line73youwant refunded toyou . If Form isattached , checkhere" The result is not seems to be good, the words are "glued". I tried to use a class PDF Text Stripper but the result still remain the same. Can it be solved, please? With respect, Vitalie > Extracted text does not have spaces > --- > > Key: PDFBOX-1858 > URL: https://issues.apache.org/jira/browse/PDFBOX-1858 > Project: PDFBox > Issue Type: Bug > Components: Parsing, Text extraction >Affects Versions: 1.8.3 > Environment: Linux 64bit, Java >Reporter: Vitalie Bureanu > Attachments: Screenshot.jpg, test.pdf > > Original Estimate: 3h > Remaining Estimate: 3h > > Extracted text does not have spaces between some words. > Use to test please a string on line 74a... inside of attached test.pdf. > It will be extracted as: "74a Amount of line73youwant refunded toyou . If > Form isattached , checkhere" > The result is not seems to be good, the words are "glued". > I tried to use a class PDF Text Stripper but the result still remain the same. > Can it be resolved, please? > With respect, > Vitalie -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1859) ClassCastException for unknown destination type
[ https://issues.apache.org/jira/browse/PDFBOX-1859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hendrik Lescak updated PDFBOX-1859: --- Attachment: Speisepläne.pdf > ClassCastException for unknown destination type > --- > > Key: PDFBOX-1859 > URL: https://issues.apache.org/jira/browse/PDFBOX-1859 > Project: PDFBox > Issue Type: Bug > Components: PDModel >Affects Versions: 1.8.3, 2.0.0 >Reporter: Hendrik Lescak > Attachments: Speisepläne.pdf > > > Trying to read the outlines failed for the attached document. > {code:java} > import java.io.IOException; > import org.apache.pdfbox.pdmodel.PDDocument; > import > org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDDestination; > import > org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineItem; > import > org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineNode; > /** > * @author mailto:andre.kisch...@interface-projects.de";>André > Kischkel > * @since 22.01.2014 > * @version $Revision$ > */ > public class TestPDDestination { > public static void main(String[] args) throws IOException { > PDDocument doc = PDDocument.load("Speisepläne.pdf"); > traverse(doc.getDocumentCatalog().getDocumentOutline()); > doc.close(); > } > > static void traverse(PDOutlineNode node) throws IOException { > if (node instanceof PDOutlineItem) { > PDDestination dst = ((PDOutlineItem) > node).getDestination(); > /** >* throws java.lang.ClassCastException: > org.apache.pdfbox.cos.COSFloat cannot be cast to > org.apache.pdfbox.cos.COSName, >* but should be something like a PDPageXYZDestination! >*/ > System.out.println(dst); > } > for (PDOutlineItem child = node.getFirstChild(); child != null; > child = child.getNextSibling()) { > traverse(child); > } > } > } > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1858) Extracted text does not have spaces
[ https://issues.apache.org/jira/browse/PDFBOX-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vitalie Bureanu updated PDFBOX-1858: Description: Extracted text does not have spaces between some words. Use to test please a string on line 74a... inside of attached test.pdf. It will be extracted as: "74a Amount of line73youwant refunded toyou . If Form isattached , checkhere" The result is not seems to be good, the words are "glued". I tried to use a class PDF Text Stripper but the result still remain the same. Can it be solved, please? With respect, Vitalie was: Extracted text does not have spaces between some words. Use to test please a string on line 74a... inside of attached test.pdf. It will be extracted as: "74a Amount of line73youwant refunded toyou . If Form isattached , checkhere" The result is not seems to be good, the words are "glued". I tried to use a class PDF Text Stripper but the resultstill remain the same. Can it be solved, please? With respect, Vitalie > Extracted text does not have spaces > --- > > Key: PDFBOX-1858 > URL: https://issues.apache.org/jira/browse/PDFBOX-1858 > Project: PDFBox > Issue Type: Bug > Components: Parsing, Text extraction >Affects Versions: 1.8.3 > Environment: Linux 64bit, Java >Reporter: Vitalie Bureanu > Attachments: Screenshot.jpg, test.pdf > > Original Estimate: 3h > Remaining Estimate: 3h > > Extracted text does not have spaces between some words. > Use to test please a string on line 74a... inside of attached test.pdf. > It will be extracted as: "74a Amount of line73youwant refunded toyou . If > Form isattached , checkhere" > The result is not seems to be good, the words are "glued". > I tried to use a class PDF Text Stripper but the result still remain the same. > Can it be solved, please? > With respect, > Vitalie -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1859) ClassCastException for unknown destination type
[ https://issues.apache.org/jira/browse/PDFBOX-1859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hendrik Lescak updated PDFBOX-1859: --- Description: Trying to read the outlines failed for the attached document. {code} import java.io.IOException; import org.apache.pdfbox.pdmodel.PDDocument; import org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDDestination; import org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineItem; import org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineNode; /** * @author mailto:andre.kisch...@interface-projects.de";>André Kischkel * @since 22.01.2014 * @version $Revision$ */ public class TestPDDestination { public static void main(String[] args) throws IOException { PDDocument doc = PDDocument.load("Speisepläne.pdf"); traverse(doc.getDocumentCatalog().getDocumentOutline()); doc.close(); } static void traverse(PDOutlineNode node) throws IOException { if (node instanceof PDOutlineItem) { PDDestination dst = ((PDOutlineItem) node).getDestination(); /** * throws java.lang.ClassCastException: org.apache.pdfbox.cos.COSFloat cannot be cast to org.apache.pdfbox.cos.COSName, * but should be something like a PDPageXYZDestination! */ System.out.println(dst); } for (PDOutlineItem child = node.getFirstChild(); child != null; child = child.getNextSibling()) { traverse(child); } } } {code} was: Trying to read the outlines failed for the attached document. {code:java} import java.io.IOException; import org.apache.pdfbox.pdmodel.PDDocument; import org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDDestination; import org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineItem; import org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineNode; /** * @author mailto:andre.kisch...@interface-projects.de";>André Kischkel * @since 22.01.2014 * @version $Revision$ */ public class TestPDDestination { public static void main(String[] args) throws IOException { PDDocument doc = PDDocument.load("Speisepläne.pdf"); traverse(doc.getDocumentCatalog().getDocumentOutline()); doc.close(); } static void traverse(PDOutlineNode node) throws IOException { if (node instanceof PDOutlineItem) { PDDestination dst = ((PDOutlineItem) node).getDestination(); /** * throws java.lang.ClassCastException: org.apache.pdfbox.cos.COSFloat cannot be cast to org.apache.pdfbox.cos.COSName, * but should be something like a PDPageXYZDestination! */ System.out.println(dst); } for (PDOutlineItem child = node.getFirstChild(); child != null; child = child.getNextSibling()) { traverse(child); } } } {code} > ClassCastException for unknown destination type > --- > > Key: PDFBOX-1859 > URL: https://issues.apache.org/jira/browse/PDFBOX-1859 > Project: PDFBox > Issue Type: Bug > Components: PDModel >Affects Versions: 1.8.3, 2.0.0 >Reporter: Hendrik Lescak > Attachments: Speisepläne.pdf > > > Trying to read the outlines failed for the attached document. > {code} > import java.io.IOException; > import org.apache.pdfbox.pdmodel.PDDocument; > import > org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDDestination; > import > org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineItem; > import > org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineNode; > /** > * @author mailto:andre.kisch...@interface-projects.de";>André > Kischkel > * @since 22.01.2014 > * @version $Revision$ > */ > public class TestPDDestination { > public static void main(String[] args) throws IOException { > PDDocument doc = PDDocument.load("Speisepläne.pdf"); > traverse(doc.getDocumentCatalog().getDocumentOutline()); > doc.close(); > } > > static void traverse(PDOutlineNode node) throws IOException { > if (node instanceof PDOutlineItem) { > PDDestination dst = ((PDOutlineItem) > node).getDestination(); > /** >* throws java.lang.ClassCastException: > org.apache.pdfbox.cos.C
[jira] [Updated] (PDFBOX-1859) ClassCastException for unknown destination type
[ https://issues.apache.org/jira/browse/PDFBOX-1859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hendrik Lescak updated PDFBOX-1859: --- Description: Trying to read the outlines failed for the attached document. {code} import java.io.IOException; import org.apache.pdfbox.pdmodel.PDDocument; import org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDDestination; import org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineItem; import org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineNode; public class TestPDDestination { public static void main(String[] args) throws IOException { PDDocument doc = PDDocument.load("Speisepläne.pdf"); traverse(doc.getDocumentCatalog().getDocumentOutline()); doc.close(); } static void traverse(PDOutlineNode node) throws IOException { if (node instanceof PDOutlineItem) { PDDestination dst = ((PDOutlineItem) node).getDestination(); /** * throws java.lang.ClassCastException: org.apache.pdfbox.cos.COSFloat cannot be cast to org.apache.pdfbox.cos.COSName, * but should be something like a PDPageXYZDestination! */ System.out.println(dst); } for (PDOutlineItem child = node.getFirstChild(); child != null; child = child.getNextSibling()) { traverse(child); } } } {code} was: Trying to read the outlines failed for the attached document. {code} import java.io.IOException; import org.apache.pdfbox.pdmodel.PDDocument; import org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDDestination; import org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineItem; import org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineNode; /** * @author mailto:andre.kisch...@interface-projects.de";>André Kischkel * @since 22.01.2014 * @version $Revision$ */ public class TestPDDestination { public static void main(String[] args) throws IOException { PDDocument doc = PDDocument.load("Speisepläne.pdf"); traverse(doc.getDocumentCatalog().getDocumentOutline()); doc.close(); } static void traverse(PDOutlineNode node) throws IOException { if (node instanceof PDOutlineItem) { PDDestination dst = ((PDOutlineItem) node).getDestination(); /** * throws java.lang.ClassCastException: org.apache.pdfbox.cos.COSFloat cannot be cast to org.apache.pdfbox.cos.COSName, * but should be something like a PDPageXYZDestination! */ System.out.println(dst); } for (PDOutlineItem child = node.getFirstChild(); child != null; child = child.getNextSibling()) { traverse(child); } } } {code} > ClassCastException for unknown destination type > --- > > Key: PDFBOX-1859 > URL: https://issues.apache.org/jira/browse/PDFBOX-1859 > Project: PDFBox > Issue Type: Bug > Components: PDModel >Affects Versions: 1.8.3, 2.0.0 >Reporter: Hendrik Lescak > Attachments: Speisepläne.pdf > > > Trying to read the outlines failed for the attached document. > {code} > import java.io.IOException; > import org.apache.pdfbox.pdmodel.PDDocument; > import > org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDDestination; > import > org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineItem; > import > org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineNode; > public class TestPDDestination { > public static void main(String[] args) throws IOException { > PDDocument doc = PDDocument.load("Speisepläne.pdf"); > traverse(doc.getDocumentCatalog().getDocumentOutline()); > doc.close(); > } > > static void traverse(PDOutlineNode node) throws IOException { > if (node instanceof PDOutlineItem) { > PDDestination dst = ((PDOutlineItem) > node).getDestination(); > /** >* throws java.lang.ClassCastException: > org.apache.pdfbox.cos.COSFloat cannot be cast to > org.apache.pdfbox.cos.COSName, >* but should be something like a PDPageXYZDestination! >*/ > System.out.println(dst); > } > for (PDOutlineIt
[jira] [Updated] (PDFBOX-1858) Extracted text does not have spaces
[ https://issues.apache.org/jira/browse/PDFBOX-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vitalie Bureanu updated PDFBOX-1858: Attachment: Screenshot.jpg test.pdf > Extracted text does not have spaces > --- > > Key: PDFBOX-1858 > URL: https://issues.apache.org/jira/browse/PDFBOX-1858 > Project: PDFBox > Issue Type: Bug > Components: Parsing, Text extraction >Affects Versions: 1.8.3 > Environment: Linux 64bit, Java >Reporter: Vitalie Bureanu > Attachments: Screenshot.jpg, test.pdf > > Original Estimate: 3h > Remaining Estimate: 3h > > Extracted text does not have spaces between some words. > Use to test please a string on line 74a... inside of attached test.pdf. > It will be extracted as: "74a Amount of line73youwant refunded toyou . If > Form isattached , checkhere" > The result is not seems to be good, the words are "glued". > I tried to use a class PDF Text Stripper but the resultstill remain the same. > Can it be solved, please? > With respect, > Vitalie -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1858) Extracted text does not have spaces
[ https://issues.apache.org/jira/browse/PDFBOX-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vitalie Bureanu updated PDFBOX-1858: Attachment: (was: Untitled-1.jpg) > Extracted text does not have spaces > --- > > Key: PDFBOX-1858 > URL: https://issues.apache.org/jira/browse/PDFBOX-1858 > Project: PDFBox > Issue Type: Bug > Components: Parsing, Text extraction >Affects Versions: 1.8.3 > Environment: Linux 64bit, Java >Reporter: Vitalie Bureanu > Attachments: Screenshot.jpg, test.pdf > > Original Estimate: 3h > Remaining Estimate: 3h > > Extracted text does not have spaces between some words. > Use to test please a string on line 74a... inside of attached test.pdf. > It will be extracted as: "74a Amount of line73youwant refunded toyou . If > Form isattached , checkhere" > The result is not seems to be good, the words are "glued". > I tried to use a class PDF Text Stripper but the resultstill remain the same. > Can it be solved, please? > With respect, > Vitalie -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1858) Extracted text does not have spaces
[ https://issues.apache.org/jira/browse/PDFBOX-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vitalie Bureanu updated PDFBOX-1858: Attachment: Untitled-1.jpg > Extracted text does not have spaces > --- > > Key: PDFBOX-1858 > URL: https://issues.apache.org/jira/browse/PDFBOX-1858 > Project: PDFBox > Issue Type: Bug > Components: Parsing, Text extraction >Affects Versions: 1.8.3 > Environment: Linux 64bit, Java >Reporter: Vitalie Bureanu > Attachments: Screenshot.jpg, test.pdf > > Original Estimate: 3h > Remaining Estimate: 3h > > Extracted text does not have spaces between some words. > Use to test please a string on line 74a... inside of attached test.pdf. > It will be extracted as: "74a Amount of line73youwant refunded toyou . If > Form isattached , checkhere" > The result is not seems to be good, the words are "glued". > I tried to use a class PDF Text Stripper but the resultstill remain the same. > Can it be solved, please? > With respect, > Vitalie -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (PDFBOX-1859) ClassCastException for unknown destination type
Hendrik Lescak created PDFBOX-1859: -- Summary: ClassCastException for unknown destination type Key: PDFBOX-1859 URL: https://issues.apache.org/jira/browse/PDFBOX-1859 Project: PDFBox Issue Type: Bug Components: PDModel Affects Versions: 1.8.3, 2.0.0 Reporter: Hendrik Lescak Trying to read the outlines failed for the attached document. {code:java} import java.io.IOException; import org.apache.pdfbox.pdmodel.PDDocument; import org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDDestination; import org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineItem; import org.apache.pdfbox.pdmodel.interactive.documentnavigation.outline.PDOutlineNode; /** * @author mailto:andre.kisch...@interface-projects.de";>André Kischkel * @since 22.01.2014 * @version $Revision$ */ public class TestPDDestination { public static void main(String[] args) throws IOException { PDDocument doc = PDDocument.load("Speisepläne.pdf"); traverse(doc.getDocumentCatalog().getDocumentOutline()); doc.close(); } static void traverse(PDOutlineNode node) throws IOException { if (node instanceof PDOutlineItem) { PDDestination dst = ((PDOutlineItem) node).getDestination(); /** * throws java.lang.ClassCastException: org.apache.pdfbox.cos.COSFloat cannot be cast to org.apache.pdfbox.cos.COSName, * but should be something like a PDPageXYZDestination! */ System.out.println(dst); } for (PDOutlineItem child = node.getFirstChild(); child != null; child = child.getNextSibling()) { traverse(child); } } } {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (PDFBOX-1858) Extracted text does not have spaces
Vitalie Bureanu created PDFBOX-1858: --- Summary: Extracted text does not have spaces Key: PDFBOX-1858 URL: https://issues.apache.org/jira/browse/PDFBOX-1858 Project: PDFBox Issue Type: Bug Components: Parsing, Text extraction Affects Versions: 1.8.3 Environment: Linux 64bit, Java Reporter: Vitalie Bureanu Extracted text does not have spaces between some words. Use to test please a string on line 74a... inside of attached test.pdf. It will be extracted as: "74a Amount of line73youwant refunded toyou . If Form isattached , checkhere" The result is not seems to be good, the words are "glued". I tried to use a class PDF Text Stripper but the resultstill remain the same. Can it be solved, please? With respect, Vitalie -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PDFBOX-1857) Attachment damages singature
[ https://issues.apache.org/jira/browse/PDFBOX-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13878492#comment-13878492 ] Thomas Chojecki commented on PDFBOX-1857: - Sorry, I misinterpreted your point 2 and 3 in the comment. I thought you try to attach a file to a signed document. So this is not duplicating PDFBOX-1837 but this time it duplicates PDFBOX-1822. If you have more informations that can be helpful, feel free to add additional informations to this issue. I'm working at the moment on a similar issue. Maybe it also solves this problematic, > Attachment damages singature > > > Key: PDFBOX-1857 > URL: https://issues.apache.org/jira/browse/PDFBOX-1857 > Project: PDFBox > Issue Type: Bug > Components: Parsing, PDFReader, Signing, Utilities, Writing >Affects Versions: 1.8.3, 1.8.4, 2.0.0 >Reporter: jack >Assignee: Thomas Chojecki >Priority: Blocker > Attachments: attach.txt, original.pdf, original[signed].pdf, > original[with-attachment].pdf, original[with-attachment][signed].pdf > > > I have PDF document. > 1) Adobe reader reads document well. > 2) I sign document (using pdfbox-examples) and everything is well > 3) Then I try to attach file to original PDF (Code is written in the pdfbox > web page - in the cookBook). > 4) Adobe reader reads attached document well. everything is well. > 5) Now I have document with attachment. > 6) I try to sign that document (I mean document with attachment). And I have > 2 problem: > First: > when I open document, Adobe reader tells me that signature byte range is > invalid. > Second: > when I try to close document (I mean to close adobe reader), Adobe reader > tells me that: > Do you want to save changes to "original[with-attachment][signed]" before > closing? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PDFBOX-1857) Attachment damages singature
[ https://issues.apache.org/jira/browse/PDFBOX-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13878483#comment-13878483 ] jack commented on PDFBOX-1857: -- The main problem is the broken signature, but this problem is not solved yet. I attach documents too. 1) attach file to original document 2) try to sign document result: Signature byte range is invalid. > Attachment damages singature > > > Key: PDFBOX-1857 > URL: https://issues.apache.org/jira/browse/PDFBOX-1857 > Project: PDFBox > Issue Type: Bug > Components: Parsing, PDFReader, Signing, Utilities, Writing >Affects Versions: 1.8.3, 1.8.4, 2.0.0 >Reporter: jack >Assignee: Thomas Chojecki >Priority: Blocker > Attachments: attach.txt, original.pdf, original[signed].pdf, > original[with-attachment].pdf, original[with-attachment][signed].pdf > > > I have PDF document. > 1) Adobe reader reads document well. > 2) I sign document (using pdfbox-examples) and everything is well > 3) Then I try to attach file to original PDF (Code is written in the pdfbox > web page - in the cookBook). > 4) Adobe reader reads attached document well. everything is well. > 5) Now I have document with attachment. > 6) I try to sign that document (I mean document with attachment). And I have > 2 problem: > First: > when I open document, Adobe reader tells me that signature byte range is > invalid. > Second: > when I try to close document (I mean to close adobe reader), Adobe reader > tells me that: > Do you want to save changes to "original[with-attachment][signed]" before > closing? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PDFBOX-1857) Attachment damages singature
[ https://issues.apache.org/jira/browse/PDFBOX-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13878477#comment-13878477 ] jack commented on PDFBOX-1857: -- No, I attach file to the document before signing. I attach file to the original document and then I try to sign. > Attachment damages singature > > > Key: PDFBOX-1857 > URL: https://issues.apache.org/jira/browse/PDFBOX-1857 > Project: PDFBox > Issue Type: Bug > Components: Parsing, PDFReader, Signing, Utilities, Writing >Affects Versions: 1.8.3, 1.8.4, 2.0.0 >Reporter: jack >Assignee: Thomas Chojecki >Priority: Blocker > Attachments: attach.txt, original.pdf, original[signed].pdf, > original[with-attachment].pdf, original[with-attachment][signed].pdf > > > I have PDF document. > 1) Adobe reader reads document well. > 2) I sign document (using pdfbox-examples) and everything is well > 3) Then I try to attach file to original PDF (Code is written in the pdfbox > web page - in the cookBook). > 4) Adobe reader reads attached document well. everything is well. > 5) Now I have document with attachment. > 6) I try to sign that document (I mean document with attachment). And I have > 2 problem: > First: > when I open document, Adobe reader tells me that signature byte range is > invalid. > Second: > when I try to close document (I mean to close adobe reader), Adobe reader > tells me that: > Do you want to save changes to "original[with-attachment][signed]" before > closing? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PDFBOX-1808) PDFTextStripper.getText - hight memory usage
[ https://issues.apache.org/jira/browse/PDFBOX-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13878466#comment-13878466 ] Timo Boehme commented on PDFBOX-1808: - one addition to my last comment: it is JVM implementation dependent if in case of large free memory the JVM will release memory to the operating system. In case of server VMs they typically keep the allocated memory - independent if the Java application still needs it. > PDFTextStripper.getText - hight memory usage > > > Key: PDFBOX-1808 > URL: https://issues.apache.org/jira/browse/PDFBOX-1808 > Project: PDFBox > Issue Type: Bug > Components: Text extraction >Affects Versions: 1.8.2, 1.8.3 > Environment: Windows 7 > Java jdk 1.7.0_45 >Reporter: Guyenot Jeremy >Assignee: Andreas Lehmkühler >Priority: Critical > Labels: performance > Attachments: 1808-java char copyof.jpg, 1808-java char > copyofrange.jpg, 1808-java usage.jpg, 1808-pdfbox usage.jpg, > 1808-snapshot.nps, DOSSIER DE CANDIDATURE_001.pdf, > Screenshot2014-01-21-19-51-24.png, netbeans_project.jpg, s5-1.png, s5-2.png, > s50-1.png, s50-2.png > > Original Estimate: 72h > Remaining Estimate: 72h > > Hello, > i'm trying to extract text from pdfs but i can find that the PDFTextStripper > use a lot of memory. > With a pdf that have 2676 pages (for a 4.6Mo size) it use 1.5Go memory. > I also constat that the memory is'nt free after the getText method is called. > You can see my code bellow: > double virgule = Math.pow(10, 2); > System.out.println("START - Total memory (Mo): " + > Math.round((Runtime.getRuntime().totalMemory()/100) * virgule) / virgule); > PDDocument cd = PDDocument.load(file); > System.out.println("PDDocument getNumberOfPages - Nombre de > pages: " + cd.getNumberOfPages()); > System.out.println("PDDocument load - Total memory (Mo): " + > Math.round((Runtime.getRuntime().totalMemory()/100) * virgule) / virgule); > String pdfText = ""; > try{ > PDFTextStripper stripper = new PDFTextStripper(); > pdfText = stripper.getText(cd); > System.out.println("PDFTextStripper getText - Total > memory (Mo): " + Math.round((Runtime.getRuntime().totalMemory()/100) * > virgule) / virgule); > stripper.resetEngine(); > stripper = null; > System.out.println("PDFTextStripper resetEngine - Total > memory (Mo): " + Math.round((Runtime.getRuntime().totalMemory()/100) * > virgule) / virgule); > } > finally{ > if( cd!=null ){ > cd.close(); > cd = null; > System.out.println("PDDocument close - Total > memory (Mo): " + Math.round((Runtime.getRuntime().totalMemory()/100) * > virgule) / virgule); > } > } > retour = new TextField(fieldName, pdfText, Field.Store.NO); > System.out.println("TextField - Total memory (Mo): " + > Math.round((Runtime.getRuntime().totalMemory()/100) * virgule) / virgule); > And the result into my output window: > START - Total memory (Mo): 95.0 > PDDocument getNumberOfPages - Nombre de pages: 2676 > PDDocument load - Total memory (Mo): 121.0 > PDFTextStripper getText - Total memory (Mo): 757.0 > PDFTextStripper resetEngine - Total memory (Mo): 757.0 > PDDocument close - Total memory (Mo): 757.0 > TextField - Total memory (Mo): 757.0 > pdfText - Total memory (Mo): 757.0 > I also try to call System.gc() but the memory use is the same. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PDFBOX-1808) PDFTextStripper.getText - hight memory usage
[ https://issues.apache.org/jira/browse/PDFBOX-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13878461#comment-13878461 ] Timo Boehme commented on PDFBOX-1808: - [~jguyenot] please inform yourself about the meaning of the memory statistics provided by Java. *Total memory* is (as the name says) all the memory the VM uses. What you want is the used memory (by your application). This has to be calculated by totalMem - freeMem (see e.g. http://stackoverflow.com/questions/3571203/what-is-the-exact-meaning-of-runtime-getruntime-totalmemory-and-freememory) > PDFTextStripper.getText - hight memory usage > > > Key: PDFBOX-1808 > URL: https://issues.apache.org/jira/browse/PDFBOX-1808 > Project: PDFBox > Issue Type: Bug > Components: Text extraction >Affects Versions: 1.8.2, 1.8.3 > Environment: Windows 7 > Java jdk 1.7.0_45 >Reporter: Guyenot Jeremy >Assignee: Andreas Lehmkühler >Priority: Critical > Labels: performance > Attachments: 1808-java char copyof.jpg, 1808-java char > copyofrange.jpg, 1808-java usage.jpg, 1808-pdfbox usage.jpg, > 1808-snapshot.nps, DOSSIER DE CANDIDATURE_001.pdf, > Screenshot2014-01-21-19-51-24.png, netbeans_project.jpg, s5-1.png, s5-2.png, > s50-1.png, s50-2.png > > Original Estimate: 72h > Remaining Estimate: 72h > > Hello, > i'm trying to extract text from pdfs but i can find that the PDFTextStripper > use a lot of memory. > With a pdf that have 2676 pages (for a 4.6Mo size) it use 1.5Go memory. > I also constat that the memory is'nt free after the getText method is called. > You can see my code bellow: > double virgule = Math.pow(10, 2); > System.out.println("START - Total memory (Mo): " + > Math.round((Runtime.getRuntime().totalMemory()/100) * virgule) / virgule); > PDDocument cd = PDDocument.load(file); > System.out.println("PDDocument getNumberOfPages - Nombre de > pages: " + cd.getNumberOfPages()); > System.out.println("PDDocument load - Total memory (Mo): " + > Math.round((Runtime.getRuntime().totalMemory()/100) * virgule) / virgule); > String pdfText = ""; > try{ > PDFTextStripper stripper = new PDFTextStripper(); > pdfText = stripper.getText(cd); > System.out.println("PDFTextStripper getText - Total > memory (Mo): " + Math.round((Runtime.getRuntime().totalMemory()/100) * > virgule) / virgule); > stripper.resetEngine(); > stripper = null; > System.out.println("PDFTextStripper resetEngine - Total > memory (Mo): " + Math.round((Runtime.getRuntime().totalMemory()/100) * > virgule) / virgule); > } > finally{ > if( cd!=null ){ > cd.close(); > cd = null; > System.out.println("PDDocument close - Total > memory (Mo): " + Math.round((Runtime.getRuntime().totalMemory()/100) * > virgule) / virgule); > } > } > retour = new TextField(fieldName, pdfText, Field.Store.NO); > System.out.println("TextField - Total memory (Mo): " + > Math.round((Runtime.getRuntime().totalMemory()/100) * virgule) / virgule); > And the result into my output window: > START - Total memory (Mo): 95.0 > PDDocument getNumberOfPages - Nombre de pages: 2676 > PDDocument load - Total memory (Mo): 121.0 > PDFTextStripper getText - Total memory (Mo): 757.0 > PDFTextStripper resetEngine - Total memory (Mo): 757.0 > PDDocument close - Total memory (Mo): 757.0 > TextField - Total memory (Mo): 757.0 > pdfText - Total memory (Mo): 757.0 > I also try to call System.gc() but the memory use is the same. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (PDFBOX-1857) Attachment damages singature
[ https://issues.apache.org/jira/browse/PDFBOX-1857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Chojecki resolved PDFBOX-1857. - Resolution: Duplicate Assignee: Thomas Chojecki You got two problems here. 1.) Attaching content after signing. 2.) Problem with the id management. 1.) not solvable right now (see PDFBOX-1837). We would need to change the object storing engine and rewrite the writer. So if you want to attach content, you should do it before the first sign, or remove the signature before attaching content and then you can sign again. 2.) I looked in the specification PDF3200-1:2008 at the chapter 14.4 and can confirm the solution that was mentioned on stackoverflow. nonetheless i close it as duplicated, because the main problem is the broken signature. > Attachment damages singature > > > Key: PDFBOX-1857 > URL: https://issues.apache.org/jira/browse/PDFBOX-1857 > Project: PDFBox > Issue Type: Bug > Components: Parsing, PDFReader, Signing, Utilities, Writing >Affects Versions: 1.8.3, 1.8.4, 2.0.0 >Reporter: jack >Assignee: Thomas Chojecki >Priority: Blocker > Attachments: attach.txt, original.pdf, original[signed].pdf, > original[with-attachment].pdf, original[with-attachment][signed].pdf > > > I have PDF document. > 1) Adobe reader reads document well. > 2) I sign document (using pdfbox-examples) and everything is well > 3) Then I try to attach file to original PDF (Code is written in the pdfbox > web page - in the cookBook). > 4) Adobe reader reads attached document well. everything is well. > 5) Now I have document with attachment. > 6) I try to sign that document (I mean document with attachment). And I have > 2 problem: > First: > when I open document, Adobe reader tells me that signature byte range is > invalid. > Second: > when I try to close document (I mean to close adobe reader), Adobe reader > tells me that: > Do you want to save changes to "original[with-attachment][signed]" before > closing? -- This message was sent by Atlassian JIRA (v6.1.5#6160)