[jira] [Commented] (PDFBOX-1645) [PATCH] Improved the accuracy of the bounding box for each rendered CFF glyph
[ https://issues.apache.org/jira/browse/PDFBOX-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871851#comment-13871851 ] Robert Meyer commented on PDFBOX-1645: -- Hi John, Thanks for taking the time to look at this and respond. I knew when I put this patch forward that it was computationally expensive but could not see any other way of retrieving accurate bounding box information for the characters. At the time I was having issues with character layout in documents generated by FOP and this naturally led me to believe that it was an issue with the co-ordinates I was getting from the renderer. From further investigation though I now understand that this information is only used in SVG documents and as such I am no longer convinced that this information is vital or necessary for the OTF implementation in the project. As such I will investigate further and submit a patch should the need arise once I have a better understanding of what is required. [PATCH] Improved the accuracy of the bounding box for each rendered CFF glyph - Key: PDFBOX-1645 URL: https://issues.apache.org/jira/browse/PDFBOX-1645 Project: PDFBox Issue Type: Improvement Components: FontBox Affects Versions: 1.8.2 Reporter: Robert Meyer Assignee: Andreas Lehmkühler Fix For: 2.0.0 Attachments: characterl.png, charactert.png, patch-20131202.diff, patch.diff In a previous patch to the CharStringRenderer class, I resolved the rendering issues and added a method to retrieve the bounding box for a CFF glyph. This utilized the GeneralPath.getBounds() method to retrieve it's bounding box. Unfortunately it was found that the method uses the control points of the bezier curves instead of the actual lines and was not very accurate. I have therefore added several new methods to calculate the correct extents of the glyph so that now it matches that of the measurements found in tools like FontForge. As a side note, there are several checks which were originally added in my patch which were unfortunately removed relating to the number of arguments provided with an operator. I have one Adobe Font (Adobe Heiti Standard - CID-Keyed OTF) which has one or more glyphs which trip up on this and cause an Array index out of Bounds exception. Each glyph renders correctly even though this issue occurs and therefore would be grateful if these could be left in. I have re-added these checks back with the patch I am about to add. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Closed] (PDFBOX-1761) java.lang.StringIndexOutOfBoundsException: String index out of range: 2047
[ https://issues.apache.org/jira/browse/PDFBOX-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler closed PDFBOX-1761. -- java.lang.StringIndexOutOfBoundsException: String index out of range: 2047 -- Key: PDFBOX-1761 URL: https://issues.apache.org/jira/browse/PDFBOX-1761 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 1.8.2 Environment: JDK6 Reporter: William Palmer Assignee: Andreas Lehmkühler Priority: Minor Using code samples provided in PDFBOX-1757 using load() and loadNonSeq() gives the following exception(s) for the test file: -http://digitalcorpora.org/corp/nps/files/govdocs1/447/447403.pdf java.lang.StringIndexOutOfBoundsException: String index out of range: 2047 at java.lang.AbstractStringBuilder.deleteCharAt(AbstractStringBuilder.java:770) at java.lang.StringBuilder.deleteCharAt(StringBuilder.java:263) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSHexString(BaseParser.java:1000) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSString(BaseParser.java:808) -using loadnonseq java.lang.StringIndexOutOfBoundsException: String index out of range: 2047 at java.lang.AbstractStringBuilder.deleteCharAt(AbstractStringBuilder.java:770) at java.lang.StringBuilder.deleteCharAt(StringBuilder.java:263) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSHexString(BaseParser.java:1000) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSString(BaseParser.java:808) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (PDFBOX-1761) java.lang.StringIndexOutOfBoundsException: String index out of range: 2047
[ https://issues.apache.org/jira/browse/PDFBOX-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler resolved PDFBOX-1761. Resolution: Duplicate Assignee: Andreas Lehmkühler Closed as duplicate of PDFBOX-1607 Please open an new issue, if the text extraction issue still persits and don't forget to attach a sample pdf. java.lang.StringIndexOutOfBoundsException: String index out of range: 2047 -- Key: PDFBOX-1761 URL: https://issues.apache.org/jira/browse/PDFBOX-1761 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 1.8.2 Environment: JDK6 Reporter: William Palmer Assignee: Andreas Lehmkühler Priority: Minor Using code samples provided in PDFBOX-1757 using load() and loadNonSeq() gives the following exception(s) for the test file: -http://digitalcorpora.org/corp/nps/files/govdocs1/447/447403.pdf java.lang.StringIndexOutOfBoundsException: String index out of range: 2047 at java.lang.AbstractStringBuilder.deleteCharAt(AbstractStringBuilder.java:770) at java.lang.StringBuilder.deleteCharAt(StringBuilder.java:263) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSHexString(BaseParser.java:1000) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSString(BaseParser.java:808) -using loadnonseq java.lang.StringIndexOutOfBoundsException: String index out of range: 2047 at java.lang.AbstractStringBuilder.deleteCharAt(AbstractStringBuilder.java:770) at java.lang.StringBuilder.deleteCharAt(StringBuilder.java:263) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSHexString(BaseParser.java:1000) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSString(BaseParser.java:808) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PDFBOX-1844) [PATCH] Parser for Type 1 Fonts
[ https://issues.apache.org/jira/browse/PDFBOX-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871951#comment-13871951 ] Tilman Hausherr commented on PDFBOX-1844: - The file of PDFBOX-1019 now looks nice now! So does the redp file. But page 25 of the document of PDFBOX-1298 brings this: 15.01.2014 12:30:14.575 ERROR [main] org.apache.pdfbox.pdmodel.font.PDType1Font:251 - Can't read the embedded Type1 font Melior java.io.IOException: Found Token[kind=INTEGER, text=-] but expected CHARSTRING at org.apache.fontbox.type1.Type1Parser.read(Type1Parser.java:637) at org.apache.fontbox.type1.Type1Parser.read(Type1Parser.java:637) at org.apache.fontbox.type1.Type1Parser.readSubrs(Type1Parser.java:551) at org.apache.fontbox.type1.Type1Parser.parseBinary(Type1Parser.java:421) at org.apache.fontbox.type1.Type1Parser.parse(Type1Parser.java:68) at org.apache.fontbox.type1.Type1Font.createWithSegments(Type1Font.java:70) at org.apache.pdfbox.pdmodel.font.PDType1Font.init(PDType1Font.java:247) at org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:92) at org.apache.pdfbox.pdmodel.PDResources.getFonts(PDResources.java:204) at org.apache.pdfbox.util.PDFStreamEngine.getFonts(PDFStreamEngine.java:580) at org.apache.pdfbox.util.operator.SetTextFont.process(SetTextFont.java:54) at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:529) at pdfboxpageimageextraction.MyPDFStreamEngine.processOperator(MyPDFStreamEngine.java:167) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:258) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:225) at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:205) at pdfboxpageimageextraction.MyPDFStreamEngine.getMaxDpi(MyPDFStreamEngine.java:53) at pdfboxpageimageextraction.ExtractImages.doPdf(ExtractImages.java:221) at pdfboxpageimageextraction.ExtractImages.main(ExtractImages.java:80) 15.01.2014 12:30:14.622 ERROR [main] org.apache.pdfbox.pdmodel.font.PDType1Font:251 - Can't read the embedded Type1 font RotisSemiSans java.io.IOException: Found Token[kind=INTEGER, text=-] but expected CHARSTRING at org.apache.fontbox.type1.Type1Parser.read(Type1Parser.java:637) at org.apache.fontbox.type1.Type1Parser.readSubrs(Type1Parser.java:551) at org.apache.fontbox.type1.Type1Parser.parseBinary(Type1Parser.java:421) at org.apache.fontbox.type1.Type1Parser.parse(Type1Parser.java:68) at org.apache.fontbox.type1.Type1Font.createWithSegments(Type1Font.java:70) at org.apache.pdfbox.pdmodel.font.PDType1Font.init(PDType1Font.java:247) at org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:92) at org.apache.pdfbox.pdmodel.PDResources.getFonts(PDResources.java:204) at org.apache.pdfbox.util.PDFStreamEngine.getFonts(PDFStreamEngine.java:580) at org.apache.pdfbox.util.operator.SetTextFont.process(SetTextFont.java:54) at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:529) at pdfboxpageimageextraction.MyPDFStreamEngine.processOperator(MyPDFStreamEngine.java:167) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:258) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:225) at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:205) at pdfboxpageimageextraction.MyPDFStreamEngine.getMaxDpi(MyPDFStreamEngine.java:53) at pdfboxpageimageextraction.ExtractImages.doPdf(ExtractImages.java:221) at pdfboxpageimageextraction.ExtractImages.main(ExtractImages.java:80) 15.01.2014 12:30:14.637 ERROR [main] org.apache.pdfbox.pdmodel.font.PDType1Font:251 - Can't read the embedded Type1 font MT-Extra java.io.IOException: Found Token[kind=NAME, text=def] but expected dup at org.apache.fontbox.type1.Type1Parser.read(Type1Parser.java:651) at org.apache.fontbox.type1.Type1Parser.parseASCII(Type1Parser.java:143) at org.apache.fontbox.type1.Type1Parser.parse(Type1Parser.java:65) at org.apache.fontbox.type1.Type1Font.createWithSegments(Type1Font.java:70) at org.apache.pdfbox.pdmodel.font.PDType1Font.init(PDType1Font.java:247) at org.apache.pdfbox.pdmodel.font.PDFontFactory.createFont(PDFontFactory.java:92) at org.apache.pdfbox.pdmodel.PDResources.getFonts(PDResources.java:204) at org.apache.pdfbox.util.PDFStreamEngine.getFonts(PDFStreamEngine.java:580) at org.apache.pdfbox.util.operator.SetTextFont.process(SetTextFont.java:54) at
[jira] [Updated] (PDFBOX-1847) TSA Time Signature
[ https://issues.apache.org/jira/browse/PDFBOX-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vakhtang koroghlishvili updated PDFBOX-1847: Attachment: resultOfSigning.jpg TSA Time Signature -- Key: PDFBOX-1847 URL: https://issues.apache.org/jira/browse/PDFBOX-1847 Project: PDFBox Issue Type: Improvement Components: Signing Affects Versions: 1.8.4 Reporter: vakhtang koroghlishvili Attachments: TSATimeSignature.patch, resultOfSigning.jpg When we was signing document, we was using time from our time. For more security we can use Time Stamp server. Trusted timestamping is the process of securely keeping track of the creation and modification time of a document. Security here means that no one — not even the owner of the document — should be able to change it once it has been recorded provided that the timestamper's integrity is never compromised.(wiki) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1847) TSA Time Signature
[ https://issues.apache.org/jira/browse/PDFBOX-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vakhtang koroghlishvili updated PDFBOX-1847: Attachment: TSATimeSignature.patch TSA Time Signature -- Key: PDFBOX-1847 URL: https://issues.apache.org/jira/browse/PDFBOX-1847 Project: PDFBox Issue Type: Improvement Components: Signing Affects Versions: 1.8.4 Reporter: vakhtang koroghlishvili Attachments: TSATimeSignature.patch, resultOfSigning.jpg When we was signing document, we was using time from our time. For more security we can use Time Stamp server. Trusted timestamping is the process of securely keeping track of the creation and modification time of a document. Security here means that no one — not even the owner of the document — should be able to change it once it has been recorded provided that the timestamper's integrity is never compromised.(wiki) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (PDFBOX-1847) TSA Time Signature
vakhtang koroghlishvili created PDFBOX-1847: --- Summary: TSA Time Signature Key: PDFBOX-1847 URL: https://issues.apache.org/jira/browse/PDFBOX-1847 Project: PDFBox Issue Type: Improvement Components: Signing Affects Versions: 1.8.4 Reporter: vakhtang koroghlishvili Attachments: TSATimeSignature.patch, resultOfSigning.jpg When we was signing document, we was using time from our time. For more security we can use Time Stamp server. Trusted timestamping is the process of securely keeping track of the creation and modification time of a document. Security here means that no one — not even the owner of the document — should be able to change it once it has been recorded provided that the timestamper's integrity is never compromised.(wiki) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1847) TSA Time Signature
[ https://issues.apache.org/jira/browse/PDFBOX-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vakhtang koroghlishvili updated PDFBOX-1847: Attachment: (was: TSATimeSignature.patch) TSA Time Signature -- Key: PDFBOX-1847 URL: https://issues.apache.org/jira/browse/PDFBOX-1847 Project: PDFBox Issue Type: Improvement Components: Signing Affects Versions: 1.8.4 Reporter: vakhtang koroghlishvili Attachments: TSATimeSignature.patch, resultOfSigning.jpg When we was signing document, we was using time from our time. For more security we can use Time Stamp server. Trusted timestamping is the process of securely keeping track of the creation and modification time of a document. Security here means that no one — not even the owner of the document — should be able to change it once it has been recorded provided that the timestamper's integrity is never compromised.(wiki) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1847) TSA Time Signature
[ https://issues.apache.org/jira/browse/PDFBOX-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vakhtang koroghlishvili updated PDFBOX-1847: Attachment: TSATimeSignature.patch TSA Time Signature -- Key: PDFBOX-1847 URL: https://issues.apache.org/jira/browse/PDFBOX-1847 Project: PDFBox Issue Type: Improvement Components: Signing Affects Versions: 1.8.4 Reporter: vakhtang koroghlishvili Attachments: TSATimeSignature.patch, resultOfSigning.jpg When we was signing document, we was using time from our time. For more security we can use Time Stamp server. Trusted timestamping is the process of securely keeping track of the creation and modification time of a document. Security here means that no one — not even the owner of the document — should be able to change it once it has been recorded provided that the timestamper's integrity is never compromised.(wiki) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PDFBOX-1847) TSA Time Signature
[ https://issues.apache.org/jira/browse/PDFBOX-1847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13871961#comment-13871961 ] vakhtang koroghlishvili commented on PDFBOX-1847: - I'have tested with many timestamping authorit servers, and it works well! :) TSA Time Signature -- Key: PDFBOX-1847 URL: https://issues.apache.org/jira/browse/PDFBOX-1847 Project: PDFBox Issue Type: Improvement Components: Signing Affects Versions: 1.8.4 Reporter: vakhtang koroghlishvili Attachments: TSATimeSignature.patch, resultOfSigning.jpg When we was signing document, we was using time from our time. For more security we can use Time Stamp server. Trusted timestamping is the process of securely keeping track of the creation and modification time of a document. Security here means that no one — not even the owner of the document — should be able to change it once it has been recorded provided that the timestamper's integrity is never compromised.(wiki) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: TSA
Hello, It's nice, to meet you. :) At the moment, I think the most important parts are: - Part 3: PAdES Enhanced – PAdES-Basic Electronic Signatures and PAdES-Explicit Policy Electronic Signatures Profiles - Part 4: PAdES Long Term – PAdES-Long Term Validation Profile Also API must be very simple -for this signbox architecture will be like this - we just create signature object and pass it an arguments and call sign method. I've already implement time stamp time signature (PDFBOX-1847https://issues.apache.org/jira/browse/PDFBOX-1847 ). Now I will implement TSA Document level signature. Then I will implement PAdES Long Term (This means that we must create Document Secure Store) :) thanks, Best regards On Fri, Jan 10, 2014 at 4:57 PM, Thomas Chojecki i...@rayman2200.de wrote: Bump this mail for Vakhtang. :) Am 2014-01-09 12:43, schrieb Maruan Sahyoun: Hi Vakhtang, hi Thomas, signing a PDF seems to be one of the top use cases for PDFBox. Maybe you can outline what has to be done to a ‚core‘ of PDFBox i.e. what might be missing from a PDF perspective and what is part of a signing application. Having a reference implementation for signing using PDFbox would be very good. And as modularization is one of the themes of PDFBox 2 we could organize it accordingly. BR Maruan Am 09.01.2014 um 12:30 schrieb Thomas Chojecki i...@rayman2200.de: Am 2014-01-09 11:45, schrieb Vakhtang koroghlishvili: Hello, Hi Vakhtang, I can implement this feature :) 2) After I implement TSA time feature, which I have written above, I think It will be very good if we implement (I can implement this too) PADES LTV profile. In the other words, it is PAdES-Long Term Validation Profile. See specification doc: http://www.etsi.org/deliver/etsi_ts/102700_102799/ 10277804/01.01.01_60/ts_10277804v010101p.pdf This would be great. Maybe we can someday create a new subproject signbox or something else that will be shipped within the pdfbox project. So custom solution isn't needed. Since middle of december I'm sitting without internet and waiting for the provider. So I can only doing mail support this time. I think they are very important and usefull features. :) Of course. This would be an important step to improving the signature functionality. The last ETSI Plugtest (interop test) shows, that there are some companies using pdfbox for his signature solution. Maybe a general base would help to be more interoperable to other libraries. So, what do you think about this features? :) Great idea. And thanks for investing so much time to improve the signature base. Best regards Thomas
[jira] [Commented] (PDFBOX-1416) missing image in reader/image output
[ https://issues.apache.org/jira/browse/PDFBOX-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872000#comment-13872000 ] Alex Boyarintsev commented on PDFBOX-1416: -- Hi! I have similar issue when try to convert PDF with gradient to JPG. missing image in reader/image output Key: PDFBOX-1416 URL: https://issues.apache.org/jira/browse/PDFBOX-1416 Project: PDFBox Issue Type: Bug Components: PDFReader Affects Versions: 1.7.1 Reporter: Joseph Berglund Assignee: Andreas Lehmkühler Attachments: missing_image.pdf, missing_image1.jpg, missing_image2.jpg See attached images and PDF. One image on the first page is missing. Two images have their backgrounds turned to blue. Output from PDFToImage Sep 24, 2012 11:15:05 AM org.apache.pdfbox.util.PDFStreamEngine processOperator INFO: unsupported/disabled operation: BDC Sep 24, 2012 11:15:05 AM org.apache.pdfbox.util.PDFStreamEngine processOperator INFO: unsupported/disabled operation: ri Sep 24, 2012 11:15:05 AM org.apache.pdfbox.util.PDFStreamEngine processOperator INFO: unsupported/disabled operation: EMC Sep 24, 2012 11:15:05 AM org.apache.pdfbox.util.PDFStreamEngine processOperator INFO: unsupported/disabled operation: BX Sep 24, 2012 11:15:05 AM org.apache.pdfbox.util.operator.pagedrawer.SHFill process WARNING: java.lang.NullPointerException java.lang.NullPointerException at org.apache.pdfbox.pdmodel.graphics.shading.RadialShadingContext.getRaster(RadialShadingContext.java:282) at sun.java2d.pipe.AlphaPaintPipe.renderPathTile(Unknown Source) at sun.java2d.pipe.SpanShapeRenderer$Composite.renderBox(Unknown Source) at sun.java2d.pipe.SpanShapeRenderer.spanClipLoop(Unknown Source) at sun.java2d.pipe.SpanShapeRenderer.renderSpans(Unknown Source) at sun.java2d.pipe.SpanShapeRenderer.fill(Unknown Source) at sun.java2d.pipe.ValidatePipe.fill(Unknown Source) at sun.java2d.SunGraphics2D.fill(Unknown Source) at org.apache.pdfbox.pdfviewer.PageDrawer.shFill(PageDrawer.java:498) at org.apache.pdfbox.util.operator.pagedrawer.SHFill.process(SHFill.java:57) at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:237) at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:217) at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:119) at org.apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.java:730) at org.apache.pdfbox.util.PDFImageWriter.writeImage(PDFImageWriter.java:115) at org.apache.pdfbox.PDFToImage.main(PDFToImage.java:244) at org.apache.pdfbox.PDFBox.main(PDFBox.java:58) Sep 24, 2012 11:15:05 AM org.apache.pdfbox.util.PDFStreamEngine processOperator INFO: unsupported/disabled operation: EX Sep 24, 2012 11:15:05 AM org.apache.pdfbox.util.PDFStreamEngine processOperator INFO: unsupported/disabled operation: i Sep 24, 2012 11:15:05 AM org.apache.pdfbox.util.operator.pagedrawer.SHFill process WARNING: java.lang.NullPointerException java.lang.NullPointerException at org.apache.pdfbox.pdmodel.graphics.shading.AxialShadingContext.getRaster(AxialShadingContext.java:244) at sun.java2d.pipe.AlphaPaintPipe.renderPathTile(Unknown Source) at sun.java2d.pipe.SpanShapeRenderer$Composite.renderBox(Unknown Source) at sun.java2d.pipe.SpanShapeRenderer.spanClipLoop(Unknown Source) at sun.java2d.pipe.SpanShapeRenderer.renderSpans(Unknown Source) at sun.java2d.pipe.SpanShapeRenderer.fill(Unknown Source) at sun.java2d.pipe.ValidatePipe.fill(Unknown Source) at sun.java2d.SunGraphics2D.fill(Unknown Source) at org.apache.pdfbox.pdfviewer.PageDrawer.shFill(PageDrawer.java:498) at org.apache.pdfbox.util.operator.pagedrawer.SHFill.process(SHFill.java:57) at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:237) at org.apache.pdfbox.util.operator.pagedrawer.Invoke.process(Invoke.java:137) at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:237) at
[jira] [Updated] (PDFBOX-1848) Time Stamp Document Level Sigature
[ https://issues.apache.org/jira/browse/PDFBOX-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vakhtang koroghlishvili updated PDFBOX-1848: Description: We need TSA Document Level signature modulo too! At the moment we sign document with our certificate. But... sometimes we need to sign document with TSA too. This is important part of signing. Sometimes this is very very very important- for instance when we will implement PAdES 4 profile this module will be essential. without that Document Secure Store will not work :) I'm working on this improvement. I'will finish this soon. It's almost done. I only must add some java docs, and might be I change architect design and etc.. So, please assign this it to me :) I will upload patch as soon as possible :) was: We need TSA Document Level signature modulo too! At the moment we sign document with our certificate. But... sometimes we need to sign document with TSA too. This is important part of signing. Sometimes this is very very very important- for instance when we will implement PAdES 4 profile this module will be essential. without that Document Secure Store will not work :) I'm working on this improvement. I'will finish this soon. It's almost done. I only must add some java docs, and might be I change architect design and etc.. So, please assign this it to me :) Time Stamp Document Level Sigature -- Key: PDFBOX-1848 URL: https://issues.apache.org/jira/browse/PDFBOX-1848 Project: PDFBox Issue Type: Improvement Components: Signing Reporter: vakhtang koroghlishvili We need TSA Document Level signature modulo too! At the moment we sign document with our certificate. But... sometimes we need to sign document with TSA too. This is important part of signing. Sometimes this is very very very important- for instance when we will implement PAdES 4 profile this module will be essential. without that Document Secure Store will not work :) I'm working on this improvement. I'will finish this soon. It's almost done. I only must add some java docs, and might be I change architect design and etc.. So, please assign this it to me :) I will upload patch as soon as possible :) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (PDFBOX-1848) Time Stamp Document Level Sigature
vakhtang koroghlishvili created PDFBOX-1848: --- Summary: Time Stamp Document Level Sigature Key: PDFBOX-1848 URL: https://issues.apache.org/jira/browse/PDFBOX-1848 Project: PDFBox Issue Type: Improvement Reporter: vakhtang koroghlishvili We need TSA Document Level signature modulo too! At the moment we sign document with our certificate. What sometimes we need to sign document with TSA too. I mean that in this case ,signer certificate is TSA Certificate. This is important part of signing. Sometimes this is very very very important- for instance when we will implement PAdES 4 profile this module will be essential. I'm working on this improvement. I'will finish this soon. It's almost done. I only must add some java docs, might be change architect design and etc.. So, please assign this it to me :) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1848) Time Stamp Document Level Sigature
[ https://issues.apache.org/jira/browse/PDFBOX-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vakhtang koroghlishvili updated PDFBOX-1848: Component/s: Signing Time Stamp Document Level Sigature -- Key: PDFBOX-1848 URL: https://issues.apache.org/jira/browse/PDFBOX-1848 Project: PDFBox Issue Type: Improvement Components: Signing Reporter: vakhtang koroghlishvili We need TSA Document Level signature modulo too! At the moment we sign document with our certificate. What sometimes we need to sign document with TSA too. I mean that in this case ,signer certificate is TSA Certificate. This is important part of signing. Sometimes this is very very very important- for instance when we will implement PAdES 4 profile this module will be essential. I'm working on this improvement. I'will finish this soon. It's almost done. I only must add some java docs, might be change architect design and etc.. So, please assign this it to me :) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1848) Time Stamp Document Level Sigature
[ https://issues.apache.org/jira/browse/PDFBOX-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] vakhtang koroghlishvili updated PDFBOX-1848: Description: We need TSA Document Level signature modulo too! At the moment we sign document with our certificate. But... sometimes we need to sign document with TSA too. This is important part of signing. Sometimes this is very very very important- for instance when we will implement PAdES 4 profile this module will be essential. without that Document Secure Store will not work :) I'm working on this improvement. I'will finish this soon. It's almost done. I only must add some java docs, and might be I change architect design and etc.. So, please assign this it to me :) was: We need TSA Document Level signature modulo too! At the moment we sign document with our certificate. What sometimes we need to sign document with TSA too. I mean that in this case ,signer certificate is TSA Certificate. This is important part of signing. Sometimes this is very very very important- for instance when we will implement PAdES 4 profile this module will be essential. I'm working on this improvement. I'will finish this soon. It's almost done. I only must add some java docs, might be change architect design and etc.. So, please assign this it to me :) Time Stamp Document Level Sigature -- Key: PDFBOX-1848 URL: https://issues.apache.org/jira/browse/PDFBOX-1848 Project: PDFBox Issue Type: Improvement Components: Signing Reporter: vakhtang koroghlishvili We need TSA Document Level signature modulo too! At the moment we sign document with our certificate. But... sometimes we need to sign document with TSA too. This is important part of signing. Sometimes this is very very very important- for instance when we will implement PAdES 4 profile this module will be essential. without that Document Secure Store will not work :) I'm working on this improvement. I'will finish this soon. It's almost done. I only must add some java docs, and might be I change architect design and etc.. So, please assign this it to me :) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PDFBOX-1848) Time Stamp Document Level Sigature
[ https://issues.apache.org/jira/browse/PDFBOX-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872142#comment-13872142 ] Thomas Chojecki commented on PDFBOX-1848: - Assigning tickets to you isn't possible, only committer can be assigned. But this is no problem, you can provide the patch and someone from the pdfbox team will look at the code and commit it. Time Stamp Document Level Sigature -- Key: PDFBOX-1848 URL: https://issues.apache.org/jira/browse/PDFBOX-1848 Project: PDFBox Issue Type: Improvement Components: Signing Reporter: vakhtang koroghlishvili We need TSA Document Level signature modulo too! At the moment we sign document with our certificate. But... sometimes we need to sign document with TSA too. This is important part of signing. Sometimes this is very very very important- for instance when we will implement PAdES 4 profile this module will be essential. without that Document Secure Store will not work :) I'm working on this improvement. I'will finish this soon. It's almost done. I only must add some java docs, and might be I change architect design and etc.. So, please assign this it to me :) I will upload patch as soon as possible :) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PDFBOX-1848) Time Stamp Document Level Sigature
[ https://issues.apache.org/jira/browse/PDFBOX-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872145#comment-13872145 ] vakhtang koroghlishvili commented on PDFBOX-1848: - No problem :) thank you very much :) Time Stamp Document Level Sigature -- Key: PDFBOX-1848 URL: https://issues.apache.org/jira/browse/PDFBOX-1848 Project: PDFBox Issue Type: Improvement Components: Signing Reporter: vakhtang koroghlishvili We need TSA Document Level signature modulo too! At the moment we sign document with our certificate. But... sometimes we need to sign document with TSA too. This is important part of signing. Sometimes this is very very very important- for instance when we will implement PAdES 4 profile this module will be essential. without that Document Secure Store will not work :) I'm working on this improvement. I'will finish this soon. It's almost done. I only must add some java docs, and might be I change architect design and etc.. So, please assign this it to me :) I will upload patch as soon as possible :) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Comment Edited] (PDFBOX-1848) Time Stamp Document Level Sigature
[ https://issues.apache.org/jira/browse/PDFBOX-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872145#comment-13872145 ] vakhtang koroghlishvili edited comment on PDFBOX-1848 at 1/15/14 2:53 PM: -- No problem :) thank you :) was (Author: v.koroghlishvili): No problem :) thank you very much :) Time Stamp Document Level Sigature -- Key: PDFBOX-1848 URL: https://issues.apache.org/jira/browse/PDFBOX-1848 Project: PDFBox Issue Type: Improvement Components: Signing Reporter: vakhtang koroghlishvili We need TSA Document Level signature modulo too! At the moment we sign document with our certificate. But... sometimes we need to sign document with TSA too. This is important part of signing. Sometimes this is very very very important- for instance when we will implement PAdES 4 profile this module will be essential. without that Document Secure Store will not work :) I'm working on this improvement. I'will finish this soon. It's almost done. I only must add some java docs, and might be I change architect design and etc.. So, please assign this it to me :) I will upload patch as soon as possible :) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PDFBOX-1416) missing image in reader/image output
[ https://issues.apache.org/jira/browse/PDFBOX-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872162#comment-13872162 ] Tilman Hausherr commented on PDFBOX-1416: - Alex - please attach the pdf, if it isn't confidential. missing image in reader/image output Key: PDFBOX-1416 URL: https://issues.apache.org/jira/browse/PDFBOX-1416 Project: PDFBox Issue Type: Bug Components: PDFReader Affects Versions: 1.7.1 Reporter: Joseph Berglund Assignee: Andreas Lehmkühler Attachments: missing_image.pdf, missing_image1.jpg, missing_image2.jpg See attached images and PDF. One image on the first page is missing. Two images have their backgrounds turned to blue. Output from PDFToImage Sep 24, 2012 11:15:05 AM org.apache.pdfbox.util.PDFStreamEngine processOperator INFO: unsupported/disabled operation: BDC Sep 24, 2012 11:15:05 AM org.apache.pdfbox.util.PDFStreamEngine processOperator INFO: unsupported/disabled operation: ri Sep 24, 2012 11:15:05 AM org.apache.pdfbox.util.PDFStreamEngine processOperator INFO: unsupported/disabled operation: EMC Sep 24, 2012 11:15:05 AM org.apache.pdfbox.util.PDFStreamEngine processOperator INFO: unsupported/disabled operation: BX Sep 24, 2012 11:15:05 AM org.apache.pdfbox.util.operator.pagedrawer.SHFill process WARNING: java.lang.NullPointerException java.lang.NullPointerException at org.apache.pdfbox.pdmodel.graphics.shading.RadialShadingContext.getRaster(RadialShadingContext.java:282) at sun.java2d.pipe.AlphaPaintPipe.renderPathTile(Unknown Source) at sun.java2d.pipe.SpanShapeRenderer$Composite.renderBox(Unknown Source) at sun.java2d.pipe.SpanShapeRenderer.spanClipLoop(Unknown Source) at sun.java2d.pipe.SpanShapeRenderer.renderSpans(Unknown Source) at sun.java2d.pipe.SpanShapeRenderer.fill(Unknown Source) at sun.java2d.pipe.ValidatePipe.fill(Unknown Source) at sun.java2d.SunGraphics2D.fill(Unknown Source) at org.apache.pdfbox.pdfviewer.PageDrawer.shFill(PageDrawer.java:498) at org.apache.pdfbox.util.operator.pagedrawer.SHFill.process(SHFill.java:57) at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:237) at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:217) at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:119) at org.apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.java:730) at org.apache.pdfbox.util.PDFImageWriter.writeImage(PDFImageWriter.java:115) at org.apache.pdfbox.PDFToImage.main(PDFToImage.java:244) at org.apache.pdfbox.PDFBox.main(PDFBox.java:58) Sep 24, 2012 11:15:05 AM org.apache.pdfbox.util.PDFStreamEngine processOperator INFO: unsupported/disabled operation: EX Sep 24, 2012 11:15:05 AM org.apache.pdfbox.util.PDFStreamEngine processOperator INFO: unsupported/disabled operation: i Sep 24, 2012 11:15:05 AM org.apache.pdfbox.util.operator.pagedrawer.SHFill process WARNING: java.lang.NullPointerException java.lang.NullPointerException at org.apache.pdfbox.pdmodel.graphics.shading.AxialShadingContext.getRaster(AxialShadingContext.java:244) at sun.java2d.pipe.AlphaPaintPipe.renderPathTile(Unknown Source) at sun.java2d.pipe.SpanShapeRenderer$Composite.renderBox(Unknown Source) at sun.java2d.pipe.SpanShapeRenderer.spanClipLoop(Unknown Source) at sun.java2d.pipe.SpanShapeRenderer.renderSpans(Unknown Source) at sun.java2d.pipe.SpanShapeRenderer.fill(Unknown Source) at sun.java2d.pipe.ValidatePipe.fill(Unknown Source) at sun.java2d.SunGraphics2D.fill(Unknown Source) at org.apache.pdfbox.pdfviewer.PageDrawer.shFill(PageDrawer.java:498) at org.apache.pdfbox.util.operator.pagedrawer.SHFill.process(SHFill.java:57) at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:237) at org.apache.pdfbox.util.operator.pagedrawer.Invoke.process(Invoke.java:137) at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:237) at
[jira] [Comment Edited] (PDFBOX-1848) Time Stamp Document Level Sigature
[ https://issues.apache.org/jira/browse/PDFBOX-1848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872145#comment-13872145 ] vakhtang koroghlishvili edited comment on PDFBOX-1848 at 1/15/14 3:20 PM: -- ok ... :) was (Author: v.koroghlishvili): No problem :) thank you :) Time Stamp Document Level Sigature -- Key: PDFBOX-1848 URL: https://issues.apache.org/jira/browse/PDFBOX-1848 Project: PDFBox Issue Type: Improvement Components: Signing Reporter: vakhtang koroghlishvili We need TSA Document Level signature modulo too! At the moment we sign document with our certificate. But... sometimes we need to sign document with TSA too. This is important part of signing. Sometimes this is very very very important- for instance when we will implement PAdES 4 profile this module will be essential. without that Document Secure Store will not work :) I'm working on this improvement. I'will finish this soon. It's almost done. I only must add some java docs, and might be I change architect design and etc.. So, please assign this it to me :) I will upload patch as soon as possible :) -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PDFBOX-615) shfill operator needs implementation
[ https://issues.apache.org/jira/browse/PDFBOX-615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872242#comment-13872242 ] Tilman Hausherr commented on PDFBOX-615: My change from yesterday doesn't work for the ch14.pdf file, because 1) the color data for the triangles is partially incomplete, 2) my code works for pattern shading, not for shfill. I'll use a different strategy which will still use most of the new code. shfill operator needs implementation Key: PDFBOX-615 URL: https://issues.apache.org/jira/browse/PDFBOX-615 Project: PDFBox Issue Type: New Feature Components: PDModel Reporter: Daniel Wilson Assignee: Daniel Wilson Attachments: Centerplan.pdf, DECAHED.pdf, axial-input-after.png, axial-input-before.png, axial-input.pdf, bugzilla843488.pdf, bugzilla843488.pdf-1.png, color_gradient.pdf, color_gradient.pdf-1.png, decahed.pdf-1.png, input.pdf, input1.png, pdfbox-1.8.patch, pdfbox.patch, pslib-shading.pdf, radial-input-after.png, radial-input-before.png, radial-input.pdf, shading_pattern.pdf, shading_pattern.pdf-2.png I have a PDF file (for which I do not yet have release permission) that uses the sh operator, equivalent to PostScript's shfill (per PDF spec 1.7 page 987). Adobe provides implementation guidance in a 78-page document at http://www.adobe.com/devnet/postscript/pdfs/TN5600.SmoothShading.pdf#17 I will be trying to add this functionality this week, but if anyone has hints, suggestions, etc. they are most certainly welcome! -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PDFBOX-1416) missing image in reader/image output
[ https://issues.apache.org/jira/browse/PDFBOX-1416?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872251#comment-13872251 ] Tilman Hausherr commented on PDFBOX-1416: - Page 1 of the file has type 2 and 3 shading. The background looks different. missing image in reader/image output Key: PDFBOX-1416 URL: https://issues.apache.org/jira/browse/PDFBOX-1416 Project: PDFBox Issue Type: Bug Components: PDFReader Affects Versions: 1.7.1 Reporter: Joseph Berglund Assignee: Andreas Lehmkühler Attachments: missing_image.pdf, missing_image1.jpg, missing_image2.jpg See attached images and PDF. One image on the first page is missing. Two images have their backgrounds turned to blue. Output from PDFToImage Sep 24, 2012 11:15:05 AM org.apache.pdfbox.util.PDFStreamEngine processOperator INFO: unsupported/disabled operation: BDC Sep 24, 2012 11:15:05 AM org.apache.pdfbox.util.PDFStreamEngine processOperator INFO: unsupported/disabled operation: ri Sep 24, 2012 11:15:05 AM org.apache.pdfbox.util.PDFStreamEngine processOperator INFO: unsupported/disabled operation: EMC Sep 24, 2012 11:15:05 AM org.apache.pdfbox.util.PDFStreamEngine processOperator INFO: unsupported/disabled operation: BX Sep 24, 2012 11:15:05 AM org.apache.pdfbox.util.operator.pagedrawer.SHFill process WARNING: java.lang.NullPointerException java.lang.NullPointerException at org.apache.pdfbox.pdmodel.graphics.shading.RadialShadingContext.getRaster(RadialShadingContext.java:282) at sun.java2d.pipe.AlphaPaintPipe.renderPathTile(Unknown Source) at sun.java2d.pipe.SpanShapeRenderer$Composite.renderBox(Unknown Source) at sun.java2d.pipe.SpanShapeRenderer.spanClipLoop(Unknown Source) at sun.java2d.pipe.SpanShapeRenderer.renderSpans(Unknown Source) at sun.java2d.pipe.SpanShapeRenderer.fill(Unknown Source) at sun.java2d.pipe.ValidatePipe.fill(Unknown Source) at sun.java2d.SunGraphics2D.fill(Unknown Source) at org.apache.pdfbox.pdfviewer.PageDrawer.shFill(PageDrawer.java:498) at org.apache.pdfbox.util.operator.pagedrawer.SHFill.process(SHFill.java:57) at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:237) at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:217) at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:119) at org.apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.java:730) at org.apache.pdfbox.util.PDFImageWriter.writeImage(PDFImageWriter.java:115) at org.apache.pdfbox.PDFToImage.main(PDFToImage.java:244) at org.apache.pdfbox.PDFBox.main(PDFBox.java:58) Sep 24, 2012 11:15:05 AM org.apache.pdfbox.util.PDFStreamEngine processOperator INFO: unsupported/disabled operation: EX Sep 24, 2012 11:15:05 AM org.apache.pdfbox.util.PDFStreamEngine processOperator INFO: unsupported/disabled operation: i Sep 24, 2012 11:15:05 AM org.apache.pdfbox.util.operator.pagedrawer.SHFill process WARNING: java.lang.NullPointerException java.lang.NullPointerException at org.apache.pdfbox.pdmodel.graphics.shading.AxialShadingContext.getRaster(AxialShadingContext.java:244) at sun.java2d.pipe.AlphaPaintPipe.renderPathTile(Unknown Source) at sun.java2d.pipe.SpanShapeRenderer$Composite.renderBox(Unknown Source) at sun.java2d.pipe.SpanShapeRenderer.spanClipLoop(Unknown Source) at sun.java2d.pipe.SpanShapeRenderer.renderSpans(Unknown Source) at sun.java2d.pipe.SpanShapeRenderer.fill(Unknown Source) at sun.java2d.pipe.ValidatePipe.fill(Unknown Source) at sun.java2d.SunGraphics2D.fill(Unknown Source) at org.apache.pdfbox.pdfviewer.PageDrawer.shFill(PageDrawer.java:498) at org.apache.pdfbox.util.operator.pagedrawer.SHFill.process(SHFill.java:57) at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:237) at org.apache.pdfbox.util.operator.pagedrawer.Invoke.process(Invoke.java:137) at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:237) at
Re: TSA
Am 2014-01-15 13:00, schrieb Vakhtang koroghlishvili: Hello, Hi, [SNIP] I've already implement time stamp time signature (PDFBOX-1847https://issues.apache.org/jira/browse/PDFBOX-1847 ). Now I will implement TSA Document level signature. Then I will implement PAdES Long Term (This means that we must create Document Secure Store) :) Thanks again for your effort, as you are planning to contribute some more code, it would be good to sign a Contributor License Agreements (CLA) [1]. If that codes that you contribute come from you not your company, you can sign the Individual CLA [2]. If it is a company work and more persons worked on the code, you should sign a Corporate CLA [3]. If you have questions, don't hesitate to ask :-) thanks, Best regards Best regards Thomas [1] http://www.apache.org/licenses/ [2] http://www.apache.org/licenses/icla.pdf [3] http://www.apache.org/licenses/cla-corporate.txt
[jira] [Commented] (PDFBOX-1812) Illegal characters in XML output
[ https://issues.apache.org/jira/browse/PDFBOX-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872279#comment-13872279 ] Johan van der Knijff commented on PDFBOX-1812: -- Thanks again Andreas and Guillaume, I just did a quick test with the latest build and it all seems to work for me as well! Illegal characters in XML output Key: PDFBOX-1812 URL: https://issues.apache.org/jira/browse/PDFBOX-1812 Project: PDFBox Issue Type: Bug Components: Preflight Affects Versions: 2.0.0 Environment: Bug reproduced under Win 7, Ubuntu Reporter: Johan van der Knijff Assignee: Andreas Lehmkühler Labels: characters, utf-8, xml Fix For: 1.8.4, 2.0.0 Attachments: 013814.pdf, 013814.xml, 013814_old.xml, 1812-additionalPDFs09012014.zip, 598659.pdf, 598659.xml, 598659_old.xml, 600111.pdf, 600111.xml, 600111_old.xml, preflight-app.jar When running Preflight in XML mode, the latest Preflight version (I used the JAR from build #747) sometimes produces output that contains characters that are illegal in XML. This can cause unexpected behavior if such files are further processed with tools that expect well-formed XML. See attached PDFs, which all result in illegal characters in the description of a 1.0 Syntax error, Error: Expected a long type. Output of older versions of Preflight didn't contain these illegal characters; instead they would give something like *actual='/O'*, *actual='Pages'*. etc. So I suppose this must have been caused by a fairly recent change. See attachments below. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Re: TSA
Hi, As I guess, I should update my patches, in order to add licenses. It's my own codes so I will add Contributor License Agreements (CLA) [1]. :) Best regards, On Wed, Jan 15, 2014 at 8:49 PM, Thomas Chojecki i...@rayman2200.de wrote: Am 2014-01-15 13:00, schrieb Vakhtang koroghlishvili: Hello, Hi, [SNIP] I've already implement time stamp time signature (PDFBOX-1847https://issues.apache.org/jira/browse/PDFBOX-1847 ). Now I will implement TSA Document level signature. Then I will implement PAdES Long Term (This means that we must create Document Secure Store) :) Thanks again for your effort, as you are planning to contribute some more code, it would be good to sign a Contributor License Agreements (CLA) [1]. If that codes that you contribute come from you not your company, you can sign the Individual CLA [2]. If it is a company work and more persons worked on the code, you should sign a Corporate CLA [3]. If you have questions, don't hesitate to ask :-) thanks, Best regards Best regards Thomas [1] http://www.apache.org/licenses/ [2] http://www.apache.org/licenses/icla.pdf [3] http://www.apache.org/licenses/cla-corporate.txt
[jira] [Commented] (PDFBOX-1646) [PATCH] Add method for retrieving CFF bounding box from CFFFont class with slight refactoring and optimization.
[ https://issues.apache.org/jira/browse/PDFBOX-1646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872407#comment-13872407 ] John Hewson commented on PDFBOX-1646: - The FOP project no longer requires the bounding box patch [1], which makes this issue redundant, can we close it? [1] https://issues.apache.org/jira/browse/PDFBOX-1645?focusedCommentId=13871851page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13871851 [PATCH] Add method for retrieving CFF bounding box from CFFFont class with slight refactoring and optimization. --- Key: PDFBOX-1646 URL: https://issues.apache.org/jira/browse/PDFBOX-1646 Project: PDFBox Issue Type: Improvement Components: FontBox Affects Versions: 1.8.2 Reporter: Robert Meyer Assignee: Andreas Lehmkühler Fix For: 2.0.0 Attachments: patch-optimize.diff, patch.diff I have added a method to the CFFFont class to retrieve the bounding box for a character determined by an SID as well as retrieving a name. I have also slightly modified the existing code so that each sid mapping can be retrieved now using the SID as the key from a map. From looking around there are several examples of where iterative loops are used using the original mapping array: CFFFontROS.java:165 CFFParser.java:876 I haven't changed those locations yet, but they can be made in a separate patch which should boost performance. There was a small bit of refactoring done as well just because I now retrieve a renderer from two locations. These patches are part of adding OTF CFF support to Apache FOP. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PDFBOX-1645) [PATCH] Improved the accuracy of the bounding box for each rendered CFF glyph
[ https://issues.apache.org/jira/browse/PDFBOX-1645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872408#comment-13872408 ] John Hewson commented on PDFBOX-1645: - Ok, lets leave this issue open for now while I look into your operand checks. [PATCH] Improved the accuracy of the bounding box for each rendered CFF glyph - Key: PDFBOX-1645 URL: https://issues.apache.org/jira/browse/PDFBOX-1645 Project: PDFBox Issue Type: Improvement Components: FontBox Affects Versions: 1.8.2 Reporter: Robert Meyer Assignee: Andreas Lehmkühler Fix For: 2.0.0 Attachments: characterl.png, charactert.png, patch-20131202.diff, patch.diff In a previous patch to the CharStringRenderer class, I resolved the rendering issues and added a method to retrieve the bounding box for a CFF glyph. This utilized the GeneralPath.getBounds() method to retrieve it's bounding box. Unfortunately it was found that the method uses the control points of the bezier curves instead of the actual lines and was not very accurate. I have therefore added several new methods to calculate the correct extents of the glyph so that now it matches that of the measurements found in tools like FontForge. As a side note, there are several checks which were originally added in my patch which were unfortunately removed relating to the number of arguments provided with an operator. I have one Adobe Font (Adobe Heiti Standard - CID-Keyed OTF) which has one or more glyphs which trip up on this and cause an Array index out of Bounds exception. Each glyph renders correctly even though this issue occurs and therefore would be grateful if these could be left in. I have re-added these checks back with the patch I am about to add. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PDFBOX-1796) Infiniteloop BaseParser.java:1010
[ https://issues.apache.org/jira/browse/PDFBOX-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872434#comment-13872434 ] Andreas Lehmkühler commented on PDFBOX-1796: I added the fix to the 1.8 branch in revision 1558517. Infiniteloop BaseParser.java:1010 - Key: PDFBOX-1796 URL: https://issues.apache.org/jira/browse/PDFBOX-1796 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 1.8.3 Reporter: Manfred Schauer Fix For: 1.8.4, 2.0.0 Attachments: dls.pdf, rsag.pdf infinite loop at org.apache.pdfbox.pdfparser.BaseParser.parseCOSHexString(BaseParser.java:1010) private final COSString parseCOSHexString() throws IOException { ... // read till the closing bracket was found do { c = pdfSource.read(); } while ( c != '' ); ... if pdfSource.read() returns EOF, the loop never terminates; Testcase: PDDocument doc = PDDocument.load (new FileInputStream(...)); 2 real world pdf-files that cause the loop could be attached; do not know if their PDF is completely valid, but at least they are displayed via Preview in MacOSX. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1796) Infiniteloop BaseParser.java:1010
[ https://issues.apache.org/jira/browse/PDFBOX-1796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler updated PDFBOX-1796: --- Fix Version/s: 1.8.4 Infiniteloop BaseParser.java:1010 - Key: PDFBOX-1796 URL: https://issues.apache.org/jira/browse/PDFBOX-1796 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 1.8.3 Reporter: Manfred Schauer Fix For: 1.8.4, 2.0.0 Attachments: dls.pdf, rsag.pdf infinite loop at org.apache.pdfbox.pdfparser.BaseParser.parseCOSHexString(BaseParser.java:1010) private final COSString parseCOSHexString() throws IOException { ... // read till the closing bracket was found do { c = pdfSource.read(); } while ( c != '' ); ... if pdfSource.read() returns EOF, the loop never terminates; Testcase: PDDocument doc = PDDocument.load (new FileInputStream(...)); 2 real world pdf-files that cause the loop could be attached; do not know if their PDF is completely valid, but at least they are displayed via Preview in MacOSX. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1585) org.apache.pdfbox.util.PDFTextStripper.getText() causes thread to block indefinitely
[ https://issues.apache.org/jira/browse/PDFBOX-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler updated PDFBOX-1585: --- Fix Version/s: 1.8.4 org.apache.pdfbox.util.PDFTextStripper.getText() causes thread to block indefinitely Key: PDFBOX-1585 URL: https://issues.apache.org/jira/browse/PDFBOX-1585 Project: PDFBox Issue Type: Bug Components: PDFReader, Text extraction Affects Versions: 1.8.1 Environment: Ubuntu Linux 10.04 Solaris 10 Java 1.6.0_34 Reporter: Sascha Szott Assignee: Andreas Lehmkühler Fix For: 1.8.4, 2.0.0 Attachments: PDFBOX-1585.patch URL of the problematic pdf file is http://www.redalyc.org/pdf/540/54017220.pdf My program tries to extract the fulltext of the given pdf file in the following manner: {code} String fileName = /home/sascha/testfile.pdf // 1 PDDocument pdDoc = PDDocument.load(fileName, true); // 2 PDFTextStripper text = new PDFTextStripper(); // 3 String fullText = text.getText(pdDoc); // 4 {code} The call in line 4 causes the thread to block indefinitely (runs now for more than two days without making any progress). The file is stored in a local file system (no network interaction occurs). jstack indicates that the thread is not deadlocked: {code} main prio=10 tid=0x4187d800 nid=0x6ed8 runnable [0x7f9e28e56000] java.lang.Thread.State: RUNNABLE at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) - locked 0x0007d73a84a0 (a java.io.BufferedInputStream) at java.io.FilterInputStream.read(FilterInputStream.java:66) at java.io.PushbackInputStream.read(PushbackInputStream.java:122) at org.apache.pdfbox.io.PushBackInputStream.read(PushBackInputStream.java:91) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSHexString(BaseParser.java:1006) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSString(BaseParser.java:808) at org.apache.pdfbox.pdfparser.PDFStreamParser.parseNextToken(PDFStreamParser.java:260) at org.apache.pdfbox.pdfparser.PDFStreamParser.access$000(PDFStreamParser.java:46) at org.apache.pdfbox.pdfparser.PDFStreamParser$1.tryNext(PDFStreamParser.java:182) at org.apache.pdfbox.pdfparser.PDFStreamParser$1.hasNext(PDFStreamParser.java:194) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:255) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:235) at org.apache.pdfbox.util.operator.Invoke.process(Invoke.java:67) at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:554) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:268) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:235) at org.apache.pdfbox.util.operator.Invoke.process(Invoke.java:67) at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:554) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:268) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:235) at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:215) at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:455) at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:379) at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:335) at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:254) at de.kobv.ked.extraction.FulltextExtraction.getFulltext(FulltextExtraction.java:65) {code} Any idea or advice on how to fix that problem? Is it possible to set up a timeout for the extraction operation? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PDFBOX-1585) org.apache.pdfbox.util.PDFTextStripper.getText() causes thread to block indefinitely
[ https://issues.apache.org/jira/browse/PDFBOX-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872432#comment-13872432 ] Andreas Lehmkühler commented on PDFBOX-1585: I added the fix to the 1.8 branch in revision 1558517. org.apache.pdfbox.util.PDFTextStripper.getText() causes thread to block indefinitely Key: PDFBOX-1585 URL: https://issues.apache.org/jira/browse/PDFBOX-1585 Project: PDFBox Issue Type: Bug Components: PDFReader, Text extraction Affects Versions: 1.8.1 Environment: Ubuntu Linux 10.04 Solaris 10 Java 1.6.0_34 Reporter: Sascha Szott Assignee: Andreas Lehmkühler Fix For: 1.8.4, 2.0.0 Attachments: PDFBOX-1585.patch URL of the problematic pdf file is http://www.redalyc.org/pdf/540/54017220.pdf My program tries to extract the fulltext of the given pdf file in the following manner: {code} String fileName = /home/sascha/testfile.pdf // 1 PDDocument pdDoc = PDDocument.load(fileName, true); // 2 PDFTextStripper text = new PDFTextStripper(); // 3 String fullText = text.getText(pdDoc); // 4 {code} The call in line 4 causes the thread to block indefinitely (runs now for more than two days without making any progress). The file is stored in a local file system (no network interaction occurs). jstack indicates that the thread is not deadlocked: {code} main prio=10 tid=0x4187d800 nid=0x6ed8 runnable [0x7f9e28e56000] java.lang.Thread.State: RUNNABLE at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) at java.io.BufferedInputStream.read(BufferedInputStream.java:237) - locked 0x0007d73a84a0 (a java.io.BufferedInputStream) at java.io.FilterInputStream.read(FilterInputStream.java:66) at java.io.PushbackInputStream.read(PushbackInputStream.java:122) at org.apache.pdfbox.io.PushBackInputStream.read(PushBackInputStream.java:91) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSHexString(BaseParser.java:1006) at org.apache.pdfbox.pdfparser.BaseParser.parseCOSString(BaseParser.java:808) at org.apache.pdfbox.pdfparser.PDFStreamParser.parseNextToken(PDFStreamParser.java:260) at org.apache.pdfbox.pdfparser.PDFStreamParser.access$000(PDFStreamParser.java:46) at org.apache.pdfbox.pdfparser.PDFStreamParser$1.tryNext(PDFStreamParser.java:182) at org.apache.pdfbox.pdfparser.PDFStreamParser$1.hasNext(PDFStreamParser.java:194) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:255) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:235) at org.apache.pdfbox.util.operator.Invoke.process(Invoke.java:67) at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:554) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:268) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:235) at org.apache.pdfbox.util.operator.Invoke.process(Invoke.java:67) at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:554) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:268) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:235) at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:215) at org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:455) at org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:379) at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:335) at org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:254) at de.kobv.ked.extraction.FulltextExtraction.getFulltext(FulltextExtraction.java:65) {code} Any idea or advice on how to fix that problem? Is it possible to set up a timeout for the extraction operation? -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1668) Loading a Russian PDF never finishes
[ https://issues.apache.org/jira/browse/PDFBOX-1668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler updated PDFBOX-1668: --- Fix Version/s: 1.8.4 Loading a Russian PDF never finishes - Key: PDFBOX-1668 URL: https://issues.apache.org/jira/browse/PDFBOX-1668 Project: PDFBox Issue Type: Bug Components: Parsing Affects Versions: 2.0.0 Reporter: Sergio Fernández Priority: Minor Fix For: 1.8.4, 2.0.0 Try to run this line: PDDocument.load(new URL(http://www.who.int/entity/foodsafety/publications/general/en/global_strategy_ru.pdf;)); The loading never finishes... taking a lot of CPU. The document size (574K) should not be the problem. I guess something in that document causes the issue with PdfBox. And I'd like to know if such could be a more general issue or what. Thanks! -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PDFBOX-1685) Verify interpretation of rdf:about for PDF/A
[ https://issues.apache.org/jira/browse/PDFBOX-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872440#comment-13872440 ] Andreas Lehmkühler commented on PDFBOX-1685: Added to the 1.8 branch in revision 1558521 Verify interpretation of rdf:about for PDF/A Key: PDFBOX-1685 URL: https://issues.apache.org/jira/browse/PDFBOX-1685 Project: PDFBox Issue Type: Task Components: Preflight Reporter: Maruan Sahyoun Assignee: Eric Leleu Priority: Minor Fix For: 1.8.4, 2.0.0 Attachments: test-bfo.pdf There was a discussion about handling rdf:about for PDF/A validation on the PDF Associations mailing list which I'm allowed to share: snip In this case we have a PDF with an XMP metadata stream containing two rdf:RDF entries, one with rdf:about set to a blank string, the other with it set to a UUID. The PDF/A specification (ISO-19005-1:2005(E) para 6.7.2) simply says that the stream must conform to the XMP specification 2004 revision which reads (p21): The rdf:about attribute on the rdf:Description element is a required attribute that identifies the resource whose metadata this XMP describes. The value of this attribute must follow URI syntax and may be either: ● an empty string (as in the example above), which means that the XMP is physically local to the resource being described. Applications must rely on knowledge of the file format to correctly associate the XMP with the resource. ● a unique instance ID that is generated every time a file is saved. The next section gives guidelines for creating instance IDs. The XMP packet must describe a single entity, and my reading of the above is a combination of empty-string and a unique UUID can meet this requirement - this is how both our software and Acrobat X and XI behave. However it's ambiguous, and this clause was revised in the 2012 revision (ISO 16684-1:2011(E) para 7.4) to this: If the XMP data model has an AboutURI (6.1, “XMP packets”), that same URI shall be the value of an rdf:about attribute in each top-level rdf:Description element. Otherwise, the rdf:about attributes for all top- level rdf:Description elements shall be present with an empty value. The rdf:about attribute shall not be used in more deeply nested rdf:Description elements. For compatibility with very early XMP usage, it is recommended that XMP readers tolerate a missing rdf:about attribute and treat it as present with an empty value. It is also recommended that XMP readers tolerate a mix of empty and non-empty rdf:about values, as long as all non-empty values are identical. Which means that an empty string and a unique UUID are technically incorrect, but it's recommended they be tolerated for compatibility purposes. /snip I might be good to check our interpretation as snip BFO and Acrobat X and XI think this is valid, PDFBox and pdf-tools.com online validator lean the other and classify this document as invalid. /snip to see if we should change our interpretation. If there is new input on the pdfa.org mailinglist I'll capture it here too. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1685) Verify interpretation of rdf:about for PDF/A
[ https://issues.apache.org/jira/browse/PDFBOX-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler updated PDFBOX-1685: --- Fix Version/s: 1.8.4 Verify interpretation of rdf:about for PDF/A Key: PDFBOX-1685 URL: https://issues.apache.org/jira/browse/PDFBOX-1685 Project: PDFBox Issue Type: Task Components: Preflight Reporter: Maruan Sahyoun Assignee: Eric Leleu Priority: Minor Fix For: 1.8.4, 2.0.0 Attachments: test-bfo.pdf There was a discussion about handling rdf:about for PDF/A validation on the PDF Associations mailing list which I'm allowed to share: snip In this case we have a PDF with an XMP metadata stream containing two rdf:RDF entries, one with rdf:about set to a blank string, the other with it set to a UUID. The PDF/A specification (ISO-19005-1:2005(E) para 6.7.2) simply says that the stream must conform to the XMP specification 2004 revision which reads (p21): The rdf:about attribute on the rdf:Description element is a required attribute that identifies the resource whose metadata this XMP describes. The value of this attribute must follow URI syntax and may be either: ● an empty string (as in the example above), which means that the XMP is physically local to the resource being described. Applications must rely on knowledge of the file format to correctly associate the XMP with the resource. ● a unique instance ID that is generated every time a file is saved. The next section gives guidelines for creating instance IDs. The XMP packet must describe a single entity, and my reading of the above is a combination of empty-string and a unique UUID can meet this requirement - this is how both our software and Acrobat X and XI behave. However it's ambiguous, and this clause was revised in the 2012 revision (ISO 16684-1:2011(E) para 7.4) to this: If the XMP data model has an AboutURI (6.1, “XMP packets”), that same URI shall be the value of an rdf:about attribute in each top-level rdf:Description element. Otherwise, the rdf:about attributes for all top- level rdf:Description elements shall be present with an empty value. The rdf:about attribute shall not be used in more deeply nested rdf:Description elements. For compatibility with very early XMP usage, it is recommended that XMP readers tolerate a missing rdf:about attribute and treat it as present with an empty value. It is also recommended that XMP readers tolerate a mix of empty and non-empty rdf:about values, as long as all non-empty values are identical. Which means that an empty string and a unique UUID are technically incorrect, but it's recommended they be tolerated for compatibility purposes. /snip I might be good to check our interpretation as snip BFO and Acrobat X and XI think this is valid, PDFBox and pdf-tools.com online validator lean the other and classify this document as invalid. /snip to see if we should change our interpretation. If there is new input on the pdfa.org mailinglist I'll capture it here too. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1707) Add dispose() when done with graphics
[ https://issues.apache.org/jira/browse/PDFBOX-1707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler updated PDFBOX-1707: --- Fix Version/s: 1.8.4 Add dispose() when done with graphics - Key: PDFBOX-1707 URL: https://issues.apache.org/jira/browse/PDFBOX-1707 Project: PDFBox Issue Type: Improvement Affects Versions: 1.8.3, 2.0.0 Reporter: Tilman Hausherr Assignee: Andreas Lehmkühler Priority: Minor Fix For: 1.8.4, 2.0.0 Attachments: JBIG2Filter.patch, PDXObjectImage.patch Original Estimate: 5m Remaining Estimate: 5m Please add dispose() in pdfbox\filter\JBIG2Filter.java pdfbox\pdmodel\graphics\xobject\PDXObjectImage.java as recommended by javadoc. Patches are attached. I've also added @override at some places. The problem is that not having it brings an additional yellow bar in netbeans at the right. Which prevents seeing the more important bars. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PDFBOX-1725) Character rendered at wrong position
[ https://issues.apache.org/jira/browse/PDFBOX-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872458#comment-13872458 ] Andreas Lehmkühler commented on PDFBOX-1725: Added the changes to the 1.8 branch in revision 1558531 Character rendered at wrong position Key: PDFBOX-1725 URL: https://issues.apache.org/jira/browse/PDFBOX-1725 Project: PDFBox Issue Type: Bug Affects Versions: 2.0.0 Reporter: Tilman Hausherr Assignee: Andreas Lehmkühler Priority: Minor Labels: regression Fix For: 1.8.4, 2.0.0 Attachments: HoulihansVeggieMenu.pdf-1.png There is a regression produced by one of the (otherwise successful) changes of this weekend / monday. When rendering the file of PDFBOX-1608, one character is at the wrong position. It is the star below the text tuscan white bean salad, somewhat in the middle of the image, below the green text soups and side salad. (the second vegan modification in that section) That star is too much on the left. This worked fine on monday before the changes. To be sure that it isn't because of my own non committed changes, I checked out a clean 2.0 version and copied the pdf file in the pdfbox\src\test\resources\input\rendering directory and looked at the pdfbox\target\test-output directory. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1725) Character rendered at wrong position
[ https://issues.apache.org/jira/browse/PDFBOX-1725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler updated PDFBOX-1725: --- Fix Version/s: 1.8.4 Character rendered at wrong position Key: PDFBOX-1725 URL: https://issues.apache.org/jira/browse/PDFBOX-1725 Project: PDFBox Issue Type: Bug Affects Versions: 2.0.0 Reporter: Tilman Hausherr Assignee: Andreas Lehmkühler Priority: Minor Labels: regression Fix For: 1.8.4, 2.0.0 Attachments: HoulihansVeggieMenu.pdf-1.png There is a regression produced by one of the (otherwise successful) changes of this weekend / monday. When rendering the file of PDFBOX-1608, one character is at the wrong position. It is the star below the text tuscan white bean salad, somewhat in the middle of the image, below the green text soups and side salad. (the second vegan modification in that section) That star is too much on the left. This worked fine on monday before the changes. To be sure that it isn't because of my own non committed changes, I checked out a clean 2.0 version and copied the pdf file in the pdfbox\src\test\resources\input\rendering directory and looked at the pdfbox\target\test-output directory. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1763) Exception caused by Invalid ICC Profile Data
[ https://issues.apache.org/jira/browse/PDFBOX-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler updated PDFBOX-1763: --- Fix Version/s: 1.8.4 Exception caused by Invalid ICC Profile Data --- Key: PDFBOX-1763 URL: https://issues.apache.org/jira/browse/PDFBOX-1763 Project: PDFBox Issue Type: Bug Components: Preflight Affects Versions: 2.0.0 Environment: Win 7 Reporter: Johan van der Knijff Assignee: Eric Leleu Labels: icc Fix For: 1.8.4, 2.0.0 Sometimes Preflight raises the exception Invalid ICC Profile Data. Some example files that produce this problem are: http://acroeng.adobe.com/Test_Files/fonts//printtestfont_nonopt.pdf http://acroeng.adobe.com/Test_Files/fonts//printtestfont_opt.pdf http://www.math.uakron.edu/~dpstory/tutorial/pdfmarks/links.pdf I also checked these files with Acrobat's Preflight function, which reports ICC profiles that are either not valid or that follow the ICC profile 4.0 version or newer (which are only allowed in PDF 1.5 onward). It would be nice if Preflight would report these errors without raising an exception. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PDFBOX-1763) Exception caused by Invalid ICC Profile Data
[ https://issues.apache.org/jira/browse/PDFBOX-1763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872470#comment-13872470 ] Andreas Lehmkühler commented on PDFBOX-1763: Added the changes to the 1.8 branch in revision 1558537 Exception caused by Invalid ICC Profile Data --- Key: PDFBOX-1763 URL: https://issues.apache.org/jira/browse/PDFBOX-1763 Project: PDFBox Issue Type: Bug Components: Preflight Affects Versions: 2.0.0 Environment: Win 7 Reporter: Johan van der Knijff Assignee: Eric Leleu Labels: icc Fix For: 1.8.4, 2.0.0 Sometimes Preflight raises the exception Invalid ICC Profile Data. Some example files that produce this problem are: http://acroeng.adobe.com/Test_Files/fonts//printtestfont_nonopt.pdf http://acroeng.adobe.com/Test_Files/fonts//printtestfont_opt.pdf http://www.math.uakron.edu/~dpstory/tutorial/pdfmarks/links.pdf I also checked these files with Acrobat's Preflight function, which reports ICC profiles that are either not valid or that follow the ICC profile 4.0 version or newer (which are only allowed in PDF 1.5 onward). It would be nice if Preflight would report these errors without raising an exception. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Assigned] (PDFBOX-1829) PDF Extract Image Pixelmap Issue
[ https://issues.apache.org/jira/browse/PDFBOX-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler reassigned PDFBOX-1829: -- Assignee: Andreas Lehmkühler PDF Extract Image Pixelmap Issue Key: PDFBOX-1829 URL: https://issues.apache.org/jira/browse/PDFBOX-1829 Project: PDFBox Issue Type: Bug Affects Versions: 1.6.0, 1.8.3 Reporter: Jonas Mende Assignee: Andreas Lehmkühler Fix For: 1.8.4, 2.0.0 Attachments: ausgabe109.pdf Hello everyone, In our current project we are using pdfbox version 1.6.0 as part of an integrated media management solution. When extracting the first page of PDFs, we encounter a certain error for some of the files. The error log looks as follows: 2013-12-20 10:09:14,471 WARN org.apache.pdfbox.util.operator.pagedrawer.SHFill : java.io.IOException: Not Implemented java.io.IOException: Not Implemented at org.apache.pdfbox.pdfviewer.PageDrawer.SHFill_Radial(PageDrawer.java:493) at org.apache.pdfbox.pdfviewer.PageDrawer.SHFill(PageDrawer.java:415) at org.apache.pdfbox.util.operator.pagedrawer.SHFill.process(SHFill.java:58) at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:551) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:274) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:251) at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:225) at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:107) at org.apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.java:722) at net.sourceforge.openutils.mgnlmedia.media.types.impl.DocumentTypeHandler.createPdfPreview(DocumentTypeHandler.java:141) at net.sourceforge.openutils.mgnlmedia.media.types.impl.DocumentTypeHandler.onPostSave(DocumentTypeHandler.java:96) at net.sourceforge.openutils.mgnlmedia.media.dialog.LayerDialogMVC.onPostSave(LayerDialogMVC.java:152) at info.magnolia.module.admininterface.DialogMVCHandler.save(DialogMVCHandler.java:236) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at info.magnolia.cms.servlets.MVCServletHandlerImpl.execute(MVCServletHandlerImpl.java:121) at info.magnolia.cms.servlets.MVCServlet.doPost(MVCServlet.java:125) at javax.servlet.http.HttpServlet.service(HttpServlet.java:637) at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) at info.magnolia.cms.filters.ServletDispatchingFilter.doFilter(ServletDispatchingFilter.java:123) at info.magnolia.cms.filters.AbstractMgnlFilter.doFilter(AbstractMgnlFilter.java:91) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:83) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:85) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:85) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:85) at info.magnolia.cms.filters.CompositeFilter.doFilter(CompositeFilter.java:67) at info.magnolia.cms.filters.AbstractMgnlFilter.doFilter(AbstractMgnlFilter.java:91) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:83) at info.magnolia.cms.filters.VirtualUriFilter.doFilter(VirtualUriFilter.java:70) at info.magnolia.cms.filters.AbstractMgnlFilter.doFilter(AbstractMgnlFilter.java:91) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:83) at info.magnolia.module.cache.executor.Bypass.processCacheRequest(Bypass.java:58) at info.magnolia.module.cache.executor.CompositeExecutor.processCacheRequest(CompositeExecutor.java:66) at info.magnolia.module.cache.filter.CacheFilter.doFilter(CacheFilter.java:153) at info.magnolia.cms.filters.OncePerRequestAbstractMgnlFilter.doFilter(OncePerRequestAbstractMgnlFilter.java:61) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:83) at info.magnolia.cms.i18n.I18nContentSupportFilter.doFilter(I18nContentSupportFilter.java:76) at info.magnolia.cms.filters.AbstractMgnlFilter.doFilter(AbstractMgnlFilter.java:91) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:83) at info.magnolia.cms.filters.RangeSupportFilter.doFilter(RangeSupportFilter.java:84) at info.magnolia.cms.filters.AbstractMgnlFilter.doFilter(AbstractMgnlFilter.java:91) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:83) at
[jira] [Updated] (PDFBOX-1829) PDF Extract Image Pixelmap Issue
[ https://issues.apache.org/jira/browse/PDFBOX-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler updated PDFBOX-1829: --- Fix Version/s: 2.0.0 1.8.4 PDF Extract Image Pixelmap Issue Key: PDFBOX-1829 URL: https://issues.apache.org/jira/browse/PDFBOX-1829 Project: PDFBox Issue Type: Bug Affects Versions: 1.6.0, 1.8.3 Reporter: Jonas Mende Assignee: Andreas Lehmkühler Fix For: 1.8.4, 2.0.0 Attachments: ausgabe109.pdf Hello everyone, In our current project we are using pdfbox version 1.6.0 as part of an integrated media management solution. When extracting the first page of PDFs, we encounter a certain error for some of the files. The error log looks as follows: 2013-12-20 10:09:14,471 WARN org.apache.pdfbox.util.operator.pagedrawer.SHFill : java.io.IOException: Not Implemented java.io.IOException: Not Implemented at org.apache.pdfbox.pdfviewer.PageDrawer.SHFill_Radial(PageDrawer.java:493) at org.apache.pdfbox.pdfviewer.PageDrawer.SHFill(PageDrawer.java:415) at org.apache.pdfbox.util.operator.pagedrawer.SHFill.process(SHFill.java:58) at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:551) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:274) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:251) at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:225) at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:107) at org.apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.java:722) at net.sourceforge.openutils.mgnlmedia.media.types.impl.DocumentTypeHandler.createPdfPreview(DocumentTypeHandler.java:141) at net.sourceforge.openutils.mgnlmedia.media.types.impl.DocumentTypeHandler.onPostSave(DocumentTypeHandler.java:96) at net.sourceforge.openutils.mgnlmedia.media.dialog.LayerDialogMVC.onPostSave(LayerDialogMVC.java:152) at info.magnolia.module.admininterface.DialogMVCHandler.save(DialogMVCHandler.java:236) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at info.magnolia.cms.servlets.MVCServletHandlerImpl.execute(MVCServletHandlerImpl.java:121) at info.magnolia.cms.servlets.MVCServlet.doPost(MVCServlet.java:125) at javax.servlet.http.HttpServlet.service(HttpServlet.java:637) at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) at info.magnolia.cms.filters.ServletDispatchingFilter.doFilter(ServletDispatchingFilter.java:123) at info.magnolia.cms.filters.AbstractMgnlFilter.doFilter(AbstractMgnlFilter.java:91) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:83) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:85) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:85) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:85) at info.magnolia.cms.filters.CompositeFilter.doFilter(CompositeFilter.java:67) at info.magnolia.cms.filters.AbstractMgnlFilter.doFilter(AbstractMgnlFilter.java:91) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:83) at info.magnolia.cms.filters.VirtualUriFilter.doFilter(VirtualUriFilter.java:70) at info.magnolia.cms.filters.AbstractMgnlFilter.doFilter(AbstractMgnlFilter.java:91) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:83) at info.magnolia.module.cache.executor.Bypass.processCacheRequest(Bypass.java:58) at info.magnolia.module.cache.executor.CompositeExecutor.processCacheRequest(CompositeExecutor.java:66) at info.magnolia.module.cache.filter.CacheFilter.doFilter(CacheFilter.java:153) at info.magnolia.cms.filters.OncePerRequestAbstractMgnlFilter.doFilter(OncePerRequestAbstractMgnlFilter.java:61) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:83) at info.magnolia.cms.i18n.I18nContentSupportFilter.doFilter(I18nContentSupportFilter.java:76) at info.magnolia.cms.filters.AbstractMgnlFilter.doFilter(AbstractMgnlFilter.java:91) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:83) at info.magnolia.cms.filters.RangeSupportFilter.doFilter(RangeSupportFilter.java:84) at info.magnolia.cms.filters.AbstractMgnlFilter.doFilter(AbstractMgnlFilter.java:91) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:83) at
[jira] [Updated] (PDFBOX-1844) [PATCH] Parser for Type 1 Fonts
[ https://issues.apache.org/jira/browse/PDFBOX-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Hewson updated PDFBOX-1844: Attachment: (was: type1-v2.patch) [PATCH] Parser for Type 1 Fonts --- Key: PDFBOX-1844 URL: https://issues.apache.org/jira/browse/PDFBOX-1844 Project: PDFBox Issue Type: Improvement Components: FontBox Affects Versions: 2.0.0 Reporter: John Hewson Assignee: Andreas Lehmkühler Labels: patch, rendering Attachments: CustomEncoding.java, Token.java, Type1CharStringReader.java, Type1Font.java, Type1Glyph2D.java, Type1Mapping.java, latexdemo.pdf, redp4581.pdf, test.pdf This patch adds a parser for Type 1 fonts to FontBox and makes use of it in PDFBox for rendering Type 1 glyphs. This should fix various issues with the JVM crashing and rendering fonts incorrectly. It was necessary to modify Type1CharStringParser to handle the `callothersubr` command and correctly handle subroutines. Likewise, Type1CharString was modified to support flex. This patch does not remove the AWT fallback for non-embedded and standard 14 fonts because an entirely new fallback system is needed and suitable fonts will need to be shipped as part of PDFBox. This needs to be discussed on the mailing list and/or in follow-on issue. Note: To keep this patch small I have not replaced any of the existing ad-hoc Type 1 parsing code in PDType1Font and preflight. Those classes retain their original code which can be replaced in subsequent patches/refactoring. I can open follow-on issues for these. ~~~ As well as the patch, the these files were added: + /pdfbox/src/main/java/org/apache/pdfbox/pdfviewer/font/Type1Glyph2D.java + /fontbox/src/main/java/org/apache/fontbox/encoding/CustomEncoding.java + /fontbox/src/main/java/org/apache/fontbox/type1/Token.java + /fontbox/src/main/java/org/apache/fontbox/type1/Type1CharStringReader.java + /fontbox/src/main/java/org/apache/fontbox/type1/Type1Font.java + /fontbox/src/main/java/org/apache/fontbox/type1/Type1Lexer.java + /fontbox/src/main/java/org/apache/fontbox/type1/Type1Mapping.java + /fontbox/src/main/java/org/apache/fontbox/type1/Type1Parser.java And this file was removed: - /pdfbox/src/main/java/org/apache/pdfbox/pdfviewer/font/CFFGlyph2D.java -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1844) [PATCH] Parser for Type 1 Fonts
[ https://issues.apache.org/jira/browse/PDFBOX-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Hewson updated PDFBOX-1844: Attachment: (was: Type1Parser.java) [PATCH] Parser for Type 1 Fonts --- Key: PDFBOX-1844 URL: https://issues.apache.org/jira/browse/PDFBOX-1844 Project: PDFBox Issue Type: Improvement Components: FontBox Affects Versions: 2.0.0 Reporter: John Hewson Assignee: Andreas Lehmkühler Labels: patch, rendering Attachments: CustomEncoding.java, Token.java, Type1CharStringReader.java, Type1Font.java, Type1Glyph2D.java, Type1Mapping.java, latexdemo.pdf, redp4581.pdf, test.pdf This patch adds a parser for Type 1 fonts to FontBox and makes use of it in PDFBox for rendering Type 1 glyphs. This should fix various issues with the JVM crashing and rendering fonts incorrectly. It was necessary to modify Type1CharStringParser to handle the `callothersubr` command and correctly handle subroutines. Likewise, Type1CharString was modified to support flex. This patch does not remove the AWT fallback for non-embedded and standard 14 fonts because an entirely new fallback system is needed and suitable fonts will need to be shipped as part of PDFBox. This needs to be discussed on the mailing list and/or in follow-on issue. Note: To keep this patch small I have not replaced any of the existing ad-hoc Type 1 parsing code in PDType1Font and preflight. Those classes retain their original code which can be replaced in subsequent patches/refactoring. I can open follow-on issues for these. ~~~ As well as the patch, the these files were added: + /pdfbox/src/main/java/org/apache/pdfbox/pdfviewer/font/Type1Glyph2D.java + /fontbox/src/main/java/org/apache/fontbox/encoding/CustomEncoding.java + /fontbox/src/main/java/org/apache/fontbox/type1/Token.java + /fontbox/src/main/java/org/apache/fontbox/type1/Type1CharStringReader.java + /fontbox/src/main/java/org/apache/fontbox/type1/Type1Font.java + /fontbox/src/main/java/org/apache/fontbox/type1/Type1Lexer.java + /fontbox/src/main/java/org/apache/fontbox/type1/Type1Mapping.java + /fontbox/src/main/java/org/apache/fontbox/type1/Type1Parser.java And this file was removed: - /pdfbox/src/main/java/org/apache/pdfbox/pdfviewer/font/CFFGlyph2D.java -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1844) [PATCH] Parser for Type 1 Fonts
[ https://issues.apache.org/jira/browse/PDFBOX-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Hewson updated PDFBOX-1844: Attachment: (was: Type1Lexer.java) [PATCH] Parser for Type 1 Fonts --- Key: PDFBOX-1844 URL: https://issues.apache.org/jira/browse/PDFBOX-1844 Project: PDFBox Issue Type: Improvement Components: FontBox Affects Versions: 2.0.0 Reporter: John Hewson Assignee: Andreas Lehmkühler Labels: patch, rendering Attachments: CustomEncoding.java, Token.java, Type1CharStringReader.java, Type1Font.java, Type1Glyph2D.java, Type1Mapping.java, latexdemo.pdf, redp4581.pdf, test.pdf This patch adds a parser for Type 1 fonts to FontBox and makes use of it in PDFBox for rendering Type 1 glyphs. This should fix various issues with the JVM crashing and rendering fonts incorrectly. It was necessary to modify Type1CharStringParser to handle the `callothersubr` command and correctly handle subroutines. Likewise, Type1CharString was modified to support flex. This patch does not remove the AWT fallback for non-embedded and standard 14 fonts because an entirely new fallback system is needed and suitable fonts will need to be shipped as part of PDFBox. This needs to be discussed on the mailing list and/or in follow-on issue. Note: To keep this patch small I have not replaced any of the existing ad-hoc Type 1 parsing code in PDType1Font and preflight. Those classes retain their original code which can be replaced in subsequent patches/refactoring. I can open follow-on issues for these. ~~~ As well as the patch, the these files were added: + /pdfbox/src/main/java/org/apache/pdfbox/pdfviewer/font/Type1Glyph2D.java + /fontbox/src/main/java/org/apache/fontbox/encoding/CustomEncoding.java + /fontbox/src/main/java/org/apache/fontbox/type1/Token.java + /fontbox/src/main/java/org/apache/fontbox/type1/Type1CharStringReader.java + /fontbox/src/main/java/org/apache/fontbox/type1/Type1Font.java + /fontbox/src/main/java/org/apache/fontbox/type1/Type1Lexer.java + /fontbox/src/main/java/org/apache/fontbox/type1/Type1Mapping.java + /fontbox/src/main/java/org/apache/fontbox/type1/Type1Parser.java And this file was removed: - /pdfbox/src/main/java/org/apache/pdfbox/pdfviewer/font/CFFGlyph2D.java -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1844) [PATCH] Parser for Type 1 Fonts
[ https://issues.apache.org/jira/browse/PDFBOX-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John Hewson updated PDFBOX-1844: Attachment: Type1Parser.java Type1Lexer.java type1-v3.patch I've fixed the issues with the PDFBOX-1298 file. There was a bug in the lexer as well as two malformed Type 1 fonts in the PDF file. I've updated the patch to v3, and modified the Type1Parser and Type1Lexer files. [PATCH] Parser for Type 1 Fonts --- Key: PDFBOX-1844 URL: https://issues.apache.org/jira/browse/PDFBOX-1844 Project: PDFBox Issue Type: Improvement Components: FontBox Affects Versions: 2.0.0 Reporter: John Hewson Assignee: Andreas Lehmkühler Labels: patch, rendering Attachments: CustomEncoding.java, Token.java, Type1CharStringReader.java, Type1Font.java, Type1Glyph2D.java, Type1Lexer.java, Type1Mapping.java, Type1Parser.java, latexdemo.pdf, redp4581.pdf, test.pdf, type1-v3.patch This patch adds a parser for Type 1 fonts to FontBox and makes use of it in PDFBox for rendering Type 1 glyphs. This should fix various issues with the JVM crashing and rendering fonts incorrectly. It was necessary to modify Type1CharStringParser to handle the `callothersubr` command and correctly handle subroutines. Likewise, Type1CharString was modified to support flex. This patch does not remove the AWT fallback for non-embedded and standard 14 fonts because an entirely new fallback system is needed and suitable fonts will need to be shipped as part of PDFBox. This needs to be discussed on the mailing list and/or in follow-on issue. Note: To keep this patch small I have not replaced any of the existing ad-hoc Type 1 parsing code in PDType1Font and preflight. Those classes retain their original code which can be replaced in subsequent patches/refactoring. I can open follow-on issues for these. ~~~ As well as the patch, the these files were added: + /pdfbox/src/main/java/org/apache/pdfbox/pdfviewer/font/Type1Glyph2D.java + /fontbox/src/main/java/org/apache/fontbox/encoding/CustomEncoding.java + /fontbox/src/main/java/org/apache/fontbox/type1/Token.java + /fontbox/src/main/java/org/apache/fontbox/type1/Type1CharStringReader.java + /fontbox/src/main/java/org/apache/fontbox/type1/Type1Font.java + /fontbox/src/main/java/org/apache/fontbox/type1/Type1Lexer.java + /fontbox/src/main/java/org/apache/fontbox/type1/Type1Mapping.java + /fontbox/src/main/java/org/apache/fontbox/type1/Type1Parser.java And this file was removed: - /pdfbox/src/main/java/org/apache/pdfbox/pdfviewer/font/CFFGlyph2D.java -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Jenkins build became unstable: PDFBox 1.8.x » Apache XmpBox #46
See https://builds.apache.org/job/PDFBox%201.8.x/org.apache.pdfbox$xmpbox/46/changes
Jenkins build became unstable: PDFBox 1.8.x #46
See https://builds.apache.org/job/PDFBox%201.8.x/46/changes
[jira] [Resolved] (PDFBOX-1625) java.lang.IndexOutOfBoundsException at writing PDF file
[ https://issues.apache.org/jira/browse/PDFBOX-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guillaume Bailleul resolved PDFBOX-1625. Resolution: Fixed Fix Version/s: 2.0.0 The patch has been applied on r1558570. java.lang.IndexOutOfBoundsException at writing PDF file --- Key: PDFBOX-1625 URL: https://issues.apache.org/jira/browse/PDFBOX-1625 Project: PDFBox Issue Type: Bug Components: Writing Affects Versions: 1.8.2 Environment: Linux, Java 7u21 Reporter: Jens Kapitza Assignee: Guillaume Bailleul Priority: Minor Fix For: 2.0.0 Attachments: clone.patch, pdftool.zip I got this error: i will just recreate a document with pages 1-6. Exception in thread main java.io.IOException: org.apache.pdfbox.exceptions.COSVisitorException: java.lang.IndexOutOfBoundsException: Index: 115, Size: 0 at de.back2heaven.pdf.model.TargetDocumuent.save(TargetDocumuent.java:56) at de.back2heaven.pdf.model.Document.prozess(Document.java:76) at de.back2heaven.pdf.model.Document.main(Document.java:56) Caused by: org.apache.pdfbox.exceptions.COSVisitorException: java.lang.IndexOutOfBoundsException: Index: 115, Size: 0 at org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1354) at org.apache.pdfbox.cos.COSStream.accept(COSStream.java:217) at org.apache.pdfbox.cos.COSObject.accept(COSObject.java:206) at org.apache.pdfbox.pdfwriter.COSWriter.doWriteObject(COSWriter.java:525) at org.apache.pdfbox.pdfwriter.COSWriter.doWriteBody(COSWriter.java:435) at org.apache.pdfbox.pdfwriter.COSWriter.visitFromDocument(COSWriter.java:1122) at org.apache.pdfbox.cos.COSDocument.accept(COSDocument.java:552) at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1501) at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1324) at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1305) at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1292) at de.back2heaven.pdf.model.TargetDocumuent.save(TargetDocumuent.java:54) ... 2 more Caused by: java.lang.IndexOutOfBoundsException: Index: 115, Size: 0 at java.util.ArrayList.rangeCheck(ArrayList.java:604) at java.util.ArrayList.get(ArrayList.java:382) at org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84) at org.apache.pdfbox.io.RandomAccessFileInputStream.read(RandomAccessFileInputStream.java:96) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1337) ... 13 more -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (PDFBOX-1849) Isartor test 6-3-5-t01-fail-a does not return the expected error code
Guillaume Bailleul created PDFBOX-1849: -- Summary: Isartor test 6-3-5-t01-fail-a does not return the expected error code Key: PDFBOX-1849 URL: https://issues.apache.org/jira/browse/PDFBOX-1849 Project: PDFBox Issue Type: Bug Components: FontBox, Preflight Reporter: Guillaume Bailleul Assignee: Guillaume Bailleul Fix For: 2.0.0 A modification on fonts handling during december 2013 or January 2014 made an isartor test not detecting the expected error. This test (6-3-5-t01-fail-a) should detect missing glyph but now detect metrics issues -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Comment Edited] (PDFBOX-1685) Verify interpretation of rdf:about for PDF/A
[ https://issues.apache.org/jira/browse/PDFBOX-1685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872440#comment-13872440 ] Andreas Lehmkühler edited comment on PDFBOX-1685 at 1/15/14 9:59 PM: - Added to the 1.8 branch in revisions 1558521 and 1558581 was (Author: lehmi): Added to the 1.8 branch in revision 1558521 Verify interpretation of rdf:about for PDF/A Key: PDFBOX-1685 URL: https://issues.apache.org/jira/browse/PDFBOX-1685 Project: PDFBox Issue Type: Task Components: Preflight Reporter: Maruan Sahyoun Assignee: Eric Leleu Priority: Minor Fix For: 1.8.4, 2.0.0 Attachments: test-bfo.pdf There was a discussion about handling rdf:about for PDF/A validation on the PDF Associations mailing list which I'm allowed to share: snip In this case we have a PDF with an XMP metadata stream containing two rdf:RDF entries, one with rdf:about set to a blank string, the other with it set to a UUID. The PDF/A specification (ISO-19005-1:2005(E) para 6.7.2) simply says that the stream must conform to the XMP specification 2004 revision which reads (p21): The rdf:about attribute on the rdf:Description element is a required attribute that identifies the resource whose metadata this XMP describes. The value of this attribute must follow URI syntax and may be either: ● an empty string (as in the example above), which means that the XMP is physically local to the resource being described. Applications must rely on knowledge of the file format to correctly associate the XMP with the resource. ● a unique instance ID that is generated every time a file is saved. The next section gives guidelines for creating instance IDs. The XMP packet must describe a single entity, and my reading of the above is a combination of empty-string and a unique UUID can meet this requirement - this is how both our software and Acrobat X and XI behave. However it's ambiguous, and this clause was revised in the 2012 revision (ISO 16684-1:2011(E) para 7.4) to this: If the XMP data model has an AboutURI (6.1, “XMP packets”), that same URI shall be the value of an rdf:about attribute in each top-level rdf:Description element. Otherwise, the rdf:about attributes for all top- level rdf:Description elements shall be present with an empty value. The rdf:about attribute shall not be used in more deeply nested rdf:Description elements. For compatibility with very early XMP usage, it is recommended that XMP readers tolerate a missing rdf:about attribute and treat it as present with an empty value. It is also recommended that XMP readers tolerate a mix of empty and non-empty rdf:about values, as long as all non-empty values are identical. Which means that an empty string and a unique UUID are technically incorrect, but it's recommended they be tolerated for compatibility purposes. /snip I might be good to check our interpretation as snip BFO and Acrobat X and XI think this is valid, PDFBox and pdf-tools.com online validator lean the other and classify this document as invalid. /snip to see if we should change our interpretation. If there is new input on the pdfa.org mailinglist I'll capture it here too. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Updated] (PDFBOX-1625) java.lang.IndexOutOfBoundsException at writing PDF file
[ https://issues.apache.org/jira/browse/PDFBOX-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler updated PDFBOX-1625: --- Fix Version/s: 1.8.4 java.lang.IndexOutOfBoundsException at writing PDF file --- Key: PDFBOX-1625 URL: https://issues.apache.org/jira/browse/PDFBOX-1625 Project: PDFBox Issue Type: Bug Components: Writing Affects Versions: 1.8.2 Environment: Linux, Java 7u21 Reporter: Jens Kapitza Assignee: Guillaume Bailleul Priority: Minor Fix For: 1.8.4, 2.0.0 Attachments: clone.patch, pdftool.zip I got this error: i will just recreate a document with pages 1-6. Exception in thread main java.io.IOException: org.apache.pdfbox.exceptions.COSVisitorException: java.lang.IndexOutOfBoundsException: Index: 115, Size: 0 at de.back2heaven.pdf.model.TargetDocumuent.save(TargetDocumuent.java:56) at de.back2heaven.pdf.model.Document.prozess(Document.java:76) at de.back2heaven.pdf.model.Document.main(Document.java:56) Caused by: org.apache.pdfbox.exceptions.COSVisitorException: java.lang.IndexOutOfBoundsException: Index: 115, Size: 0 at org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1354) at org.apache.pdfbox.cos.COSStream.accept(COSStream.java:217) at org.apache.pdfbox.cos.COSObject.accept(COSObject.java:206) at org.apache.pdfbox.pdfwriter.COSWriter.doWriteObject(COSWriter.java:525) at org.apache.pdfbox.pdfwriter.COSWriter.doWriteBody(COSWriter.java:435) at org.apache.pdfbox.pdfwriter.COSWriter.visitFromDocument(COSWriter.java:1122) at org.apache.pdfbox.cos.COSDocument.accept(COSDocument.java:552) at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1501) at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1324) at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1305) at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1292) at de.back2heaven.pdf.model.TargetDocumuent.save(TargetDocumuent.java:54) ... 2 more Caused by: java.lang.IndexOutOfBoundsException: Index: 115, Size: 0 at java.util.ArrayList.rangeCheck(ArrayList.java:604) at java.util.ArrayList.get(ArrayList.java:382) at org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84) at org.apache.pdfbox.io.RandomAccessFileInputStream.read(RandomAccessFileInputStream.java:96) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1337) ... 13 more -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PDFBOX-1625) java.lang.IndexOutOfBoundsException at writing PDF file
[ https://issues.apache.org/jira/browse/PDFBOX-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872658#comment-13872658 ] Andreas Lehmkühler commented on PDFBOX-1625: Added the changes to the 1.8 branch in revision 1558585 java.lang.IndexOutOfBoundsException at writing PDF file --- Key: PDFBOX-1625 URL: https://issues.apache.org/jira/browse/PDFBOX-1625 Project: PDFBox Issue Type: Bug Components: Writing Affects Versions: 1.8.2 Environment: Linux, Java 7u21 Reporter: Jens Kapitza Assignee: Guillaume Bailleul Priority: Minor Fix For: 1.8.4, 2.0.0 Attachments: clone.patch, pdftool.zip I got this error: i will just recreate a document with pages 1-6. Exception in thread main java.io.IOException: org.apache.pdfbox.exceptions.COSVisitorException: java.lang.IndexOutOfBoundsException: Index: 115, Size: 0 at de.back2heaven.pdf.model.TargetDocumuent.save(TargetDocumuent.java:56) at de.back2heaven.pdf.model.Document.prozess(Document.java:76) at de.back2heaven.pdf.model.Document.main(Document.java:56) Caused by: org.apache.pdfbox.exceptions.COSVisitorException: java.lang.IndexOutOfBoundsException: Index: 115, Size: 0 at org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1354) at org.apache.pdfbox.cos.COSStream.accept(COSStream.java:217) at org.apache.pdfbox.cos.COSObject.accept(COSObject.java:206) at org.apache.pdfbox.pdfwriter.COSWriter.doWriteObject(COSWriter.java:525) at org.apache.pdfbox.pdfwriter.COSWriter.doWriteBody(COSWriter.java:435) at org.apache.pdfbox.pdfwriter.COSWriter.visitFromDocument(COSWriter.java:1122) at org.apache.pdfbox.cos.COSDocument.accept(COSDocument.java:552) at org.apache.pdfbox.pdfwriter.COSWriter.write(COSWriter.java:1501) at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1324) at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1305) at org.apache.pdfbox.pdmodel.PDDocument.save(PDDocument.java:1292) at de.back2heaven.pdf.model.TargetDocumuent.save(TargetDocumuent.java:54) ... 2 more Caused by: java.lang.IndexOutOfBoundsException: Index: 115, Size: 0 at java.util.ArrayList.rangeCheck(ArrayList.java:604) at java.util.ArrayList.get(ArrayList.java:382) at org.apache.pdfbox.io.RandomAccessBuffer.seek(RandomAccessBuffer.java:84) at org.apache.pdfbox.io.RandomAccessFileInputStream.read(RandomAccessFileInputStream.java:96) at java.io.BufferedInputStream.fill(BufferedInputStream.java:235) at java.io.BufferedInputStream.read1(BufferedInputStream.java:275) at java.io.BufferedInputStream.read(BufferedInputStream.java:334) at org.apache.pdfbox.pdfwriter.COSWriter.visitFromStream(COSWriter.java:1337) ... 13 more -- This message was sent by Atlassian JIRA (v6.1.5#6160)
Jenkins build is back to stable : PDFBox 1.8.x » Apache XmpBox #47
See https://builds.apache.org/job/PDFBox%201.8.x/org.apache.pdfbox$xmpbox/47/changes
Jenkins build is back to stable : PDFBox 1.8.x #47
See https://builds.apache.org/job/PDFBox%201.8.x/47/changes
[jira] [Commented] (PDFBOX-1849) Isartor test 6-3-5-t01-fail-a does not return the expected error code
[ https://issues.apache.org/jira/browse/PDFBOX-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13872678#comment-13872678 ] Guillaume Bailleul commented on PDFBOX-1849: The issue appeared with r129 Isartor test 6-3-5-t01-fail-a does not return the expected error code - Key: PDFBOX-1849 URL: https://issues.apache.org/jira/browse/PDFBOX-1849 Project: PDFBox Issue Type: Bug Components: FontBox, Preflight Reporter: Guillaume Bailleul Assignee: Guillaume Bailleul Fix For: 2.0.0 A modification on fonts handling during december 2013 or January 2014 made an isartor test not detecting the expected error. This test (6-3-5-t01-fail-a) should detect missing glyph but now detect metrics issues -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (PDFBOX-1849) Isartor test 6-3-5-t01-fail-a does not return the expected error code
[ https://issues.apache.org/jira/browse/PDFBOX-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13873002#comment-13873002 ] John Hewson commented on PDFBOX-1849: - Yep, this was indeed caused by my patch PDFBOX-1831, the problem is in the getWidth() on line 66 of Type2CharString return nominalWidthX + width; Should be: return width; We should wait until PDFBOX-1844 is closed before fixing this to avoid breaking my 1000+ line patch. Isartor test 6-3-5-t01-fail-a does not return the expected error code - Key: PDFBOX-1849 URL: https://issues.apache.org/jira/browse/PDFBOX-1849 Project: PDFBox Issue Type: Bug Components: FontBox, Preflight Reporter: Guillaume Bailleul Assignee: Guillaume Bailleul Fix For: 2.0.0 A modification on fonts handling during december 2013 or January 2014 made an isartor test not detecting the expected error. This test (6-3-5-t01-fail-a) should detect missing glyph but now detect metrics issues -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Comment Edited] (PDFBOX-1849) Isartor test 6-3-5-t01-fail-a does not return the expected error code
[ https://issues.apache.org/jira/browse/PDFBOX-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13873002#comment-13873002 ] John Hewson edited comment on PDFBOX-1849 at 1/16/14 3:10 AM: -- Yep, this was indeed caused by my patch PDFBOX-1831, the problem on line 66 of Type2CharString in the getWidth() method: {code:return nominalWidthX + width;} Should be: {code:return width;} We should wait until PDFBOX-1844 is closed before fixing this to avoid breaking my 1000+ line patch. was (Author: jahewson): Yep, this was indeed caused by my patch PDFBOX-1831, the problem is in the getWidth() on line 66 of Type2CharString return nominalWidthX + width; Should be: return width; We should wait until PDFBOX-1844 is closed before fixing this to avoid breaking my 1000+ line patch. Isartor test 6-3-5-t01-fail-a does not return the expected error code - Key: PDFBOX-1849 URL: https://issues.apache.org/jira/browse/PDFBOX-1849 Project: PDFBox Issue Type: Bug Components: FontBox, Preflight Reporter: Guillaume Bailleul Assignee: Guillaume Bailleul Fix For: 2.0.0 A modification on fonts handling during december 2013 or January 2014 made an isartor test not detecting the expected error. This test (6-3-5-t01-fail-a) should detect missing glyph but now detect metrics issues -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Comment Edited] (PDFBOX-1849) Isartor test 6-3-5-t01-fail-a does not return the expected error code
[ https://issues.apache.org/jira/browse/PDFBOX-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13873002#comment-13873002 ] John Hewson edited comment on PDFBOX-1849 at 1/16/14 3:11 AM: -- Yep, this was indeed caused by my patch PDFBOX-1831, the problem on line 66 of Type2CharString in the getWidth() method: {code:java} return nominalWidthX + width; {code} Should be: {code:java} return width; {code} We should wait until PDFBOX-1844 is closed before fixing this to avoid breaking my 1000+ line patch. was (Author: jahewson): Yep, this was indeed caused by my patch PDFBOX-1831, the problem on line 66 of Type2CharString in the getWidth() method: {code:return nominalWidthX + width;} Should be: {code:return width;} We should wait until PDFBOX-1844 is closed before fixing this to avoid breaking my 1000+ line patch. Isartor test 6-3-5-t01-fail-a does not return the expected error code - Key: PDFBOX-1849 URL: https://issues.apache.org/jira/browse/PDFBOX-1849 Project: PDFBox Issue Type: Bug Components: FontBox, Preflight Reporter: Guillaume Bailleul Assignee: Guillaume Bailleul Fix For: 2.0.0 A modification on fonts handling during december 2013 or January 2014 made an isartor test not detecting the expected error. This test (6-3-5-t01-fail-a) should detect missing glyph but now detect metrics issues -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Comment Edited] (PDFBOX-1849) Isartor test 6-3-5-t01-fail-a does not return the expected error code
[ https://issues.apache.org/jira/browse/PDFBOX-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13873002#comment-13873002 ] John Hewson edited comment on PDFBOX-1849 at 1/16/14 3:12 AM: -- Yep, this was indeed caused by my patch PDFBOX-1831, the problem is on line 66 of Type2CharString in the getWidth() method: {code:java} return nominalWidthX + width; {code} Should be: {code:java} return width; {code} We should wait until PDFBOX-1844 is closed before fixing this to avoid breaking my 1000+ line patch. was (Author: jahewson): Yep, this was indeed caused by my patch PDFBOX-1831, the problem on line 66 of Type2CharString in the getWidth() method: {code:java} return nominalWidthX + width; {code} Should be: {code:java} return width; {code} We should wait until PDFBOX-1844 is closed before fixing this to avoid breaking my 1000+ line patch. Isartor test 6-3-5-t01-fail-a does not return the expected error code - Key: PDFBOX-1849 URL: https://issues.apache.org/jira/browse/PDFBOX-1849 Project: PDFBox Issue Type: Bug Components: FontBox, Preflight Reporter: Guillaume Bailleul Assignee: Guillaume Bailleul Fix For: 2.0.0 A modification on fonts handling during december 2013 or January 2014 made an isartor test not detecting the expected error. This test (6-3-5-t01-fail-a) should detect missing glyph but now detect metrics issues -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (PDFBOX-1829) PDF Extract Image Pixelmap Issue
[ https://issues.apache.org/jira/browse/PDFBOX-1829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler resolved PDFBOX-1829. Resolution: Fixed I added the changes to the 1.8 branch in revision 1558702. We will follow up on this issue in PDFBOX-1819, so that I'm closing this one. Thanks for the report! PDF Extract Image Pixelmap Issue Key: PDFBOX-1829 URL: https://issues.apache.org/jira/browse/PDFBOX-1829 Project: PDFBox Issue Type: Bug Affects Versions: 1.6.0, 1.8.3 Reporter: Jonas Mende Assignee: Andreas Lehmkühler Fix For: 1.8.4, 2.0.0 Attachments: ausgabe109.pdf Hello everyone, In our current project we are using pdfbox version 1.6.0 as part of an integrated media management solution. When extracting the first page of PDFs, we encounter a certain error for some of the files. The error log looks as follows: 2013-12-20 10:09:14,471 WARN org.apache.pdfbox.util.operator.pagedrawer.SHFill : java.io.IOException: Not Implemented java.io.IOException: Not Implemented at org.apache.pdfbox.pdfviewer.PageDrawer.SHFill_Radial(PageDrawer.java:493) at org.apache.pdfbox.pdfviewer.PageDrawer.SHFill(PageDrawer.java:415) at org.apache.pdfbox.util.operator.pagedrawer.SHFill.process(SHFill.java:58) at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:551) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:274) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:251) at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:225) at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:107) at org.apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.java:722) at net.sourceforge.openutils.mgnlmedia.media.types.impl.DocumentTypeHandler.createPdfPreview(DocumentTypeHandler.java:141) at net.sourceforge.openutils.mgnlmedia.media.types.impl.DocumentTypeHandler.onPostSave(DocumentTypeHandler.java:96) at net.sourceforge.openutils.mgnlmedia.media.dialog.LayerDialogMVC.onPostSave(LayerDialogMVC.java:152) at info.magnolia.module.admininterface.DialogMVCHandler.save(DialogMVCHandler.java:236) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at info.magnolia.cms.servlets.MVCServletHandlerImpl.execute(MVCServletHandlerImpl.java:121) at info.magnolia.cms.servlets.MVCServlet.doPost(MVCServlet.java:125) at javax.servlet.http.HttpServlet.service(HttpServlet.java:637) at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) at info.magnolia.cms.filters.ServletDispatchingFilter.doFilter(ServletDispatchingFilter.java:123) at info.magnolia.cms.filters.AbstractMgnlFilter.doFilter(AbstractMgnlFilter.java:91) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:83) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:85) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:85) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:85) at info.magnolia.cms.filters.CompositeFilter.doFilter(CompositeFilter.java:67) at info.magnolia.cms.filters.AbstractMgnlFilter.doFilter(AbstractMgnlFilter.java:91) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:83) at info.magnolia.cms.filters.VirtualUriFilter.doFilter(VirtualUriFilter.java:70) at info.magnolia.cms.filters.AbstractMgnlFilter.doFilter(AbstractMgnlFilter.java:91) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:83) at info.magnolia.module.cache.executor.Bypass.processCacheRequest(Bypass.java:58) at info.magnolia.module.cache.executor.CompositeExecutor.processCacheRequest(CompositeExecutor.java:66) at info.magnolia.module.cache.filter.CacheFilter.doFilter(CacheFilter.java:153) at info.magnolia.cms.filters.OncePerRequestAbstractMgnlFilter.doFilter(OncePerRequestAbstractMgnlFilter.java:61) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:83) at info.magnolia.cms.i18n.I18nContentSupportFilter.doFilter(I18nContentSupportFilter.java:76) at info.magnolia.cms.filters.AbstractMgnlFilter.doFilter(AbstractMgnlFilter.java:91) at info.magnolia.cms.filters.MgnlFilterChain.doFilter(MgnlFilterChain.java:83) at info.magnolia.cms.filters.RangeSupportFilter.doFilter(RangeSupportFilter.java:84) at
[jira] [Commented] (PDFBOX-1808) PDFTextStripper.getText - hight memory usage
[ https://issues.apache.org/jira/browse/PDFBOX-1808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13873111#comment-13873111 ] Andreas Lehmkühler commented on PDFBOX-1808: I added most of the changes (excluding 1553174 as it introduces an api incompatibility) to the 1.8 branch in revision 1558705. [~jguyenot] Any luck with your test? Do you need some additional help/information? PDFTextStripper.getText - hight memory usage Key: PDFBOX-1808 URL: https://issues.apache.org/jira/browse/PDFBOX-1808 Project: PDFBox Issue Type: Bug Components: Text extraction Affects Versions: 1.8.2, 1.8.3 Environment: Windows 7 Java jdk 1.7.0_45 Reporter: Guyenot Jeremy Assignee: Andreas Lehmkühler Priority: Critical Labels: performance Attachments: 1808-java char copyof.jpg, 1808-java char copyofrange.jpg, 1808-java usage.jpg, 1808-pdfbox usage.jpg, 1808-snapshot.nps, DOSSIER DE CANDIDATURE_001.pdf, s5-1.png, s5-2.png, s50-1.png, s50-2.png Original Estimate: 72h Remaining Estimate: 72h Hello, i'm trying to extract text from pdfs but i can find that the PDFTextStripper use a lot of memory. With a pdf that have 2676 pages (for a 4.6Mo size) it use 1.5Go memory. I also constat that the memory is'nt free after the getText method is called. You can see my code bellow: double virgule = Math.pow(10, 2); System.out.println(START - Total memory (Mo): + Math.round((Runtime.getRuntime().totalMemory()/100) * virgule) / virgule); PDDocument cd = PDDocument.load(file); System.out.println(PDDocument getNumberOfPages - Nombre de pages: + cd.getNumberOfPages()); System.out.println(PDDocument load - Total memory (Mo): + Math.round((Runtime.getRuntime().totalMemory()/100) * virgule) / virgule); String pdfText = ; try{ PDFTextStripper stripper = new PDFTextStripper(); pdfText = stripper.getText(cd); System.out.println(PDFTextStripper getText - Total memory (Mo): + Math.round((Runtime.getRuntime().totalMemory()/100) * virgule) / virgule); stripper.resetEngine(); stripper = null; System.out.println(PDFTextStripper resetEngine - Total memory (Mo): + Math.round((Runtime.getRuntime().totalMemory()/100) * virgule) / virgule); } finally{ if( cd!=null ){ cd.close(); cd = null; System.out.println(PDDocument close - Total memory (Mo): + Math.round((Runtime.getRuntime().totalMemory()/100) * virgule) / virgule); } } retour = new TextField(fieldName, pdfText, Field.Store.NO); System.out.println(TextField - Total memory (Mo): + Math.round((Runtime.getRuntime().totalMemory()/100) * virgule) / virgule); And the result into my output window: START - Total memory (Mo): 95.0 PDDocument getNumberOfPages - Nombre de pages: 2676 PDDocument load - Total memory (Mo): 121.0 PDFTextStripper getText - Total memory (Mo): 757.0 PDFTextStripper resetEngine - Total memory (Mo): 757.0 PDDocument close - Total memory (Mo): 757.0 TextField - Total memory (Mo): 757.0 pdfText - Total memory (Mo): 757.0 I also try to call System.gc() but the memory use is the same. -- This message was sent by Atlassian JIRA (v6.1.5#6160)