[jira] [Updated] (PDFBOX-2141) Shading not applied to text
[ https://issues.apache.org/jira/browse/PDFBOX-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-2141: Attachment: pattern-shading-2-4.ps pattern-shading-2-4-idMatrix1.jpg pattern-shading-2-4-idMatrix.pdf pattern-shading-2-4-idMatrix.pdf is a file that I created from PostScript (also attached). The rendering of the text brings a bad surprise :-( - every glyph is rendered as if it is a single page. Shading not applied to text --- Key: PDFBOX-2141 URL: https://issues.apache.org/jira/browse/PDFBOX-2141 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 1.8.5, 1.8.6, 2.0.0 Reporter: Petr Slaby Priority: Minor Attachments: 04_ShadingPatternTextPDF.pdf, PDFBOX-1917.pdf-1.png, PDFBOX-1917.pdf-1.png-diff.png, PDFBOX-1917.pdf-9.png, PDFBOX-1917.pdf-9.png-diff.png, PDFBOX-2135.pdf-2.png, PDFBOX-2135.pdf-2.png-diff.png, PageDrawer.writeFont.java.patch, pattern-shading-2-4-idMatrix.pdf, pattern-shading-2-4-idMatrix1.jpg, pattern-shading-2-4.ps The attached PDF draws a text filled with horizontal shading going from red to blue. When rendered via PDFBox, the text is completely filled with red. The problem is that AxialShadingContext#getRaster() gets called with positions that completely fell outside of the range stored in its coords[] field. The fix seems to be to set glyph transform rather than graphics2d transform in PageDrawer#writeText() as shown in the attached patch. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2149) Font Refactoring
[ https://issues.apache.org/jira/browse/PDFBOX-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14040054#comment-14040054 ] Tilman Hausherr commented on PDFBOX-2149: - Thanks, yes, that one is fixed. Font Refactoring Key: PDFBOX-2149 URL: https://issues.apache.org/jira/browse/PDFBOX-2149 Project: PDFBox Issue Type: Improvement Components: FontBox, PDModel Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: 39.pdf, 000467.pdf To fix bugs such as PDFBOX-2140 and to enable Unicode TTF embedding we need to sort out long-standing font/text encoding issues. The main issue is that encoding is done in an ad-hoc manner, sometimes in the PDFont subclasses, sometimes elsewhere. For example TTFGlyph2D does its own decoding, and this code is copy pasted into PDTrueTypeFont. Likewise, PDFont handles CMaps and Encodings despite the fact that these two encoding methods are mutually exclusive. The end result is that the process of reading Encodings/CMaps is often following rules which are completely invalid for that font type but mostly work by luck. Phase 1 - Refactor PDFont subclasses to remove setXXX methods which allow the object to be corrupted. Proper use of inheritance can remove all cases where public setXXX methods are used during font loading. - Clean up TTF loading and the loadTTF in anticipation of Unicode TTF embedding, FontBox's TrueTypeFont class is externally mutable via setXXX methods used only by TTFParser: these can be made package-private. - the Encoding class and EncodingManager could do with some cleaning up prior to further refactoring. - PDSimpleFont does not do anything, its functionality should be moved into its superclass, PDFont. - PDFont#determineEncoding() loads CMaps when only Encodings are applicable, and vice versa. Loading needs to be pushed down into the appropriate subclasses, as a starting point the relevant code should at least be copied into the relevant subclasses ready for further refactoring. - TTFGlyph2D does its own decoding of char codes, rather than using the font's #encode method (fair enough because #encode is broken) and there's a copy and pasted version of the same code in PDTrueTypeFont - we need to consolidate this code into PDTrueTypeFont where it belongs. Phase 2 - Refactor loading of CMaps and Encodings from font dictionaries, this will involve changes to PDFont and its subclasses to delegate loading to subclasses where it can be properly encapsulated - May need to alter the class hierarchy w.r.t CIDFont to facilitate this, as CIDFont isn't really a PDFont - it's parent Type0 font is responsible for its CMap. We'll see. Phase 3 - Refactor the decoding of character codes by PDFont and its subclasses, this will involve replacing the #getCodeFromArray, #encode and #encodeToCID methods. - Fix decoding of content stream character codes in PDFStreamEngine, using the newly refactored PDFont and using the current font's CMap to determine the code width. Phase 4 - Add support for generating embedded TTFs with Unicode -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PDFBOX-2156) different shading patterns at different resolutions when ctm is null
[ https://issues.apache.org/jira/browse/PDFBOX-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-2156: Affects Version/s: 2.0.0 different shading patterns at different resolutions when ctm is null Key: PDFBOX-2156 URL: https://issues.apache.org/jira/browse/PDFBOX-2156 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 1.8.5, 1.8.6, 2.0.0 Reporter: Tilman Hausherr Assignee: Tilman Hausherr Labels: shading, shadingpattern Attachments: swftools-gradients.pdf, swftools-gradients1.jpg The attached file renders incorrectly except at 72dpi; at other resolutions, there are different results that get worse as the resolution gets higher. For shading types 2 and 3, this effect happens only in the 1.8 versions, not in the 2.0 version. The reason is that the transformation is incomplete if the ctm is null. I found the file here and it was created by Matthias Kramm: https://github.com/jdapena/swftools/tree/master/spec -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PDFBOX-2156) different shading patterns at different resolutions when ctm is null
[ https://issues.apache.org/jira/browse/PDFBOX-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-2156: Description: The attached file renders incorrectly except at 72dpi; at other resolutions, there are different results that get worse as the resolution gets higher. For shading types 2 and 3, this effect happens only in the 1.8 versions, not in the 2.0 version. The reason is that the transformation is incomplete if the ctm is null. I found the file here and it was created by Matthias Kramm: https://github.com/jdapena/swftools/tree/master/spec was: The attached file renders incorrectly except at 72dpi; at other resolutions, there are different results that get worse as the resolution gets higher. This effect happens only in the 1.8 versions. The reason is that the transformation is incomplete if the ctm is null. The effect doesn't happen in the 2.0 version. I already have a fix but will make some tests and try another strategy too. I found the file here and it was created by Matthias Kramm: https://github.com/jdapena/swftools/tree/master/spec different shading patterns at different resolutions when ctm is null Key: PDFBOX-2156 URL: https://issues.apache.org/jira/browse/PDFBOX-2156 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 1.8.5, 1.8.6, 2.0.0 Reporter: Tilman Hausherr Assignee: Tilman Hausherr Labels: shading, shadingpattern Attachments: swftools-gradients.pdf, swftools-gradients1.jpg The attached file renders incorrectly except at 72dpi; at other resolutions, there are different results that get worse as the resolution gets higher. For shading types 2 and 3, this effect happens only in the 1.8 versions, not in the 2.0 version. The reason is that the transformation is incomplete if the ctm is null. I found the file here and it was created by Matthias Kramm: https://github.com/jdapena/swftools/tree/master/spec -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PDFBOX-2156) different shading patterns at different resolutions when ctm is null
[ https://issues.apache.org/jira/browse/PDFBOX-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-2156: Description: The attached file renders incorrectly except at 72dpi; at other resolutions, there are different results that get worse as the resolution gets higher. For shading types 2 and 3, this effect happens only in the 1.8 versions, not in the 2.0 version. For shading types 4 and up, it happens in both 1.8 and 2.0. The reason is that the transformation is incomplete if the ctm is null. I found the file here and it was created by Matthias Kramm: https://github.com/jdapena/swftools/tree/master/spec was: The attached file renders incorrectly except at 72dpi; at other resolutions, there are different results that get worse as the resolution gets higher. For shading types 2 and 3, this effect happens only in the 1.8 versions, not in the 2.0 version. The reason is that the transformation is incomplete if the ctm is null. I found the file here and it was created by Matthias Kramm: https://github.com/jdapena/swftools/tree/master/spec different shading patterns at different resolutions when ctm is null Key: PDFBOX-2156 URL: https://issues.apache.org/jira/browse/PDFBOX-2156 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 1.8.5, 1.8.6, 2.0.0 Reporter: Tilman Hausherr Assignee: Tilman Hausherr Labels: shading, shadingpattern Attachments: swftools-gradients.pdf, swftools-gradients1.jpg The attached file renders incorrectly except at 72dpi; at other resolutions, there are different results that get worse as the resolution gets higher. For shading types 2 and 3, this effect happens only in the 1.8 versions, not in the 2.0 version. For shading types 4 and up, it happens in both 1.8 and 2.0. The reason is that the transformation is incomplete if the ctm is null. I found the file here and it was created by Matthias Kramm: https://github.com/jdapena/swftools/tree/master/spec -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PDFBOX-2156) different shading patterns at different resolutions when ctm is null
[ https://issues.apache.org/jira/browse/PDFBOX-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-2156: Attachment: pattern-shading-2-4-noMatrix.pdf The attached file pattern-shading-2-4-noMatrix.pdf is the same as pattern-shading-2-4-idMatrix.pdf from PDFBOX-2141, except that I blanked the identity matrices in the PDF files with a hex editor. The problem with the not / badly shaded text will be handled there. different shading patterns at different resolutions when ctm is null Key: PDFBOX-2156 URL: https://issues.apache.org/jira/browse/PDFBOX-2156 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 1.8.5, 1.8.6, 2.0.0 Reporter: Tilman Hausherr Assignee: Tilman Hausherr Labels: shading, shadingpattern Attachments: pattern-shading-2-4-noMatrix.pdf, swftools-gradients.pdf, swftools-gradients1.jpg The attached file renders incorrectly except at 72dpi; at other resolutions, there are different results that get worse as the resolution gets higher. For shading types 2 and 3, this effect happens only in the 1.8 versions, not in the 2.0 version. For shading types 4 and up, it happens in both 1.8 and 2.0. The reason is that the transformation is incomplete if the ctm is null. I found the file here and it was created by Matthias Kramm: https://github.com/jdapena/swftools/tree/master/spec -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2156) different shading patterns at different resolutions when ctm is null
[ https://issues.apache.org/jira/browse/PDFBOX-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14040090#comment-14040090 ] Tilman Hausherr edited comment on PDFBOX-2156 at 6/22/14 9:54 AM: -- Fixed in rev 1604555 in the 2.0 version for shading types 4 and 5. was (Author: tilman): Fixed in the 2.0 version for shading types 4 and 5. different shading patterns at different resolutions when ctm is null Key: PDFBOX-2156 URL: https://issues.apache.org/jira/browse/PDFBOX-2156 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 1.8.5, 1.8.6, 2.0.0 Reporter: Tilman Hausherr Assignee: Tilman Hausherr Labels: shading, shadingpattern Attachments: pattern-shading-2-4-noMatrix.pdf, swftools-gradients.pdf, swftools-gradients1.jpg The attached file renders incorrectly except at 72dpi; at other resolutions, there are different results that get worse as the resolution gets higher. For shading types 2 and 3, this effect happens only in the 1.8 versions, not in the 2.0 version. For shading types 4 and up, it happens in both 1.8 and 2.0. The reason is that the transformation is incomplete if the ctm is null. I found the file here and it was created by Matthias Kramm: https://github.com/jdapena/swftools/tree/master/spec -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2156) different shading patterns at different resolutions when ctm is null
[ https://issues.apache.org/jira/browse/PDFBOX-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14040090#comment-14040090 ] Tilman Hausherr commented on PDFBOX-2156: - Fixed in the 2.0 version for shading types 4 and 5. different shading patterns at different resolutions when ctm is null Key: PDFBOX-2156 URL: https://issues.apache.org/jira/browse/PDFBOX-2156 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 1.8.5, 1.8.6, 2.0.0 Reporter: Tilman Hausherr Assignee: Tilman Hausherr Labels: shading, shadingpattern Attachments: pattern-shading-2-4-noMatrix.pdf, swftools-gradients.pdf, swftools-gradients1.jpg The attached file renders incorrectly except at 72dpi; at other resolutions, there are different results that get worse as the resolution gets higher. For shading types 2 and 3, this effect happens only in the 1.8 versions, not in the 2.0 version. For shading types 4 and up, it happens in both 1.8 and 2.0. The reason is that the transformation is incomplete if the ctm is null. I found the file here and it was created by Matthias Kramm: https://github.com/jdapena/swftools/tree/master/spec -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (PDFBOX-2118) Remove ICU4J dependency
[ https://issues.apache.org/jira/browse/PDFBOX-2118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler resolved PDFBOX-2118. Resolution: Fixed I've removed the last ICU4J dependency in revision http://svn.apache.org/r1604560 Remove ICU4J dependency --- Key: PDFBOX-2118 URL: https://issues.apache.org/jira/browse/PDFBOX-2118 Project: PDFBox Issue Type: Improvement Components: PDModel Affects Versions: 2.0.0 Reporter: Andreas Lehmkühler Assignee: Andreas Lehmkühler Labels: ICU4J Fix For: 2.0.0 The ICU4J lib is quite big and we are just using a small part of it. Both features are provided by the JDK (java.text.Normalizer and java.text.Bidi) since 1.6 so that it should be possible to remove the ICU4J dependency. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-1915) Implement shading with Coons and tensor-product patch meshes
[ https://issues.apache.org/jira/browse/PDFBOX-1915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14040103#comment-14040103 ] Tilman Hausherr commented on PDFBOX-1915: - There's was a bug in the existing shadings (PDFBOX-2156) that is also in your code, although it doesn't apply to your test files - please change transformPoint() so that it looks like this: {code} if (ctm != null) { ctm.createAffineTransform().transform(p, p); } xform.transform(p, p); {code} Implement shading with Coons and tensor-product patch meshes Key: PDFBOX-1915 URL: https://issues.apache.org/jira/browse/PDFBOX-1915 Project: PDFBox Issue Type: Improvement Components: Rendering Affects Versions: 1.8.5, 1.8.6, 2.0.0 Reporter: Tilman Hausherr Assignee: Shaola Ren Labels: graphical, gsoc2014, java, math, shading Fix For: 2.0.0 Attachments: CONICAL.pdf, GWG060_Shading_x1a.pdf, GWG060_Shading_x1a_1.png, HSBWHEEL.pdf, McAfee-ShadingType7.pdf, Shadingtype6week1.pdf, TENSOR.pdf, XYZsweep.pdf, _gwg060_shading_x1a.pdf-1.png, _mcafee-shadingtype7.pdf-1.png, asy-coons-but-really-tensor.pdf, asy-tensor-rainbow.pdf, asy-tensor.pdf, coons-function.pdf, coons-function.ps, coons-nofunction-CMYK.pdf, coons-nofunction-CMYK.ps, coons-nofunction-Duotone.pdf, coons-nofunction-Duotone.ps, coons-nofunction-Gray.pdf, coons-nofunction-Gray.ps, coons-nofunction-RGB.pdf, coons-nofunction-RGB.ps, coons2-function.pdf, coons2-function.ps, coons4-function.ps, crestron-p9.pdf, eci_altona-test-suite-v2_technical_H.pdf, failedTest.rar, lamp_cairo.pdf, lamp_cairo7_0.png, lamp_cairo7_1.png, lamp_cairo7_1.png, lineRasterization.jpg, mcafeeU5.pdf, mcafeeU5_1.png, mcafeeu5.pdf-1.png, pass4FlagTest.rar, patchCases.jpg, patchMap.jpg, shading6ContourTest.rar, shading6Done.rar, shading7.rar, tensor-nofunction-RGB.pdf, tensor-nofunction-RGB.ps, tensor-nofunction-RGB_1.png, tensor4-nofunction.pdf, tensor4-nofunction.ps, tensor4-nofunction_1.png, updateshading6ContourTest.rar Of the seven shading methods described in the PDF specification, type 6 (Coons patch meshes) and type 7 (Tensor-product patch meshes) haven't been implemented. I have done type 1, 4 and 5, but I don't know the math for type 6 and 7. My math days are decades away. Knowledge prerequisites: - java, although you don't have to be a java ace, just feel confortable - math: you should know what cubic Bézier curves, Degenerate Bézier curves, bilinear interpolation, tensor-product, affine transform matrix and Bernstein polynomials are, or be able to learn it - maven (basic) - svn (basic) - an IDE like Netbeans or Eclipse or IntelliJ (basic) - ideally, you are either a math student who likes to program, or a computer science student who is specializing in graphics. A first look at PDFBOX: try the command utility here: https://pdfbox.apache.org/commandline/#pdfToImage and use your favorite PDF, or the PDFs mentioned in PDFBOX-615, these have the shading types that are already implemented. Some simple source code to convert to images: String filename = blah.pdf; PDDocument document = PDDocument.loadNonSeq(new File(filename), null); ListPDPage pdPages = document.getDocumentCatalog().getAllPages(); int page = 0; for (PDPage pdPage : pdPages) { ++page; BufferedImage bim = RenderUtil.convertToImage(pdPage, BufferedImage.TYPE_BYTE_BINARY, 300); ImageIO.write(bim, png, new File(filename+page+.png)); } document.close(); You are not starting from scratch. The implementation of type 4 and 5 shows you how to read parameters from the PDF and set the graphics. You don't have to learn the complete PDF spec, only 15 pages related to the two shading types, and 6 pages about shading in general. The PDF specification is here: http://www.adobe.com/devnet/pdf/pdf_reference.html The tricky parts are: - decide whether a point(x,y) is inside or outside a patch - decide the color of a point within the patch To get an idea about the code, look at the classes GouraudTriangle, GouraudShadingContext, Type4ShadingContext and Vertex here https://svn.apache.org/viewvc/pdfbox/trunk/pdfbox/src/main/java/org/apache/pdfbox/pdmodel/graphics/shading/ or download the whole project from the repository. https://pdfbox.apache.org/downloads.html#scm If you want to see the existing code in the debugger with a Gouraud shading, try this file: http://asymptote.sourceforge.net/gallery/Gouraud.pdf Testing: I have attached several example PDFs. To see which one has which shading, open them with an editor like NOTEPAD++, and search for /ShadingType (without the quotes). If your images are rendering like the example PDFs, then you were
[jira] [Commented] (PDFBOX-2156) different shading patterns at different resolutions when ctm is null
[ https://issues.apache.org/jira/browse/PDFBOX-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14040109#comment-14040109 ] Tilman Hausherr commented on PDFBOX-2156: - Fixed in rev 1604562 for the 1.8 version for shading types 4 and 5. Surprisingly, the method of shading 2 and 3 (which involves the pageHeight and the y-translation) can't be used here. different shading patterns at different resolutions when ctm is null Key: PDFBOX-2156 URL: https://issues.apache.org/jira/browse/PDFBOX-2156 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 1.8.5, 1.8.6, 2.0.0 Reporter: Tilman Hausherr Assignee: Tilman Hausherr Labels: shading, shadingpattern Attachments: pattern-shading-2-4-noMatrix.pdf, swftools-gradients.pdf, swftools-gradients1.jpg The attached file renders incorrectly except at 72dpi; at other resolutions, there are different results that get worse as the resolution gets higher. For shading types 2 and 3, this effect happens only in the 1.8 versions, not in the 2.0 version. For shading types 4 and up, it happens in both 1.8 and 2.0. The reason is that the transformation is incomplete if the ctm is null. I found the file here and it was created by Matthias Kramm: https://github.com/jdapena/swftools/tree/master/spec -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2155) Fix JavaDocs warnings
[ https://issues.apache.org/jira/browse/PDFBOX-2155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14040113#comment-14040113 ] Maruan Sahyoun commented on PDFBOX-2155: fixed warnings for xmpbox in rev. 1604565 for 2.0 Fix JavaDocs warnings - Key: PDFBOX-2155 URL: https://issues.apache.org/jira/browse/PDFBOX-2155 Project: PDFBox Issue Type: Bug Components: Documentation, FontBox, Preflight, XmpBox Affects Versions: 1.8.6, 2.0.0 Reporter: Maruan Sahyoun Assignee: Maruan Sahyoun Priority: Trivial after fixing PDFBOX-1897 with additional changes some new warnings were introduced. In addition warnings in sub projects should be fixed. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (PDFBOX-2157) Remove AFMFormatter
Andreas Lehmkühler created PDFBOX-2157: -- Summary: Remove AFMFormatter Key: PDFBOX-2157 URL: https://issues.apache.org/jira/browse/PDFBOX-2157 Project: PDFBox Issue Type: Improvement Components: FontBox Affects Versions: 2.0.0 Reporter: Andreas Lehmkühler Assignee: Andreas Lehmkühler The AFMFormatter class is used to create the font metrics of a CFF font. It makes a detour by creating an AFM file first and parsing it the create the font metrics. That isn't needed as we can create the font metrics directly when parsing the CFF font. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2157) Remove AFMFormatter
[ https://issues.apache.org/jira/browse/PDFBOX-2157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14040129#comment-14040129 ] Andreas Lehmkühler commented on PDFBOX-2157: I've removed the AFMFormatter class from the trunk in revision http://svn.apache.org/r1604572 The font metric is created when parsing the CFFFont. Remove AFMFormatter --- Key: PDFBOX-2157 URL: https://issues.apache.org/jira/browse/PDFBOX-2157 Project: PDFBox Issue Type: Improvement Components: FontBox Affects Versions: 2.0.0 Reporter: Andreas Lehmkühler Assignee: Andreas Lehmkühler The AFMFormatter class is used to create the font metrics of a CFF font. It makes a detour by creating an AFM file first and parsing it the create the font metrics. That isn't needed as we can create the font metrics directly when parsing the CFF font. -- This message was sent by Atlassian JIRA (v6.2#6252)
[RESULT][VOTE] Release Apache PDFBox 1.8.6
Hi Am 19.06.2014 14:28, schrieb Andreas Lehmkuehler: Please vote on releasing this package as Apache PDFBox 1.8.6. The vote passes as follows: +1 Rey Malahay +1 Tilman Hausherr (*) +1 John Hewson (*) +1 Maruan Sahyoun (*) +1 Timo Boehme (*) +1 Guillaume Bailleul (*) +1 Andreas Lehmkühler (*) (*) binding votes Thanks for your help and support!! I'll push the release out. BR Andreas Lehmkühler
[jira] [Resolved] (PDFBOX-2157) Remove AFMFormatter
[ https://issues.apache.org/jira/browse/PDFBOX-2157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andreas Lehmkühler resolved PDFBOX-2157. Resolution: Fixed Fix Version/s: 2.0.0 I've deprecated the removed method BoundingBox#contains(Point point) and the removed class AFMFormatter in revisions http://svn.apache.org/r1604573 and http://svn.apache.org/r1604589 in the 1.8 branch Remove AFMFormatter --- Key: PDFBOX-2157 URL: https://issues.apache.org/jira/browse/PDFBOX-2157 Project: PDFBox Issue Type: Improvement Components: FontBox Affects Versions: 2.0.0 Reporter: Andreas Lehmkühler Assignee: Andreas Lehmkühler Fix For: 2.0.0 The AFMFormatter class is used to create the font metrics of a CFF font. It makes a detour by creating an AFM file first and parsing it the create the font metrics. That isn't needed as we can create the font metrics directly when parsing the CFF font. -- This message was sent by Atlassian JIRA (v6.2#6252)
1.8.6 and JIRA
Hi, due to the newest PDFBox 1.8.6 release I've closed all resolved 1.8.6 related issues in a bulk operation. I've disabled the email notification to avoid an email flood. I've also added the all new version 1.8.7 for our next bugfix release ... I'll update the download page once the mirrors copied the version from our repository. BR Andreas Lehmkühler
wikipedia
I've started a wikipedia article about PDFBox. It has already existed for 10 minutes, so there is hope :-) https://en.wikipedia.org/wiki/Apache_PDFBox
[jira] [Updated] (PDFBOX-2156) different shading patterns at different resolutions when ctm is null
[ https://issues.apache.org/jira/browse/PDFBOX-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-2156: Affects Version/s: 1.8.7 different shading patterns at different resolutions when ctm is null Key: PDFBOX-2156 URL: https://issues.apache.org/jira/browse/PDFBOX-2156 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 1.8.5, 1.8.6, 1.8.7, 2.0.0 Reporter: Tilman Hausherr Assignee: Tilman Hausherr Labels: shading, shadingpattern Fix For: 1.8.7, 2.0.0 Attachments: pattern-shading-2-4-noMatrix.pdf, swftools-gradients.pdf, swftools-gradients1.jpg The attached file renders incorrectly except at 72dpi; at other resolutions, there are different results that get worse as the resolution gets higher. For shading types 2 and 3, this effect happens only in the 1.8 versions, not in the 2.0 version. For shading types 4 and up, it happens in both 1.8 and 2.0. The reason is that the transformation is incomplete if the ctm is null. I found the file here and it was created by Matthias Kramm: https://github.com/jdapena/swftools/tree/master/spec -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (PDFBOX-2156) different shading patterns at different resolutions when ctm is null
[ https://issues.apache.org/jira/browse/PDFBOX-2156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr resolved PDFBOX-2156. - Resolution: Fixed Fix Version/s: 2.0.0 1.8.7 different shading patterns at different resolutions when ctm is null Key: PDFBOX-2156 URL: https://issues.apache.org/jira/browse/PDFBOX-2156 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 1.8.5, 1.8.6, 1.8.7, 2.0.0 Reporter: Tilman Hausherr Assignee: Tilman Hausherr Labels: shading, shadingpattern Fix For: 1.8.7, 2.0.0 Attachments: pattern-shading-2-4-noMatrix.pdf, swftools-gradients.pdf, swftools-gradients1.jpg The attached file renders incorrectly except at 72dpi; at other resolutions, there are different results that get worse as the resolution gets higher. For shading types 2 and 3, this effect happens only in the 1.8 versions, not in the 2.0 version. For shading types 4 and up, it happens in both 1.8 and 2.0. The reason is that the transformation is incomplete if the ctm is null. I found the file here and it was created by Matthias Kramm: https://github.com/jdapena/swftools/tree/master/spec -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PDFBOX-2153) Setting the correct clipping path for shading
[ https://issues.apache.org/jira/browse/PDFBOX-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-2153: Affects Version/s: 1.8.7 Setting the correct clipping path for shading - Key: PDFBOX-2153 URL: https://issues.apache.org/jira/browse/PDFBOX-2153 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 1.8.5, 1.8.6, 1.8.7, 2.0.0 Reporter: Tilman Hausherr Labels: shading, shadingpattern Fix For: 1.8.7, 2.0.0 While doing tests with the file eci_altona-test-suite-v2_technical_H.pdf (uncompressed) of PDFBOX-1915 I noticed that by removing a W (modifies the clipping region) operator of a type 7 shading I got a lot more correct shadings (type 6 and lower). It looked like PDFBox had been using the clipping of the type 7 when drawing the type 6, which is just a rectangle above in that rendering. This resulted in a blank. By adding {code} graphics.setClip(getGraphicsState().getCurrentClippingPath()); {code} in PageDrawer.shfill() just before the graphics.fill() I get several files to render correctly that I hadn't before. (Setting null will probably do the same, didn't test that yet). The following PDFs are rendered correctly with the change: McAfee-ShadingType7.pdf eci_altona-test-suite-v2_technical_H.pdf crestron-p9.pdf (these three found in PDFBOX-1915) PDFBOX-1451.pdf (alfresco) PDFBOX-1940.pdf (chart) PDFBOX-1861-tracemonkey.pdf p.11 Not solved by the change: PDFBOX-2098-asyTUG.pdf p.6 (this one doesn't use shfill) PDFBOX-1861-tracemonkey.pdf p.6 (not shading) PDFBOX-1416.pdf (not shading) texample-rgb-triangle.pdf (John has an explanation about that one) WDYT? Is there any reason NOT to set the clipping path in PageDrawer.shFill() ? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PDFBOX-2141) Shading not applied to text
[ https://issues.apache.org/jira/browse/PDFBOX-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-2141: Affects Version/s: 1.8.7 Shading not applied to text --- Key: PDFBOX-2141 URL: https://issues.apache.org/jira/browse/PDFBOX-2141 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 1.8.5, 1.8.6, 1.8.7, 2.0.0 Reporter: Petr Slaby Priority: Minor Attachments: 04_ShadingPatternTextPDF.pdf, PDFBOX-1917.pdf-1.png, PDFBOX-1917.pdf-1.png-diff.png, PDFBOX-1917.pdf-9.png, PDFBOX-1917.pdf-9.png-diff.png, PDFBOX-2135.pdf-2.png, PDFBOX-2135.pdf-2.png-diff.png, PageDrawer.writeFont.java.patch, pattern-shading-2-4-idMatrix.pdf, pattern-shading-2-4-idMatrix1.jpg, pattern-shading-2-4.ps The attached PDF draws a text filled with horizontal shading going from red to blue. When rendered via PDFBox, the text is completely filled with red. The problem is that AxialShadingContext#getRaster() gets called with positions that completely fell outside of the range stored in its coords[] field. The fix seems to be to set glyph transform rather than graphics2d transform in PageDrawer#writeText() as shown in the attached patch. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (PDFBOX-2153) Setting the correct clipping path for shading
[ https://issues.apache.org/jira/browse/PDFBOX-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr resolved PDFBOX-2153. - Resolution: Fixed Fix Version/s: 2.0.0 1.8.7 Assignee: Tilman Hausherr Setting the correct clipping path for shading - Key: PDFBOX-2153 URL: https://issues.apache.org/jira/browse/PDFBOX-2153 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 1.8.5, 1.8.6, 1.8.7, 2.0.0 Reporter: Tilman Hausherr Assignee: Tilman Hausherr Labels: shading, shadingpattern Fix For: 1.8.7, 2.0.0 While doing tests with the file eci_altona-test-suite-v2_technical_H.pdf (uncompressed) of PDFBOX-1915 I noticed that by removing a W (modifies the clipping region) operator of a type 7 shading I got a lot more correct shadings (type 6 and lower). It looked like PDFBox had been using the clipping of the type 7 when drawing the type 6, which is just a rectangle above in that rendering. This resulted in a blank. By adding {code} graphics.setClip(getGraphicsState().getCurrentClippingPath()); {code} in PageDrawer.shfill() just before the graphics.fill() I get several files to render correctly that I hadn't before. (Setting null will probably do the same, didn't test that yet). The following PDFs are rendered correctly with the change: McAfee-ShadingType7.pdf eci_altona-test-suite-v2_technical_H.pdf crestron-p9.pdf (these three found in PDFBOX-1915) PDFBOX-1451.pdf (alfresco) PDFBOX-1940.pdf (chart) PDFBOX-1861-tracemonkey.pdf p.11 Not solved by the change: PDFBOX-2098-asyTUG.pdf p.6 (this one doesn't use shfill) PDFBOX-1861-tracemonkey.pdf p.6 (not shading) PDFBOX-1416.pdf (not shading) texample-rgb-triangle.pdf (John has an explanation about that one) WDYT? Is there any reason NOT to set the clipping path in PageDrawer.shFill() ? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PDFBOX-2151) Replace log4j with commons logging
[ https://issues.apache.org/jira/browse/PDFBOX-2151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-2151: Affects Version/s: 1.8.7 Replace log4j with commons logging -- Key: PDFBOX-2151 URL: https://issues.apache.org/jira/browse/PDFBOX-2151 Project: PDFBox Issue Type: Improvement Components: Preflight Affects Versions: 1.8.6, 1.8.7, 2.0.0 Reporter: Tilman Hausherr Assignee: Tilman Hausherr Priority: Minor Suggested by Simon Steiner on the dev list: Should pdfbox move few bits of log4j to commons logging? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PDFBOX-2154) NPE while rendering files with type3 fonts
[ https://issues.apache.org/jira/browse/PDFBOX-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-2154: Affects Version/s: 1.8.7 NPE while rendering files with type3 fonts -- Key: PDFBOX-2154 URL: https://issues.apache.org/jira/browse/PDFBOX-2154 Project: PDFBox Issue Type: Bug Affects Versions: 1.8.3, 1.8.4, 1.8.5, 1.8.6, 1.8.7 Reporter: Tilman Hausherr Labels: type3 I get this NPE with the files of PDFBOX-1145, PDFBOX-1794, PDFBOX-2023 in 1.8 only: java.lang.NullPointerException at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:210) at org.apache.pdfbox.pdmodel.font.Type3StreamParser.createImage(Type3StreamParser.java:59) at org.apache.pdfbox.pdmodel.font.PDType3Font.createImageIfNecessary(PDType3Font.java:80) at org.apache.pdfbox.pdmodel.font.PDType3Font.drawString(PDType3Font.java:102) at org.apache.pdfbox.pdfviewer.PageDrawer.processTextPosition(PageDrawer.java:256) at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:499) at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45) at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:557) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:268) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:235) at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:215) at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:135) at org.apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.java:801) at org.apache.pdfbox.util.TestPDFToImage.doTestFile(TestPDFToImage.java:232) at org.apache.pdfbox.util.TestPDFToImage.testRenderImage(TestPDFToImage.java:344) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at junit.framework.TestCase.runTest(TestCase.java:168) at junit.framework.TestCase.runBare(TestCase.java:134) at junit.framework.TestResult$1.protect(TestResult.java:110) at junit.framework.TestResult.runProtected(TestResult.java:128) at junit.framework.TestResult.run(TestResult.java:113) at junit.framework.TestCase.run(TestCase.java:124) at junit.framework.TestSuite.runTest(TestSuite.java:232) at junit.framework.TestSuite.run(TestSuite.java:227) at junit.textui.TestRunner.doRun(TestRunner.java:116) at junit.textui.TestRunner.start(TestRunner.java:180) at junit.textui.TestRunner.main(TestRunner.java:138) at org.apache.pdfbox.util.TestPDFToImage.main(TestPDFToImage.java:394) After fixing PDFStreamEngine.processStream() like this {code} if (aPage == null) { graphicsState = new PDGraphicsState(); } else { graphicsState = new PDGraphicsState(aPage.findCropBox()); } {code} I get another NPE: java.lang.NullPointerException at org.apache.pdfbox.pdmodel.font.Type3StreamParser.createImage(Type3StreamParser.java:60) at org.apache.pdfbox.pdmodel.font.PDType3Font.createImageIfNecessary(PDType3Font.java:80) at org.apache.pdfbox.pdmodel.font.PDType3Font.drawString(PDType3Font.java:102) at org.apache.pdfbox.pdfviewer.PageDrawer.processTextPosition(PageDrawer.java:256) at org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:506) at org.apache.pdfbox.util.operator.ShowText.process(ShowText.java:45) at org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:564) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:275) at org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:242) at org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:222) at org.apache.pdfbox.pdfviewer.PageDrawer.drawPage(PageDrawer.java:135) at org.apache.pdfbox.pdmodel.PDPage.convertToImage(PDPage.java:801) at org.apache.pdfbox.util.TestPDFToImage.doTestFile(TestPDFToImage.java:232) at org.apache.pdfbox.util.TestPDFToImage.testRenderImage(TestPDFToImage.java:344) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at
[jira] [Updated] (PDFBOX-1940) Faulty pdf-image rendering
[ https://issues.apache.org/jira/browse/PDFBOX-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-1940: Affects Version/s: 1.8.7 Faulty pdf-image rendering --- Key: PDFBOX-1940 URL: https://issues.apache.org/jira/browse/PDFBOX-1940 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 1.8.6, 1.8.7, 2.0.0 Reporter: Daniel Kozimor Assignee: Tilman Hausherr Fix For: 1.8.7, 2.0.0 Attachments: PDFBOX-1940-v1.8.jpg, input.pdf, output.jpg A particular PDF is producing improper output jpg. The pdf in question, as well as the produced jpg can be found attached to this issue. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (PDFBOX-1940) Faulty pdf-image rendering
[ https://issues.apache.org/jira/browse/PDFBOX-1940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr resolved PDFBOX-1940. - Resolution: Fixed Fix Version/s: 1.8.7 Faulty pdf-image rendering --- Key: PDFBOX-1940 URL: https://issues.apache.org/jira/browse/PDFBOX-1940 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 1.8.6, 1.8.7, 2.0.0 Reporter: Daniel Kozimor Assignee: Tilman Hausherr Fix For: 1.8.7, 2.0.0 Attachments: PDFBOX-1940-v1.8.jpg, input.pdf, output.jpg A particular PDF is producing improper output jpg. The pdf in question, as well as the produced jpg can be found attached to this issue. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Resolved] (PDFBOX-2151) Replace log4j with commons logging
[ https://issues.apache.org/jira/browse/PDFBOX-2151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr resolved PDFBOX-2151. - Resolution: Fixed Fix Version/s: 2.0.0 1.8.7 Replace log4j with commons logging -- Key: PDFBOX-2151 URL: https://issues.apache.org/jira/browse/PDFBOX-2151 Project: PDFBox Issue Type: Improvement Components: Preflight Affects Versions: 1.8.6, 1.8.7, 2.0.0 Reporter: Tilman Hausherr Assignee: Tilman Hausherr Priority: Minor Fix For: 1.8.7, 2.0.0 Suggested by Simon Steiner on the dev list: Should pdfbox move few bits of log4j to commons logging? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2125) Solr throws exception error
[ https://issues.apache.org/jira/browse/PDFBOX-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14040165#comment-14040165 ] Tilman Hausherr commented on PDFBOX-2125: - Sometimes PDF files have an empty password. To check whether this is the case here, the best would be to test with a PDF that has an empty password. Solr throws exception error --- Key: PDFBOX-2125 URL: https://issues.apache.org/jira/browse/PDFBOX-2125 Project: PDFBox Issue Type: Bug Components: PDModel Affects Versions: 1.8.4 Environment: Java; Windows 7; Tika; Solr; Reporter: Zack Honig Labels: Solr, Tika Original Estimate: 168h Remaining Estimate: 168h When uploading a PDF document and it renders through SOLR, this error is thrown: 54440 [qtp1311760211-12] ERROR org.apache.solr.core.SolrCore û org.apache.solr. common.SolrException: org.apache.tika.exception.TikaException: Unexpected Runtim eException from org.apache.tika.parser.pdf.PDFParser@4d711a77 at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(Extr actingDocumentLoader.java:225) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(Co ntentStreamHandlerBase.java:74) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandl erBase.java:135) at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handle Request(RequestHandlers.java:241) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1952) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter .java:774) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte r.java:418) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilte r.java:207) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(Servlet Handler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java :455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.j ava:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.jav a:522) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandl er.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandl er.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java: 384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandle r.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandle r.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.j ava:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(Cont extHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerColl ection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper .java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(Abstrac tHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(Blockin gHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(Abstra ctHttpConnection.java:942) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.header Complete(AbstractHttpConnection.java:1004) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:647) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpCo nnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(So cketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPoo l.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool .java:543) at java.lang.Thread.run(Unknown Source) Caused by: org.apache.tika.exception.TikaException: Unexpected RuntimeException from org.apache.tika.parser.pdf.PDFParser@4d711a77 at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244 ) at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:242 ) at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:1 24) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(Extr actingDocumentLoader.java:219) ... 32 more Caused by: java.lang.ArrayIndexOutOfBoundsException
[jira] [Closed] (PDFBOX-2143) Unable to add png file in to pdf
[ https://issues.apache.org/jira/browse/PDFBOX-2143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr closed PDFBOX-2143. --- Resolution: Cannot Reproduce Closing as we can't help you without getting more info. You can still reopen. Some more hints: - try with an ordinary PNG, just to be sure that something works - make sure you have the latest version (1.8.6) and preferably a recent JDK - look at the source code of PDFBox to see how it is done, just to be sure that you are doing it properly (search for new PDPixelMap) - try one of the examples (AddImageToPDF) Unable to add png file in to pdf Key: PDFBOX-2143 URL: https://issues.apache.org/jira/browse/PDFBOX-2143 Project: PDFBox Issue Type: Bug Reporter: Suresh Dhanapal I am trying to draw png file in to pdf, I am getting out of memory error. Could you suggest me how to resolve? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2141) Shading not applied to text
[ https://issues.apache.org/jira/browse/PDFBOX-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14040233#comment-14040233 ] Petr Slaby commented on PDFBOX-2141: {quote} pattern-shading-2-4-idMatrix.pdf ... {quote} But the problem does not seem to be related to this issue. At least I get an identical rendering before and after the change made in revision 1604282. Shading not applied to text --- Key: PDFBOX-2141 URL: https://issues.apache.org/jira/browse/PDFBOX-2141 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 1.8.5, 1.8.6, 1.8.7, 2.0.0 Reporter: Petr Slaby Priority: Minor Attachments: 04_ShadingPatternTextPDF.pdf, PDFBOX-1917.pdf-1.png, PDFBOX-1917.pdf-1.png-diff.png, PDFBOX-1917.pdf-9.png, PDFBOX-1917.pdf-9.png-diff.png, PDFBOX-2135.pdf-2.png, PDFBOX-2135.pdf-2.png-diff.png, PageDrawer.writeFont.java.patch, pattern-shading-2-4-idMatrix.pdf, pattern-shading-2-4-idMatrix1.jpg, pattern-shading-2-4.ps The attached PDF draws a text filled with horizontal shading going from red to blue. When rendered via PDFBox, the text is completely filled with red. The problem is that AxialShadingContext#getRaster() gets called with positions that completely fell outside of the range stored in its coords[] field. The fix seems to be to set glyph transform rather than graphics2d transform in PageDrawer#writeText() as shown in the attached patch. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (PDFBOX-2158) ExtractText missing most of text in this PDF file, due to font bonding box with minus infinity
Joel Hirsh created PDFBOX-2158: -- Summary: ExtractText missing most of text in this PDF file, due to font bonding box with minus infinity Key: PDFBOX-2158 URL: https://issues.apache.org/jira/browse/PDFBOX-2158 Project: PDFBox Issue Type: Bug Components: Text extraction Affects Versions: 1.8.5 Environment: Windows x64 Reporter: Joel Hirsh Attached PDF file is missing most of the text when processed by the ExtractText example program I traced it down to PDFontDescriptorDictionary.getFontBoundingBox() getting a rectange for COSName.FONT_BBOX that contained a ymin value of minus infinity. That method then creates a PDRectangle which calculates a bounding box with a ymin value of -65,329, and results in an enormous text size, and things go downhill from there. The text cannot be matched up, and most of it ends up being discarded. I was able to hack a fix by doing a check in the constructor PDRectangle.PDRectangle( COSArray array ) for big negative numbers and setting them to 0. With that change, all the text came through as expected. However, I don't have enough familiarity with the code to understand what a real fix ought to look like. The PDF file looks to be fine by other programs such as Acrobat and NitroPDF -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (PDFBOX-2158) ExtractText missing most of text in this PDF file, due to font bonding box with minus infinity
[ https://issues.apache.org/jira/browse/PDFBOX-2158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joel Hirsh updated PDFBOX-2158: --- Attachment: negative.text.box.pdf File that exhibits this problem ExtractText missing most of text in this PDF file, due to font bonding box with minus infinity -- Key: PDFBOX-2158 URL: https://issues.apache.org/jira/browse/PDFBOX-2158 Project: PDFBox Issue Type: Bug Components: Text extraction Affects Versions: 1.8.5 Environment: Windows x64 Reporter: Joel Hirsh Attachments: negative.text.box.pdf Attached PDF file is missing most of the text when processed by the ExtractText example program I traced it down to PDFontDescriptorDictionary.getFontBoundingBox() getting a rectange for COSName.FONT_BBOX that contained a ymin value of minus infinity. That method then creates a PDRectangle which calculates a bounding box with a ymin value of -65,329, and results in an enormous text size, and things go downhill from there. The text cannot be matched up, and most of it ends up being discarded. I was able to hack a fix by doing a check in the constructor PDRectangle.PDRectangle( COSArray array ) for big negative numbers and setting them to 0. With that change, all the text came through as expected. However, I don't have enough familiarity with the code to understand what a real fix ought to look like. The PDF file looks to be fine by other programs such as Acrobat and NitroPDF -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2141) Shading not applied to text
[ https://issues.apache.org/jira/browse/PDFBOX-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14040251#comment-14040251 ] Tilman Hausherr commented on PDFBOX-2141: - Yes... sorry, I didn't make myself clear, it is not a regression, I forgot to mention that I also tested with earlier versions. What really puzzles me is that both the http://pslib.sourceforge.net/shading.ps file and my file use a pattern-based approach and one works and the other doesn't. But I haven't had the time to analyze what really happens. Shading not applied to text --- Key: PDFBOX-2141 URL: https://issues.apache.org/jira/browse/PDFBOX-2141 Project: PDFBox Issue Type: Bug Components: Rendering Affects Versions: 1.8.5, 1.8.6, 1.8.7, 2.0.0 Reporter: Petr Slaby Priority: Minor Attachments: 04_ShadingPatternTextPDF.pdf, PDFBOX-1917.pdf-1.png, PDFBOX-1917.pdf-1.png-diff.png, PDFBOX-1917.pdf-9.png, PDFBOX-1917.pdf-9.png-diff.png, PDFBOX-2135.pdf-2.png, PDFBOX-2135.pdf-2.png-diff.png, PageDrawer.writeFont.java.patch, pattern-shading-2-4-idMatrix.pdf, pattern-shading-2-4-idMatrix1.jpg, pattern-shading-2-4.ps The attached PDF draws a text filled with horizontal shading going from red to blue. When rendered via PDFBox, the text is completely filled with red. The problem is that AxialShadingContext#getRaster() gets called with positions that completely fell outside of the range stored in its coords[] field. The fix seems to be to set glyph transform rather than graphics2d transform in PageDrawer#writeText() as shown in the attached patch. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (PDFBOX-2149) Font Refactoring
[ https://issues.apache.org/jira/browse/PDFBOX-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14040349#comment-14040349 ] John Hewson commented on PDFBOX-2149: - Ok, the underling problem which caused an NPE in getFontDescriptor() is solved in [r1604679|http://svn.apache.org/r1604679]. System TrueType fonts now synthesise a FontDescriptor in the same manner as Type1 fonts. Font Refactoring Key: PDFBOX-2149 URL: https://issues.apache.org/jira/browse/PDFBOX-2149 Project: PDFBox Issue Type: Improvement Components: FontBox, PDModel Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: 39.pdf, 000467.pdf To fix bugs such as PDFBOX-2140 and to enable Unicode TTF embedding we need to sort out long-standing font/text encoding issues. The main issue is that encoding is done in an ad-hoc manner, sometimes in the PDFont subclasses, sometimes elsewhere. For example TTFGlyph2D does its own decoding, and this code is copy pasted into PDTrueTypeFont. Likewise, PDFont handles CMaps and Encodings despite the fact that these two encoding methods are mutually exclusive. The end result is that the process of reading Encodings/CMaps is often following rules which are completely invalid for that font type but mostly work by luck. Phase 1 - Refactor PDFont subclasses to remove setXXX methods which allow the object to be corrupted. Proper use of inheritance can remove all cases where public setXXX methods are used during font loading. - Clean up TTF loading and the loadTTF in anticipation of Unicode TTF embedding, FontBox's TrueTypeFont class is externally mutable via setXXX methods used only by TTFParser: these can be made package-private. - the Encoding class and EncodingManager could do with some cleaning up prior to further refactoring. - PDSimpleFont does not do anything, its functionality should be moved into its superclass, PDFont. - PDFont#determineEncoding() loads CMaps when only Encodings are applicable, and vice versa. Loading needs to be pushed down into the appropriate subclasses, as a starting point the relevant code should at least be copied into the relevant subclasses ready for further refactoring. - TTFGlyph2D does its own decoding of char codes, rather than using the font's #encode method (fair enough because #encode is broken) and there's a copy and pasted version of the same code in PDTrueTypeFont - we need to consolidate this code into PDTrueTypeFont where it belongs. Phase 2 - Refactor loading of CMaps and Encodings from font dictionaries, this will involve changes to PDFont and its subclasses to delegate loading to subclasses where it can be properly encapsulated - May need to alter the class hierarchy w.r.t CIDFont to facilitate this, as CIDFont isn't really a PDFont - it's parent Type0 font is responsible for its CMap. We'll see. Phase 3 - Refactor the decoding of character codes by PDFont and its subclasses, this will involve replacing the #getCodeFromArray, #encode and #encodeToCID methods. - Fix decoding of content stream character codes in PDFStreamEngine, using the newly refactored PDFont and using the current font's CMap to determine the code width. Phase 4 - Add support for generating embedded TTFs with Unicode -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Comment Edited] (PDFBOX-2149) Font Refactoring
[ https://issues.apache.org/jira/browse/PDFBOX-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14040349#comment-14040349 ] John Hewson edited comment on PDFBOX-2149 at 6/23/14 2:20 AM: -- Ok, the underling problem which caused an NPE in getFontDescriptor() is solved in [r1604679|http://svn.apache.org/r1604679] and [r1604681|http://svn.apache.org/r1604681]. System TrueType fonts now synthesise a FontDescriptor in the same manner as Type1 fonts. was (Author: jahewson): Ok, the underling problem which caused an NPE in getFontDescriptor() is solved in [r1604679|http://svn.apache.org/r1604679]. System TrueType fonts now synthesise a FontDescriptor in the same manner as Type1 fonts. Font Refactoring Key: PDFBOX-2149 URL: https://issues.apache.org/jira/browse/PDFBOX-2149 Project: PDFBox Issue Type: Improvement Components: FontBox, PDModel Affects Versions: 2.0.0 Reporter: John Hewson Assignee: John Hewson Attachments: 39.pdf, 000467.pdf To fix bugs such as PDFBOX-2140 and to enable Unicode TTF embedding we need to sort out long-standing font/text encoding issues. The main issue is that encoding is done in an ad-hoc manner, sometimes in the PDFont subclasses, sometimes elsewhere. For example TTFGlyph2D does its own decoding, and this code is copy pasted into PDTrueTypeFont. Likewise, PDFont handles CMaps and Encodings despite the fact that these two encoding methods are mutually exclusive. The end result is that the process of reading Encodings/CMaps is often following rules which are completely invalid for that font type but mostly work by luck. Phase 1 - Refactor PDFont subclasses to remove setXXX methods which allow the object to be corrupted. Proper use of inheritance can remove all cases where public setXXX methods are used during font loading. - Clean up TTF loading and the loadTTF in anticipation of Unicode TTF embedding, FontBox's TrueTypeFont class is externally mutable via setXXX methods used only by TTFParser: these can be made package-private. - the Encoding class and EncodingManager could do with some cleaning up prior to further refactoring. - PDSimpleFont does not do anything, its functionality should be moved into its superclass, PDFont. - PDFont#determineEncoding() loads CMaps when only Encodings are applicable, and vice versa. Loading needs to be pushed down into the appropriate subclasses, as a starting point the relevant code should at least be copied into the relevant subclasses ready for further refactoring. - TTFGlyph2D does its own decoding of char codes, rather than using the font's #encode method (fair enough because #encode is broken) and there's a copy and pasted version of the same code in PDTrueTypeFont - we need to consolidate this code into PDTrueTypeFont where it belongs. Phase 2 - Refactor loading of CMaps and Encodings from font dictionaries, this will involve changes to PDFont and its subclasses to delegate loading to subclasses where it can be properly encapsulated - May need to alter the class hierarchy w.r.t CIDFont to facilitate this, as CIDFont isn't really a PDFont - it's parent Type0 font is responsible for its CMap. We'll see. Phase 3 - Refactor the decoding of character codes by PDFont and its subclasses, this will involve replacing the #getCodeFromArray, #encode and #encodeToCID methods. - Fix decoding of content stream character codes in PDFStreamEngine, using the newly refactored PDFont and using the current font's CMap to determine the code width. Phase 4 - Add support for generating embedded TTFs with Unicode -- This message was sent by Atlassian JIRA (v6.2#6252)
[ANNOUNCE] Apache PDFBox 1.8.6 released
The Apache PDFBox community is pleased to announce the release of Apache PDFBox version 1.8.6. The release is available for download at: http://pdfbox.apache.org/downloads.html See the full release notes below for details about this release. Release Notes -- Apache PDFBox -- Version 1.8.6 Introduction The Apache PDFBox library is an open source Java tool for working with PDF documents. This is an incremental bugfix release based on the earlier 1.8.5 release. It contains a couple of fixes and small improvements. For more details on all fixes included in this release, please refer to the following issues on the PDFBox issue tracker at https://issues.apache.org/jira/browse/PDFBOX. Bug [PDFBOX-54] - please correct the SetField example [PDFBOX-62] - Incorrect (zero) character widths returned in some docs [PDFBOX-239] - PDFToImage prints every word at the start of the line [PDFBOX-934] - ImageToPDF.createPDFFromImage causes problems for certain TIFF inputs [PDFBOX-1474] - PDDocument.decrypt does not throws InvalidPasswordException [PDFBOX-1689] - Partial failure to render PDF [PDFBOX-1713] - [PATCH] Bullet character not rendered [PDFBOX-1756] - ClassCastException CosString cannot be cast to COSName [PDFBOX-1845] - PDDocument.load() give Error: Expected a long type at offset 1633 [PDFBOX-1895] - Type0 settings /Registry and /Ordering are not decrypted when writing document [PDFBOX-1922] - NonSequentialParser not reading version in header and trailer [PDFBOX-2047] - read operations alter PDLab object [PDFBOX-2050] - Add predictor to LZW filter [PDFBOX-2054] - Remove System.out.println() [PDFBOX-2056] - incomplete build tests [PDFBOX-2057] - Importing BufferedImage into PDPixelMap is broken in 1.8.5 [PDFBOX-2058] - The text of pdfs using Type1C can't be extracted correct [PDFBOX-2063] - Incomplete EOF detection in ASCIIHexFilter [PDFBOX-2064] - ArrayIndexOutOfBoundsException in CompositeImage.createMaskedImage [PDFBOX-2072] - Wrong calculation of space char width in PDFStreamEngine [PDFBOX-2073] - PDF files with unusual Japanese font can not be rewrite correctly [PDFBOX-2074] - 4-bytes CMap entry causes exception [PDFBOX-2079] - Extra new line characters extracted in 1.8.5 for embedded files leading to ZipFile exception in Java 1.6 [PDFBOX-2080] - Barcode getting color inverted in pdf to image conversion [PDFBOX-2082] - signing corrupts PDF when signature exactly fits allocated space [PDFBOX-2095] - Useless memory allocation in GlyfDescript [PDFBOX-2096] - ICC profile ignored if number of components is 1 [PDFBOX-2100] - Gouraud shading doesn't work with function [PDFBOX-2101] - Surprising memory consumption when extracting images [PDFBOX-2102] - Characters swallowed on COSString.getString() [PDFBOX-2109] - CFFParser uses String constructor without encoding [PDFBOX-2110] - Font not found: CourierNew [PDFBOX-2111] - Cast error in Gouraud shadings [PDFBOX-2114] - ObjStm is being processed to late [PDFBOX-2115] - Use unfiltered stream in gouraud shadings [PDFBOX-2120] - Regression: Type 1 font corrupted [PDFBOX-2122] - FontBox's TTFDataStream doesn't set timezone in readInternationalDate Improvement [PDFBOX-712] - SecurityHandlersManager May stop the application Server when running PDFParser in a Servlet. [PDFBOX-1596] - OverlayPDF logic should be moved into a library class [PDFBOX-1739] - Load document error for two RegisSTAR documents [PDFBOX-2034] - TestFilters is non-deterministic [PDFBOX-2052] - PDFCloneUtility does not handle COSStreamArray [PDFBOX-2066] - RubberStampWithImage should support more image types [PDFBOX-2084] - Make TestImageIOUtils optional in 1.8 for Fedora packaging [PDFBOX-2105] - Support for multipage TIFFs in CCITTFactory, makes PDFBox capable of doing tiff2pdf [PDFBOX-2129] - Add PDFBox version to the title [PDFBOX-1600] - COSDocument and PDDocument declare throws IOException when they don't [PDFBOX-1584] - Add unit test for RandomAccessFileOutputStream Release Contents This release consists of a single source archive packaged as a zip file. The archive can be unpacked with the jar tool from your JDK installation. See the README.txt file for instructions on how to build this release. The source archive is accompanied by SHA1 and MD5 checksums and a PGP signature that you can use to verify the authenticity of your download. The public key used for the PGP signature can be found at https://svn.apache.org/repos/asf/pdfbox/KEYS. About Apache PDFBox --- Apache PDFBox is an open source Java library for working with PDF documents. This project allows creation of new PDF documents, manipulation of existing documents and the ability to extract content from documents. Apache PDFBox also includes several command line utilities. Apache PDFBox is published under the Apache License, Version 2.0. For more information, visit http://pdfbox.apache.org/ About The