[jira] [Commented] (PDFBOX-4396) Memory leak due to soft reference caching
[ https://issues.apache.org/jira/browse/PDFBOX-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711099#comment-16711099 ] Ben Manes commented on PDFBOX-4396: --- This is probably covered in your references tickets. In ScratchFileBuffer it states, {code:java} /** * While calling finalize is normally discouraged we will have to * use it here as long as closing a scratch file buffer is not * done in every case. Currently {@link COSStream} creates new * buffers without closing the old one - which might still be * used. * * Enabling debugging one will see if there are still cases * where the buffer is not closed. */{code} I wasn't able to reproduce the problem in isolation on the PDF document that failed (450mb, 999 pages). I could process it locally in ~12 minutes, the same as ghostscript. It may be due to the additional load put on the machine as the processing is cpu heavy, I process multiple pdfs and pages in parallel, and there is other incoming work. As G1 is is quota driven, likely the cpu thrashing is causes it to not have its work finished within the desired timeframes. When it exhausts its quota and is unable to keep up, that would eventually lead to an OOME. Since Java lacks functioning thread priorities, we it can't de-emphasize application threads for the collector. If G1 has moved away from stop-the-world to failing, then it cannot recover in this scenario. Since G1 has constantly changed, it's hard to pinpoint as descriptions from years ago are no longer accurate and likely they optimized against handling this case, preferring the application was fixed to be better behaved. So far my fixes do seem to be chugging along and past the failure point, but still has more work before it's in the clear. I disabled caching (no obvious perf hit), discard a PDDocument every 25 pages, and call GC each time a PDDocument is closed. I may look into using ghostscript and lambda functions instead, to distribute the work and offload from application servers. In regards to JDK10, there are some build tools not yet JDK11 compatible that I am waiting on. It takes some work to be JDK9 compatible, though 9=>10 was effortless. The 11 transition is more work due to additional module removals. I have 11 prototyped, but its stuck on an infinite compilation bug, due to using Gradle 4.x (incompatible) and a plugin not yet released with Gradle 5 support. > Memory leak due to soft reference caching > - > > Key: PDFBOX-4396 > URL: https://issues.apache.org/jira/browse/PDFBOX-4396 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.12 > Environment: JDK10; G1 >Reporter: Ben Manes >Priority: Major > Attachments: #2 - memory leak 2.png, #2 - memory leak.png, memory > leak 2.png, memory leak.png > > > In a heap dump, it appears that DefaultResourceCache is retaining 5.3 GB of > memory due to buffered images (via PDImageXObject). I suspect that G1 is not > collecting soft references across all regions before it out-of-memory errors. > In PDFBOX-4389, I discovered very slow PDDocument#load times due to a JDK10 > I/O bug. Previously I was loading the document to render each page, but this > took 1.5 minutes. To work around that bug I reused the document instance > across pages. This seems to have fail because the pages were cached and not > cleared by the GC. > The DefaultResourceCache does not prune its cache entries when the soft > references are collected. Like WeakHashMap, it should use a ReferenceQueue, > poll it on every access, and prune accordingly. > Thankfully PDDocument#setResourceCache exists. For now I am going to reset > the cache to a new instance after a page has been rendered. The entries > should no longer be reachable and be GC'd more aggressively. If that doesn't > work, I'll either replace the cache (e.g. with Caffeine) or disable it by > setting the instance to null. > I think the desired fix is to prune the DefaultResourceCache and, ideally, > reconsider usage of soft references (as they tend to be poor in practice). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Closed] (PDFBOX-3388) PDFTextStripper - ScratchFileBuffer not closed!
[ https://issues.apache.org/jira/browse/PDFBOX-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr closed PDFBOX-3388. --- Resolution: Duplicate Closing as duplicate of PDFBOX-3359. I've added you as a watcher there. > PDFTextStripper - ScratchFileBuffer not closed! > --- > > Key: PDFBOX-3388 > URL: https://issues.apache.org/jira/browse/PDFBOX-3388 > Project: PDFBox > Issue Type: Bug >Reporter: Roman Pichlik >Priority: Major > Attachments: CloseablePDFParser.java, PDFStripperTest.java, test.pdf > > > _PDFTextStripper_ or inherently used classes probably do not close all opened > streams under all circumstances. You can reproduce that by the following > snippet of code and the attached PDF file. > {code} > try (RandomAccessBuffer rab = new RandomAccessBuffer(is)) { > PDFParser parser = new PDFParser(rab); > parser.parse(); > try (COSDocument cosDoc = parser.getDocument();PDDocument pdDoc = > new PDDocument(cosDoc);){ > PDFTextStripper pdfStripper = new PDFTextStripper(); > pdfStripper.getText(pdDoc); > } > } catch (IOException e) { > throw new RuntimeException(e); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Created] (PDFBOX-4398) getLastSignatureDictionary modifies internal structure of PDDocument
beat weisskopf created PDFBOX-4398: -- Summary: getLastSignatureDictionary modifies internal structure of PDDocument Key: PDFBOX-4398 URL: https://issues.apache.org/jira/browse/PDFBOX-4398 Project: PDFBox Issue Type: Bug Components: AcroForm Affects Versions: 2.0.12 Reporter: beat weisskopf If one calls PDDocument#getLastSignatureDictionary, the AcroFrom is populated with the defaults even if not needed. This modifies the internals of the PDDocument and therefore there are changes to be saved, even if the file is not modified by "real" changes. For example: {code} PDDocument pdfDocument = PDDocument.load(pdfBytes); pdfDocument.getLastSignatureDictionary(); {code} This calls the verifyOrCreateDefaults() method, which initializes the DR-Dictionary if not yet done. This is even done if getLastSignatureDictionary returns null. Why this bothers me: it is very unexpected behaviour that a getter modifies an objects state. This is no big deal for our usecase, the other issue (PDFBOX-4303) was a bigger problem as we are diffing objects between revisions (current vs last signed revision). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4393) PDF signature invalid after second interactive field signed
[ https://issues.apache.org/jira/browse/PDFBOX-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711308#comment-16711308 ] Tilman Hausherr commented on PDFBOX-4393: - [~Pooky] please give feedback on whether the workaround helped; try also the snapshot at https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/2.0.14-SNAPSHOT/ that would make the workaround obsolete. > PDF signature invalid after second interactive field signed > --- > > Key: PDFBOX-4393 > URL: https://issues.apache.org/jira/browse/PDFBOX-4393 > Project: PDFBox > Issue Type: Bug > Components: Signing >Affects Versions: 2.0.12, 2.0.13 > Environment: Windows >Reporter: Martin Klíma >Priority: Major > Fix For: 2.0.14, 3.0.0 PDFBox > > Attachments: image-2018-12-04-11-14-01-586.png, > image-2018-12-04-11-14-32-676.png, streamserve_test_sig0.pdf, > streamserve_test_sig0_saved.pdf, streamserve_test_sig0_saved_signed by > PDFBox.pdf, streamserve_test_sig0_saved_signed by PDFBox_then_signed by > AR.pdf, streamserve_test_sig0_signed by PDFBoxNEW.pdf, > streamserve_test_sig0_signed by PDFBoxNEW_signed by AR.pdf, > streamserve_test_sig2.pdf, streamserve_test_sig3.pdf, > streamserve_test_sig3.signed.pdf > > > Hi guys, > I stumped on the problem with PDFBox and interactive field signing. I have > PDF generated with OpenText StreamServe with two interactive fields for > signing. See example 1 (streamserve_test_sig0.pdf) in attachement. > When I use Adobe Reader I can sign both of the visual fields just fine but > when I use PDFBox to sign one of this field the following signature is marked > as invalid. Doesn't matter if I use PDFbox or sign it manually with Adobe > Reader See example 3: > * streamserve_test_sig3.pdf (signed by PDFBox) - valid > * streamserve_test_sig3.signed.pdf (signed 2nd by Adobe Reader) - invalid > !image-2018-12-04-11-14-01-586.png|width=582,height=269! > > Also last example - when it´s signed first by Adobe Reader and then with > PDFBox the signature seems valid but it says the document was "certified". > See example 2 (streamserve_test_sig2.pdf) > !image-2018-12-04-11-14-32-676.png|width=591,height=283! > > What could be wrong? When is signed whole document with PDFBox it works just > fine. > Thanks for response, > Martin > > > > > > > > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4398) getLastSignatureDictionary modifies internal structure of PDDocument
[ https://issues.apache.org/jira/browse/PDFBOX-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711279#comment-16711279 ] Tilman Hausherr commented on PDFBOX-4398: - This was introduced in PDFBOX-3732 and also badly needed by PDFBOX-4393. Yes such side effects are usually a no-no but we replicated the behavior of Adobe. See also the ending comment by Maruan in PDFBOX-3732. >From your text I understand that it bothers you, but it doesn't break >anything. So I'd just target this to 3.0, set it to minor and keep 2.x as is. >I won't do anything for now. > getLastSignatureDictionary modifies internal structure of PDDocument > > > Key: PDFBOX-4398 > URL: https://issues.apache.org/jira/browse/PDFBOX-4398 > Project: PDFBox > Issue Type: Bug > Components: AcroForm >Affects Versions: 2.0.12 >Reporter: beat weisskopf >Priority: Minor > Fix For: 3.0.0 PDFBox > > > If one calls PDDocument#getLastSignatureDictionary, the AcroFrom is populated > with the defaults even if not needed. This modifies the internals of the > PDDocument and therefore there are changes to be saved, even if the file is > not modified by "real" changes. > For example: > {code} > PDDocument pdfDocument = PDDocument.load(pdfBytes); > pdfDocument.getLastSignatureDictionary(); > {code} > This calls the verifyOrCreateDefaults() method, which initializes the > DR-Dictionary if not yet done. This is even done if > getLastSignatureDictionary returns null. > Why this bothers me: it is very unexpected behaviour that a getter modifies > an objects state. This is no big deal for our usecase, the other issue > (PDFBOX-4303) was a bigger problem as we are diffing objects between > revisions (current vs last signed revision). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-4398) getLastSignatureDictionary modifies internal structure of PDDocument
[ https://issues.apache.org/jira/browse/PDFBOX-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-4398: Fix Version/s: 3.0.0 PDFBox > getLastSignatureDictionary modifies internal structure of PDDocument > > > Key: PDFBOX-4398 > URL: https://issues.apache.org/jira/browse/PDFBOX-4398 > Project: PDFBox > Issue Type: Bug > Components: AcroForm >Affects Versions: 2.0.12 >Reporter: beat weisskopf >Priority: Minor > Fix For: 3.0.0 PDFBox > > > If one calls PDDocument#getLastSignatureDictionary, the AcroFrom is populated > with the defaults even if not needed. This modifies the internals of the > PDDocument and therefore there are changes to be saved, even if the file is > not modified by "real" changes. > For example: > {code} > PDDocument pdfDocument = PDDocument.load(pdfBytes); > pdfDocument.getLastSignatureDictionary(); > {code} > This calls the verifyOrCreateDefaults() method, which initializes the > DR-Dictionary if not yet done. This is even done if > getLastSignatureDictionary returns null. > Why this bothers me: it is very unexpected behaviour that a getter modifies > an objects state. This is no big deal for our usecase, the other issue > (PDFBOX-4303) was a bigger problem as we are diffing objects between > revisions (current vs last signed revision). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Comment Edited] (PDFBOX-4393) PDF signature invalid after second interactive field signed
[ https://issues.apache.org/jira/browse/PDFBOX-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710060#comment-16710060 ] Tilman Hausherr edited comment on PDFBOX-4393 at 12/6/18 11:14 AM: --- Commit 1848213 from til...@apache.org in branch 'pdfbox/branches/2.0' [ https://svn.apache.org/r1848213 ] PDFBOX-4303, PDFBOX-4393: mark updated when applicable; check whether font dictionary exists; avoid bug that wrong dictionary was checked for font was (Author: jira-bot): Commit 1848213 from til...@apache.org in branch 'pdfbox/branches/2.0' [ https://svn.apache.org/r1848213 ] PDFBOX-4393: mark updated when applicable; check whether font dictionary exists; avoid bug that wrong dictionary was checked for font > PDF signature invalid after second interactive field signed > --- > > Key: PDFBOX-4393 > URL: https://issues.apache.org/jira/browse/PDFBOX-4393 > Project: PDFBox > Issue Type: Bug > Components: Signing >Affects Versions: 2.0.12, 2.0.13 > Environment: Windows >Reporter: Martin Klíma >Priority: Major > Fix For: 2.0.14, 3.0.0 PDFBox > > Attachments: image-2018-12-04-11-14-01-586.png, > image-2018-12-04-11-14-32-676.png, streamserve_test_sig0.pdf, > streamserve_test_sig0_saved.pdf, streamserve_test_sig0_saved_signed by > PDFBox.pdf, streamserve_test_sig0_saved_signed by PDFBox_then_signed by > AR.pdf, streamserve_test_sig0_signed by PDFBoxNEW.pdf, > streamserve_test_sig0_signed by PDFBoxNEW_signed by AR.pdf, > streamserve_test_sig2.pdf, streamserve_test_sig3.pdf, > streamserve_test_sig3.signed.pdf > > > Hi guys, > I stumped on the problem with PDFBox and interactive field signing. I have > PDF generated with OpenText StreamServe with two interactive fields for > signing. See example 1 (streamserve_test_sig0.pdf) in attachement. > When I use Adobe Reader I can sign both of the visual fields just fine but > when I use PDFBox to sign one of this field the following signature is marked > as invalid. Doesn't matter if I use PDFbox or sign it manually with Adobe > Reader See example 3: > * streamserve_test_sig3.pdf (signed by PDFBox) - valid > * streamserve_test_sig3.signed.pdf (signed 2nd by Adobe Reader) - invalid > !image-2018-12-04-11-14-01-586.png|width=582,height=269! > > Also last example - when it´s signed first by Adobe Reader and then with > PDFBox the signature seems valid but it says the document was "certified". > See example 2 (streamserve_test_sig2.pdf) > !image-2018-12-04-11-14-32-676.png|width=591,height=283! > > What could be wrong? When is signed whole document with PDFBox it works just > fine. > Thanks for response, > Martin > > > > > > > > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Comment Edited] (PDFBOX-4393) PDF signature invalid after second interactive field signed
[ https://issues.apache.org/jira/browse/PDFBOX-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710059#comment-16710059 ] Tilman Hausherr edited comment on PDFBOX-4393 at 12/6/18 11:14 AM: --- Commit 1848212 from til...@apache.org in branch 'pdfbox/trunk' [ https://svn.apache.org/r1848212 ] PDFBOX-4303, PDFBOX-4393: mark updated when applicable; check whether font dictionary exists; avoid bug that wrong dictionary was checked for font was (Author: jira-bot): Commit 1848212 from til...@apache.org in branch 'pdfbox/trunk' [ https://svn.apache.org/r1848212 ] PDFBOX-4393: mark updated when applicable; check whether font dictionary exists; avoid bug that wrong dictionary was checked for font > PDF signature invalid after second interactive field signed > --- > > Key: PDFBOX-4393 > URL: https://issues.apache.org/jira/browse/PDFBOX-4393 > Project: PDFBox > Issue Type: Bug > Components: Signing >Affects Versions: 2.0.12, 2.0.13 > Environment: Windows >Reporter: Martin Klíma >Priority: Major > Fix For: 2.0.14, 3.0.0 PDFBox > > Attachments: image-2018-12-04-11-14-01-586.png, > image-2018-12-04-11-14-32-676.png, streamserve_test_sig0.pdf, > streamserve_test_sig0_saved.pdf, streamserve_test_sig0_saved_signed by > PDFBox.pdf, streamserve_test_sig0_saved_signed by PDFBox_then_signed by > AR.pdf, streamserve_test_sig0_signed by PDFBoxNEW.pdf, > streamserve_test_sig0_signed by PDFBoxNEW_signed by AR.pdf, > streamserve_test_sig2.pdf, streamserve_test_sig3.pdf, > streamserve_test_sig3.signed.pdf > > > Hi guys, > I stumped on the problem with PDFBox and interactive field signing. I have > PDF generated with OpenText StreamServe with two interactive fields for > signing. See example 1 (streamserve_test_sig0.pdf) in attachement. > When I use Adobe Reader I can sign both of the visual fields just fine but > when I use PDFBox to sign one of this field the following signature is marked > as invalid. Doesn't matter if I use PDFbox or sign it manually with Adobe > Reader See example 3: > * streamserve_test_sig3.pdf (signed by PDFBox) - valid > * streamserve_test_sig3.signed.pdf (signed 2nd by Adobe Reader) - invalid > !image-2018-12-04-11-14-01-586.png|width=582,height=269! > > Also last example - when it´s signed first by Adobe Reader and then with > PDFBox the signature seems valid but it says the document was "certified". > See example 2 (streamserve_test_sig2.pdf) > !image-2018-12-04-11-14-32-676.png|width=591,height=283! > > What could be wrong? When is signed whole document with PDFBox it works just > fine. > Thanks for response, > Martin > > > > > > > > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4396) Memory leak due to soft reference caching
[ https://issues.apache.org/jira/browse/PDFBOX-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711189#comment-16711189 ] Tilman Hausherr commented on PDFBOX-4396: - Can you build from source? If yes, open COSStream.java, and add {code:java}IOUtils.closeQuietly(randomAccess);{code} above {code:java}randomAccess = scratchFile.createBuffer();{code} at two places. > Memory leak due to soft reference caching > - > > Key: PDFBOX-4396 > URL: https://issues.apache.org/jira/browse/PDFBOX-4396 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.12 > Environment: JDK10; G1 >Reporter: Ben Manes >Priority: Major > Attachments: #2 - memory leak 2.png, #2 - memory leak.png, memory > leak 2.png, memory leak.png > > > In a heap dump, it appears that DefaultResourceCache is retaining 5.3 GB of > memory due to buffered images (via PDImageXObject). I suspect that G1 is not > collecting soft references across all regions before it out-of-memory errors. > In PDFBOX-4389, I discovered very slow PDDocument#load times due to a JDK10 > I/O bug. Previously I was loading the document to render each page, but this > took 1.5 minutes. To work around that bug I reused the document instance > across pages. This seems to have fail because the pages were cached and not > cleared by the GC. > The DefaultResourceCache does not prune its cache entries when the soft > references are collected. Like WeakHashMap, it should use a ReferenceQueue, > poll it on every access, and prune accordingly. > Thankfully PDDocument#setResourceCache exists. For now I am going to reset > the cache to a new instance after a page has been rendered. The entries > should no longer be reachable and be GC'd more aggressively. If that doesn't > work, I'll either replace the cache (e.g. with Caffeine) or disable it by > setting the instance to null. > I think the desired fix is to prune the DefaultResourceCache and, ideally, > reconsider usage of soft references (as they tend to be poor in practice). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4303) Helv and ZaDb overridden
[ https://issues.apache.org/jira/browse/PDFBOX-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711295#comment-16711295 ] Tilman Hausherr commented on PDFBOX-4303: - Commit 1848212 from til...@apache.org in branch 'pdfbox/trunk' [ https://svn.apache.org/r1848212 ] PDFBOX-4303, PDFBOX-4393: mark updated when applicable; check whether font dictionary exists; avoid bug that wrong dictionary was checked for font Commit 1848213 from til...@apache.org in branch 'pdfbox/branches/2.0' [ https://svn.apache.org/r1848213 ] PDFBOX-4303, PDFBOX-4393: mark updated when applicable; check whether font dictionary exists; avoid bug that wrong dictionary was checked for font > Helv and ZaDb overridden > > > Key: PDFBOX-4303 > URL: https://issues.apache.org/jira/browse/PDFBOX-4303 > Project: PDFBox > Issue Type: Bug > Components: AcroForm >Affects Versions: 2.0.11 >Reporter: simon steiner >Assignee: Maruan Sahyoun >Priority: Major > Labels: Appearance > Fix For: 2.0.14, 3.0.0 PDFBox > > Attachments: PDFBOX-4303-2.0.12.diff, ReaderModifiedForm.pdf > > > Due to change: > PDFBOX-3943: create /Helv and /ZaDb entries if they don't exist, regardless > if /DR existed or not > > was working ok in 2.0.7, in 2.0 branch > PDAcroForm > verifyOrCreateDefaults(): > is: > {color:#80}if > {color}(!defaultResources.getCOSObject().containsKey({color:#008000}"Helv"{color})) > should be checking key in the font dictionary before calling > defaultResources.put -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4393) PDF signature invalid after second interactive field signed
[ https://issues.apache.org/jira/browse/PDFBOX-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711297#comment-16711297 ] ASF subversion and git services commented on PDFBOX-4393: - Commit 1848285 from til...@apache.org in branch 'pdfbox/branches/2.0' [ https://svn.apache.org/r1848285 ] PDFBOX-4303, PDFBOX-4393: add comments > PDF signature invalid after second interactive field signed > --- > > Key: PDFBOX-4393 > URL: https://issues.apache.org/jira/browse/PDFBOX-4393 > Project: PDFBox > Issue Type: Bug > Components: Signing >Affects Versions: 2.0.12, 2.0.13 > Environment: Windows >Reporter: Martin Klíma >Priority: Major > Fix For: 2.0.14, 3.0.0 PDFBox > > Attachments: image-2018-12-04-11-14-01-586.png, > image-2018-12-04-11-14-32-676.png, streamserve_test_sig0.pdf, > streamserve_test_sig0_saved.pdf, streamserve_test_sig0_saved_signed by > PDFBox.pdf, streamserve_test_sig0_saved_signed by PDFBox_then_signed by > AR.pdf, streamserve_test_sig0_signed by PDFBoxNEW.pdf, > streamserve_test_sig0_signed by PDFBoxNEW_signed by AR.pdf, > streamserve_test_sig2.pdf, streamserve_test_sig3.pdf, > streamserve_test_sig3.signed.pdf > > > Hi guys, > I stumped on the problem with PDFBox and interactive field signing. I have > PDF generated with OpenText StreamServe with two interactive fields for > signing. See example 1 (streamserve_test_sig0.pdf) in attachement. > When I use Adobe Reader I can sign both of the visual fields just fine but > when I use PDFBox to sign one of this field the following signature is marked > as invalid. Doesn't matter if I use PDFbox or sign it manually with Adobe > Reader See example 3: > * streamserve_test_sig3.pdf (signed by PDFBox) - valid > * streamserve_test_sig3.signed.pdf (signed 2nd by Adobe Reader) - invalid > !image-2018-12-04-11-14-01-586.png|width=582,height=269! > > Also last example - when it´s signed first by Adobe Reader and then with > PDFBox the signature seems valid but it says the document was "certified". > See example 2 (streamserve_test_sig2.pdf) > !image-2018-12-04-11-14-32-676.png|width=591,height=283! > > What could be wrong? When is signed whole document with PDFBox it works just > fine. > Thanks for response, > Martin > > > > > > > > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4303) Helv and ZaDb overridden
[ https://issues.apache.org/jira/browse/PDFBOX-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711296#comment-16711296 ] ASF subversion and git services commented on PDFBOX-4303: - Commit 1848285 from til...@apache.org in branch 'pdfbox/branches/2.0' [ https://svn.apache.org/r1848285 ] PDFBOX-4303, PDFBOX-4393: add comments > Helv and ZaDb overridden > > > Key: PDFBOX-4303 > URL: https://issues.apache.org/jira/browse/PDFBOX-4303 > Project: PDFBox > Issue Type: Bug > Components: AcroForm >Affects Versions: 2.0.11 >Reporter: simon steiner >Assignee: Maruan Sahyoun >Priority: Major > Labels: Appearance > Fix For: 2.0.14, 3.0.0 PDFBox > > Attachments: PDFBOX-4303-2.0.12.diff, ReaderModifiedForm.pdf > > > Due to change: > PDFBOX-3943: create /Helv and /ZaDb entries if they don't exist, regardless > if /DR existed or not > > was working ok in 2.0.7, in 2.0 branch > PDAcroForm > verifyOrCreateDefaults(): > is: > {color:#80}if > {color}(!defaultResources.getCOSObject().containsKey({color:#008000}"Helv"{color})) > should be checking key in the font dictionary before calling > defaultResources.put -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Comment Edited] (PDFBOX-4398) getLastSignatureDictionary modifies internal structure of PDDocument
[ https://issues.apache.org/jira/browse/PDFBOX-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711279#comment-16711279 ] Tilman Hausherr edited comment on PDFBOX-4398 at 12/6/18 11:00 AM: --- This was introduced in PDFBOX-3732 and also badly needed by PDFBOX-4393. Yes such side effects are usually a no-no but we replicated the behavior of Adobe. See also the ending comment by Maruan in PDFBOX-3732. >From your text I understand that it bothers you, but it doesn't break >anything. So I'd just target this to 3.0 and keep 2.x as is. I won't do >anything for now. was (Author: tilman): This was introduced in PDFBOX-3732 and also badly needed by PDFBOX-4393. Yes such side effects are usually a no-no but we replicated the behavior of Adobe. See also the ending comment by Maruan in PDFBOX-3732. >From your text I understand that it bothers you, but it doesn't break >anything. So I'd just target this to 3.0, set it to minor and keep 2.x as is. >I won't do anything for now. > getLastSignatureDictionary modifies internal structure of PDDocument > > > Key: PDFBOX-4398 > URL: https://issues.apache.org/jira/browse/PDFBOX-4398 > Project: PDFBox > Issue Type: Bug > Components: AcroForm >Affects Versions: 2.0.12 >Reporter: beat weisskopf >Priority: Minor > Fix For: 3.0.0 PDFBox > > > If one calls PDDocument#getLastSignatureDictionary, the AcroFrom is populated > with the defaults even if not needed. This modifies the internals of the > PDDocument and therefore there are changes to be saved, even if the file is > not modified by "real" changes. > For example: > {code} > PDDocument pdfDocument = PDDocument.load(pdfBytes); > pdfDocument.getLastSignatureDictionary(); > {code} > This calls the verifyOrCreateDefaults() method, which initializes the > DR-Dictionary if not yet done. This is even done if > getLastSignatureDictionary returns null. > Why this bothers me: it is very unexpected behaviour that a getter modifies > an objects state. This is no big deal for our usecase, the other issue > (PDFBOX-4303) was a bigger problem as we are diffing objects between > revisions (current vs last signed revision). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Comment Edited] (PDFBOX-4396) Memory leak due to soft reference caching
[ https://issues.apache.org/jira/browse/PDFBOX-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711189#comment-16711189 ] Tilman Hausherr edited comment on PDFBOX-4396 at 12/6/18 9:31 AM: -- Can you build from source? If yes, open COSStream.java, and add {code:java} IOUtils.closeQuietly(randomAccess);{code} above {code:java} randomAccess = scratchFile.createBuffer();{code} at two places. My observations that ramdomAccess was not null when an encrypted file is opened, because the same COSStream is rewritten and it can't be closed before that because this would result in an exception. was (Author: tilman): Can you build from source? If yes, open COSStream.java, and add {code:java}IOUtils.closeQuietly(randomAccess);{code} above {code:java}randomAccess = scratchFile.createBuffer();{code} at two places. > Memory leak due to soft reference caching > - > > Key: PDFBOX-4396 > URL: https://issues.apache.org/jira/browse/PDFBOX-4396 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.12 > Environment: JDK10; G1 >Reporter: Ben Manes >Priority: Major > Attachments: #2 - memory leak 2.png, #2 - memory leak.png, memory > leak 2.png, memory leak.png > > > In a heap dump, it appears that DefaultResourceCache is retaining 5.3 GB of > memory due to buffered images (via PDImageXObject). I suspect that G1 is not > collecting soft references across all regions before it out-of-memory errors. > In PDFBOX-4389, I discovered very slow PDDocument#load times due to a JDK10 > I/O bug. Previously I was loading the document to render each page, but this > took 1.5 minutes. To work around that bug I reused the document instance > across pages. This seems to have fail because the pages were cached and not > cleared by the GC. > The DefaultResourceCache does not prune its cache entries when the soft > references are collected. Like WeakHashMap, it should use a ReferenceQueue, > poll it on every access, and prune accordingly. > Thankfully PDDocument#setResourceCache exists. For now I am going to reset > the cache to a new instance after a page has been rendered. The entries > should no longer be reachable and be GC'd more aggressively. If that doesn't > work, I'll either replace the cache (e.g. with Caffeine) or disable it by > setting the instance to null. > I think the desired fix is to prune the DefaultResourceCache and, ideally, > reconsider usage of soft references (as they tend to be poor in practice). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4393) PDF signature invalid after second interactive field signed
[ https://issues.apache.org/jira/browse/PDFBOX-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711300#comment-16711300 ] ASF subversion and git services commented on PDFBOX-4393: - Commit 1848286 from til...@apache.org in branch 'pdfbox/trunk' [ https://svn.apache.org/r1848286 ] PDFBOX-4303, PDFBOX-4393: add comments > PDF signature invalid after second interactive field signed > --- > > Key: PDFBOX-4393 > URL: https://issues.apache.org/jira/browse/PDFBOX-4393 > Project: PDFBox > Issue Type: Bug > Components: Signing >Affects Versions: 2.0.12, 2.0.13 > Environment: Windows >Reporter: Martin Klíma >Priority: Major > Fix For: 2.0.14, 3.0.0 PDFBox > > Attachments: image-2018-12-04-11-14-01-586.png, > image-2018-12-04-11-14-32-676.png, streamserve_test_sig0.pdf, > streamserve_test_sig0_saved.pdf, streamserve_test_sig0_saved_signed by > PDFBox.pdf, streamserve_test_sig0_saved_signed by PDFBox_then_signed by > AR.pdf, streamserve_test_sig0_signed by PDFBoxNEW.pdf, > streamserve_test_sig0_signed by PDFBoxNEW_signed by AR.pdf, > streamserve_test_sig2.pdf, streamserve_test_sig3.pdf, > streamserve_test_sig3.signed.pdf > > > Hi guys, > I stumped on the problem with PDFBox and interactive field signing. I have > PDF generated with OpenText StreamServe with two interactive fields for > signing. See example 1 (streamserve_test_sig0.pdf) in attachement. > When I use Adobe Reader I can sign both of the visual fields just fine but > when I use PDFBox to sign one of this field the following signature is marked > as invalid. Doesn't matter if I use PDFbox or sign it manually with Adobe > Reader See example 3: > * streamserve_test_sig3.pdf (signed by PDFBox) - valid > * streamserve_test_sig3.signed.pdf (signed 2nd by Adobe Reader) - invalid > !image-2018-12-04-11-14-01-586.png|width=582,height=269! > > Also last example - when it´s signed first by Adobe Reader and then with > PDFBox the signature seems valid but it says the document was "certified". > See example 2 (streamserve_test_sig2.pdf) > !image-2018-12-04-11-14-32-676.png|width=591,height=283! > > What could be wrong? When is signed whole document with PDFBox it works just > fine. > Thanks for response, > Martin > > > > > > > > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4303) Helv and ZaDb overridden
[ https://issues.apache.org/jira/browse/PDFBOX-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711298#comment-16711298 ] ASF subversion and git services commented on PDFBOX-4303: - Commit 1848286 from til...@apache.org in branch 'pdfbox/trunk' [ https://svn.apache.org/r1848286 ] PDFBOX-4303, PDFBOX-4393: add comments > Helv and ZaDb overridden > > > Key: PDFBOX-4303 > URL: https://issues.apache.org/jira/browse/PDFBOX-4303 > Project: PDFBox > Issue Type: Bug > Components: AcroForm >Affects Versions: 2.0.11 >Reporter: simon steiner >Assignee: Maruan Sahyoun >Priority: Major > Labels: Appearance > Fix For: 2.0.14, 3.0.0 PDFBox > > Attachments: PDFBOX-4303-2.0.12.diff, ReaderModifiedForm.pdf > > > Due to change: > PDFBOX-3943: create /Helv and /ZaDb entries if they don't exist, regardless > if /DR existed or not > > was working ok in 2.0.7, in 2.0 branch > PDAcroForm > verifyOrCreateDefaults(): > is: > {color:#80}if > {color}(!defaultResources.getCOSObject().containsKey({color:#008000}"Helv"{color})) > should be checking key in the font dictionary before calling > defaultResources.put -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Closed] (PDFBOX-4396) Memory leak due to soft reference caching
[ https://issues.apache.org/jira/browse/PDFBOX-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Manes closed PDFBOX-4396. - > Memory leak due to soft reference caching > - > > Key: PDFBOX-4396 > URL: https://issues.apache.org/jira/browse/PDFBOX-4396 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.12 > Environment: JDK10; G1 >Reporter: Ben Manes >Priority: Major > Attachments: #2 - memory leak 2.png, #2 - memory leak.png, memory > leak 2.png, memory leak.png > > > In a heap dump, it appears that DefaultResourceCache is retaining 5.3 GB of > memory due to buffered images (via PDImageXObject). I suspect that G1 is not > collecting soft references across all regions before it out-of-memory errors. > In PDFBOX-4389, I discovered very slow PDDocument#load times due to a JDK10 > I/O bug. Previously I was loading the document to render each page, but this > took 1.5 minutes. To work around that bug I reused the document instance > across pages. This seems to have fail because the pages were cached and not > cleared by the GC. > The DefaultResourceCache does not prune its cache entries when the soft > references are collected. Like WeakHashMap, it should use a ReferenceQueue, > poll it on every access, and prune accordingly. > Thankfully PDDocument#setResourceCache exists. For now I am going to reset > the cache to a new instance after a page has been rendered. The entries > should no longer be reachable and be GC'd more aggressively. If that doesn't > work, I'll either replace the cache (e.g. with Caffeine) or disable it by > setting the instance to null. > I think the desired fix is to prune the DefaultResourceCache and, ideally, > reconsider usage of soft references (as they tend to be poor in practice). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Resolved] (PDFBOX-4396) Memory leak due to soft reference caching
[ https://issues.apache.org/jira/browse/PDFBOX-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ben Manes resolved PDFBOX-4396. --- Resolution: Workaround > Memory leak due to soft reference caching > - > > Key: PDFBOX-4396 > URL: https://issues.apache.org/jira/browse/PDFBOX-4396 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.12 > Environment: JDK10; G1 >Reporter: Ben Manes >Priority: Major > Attachments: #2 - memory leak 2.png, #2 - memory leak.png, memory > leak 2.png, memory leak.png > > > In a heap dump, it appears that DefaultResourceCache is retaining 5.3 GB of > memory due to buffered images (via PDImageXObject). I suspect that G1 is not > collecting soft references across all regions before it out-of-memory errors. > In PDFBOX-4389, I discovered very slow PDDocument#load times due to a JDK10 > I/O bug. Previously I was loading the document to render each page, but this > took 1.5 minutes. To work around that bug I reused the document instance > across pages. This seems to have fail because the pages were cached and not > cleared by the GC. > The DefaultResourceCache does not prune its cache entries when the soft > references are collected. Like WeakHashMap, it should use a ReferenceQueue, > poll it on every access, and prune accordingly. > Thankfully PDDocument#setResourceCache exists. For now I am going to reset > the cache to a new instance after a page has been rendered. The entries > should no longer be reachable and be GC'd more aggressively. If that doesn't > work, I'll either replace the cache (e.g. with Caffeine) or disable it by > setting the instance to null. > I think the desired fix is to prune the DefaultResourceCache and, ideally, > reconsider usage of soft references (as they tend to be poor in practice). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3359) Drawing to Graphics2D / ScratchFileBuffer not closed
[ https://issues.apache.org/jira/browse/PDFBOX-3359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711978#comment-16711978 ] ASF subversion and git services commented on PDFBOX-3359: - Commit 1848359 from til...@apache.org in branch 'pdfbox/trunk' [ https://svn.apache.org/r1848359 ] PDFBOX-2999, PDFBOX-3359: close existing buffer to avoid "ScratchFileBuffer not closed" log message > Drawing to Graphics2D / ScratchFileBuffer not closed > > > Key: PDFBOX-3359 > URL: https://issues.apache.org/jira/browse/PDFBOX-3359 > Project: PDFBox > Issue Type: Bug > Components: Rendering >Affects Versions: 2.0.1 >Reporter: Ivan Ridao Freitas >Priority: Major > Fix For: 3.0.0 PDFBox > > > First, there is a little bug on PDFRenderer.renderPageToGraphics(int > pageIndex, Graphics2D graphics, float scale) when using scale != 1 the call > to clearRect() fills the original size with white background, but it should > fill the scaled size. > Second, I implemented a JPanel which is painted using that function and on > every paint this message goes to the console: > "DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not closed!". Here is the > code to test it, run it and *resize the JFrame*: > {code:title=PanelTest.java|borderStyle=solid} > import java.awt.Dimension; > import java.awt.Graphics; > import java.awt.Graphics2D; > import java.io.File; > import java.io.IOException; > import javax.swing.JFrame; > import javax.swing.JPanel; > import javax.swing.SwingUtilities; > import javax.swing.WindowConstants; > import org.apache.pdfbox.pdmodel.PDDocument; > import org.apache.pdfbox.rendering.PDFRenderer; > public class PanelTest { > > private static JPanel getTestPanel() { > PDDocument doc = null; > try { > doc = PDDocument.load(new File("anyfile.pdf")); > } catch (IOException e) { > e.printStackTrace(); > } > final PDFRenderer renderer = new PDFRenderer(doc); > JPanel panel = new JPanel() { > @Override > protected void paintComponent(Graphics g) { > try { > renderer.renderPageToGraphics(0, (Graphics2D) g, 0.5f); > } catch (IOException e) { > e.printStackTrace(); > } > } > }; > return panel; > } > public static void main(String[] args) throws Exception { > SwingUtilities.invokeLater(new Runnable() { > @Override > public void run() { > JFrame frame = new JFrame(); > frame.setDefaultCloseOperation(WindowConstants.EXIT_ON_CLOSE); > frame.add(getTestPanel()); > frame.pack(); > frame.setSize(600, 400); > Dimension paneSize = frame.getSize(); > Dimension screenSize = frame.getToolkit().getScreenSize(); > frame.setLocation((screenSize.width - paneSize.width) / 2, > (screenSize.height - paneSize.height) / 2); > frame.setTitle("Test"); > frame.setVisible(true); > } > }); > } > } > {code} > Ivan -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-2999) Optimize COSStream scratch file usage
[ https://issues.apache.org/jira/browse/PDFBOX-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711975#comment-16711975 ] ASF subversion and git services commented on PDFBOX-2999: - Commit 1848358 from til...@apache.org in branch 'pdfbox/branches/2.0' [ https://svn.apache.org/r1848358 ] PDFBOX-2999, PDFBOX-3359: close existing buffer to avoid "ScratchFileBuffer not closed" log message > Optimize COSStream scratch file usage > - > > Key: PDFBOX-2999 > URL: https://issues.apache.org/jira/browse/PDFBOX-2999 > Project: PDFBox > Issue Type: Improvement > Components: PDModel >Affects Versions: 2.0.0 >Reporter: Timo Boehme >Assignee: Timo Boehme >Priority: Major > > The usage of scratch file buffers in COSStreams is quite sloppy. A never > filled buffer is created in the beginning and existing buffers are discarded > without being closed when a variant of {{createOutputStream}} is called. > Furthermore it should be clarified if requesting an input stream without > having created an output stream before is ok and if a returned input stream > keeps valid after a new output stream is created (which is crucial for proper > buffer-closing). > This issue should resolve some of the shortcomings and document the expected > or even required usage of COSStream. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3359) Drawing to Graphics2D / ScratchFileBuffer not closed
[ https://issues.apache.org/jira/browse/PDFBOX-3359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711976#comment-16711976 ] ASF subversion and git services commented on PDFBOX-3359: - Commit 1848358 from til...@apache.org in branch 'pdfbox/branches/2.0' [ https://svn.apache.org/r1848358 ] PDFBOX-2999, PDFBOX-3359: close existing buffer to avoid "ScratchFileBuffer not closed" log message > Drawing to Graphics2D / ScratchFileBuffer not closed > > > Key: PDFBOX-3359 > URL: https://issues.apache.org/jira/browse/PDFBOX-3359 > Project: PDFBox > Issue Type: Bug > Components: Rendering >Affects Versions: 2.0.1 >Reporter: Ivan Ridao Freitas >Priority: Major > Fix For: 3.0.0 PDFBox > > > First, there is a little bug on PDFRenderer.renderPageToGraphics(int > pageIndex, Graphics2D graphics, float scale) when using scale != 1 the call > to clearRect() fills the original size with white background, but it should > fill the scaled size. > Second, I implemented a JPanel which is painted using that function and on > every paint this message goes to the console: > "DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not closed!". Here is the > code to test it, run it and *resize the JFrame*: > {code:title=PanelTest.java|borderStyle=solid} > import java.awt.Dimension; > import java.awt.Graphics; > import java.awt.Graphics2D; > import java.io.File; > import java.io.IOException; > import javax.swing.JFrame; > import javax.swing.JPanel; > import javax.swing.SwingUtilities; > import javax.swing.WindowConstants; > import org.apache.pdfbox.pdmodel.PDDocument; > import org.apache.pdfbox.rendering.PDFRenderer; > public class PanelTest { > > private static JPanel getTestPanel() { > PDDocument doc = null; > try { > doc = PDDocument.load(new File("anyfile.pdf")); > } catch (IOException e) { > e.printStackTrace(); > } > final PDFRenderer renderer = new PDFRenderer(doc); > JPanel panel = new JPanel() { > @Override > protected void paintComponent(Graphics g) { > try { > renderer.renderPageToGraphics(0, (Graphics2D) g, 0.5f); > } catch (IOException e) { > e.printStackTrace(); > } > } > }; > return panel; > } > public static void main(String[] args) throws Exception { > SwingUtilities.invokeLater(new Runnable() { > @Override > public void run() { > JFrame frame = new JFrame(); > frame.setDefaultCloseOperation(WindowConstants.EXIT_ON_CLOSE); > frame.add(getTestPanel()); > frame.pack(); > frame.setSize(600, 400); > Dimension paneSize = frame.getSize(); > Dimension screenSize = frame.getToolkit().getScreenSize(); > frame.setLocation((screenSize.width - paneSize.width) / 2, > (screenSize.height - paneSize.height) / 2); > frame.setTitle("Test"); > frame.setVisible(true); > } > }); > } > } > {code} > Ivan -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-2999) Optimize COSStream scratch file usage
[ https://issues.apache.org/jira/browse/PDFBOX-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711977#comment-16711977 ] ASF subversion and git services commented on PDFBOX-2999: - Commit 1848359 from til...@apache.org in branch 'pdfbox/trunk' [ https://svn.apache.org/r1848359 ] PDFBOX-2999, PDFBOX-3359: close existing buffer to avoid "ScratchFileBuffer not closed" log message > Optimize COSStream scratch file usage > - > > Key: PDFBOX-2999 > URL: https://issues.apache.org/jira/browse/PDFBOX-2999 > Project: PDFBox > Issue Type: Improvement > Components: PDModel >Affects Versions: 2.0.0 >Reporter: Timo Boehme >Assignee: Timo Boehme >Priority: Major > > The usage of scratch file buffers in COSStreams is quite sloppy. A never > filled buffer is created in the beginning and existing buffers are discarded > without being closed when a variant of {{createOutputStream}} is called. > Furthermore it should be clarified if requesting an input stream without > having created an output stream before is ok and if a returned input stream > keeps valid after a new output stream is created (which is crucial for proper > buffer-closing). > This issue should resolve some of the shortcomings and document the expected > or even required usage of COSStream. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4396) Memory leak due to soft reference caching
[ https://issues.apache.org/jira/browse/PDFBOX-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711955#comment-16711955 ] Ben Manes commented on PDFBOX-4396: --- The process completed for one of the large uploads and I had to disable the others due to taking too long (hours). The cpu overhead on the machine caused bad user-facing latencies, since the scheduler doesn't take cpu into account and those jobs were being delayed. I think since our use cases expanded expecting 5-10 page documents to now many thousands of pages (monthly historicals), it's no longer a good fit to do the work on a single process, shared with other user-facing work. I think my next step should be to migrate this use-case to a lambda, distribute page ranges, and invoke in parallel. That could easily be distributed using pdfbox and work great, but it's probably easier / faster / cheaper to use ghostscript for such a simple lambda task. The documents are not encrypted so I think that case may not apply. In my code I often pass around a Guava Closer to accumulate resources across methods, and then ensure all are closed if not done so otherwise. If everything is associated to a document, it would make sense for a closer to be propagated from it and then it can close all of the resources (if not closed already). That could be a custom utility, etc. of course rather than Guava's. You might also considered using weak / phantom references instead of finalization. For my application's file I/O (local and s3), I give clients a session with their own tempdir and reference count downloaded files against a global cache. The session handles are proxies that clients should close, but held in a weak keyed cache where the actual implementation is the value. Then when the proxy is collected, the strong-ref value is explicitly closed. This acts as a safety net just in case, since we do a lot of I/O and this form of reference caching is cheap. The same can be done better with phantom references, but more work than spinning up a weak cache with a removal listener. From reading the code, it looks like a lot of effort was made to close resources but it also got really complex with patches for the inevitable leaks. Of course, you might not be able to change much due to API compatibility needs. I think at this point I'll close this, like the other, as not something trivially fixable. I do think better resource handing is warranted, but that requires a thoughtful refactor. > Memory leak due to soft reference caching > - > > Key: PDFBOX-4396 > URL: https://issues.apache.org/jira/browse/PDFBOX-4396 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.12 > Environment: JDK10; G1 >Reporter: Ben Manes >Priority: Major > Attachments: #2 - memory leak 2.png, #2 - memory leak.png, memory > leak 2.png, memory leak.png > > > In a heap dump, it appears that DefaultResourceCache is retaining 5.3 GB of > memory due to buffered images (via PDImageXObject). I suspect that G1 is not > collecting soft references across all regions before it out-of-memory errors. > In PDFBOX-4389, I discovered very slow PDDocument#load times due to a JDK10 > I/O bug. Previously I was loading the document to render each page, but this > took 1.5 minutes. To work around that bug I reused the document instance > across pages. This seems to have fail because the pages were cached and not > cleared by the GC. > The DefaultResourceCache does not prune its cache entries when the soft > references are collected. Like WeakHashMap, it should use a ReferenceQueue, > poll it on every access, and prune accordingly. > Thankfully PDDocument#setResourceCache exists. For now I am going to reset > the cache to a new instance after a page has been rendered. The entries > should no longer be reachable and be GC'd more aggressively. If that doesn't > work, I'll either replace the cache (e.g. with Caffeine) or disable it by > setting the instance to null. > I think the desired fix is to prune the DefaultResourceCache and, ideally, > reconsider usage of soft references (as they tend to be poor in practice). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Closed] (PDFBOX-4396) Memory leak due to soft reference caching
[ https://issues.apache.org/jira/browse/PDFBOX-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr closed PDFBOX-4396. --- Resolution: Workaround > Memory leak due to soft reference caching > - > > Key: PDFBOX-4396 > URL: https://issues.apache.org/jira/browse/PDFBOX-4396 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.12 > Environment: JDK10; G1 >Reporter: Ben Manes >Priority: Major > Attachments: #2 - memory leak 2.png, #2 - memory leak.png, memory > leak 2.png, memory leak.png > > > In a heap dump, it appears that DefaultResourceCache is retaining 5.3 GB of > memory due to buffered images (via PDImageXObject). I suspect that G1 is not > collecting soft references across all regions before it out-of-memory errors. > In PDFBOX-4389, I discovered very slow PDDocument#load times due to a JDK10 > I/O bug. Previously I was loading the document to render each page, but this > took 1.5 minutes. To work around that bug I reused the document instance > across pages. This seems to have fail because the pages were cached and not > cleared by the GC. > The DefaultResourceCache does not prune its cache entries when the soft > references are collected. Like WeakHashMap, it should use a ReferenceQueue, > poll it on every access, and prune accordingly. > Thankfully PDDocument#setResourceCache exists. For now I am going to reset > the cache to a new instance after a page has been rendered. The entries > should no longer be reachable and be GC'd more aggressively. If that doesn't > work, I'll either replace the cache (e.g. with Caffeine) or disable it by > setting the instance to null. > I think the desired fix is to prune the DefaultResourceCache and, ideally, > reconsider usage of soft references (as they tend to be poor in practice). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Reopened] (PDFBOX-4396) Memory leak due to soft reference caching
[ https://issues.apache.org/jira/browse/PDFBOX-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr reopened PDFBOX-4396: - > Memory leak due to soft reference caching > - > > Key: PDFBOX-4396 > URL: https://issues.apache.org/jira/browse/PDFBOX-4396 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.12 > Environment: JDK10; G1 >Reporter: Ben Manes >Priority: Major > Attachments: #2 - memory leak 2.png, #2 - memory leak.png, memory > leak 2.png, memory leak.png > > > In a heap dump, it appears that DefaultResourceCache is retaining 5.3 GB of > memory due to buffered images (via PDImageXObject). I suspect that G1 is not > collecting soft references across all regions before it out-of-memory errors. > In PDFBOX-4389, I discovered very slow PDDocument#load times due to a JDK10 > I/O bug. Previously I was loading the document to render each page, but this > took 1.5 minutes. To work around that bug I reused the document instance > across pages. This seems to have fail because the pages were cached and not > cleared by the GC. > The DefaultResourceCache does not prune its cache entries when the soft > references are collected. Like WeakHashMap, it should use a ReferenceQueue, > poll it on every access, and prune accordingly. > Thankfully PDDocument#setResourceCache exists. For now I am going to reset > the cache to a new instance after a page has been rendered. The entries > should no longer be reachable and be GC'd more aggressively. If that doesn't > work, I'll either replace the cache (e.g. with Caffeine) or disable it by > setting the instance to null. > I think the desired fix is to prune the DefaultResourceCache and, ideally, > reconsider usage of soft references (as they tend to be poor in practice). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Comment Edited] (PDFBOX-4396) Memory leak due to soft reference caching
[ https://issues.apache.org/jira/browse/PDFBOX-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711189#comment-16711189 ] Tilman Hausherr edited comment on PDFBOX-4396 at 12/6/18 9:44 AM: -- Can you build from source? If yes, open COSStream.java, and add {code:java} IOUtils.closeQuietly(randomAccess);{code} above {code:java} randomAccess = scratchFile.createBuffer();{code} at two places. My observation was that randomAccess was not null when an encrypted file was opened, because the same COSStream is rewritten (in {{SecurityHandler.decryptStream()}}) and it can't be closed in that method because this would result in an exception thrown by COSStream. was (Author: tilman): Can you build from source? If yes, open COSStream.java, and add {code:java} IOUtils.closeQuietly(randomAccess);{code} above {code:java} randomAccess = scratchFile.createBuffer();{code} at two places. My observations that ramdomAccess was not null when an encrypted file is opened, because the same COSStream is rewritten and it can't be closed before that because this would result in an exception. > Memory leak due to soft reference caching > - > > Key: PDFBOX-4396 > URL: https://issues.apache.org/jira/browse/PDFBOX-4396 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.12 > Environment: JDK10; G1 >Reporter: Ben Manes >Priority: Major > Attachments: #2 - memory leak 2.png, #2 - memory leak.png, memory > leak 2.png, memory leak.png > > > In a heap dump, it appears that DefaultResourceCache is retaining 5.3 GB of > memory due to buffered images (via PDImageXObject). I suspect that G1 is not > collecting soft references across all regions before it out-of-memory errors. > In PDFBOX-4389, I discovered very slow PDDocument#load times due to a JDK10 > I/O bug. Previously I was loading the document to render each page, but this > took 1.5 minutes. To work around that bug I reused the document instance > across pages. This seems to have fail because the pages were cached and not > cleared by the GC. > The DefaultResourceCache does not prune its cache entries when the soft > references are collected. Like WeakHashMap, it should use a ReferenceQueue, > poll it on every access, and prune accordingly. > Thankfully PDDocument#setResourceCache exists. For now I am going to reset > the cache to a new instance after a page has been rendered. The entries > should no longer be reachable and be GC'd more aggressively. If that doesn't > work, I'll either replace the cache (e.g. with Caffeine) or disable it by > setting the instance to null. > I think the desired fix is to prune the DefaultResourceCache and, ideally, > reconsider usage of soft references (as they tend to be poor in practice). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances
[ https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711505#comment-16711505 ] ASF subversion and git services commented on PDFBOX-4392: - Commit 1848319 from til...@apache.org in branch 'pdfbox/branches/2.0' [ https://svn.apache.org/r1848319 ] PDFBOX-4392: remove toRGB() which is contained in Color() construction, thanks Itai Shaked > PDF completely blow up the RAM on amazon instances > -- > > Key: PDFBOX-4392 > URL: https://issues.apache.org/jira/browse/PDFBOX-4392 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.12 >Reporter: Oleksandr Skoryi >Priority: Major > Fix For: 2.0.13 > > Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, > 4392-prereadICC.patch > > > Hi all > The issue is pretty straightforward. I receive a lot of pdfs every day and > render them. In most of the cases everything is OK, but PDFs which produces > WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is > Perceptual, ignoring, treating as Display class > working super long, and are super memory consumable. > It takes from 5 to 15 min on m5.large amazon instance. But attached PDF > completely killed the instance. The java process is just killed by linux > during processing with no exception in logs. > So could you please provide explanations what is going on with files with > WARN message above, and how can I improve the rendering. > > Here is my VM options > -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G > -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider" > Also don't hesitate to ask me about more PDF, I have tones of them :D > > And also a question, does GPU have influence on rendering? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances
[ https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711354#comment-16711354 ] Itai Shaked commented on PDFBOX-4392: - So I looked into it more closely using a profiler (should have done this in the first place, as it revealed the call to `ensureDisplayProfile` has negligible performance impact), and it seems lots of time is wasted on the call to `toRGB` in line 170 of PDICCBased. According to the comment above the call, this is done to test for bad ICC profiles as described in PDFBOX-1295, PDFBOX-1740 and PDFBOX-3610. PDFBOX-1740 seems totally unrelated, so perhaps a typo in the comment? I have tried removing the call, and the files from PDFBOX-1295 and PDFBOX-3610 both render correctly (even with the call - no exception is thrown). Furthermore, removing this call cuts render time on the file in this issue by ~50%. Could this be an issue with a bug in an older JDK version, and so is no longer needed? If so - perhaps a test can be added so `toRGB` is only called on known bad JDK versions? > PDF completely blow up the RAM on amazon instances > -- > > Key: PDFBOX-4392 > URL: https://issues.apache.org/jira/browse/PDFBOX-4392 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.12 >Reporter: Oleksandr Skoryi >Priority: Major > Fix For: 2.0.13 > > Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, > 4392-prereadICC.patch > > > Hi all > The issue is pretty straightforward. I receive a lot of pdfs every day and > render them. In most of the cases everything is OK, but PDFs which produces > WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is > Perceptual, ignoring, treating as Display class > working super long, and are super memory consumable. > It takes from 5 to 15 min on m5.large amazon instance. But attached PDF > completely killed the instance. The java process is just killed by linux > during processing with no exception in logs. > So could you please provide explanations what is going on with files with > WARN message above, and how can I improve the rendering. > > Here is my VM options > -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G > -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider" > Also don't hesitate to ask me about more PDF, I have tones of them :D > > And also a question, does GPU have influence on rendering? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-4399) Printing invisible Content from PDF-Forms
[ https://issues.apache.org/jira/browse/PDFBOX-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Ziel updated PDFBOX-4399: Attachment: (was: printed.brf) > Printing invisible Content from PDF-Forms > -- > > Key: PDFBOX-4399 > URL: https://issues.apache.org/jira/browse/PDFBOX-4399 > Project: PDFBox > Issue Type: Bug > Components: AcroForm, Rendering >Affects Versions: 2.0.6 >Reporter: Stefan Ziel >Priority: Major > Attachments: original.pdf > > > Printing PDF-documents with macro renders hidden content. > {code:java} > InputStream sourceStream = new FileInputStream(pFile); > try { > PDDocument source = PDDocument.load(sourceStream); > job.setPageable(new PDFPageable(source)); > job.print(atts); > } finally { > sourceStream.close(); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-4399) Printing invisible Content from PDF-Forms
[ https://issues.apache.org/jira/browse/PDFBOX-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Ziel updated PDFBOX-4399: Attachment: printed.brf original.pdf > Printing invisible Content from PDF-Forms > -- > > Key: PDFBOX-4399 > URL: https://issues.apache.org/jira/browse/PDFBOX-4399 > Project: PDFBox > Issue Type: Bug > Components: AcroForm, Rendering >Affects Versions: 2.0.6 >Reporter: Stefan Ziel >Priority: Major > Attachments: original.pdf > > > Printing PDF-documents with macro renders hidden content. > {code:java} > InputStream sourceStream = new FileInputStream(pFile); > try { > PDDocument source = PDDocument.load(sourceStream); > job.setPageable(new PDFPageable(source)); > job.print(atts); > } finally { > sourceStream.close(); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances
[ https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711485#comment-16711485 ] Tilman Hausherr commented on PDFBOX-4392: - I looked into the Color source code, that explains it LOL. > PDF completely blow up the RAM on amazon instances > -- > > Key: PDFBOX-4392 > URL: https://issues.apache.org/jira/browse/PDFBOX-4392 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.12 >Reporter: Oleksandr Skoryi >Priority: Major > Fix For: 2.0.13 > > Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, > 4392-prereadICC.patch > > > Hi all > The issue is pretty straightforward. I receive a lot of pdfs every day and > render them. In most of the cases everything is OK, but PDFs which produces > WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is > Perceptual, ignoring, treating as Display class > working super long, and are super memory consumable. > It takes from 5 to 15 min on m5.large amazon instance. But attached PDF > completely killed the instance. The java process is just killed by linux > during processing with no exception in logs. > So could you please provide explanations what is going on with files with > WARN message above, and how can I improve the rendering. > > Here is my VM options > -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G > -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider" > Also don't hesitate to ask me about more PDF, I have tones of them :D > > And also a question, does GPU have influence on rendering? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4398) getLastSignatureDictionary modifies internal structure of PDDocument
[ https://issues.apache.org/jira/browse/PDFBOX-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711372#comment-16711372 ] beat weisskopf commented on PDFBOX-4398: {quote} I won't do anything for now. {quote} That's ok for me. Thanks for all the work! > getLastSignatureDictionary modifies internal structure of PDDocument > > > Key: PDFBOX-4398 > URL: https://issues.apache.org/jira/browse/PDFBOX-4398 > Project: PDFBox > Issue Type: Bug > Components: AcroForm >Affects Versions: 2.0.12 >Reporter: beat weisskopf >Priority: Minor > Fix For: 3.0.0 PDFBox > > > If one calls PDDocument#getLastSignatureDictionary, the AcroFrom is populated > with the defaults even if not needed. This modifies the internals of the > PDDocument and therefore there are changes to be saved, even if the file is > not modified by "real" changes. > For example: > {code} > PDDocument pdfDocument = PDDocument.load(pdfBytes); > pdfDocument.getLastSignatureDictionary(); > {code} > This calls the verifyOrCreateDefaults() method, which initializes the > DR-Dictionary if not yet done. This is even done if > getLastSignatureDictionary returns null. > Why this bothers me: it is very unexpected behaviour that a getter modifies > an objects state. This is no big deal for our usecase, the other issue > (PDFBOX-4303) was a bigger problem as we are diffing objects between > revisions (current vs last signed revision). -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4303) Helv and ZaDb overridden
[ https://issues.apache.org/jira/browse/PDFBOX-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711390#comment-16711390 ] ASF subversion and git services commented on PDFBOX-4303: - Commit 1848299 from til...@apache.org in branch 'pdfbox/branches/2.0' [ https://svn.apache.org/r1848299 ] PDFBOX-4303, PDFBOX-4393: add test > Helv and ZaDb overridden > > > Key: PDFBOX-4303 > URL: https://issues.apache.org/jira/browse/PDFBOX-4303 > Project: PDFBox > Issue Type: Bug > Components: AcroForm >Affects Versions: 2.0.11 >Reporter: simon steiner >Assignee: Maruan Sahyoun >Priority: Major > Labels: Appearance > Fix For: 2.0.14, 3.0.0 PDFBox > > Attachments: PDFBOX-4303-2.0.12.diff, ReaderModifiedForm.pdf > > > Due to change: > PDFBOX-3943: create /Helv and /ZaDb entries if they don't exist, regardless > if /DR existed or not > > was working ok in 2.0.7, in 2.0 branch > PDAcroForm > verifyOrCreateDefaults(): > is: > {color:#80}if > {color}(!defaultResources.getCOSObject().containsKey({color:#008000}"Helv"{color})) > should be checking key in the font dictionary before calling > defaultResources.put -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4303) Helv and ZaDb overridden
[ https://issues.apache.org/jira/browse/PDFBOX-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711392#comment-16711392 ] ASF subversion and git services commented on PDFBOX-4303: - Commit 1848300 from til...@apache.org in branch 'pdfbox/trunk' [ https://svn.apache.org/r1848300 ] PDFBOX-4303, PDFBOX-4393: add test > Helv and ZaDb overridden > > > Key: PDFBOX-4303 > URL: https://issues.apache.org/jira/browse/PDFBOX-4303 > Project: PDFBox > Issue Type: Bug > Components: AcroForm >Affects Versions: 2.0.11 >Reporter: simon steiner >Assignee: Maruan Sahyoun >Priority: Major > Labels: Appearance > Fix For: 2.0.14, 3.0.0 PDFBox > > Attachments: PDFBOX-4303-2.0.12.diff, ReaderModifiedForm.pdf > > > Due to change: > PDFBOX-3943: create /Helv and /ZaDb entries if they don't exist, regardless > if /DR existed or not > > was working ok in 2.0.7, in 2.0 branch > PDAcroForm > verifyOrCreateDefaults(): > is: > {color:#80}if > {color}(!defaultResources.getCOSObject().containsKey({color:#008000}"Helv"{color})) > should be checking key in the font dictionary before calling > defaultResources.put -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4393) PDF signature invalid after second interactive field signed
[ https://issues.apache.org/jira/browse/PDFBOX-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711393#comment-16711393 ] ASF subversion and git services commented on PDFBOX-4393: - Commit 1848300 from til...@apache.org in branch 'pdfbox/trunk' [ https://svn.apache.org/r1848300 ] PDFBOX-4303, PDFBOX-4393: add test > PDF signature invalid after second interactive field signed > --- > > Key: PDFBOX-4393 > URL: https://issues.apache.org/jira/browse/PDFBOX-4393 > Project: PDFBox > Issue Type: Bug > Components: Signing >Affects Versions: 2.0.12, 2.0.13 > Environment: Windows >Reporter: Martin Klíma >Priority: Major > Fix For: 2.0.14, 3.0.0 PDFBox > > Attachments: image-2018-12-04-11-14-01-586.png, > image-2018-12-04-11-14-32-676.png, streamserve_test_sig0.pdf, > streamserve_test_sig0_saved.pdf, streamserve_test_sig0_saved_signed by > PDFBox.pdf, streamserve_test_sig0_saved_signed by PDFBox_then_signed by > AR.pdf, streamserve_test_sig0_signed by PDFBoxNEW.pdf, > streamserve_test_sig0_signed by PDFBoxNEW_signed by AR.pdf, > streamserve_test_sig2.pdf, streamserve_test_sig3.pdf, > streamserve_test_sig3.signed.pdf > > > Hi guys, > I stumped on the problem with PDFBox and interactive field signing. I have > PDF generated with OpenText StreamServe with two interactive fields for > signing. See example 1 (streamserve_test_sig0.pdf) in attachement. > When I use Adobe Reader I can sign both of the visual fields just fine but > when I use PDFBox to sign one of this field the following signature is marked > as invalid. Doesn't matter if I use PDFbox or sign it manually with Adobe > Reader See example 3: > * streamserve_test_sig3.pdf (signed by PDFBox) - valid > * streamserve_test_sig3.signed.pdf (signed 2nd by Adobe Reader) - invalid > !image-2018-12-04-11-14-01-586.png|width=582,height=269! > > Also last example - when it´s signed first by Adobe Reader and then with > PDFBox the signature seems valid but it says the document was "certified". > See example 2 (streamserve_test_sig2.pdf) > !image-2018-12-04-11-14-32-676.png|width=591,height=283! > > What could be wrong? When is signed whole document with PDFBox it works just > fine. > Thanks for response, > Martin > > > > > > > > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4393) PDF signature invalid after second interactive field signed
[ https://issues.apache.org/jira/browse/PDFBOX-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711391#comment-16711391 ] ASF subversion and git services commented on PDFBOX-4393: - Commit 1848299 from til...@apache.org in branch 'pdfbox/branches/2.0' [ https://svn.apache.org/r1848299 ] PDFBOX-4303, PDFBOX-4393: add test > PDF signature invalid after second interactive field signed > --- > > Key: PDFBOX-4393 > URL: https://issues.apache.org/jira/browse/PDFBOX-4393 > Project: PDFBox > Issue Type: Bug > Components: Signing >Affects Versions: 2.0.12, 2.0.13 > Environment: Windows >Reporter: Martin Klíma >Priority: Major > Fix For: 2.0.14, 3.0.0 PDFBox > > Attachments: image-2018-12-04-11-14-01-586.png, > image-2018-12-04-11-14-32-676.png, streamserve_test_sig0.pdf, > streamserve_test_sig0_saved.pdf, streamserve_test_sig0_saved_signed by > PDFBox.pdf, streamserve_test_sig0_saved_signed by PDFBox_then_signed by > AR.pdf, streamserve_test_sig0_signed by PDFBoxNEW.pdf, > streamserve_test_sig0_signed by PDFBoxNEW_signed by AR.pdf, > streamserve_test_sig2.pdf, streamserve_test_sig3.pdf, > streamserve_test_sig3.signed.pdf > > > Hi guys, > I stumped on the problem with PDFBox and interactive field signing. I have > PDF generated with OpenText StreamServe with two interactive fields for > signing. See example 1 (streamserve_test_sig0.pdf) in attachement. > When I use Adobe Reader I can sign both of the visual fields just fine but > when I use PDFBox to sign one of this field the following signature is marked > as invalid. Doesn't matter if I use PDFbox or sign it manually with Adobe > Reader See example 3: > * streamserve_test_sig3.pdf (signed by PDFBox) - valid > * streamserve_test_sig3.signed.pdf (signed 2nd by Adobe Reader) - invalid > !image-2018-12-04-11-14-01-586.png|width=582,height=269! > > Also last example - when it´s signed first by Adobe Reader and then with > PDFBox the signature seems valid but it says the document was "certified". > See example 2 (streamserve_test_sig2.pdf) > !image-2018-12-04-11-14-32-676.png|width=591,height=283! > > What could be wrong? When is signed whole document with PDFBox it works just > fine. > Thanks for response, > Martin > > > > > > > > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-4399) Printing invisible Content from PDF-Forms
[ https://issues.apache.org/jira/browse/PDFBOX-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Ziel updated PDFBOX-4399: Attachment: printed.png > Printing invisible Content from PDF-Forms > -- > > Key: PDFBOX-4399 > URL: https://issues.apache.org/jira/browse/PDFBOX-4399 > Project: PDFBox > Issue Type: Bug > Components: AcroForm, Rendering >Affects Versions: 2.0.6 >Reporter: Stefan Ziel >Priority: Major > Attachments: gs.png, original.pdf, printed.png > > > Printing PDF-documents with macro [^original.pdf] renders hidden content > [^printed.pdf] . Code used to print > {code:java} > InputStream sourceStream = new FileInputStream(pFile); > try { > PDDocument source = PDDocument.load(sourceStream); > job.setPageable(new PDFPageable(source)); > job.print(atts); > } finally { > sourceStream.close(); > } > {code} > This is not only a problem of PDFBox ;) but can be done right ... ghostscript > does it [^gs.png]. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-4399) Printing invisible Content from PDF-Forms
[ https://issues.apache.org/jira/browse/PDFBOX-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Ziel updated PDFBOX-4399: Attachment: printed.pdf > Printing invisible Content from PDF-Forms > -- > > Key: PDFBOX-4399 > URL: https://issues.apache.org/jira/browse/PDFBOX-4399 > Project: PDFBox > Issue Type: Bug > Components: AcroForm, Rendering >Affects Versions: 2.0.6 >Reporter: Stefan Ziel >Priority: Major > Attachments: original.pdf, printed.pdf > > > Printing PDF-documents with macro renders hidden content. > {code:java} > InputStream sourceStream = new FileInputStream(pFile); > try { > PDDocument source = PDDocument.load(sourceStream); > job.setPageable(new PDFPageable(source)); > job.print(atts); > } finally { > sourceStream.close(); > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-4399) Printing invisible Content from PDF-Forms
[ https://issues.apache.org/jira/browse/PDFBOX-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Ziel updated PDFBOX-4399: Attachment: gs.png > Printing invisible Content from PDF-Forms > -- > > Key: PDFBOX-4399 > URL: https://issues.apache.org/jira/browse/PDFBOX-4399 > Project: PDFBox > Issue Type: Bug > Components: AcroForm, Rendering >Affects Versions: 2.0.6 >Reporter: Stefan Ziel >Priority: Major > Attachments: gs.png, original.pdf, printed.pdf > > > Printing PDF-documents with macro [^original.pdf] renders hidden content > [^printed.pdf] . Code used to print > {code:java} > InputStream sourceStream = new FileInputStream(pFile); > try { > PDDocument source = PDDocument.load(sourceStream); > job.setPageable(new PDFPageable(source)); > job.print(atts); > } finally { > sourceStream.close(); > } > {code} > This is not only a problem of PDFBox ;) but can be done right ... ghostscript > does it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-4399) Printing invisible Content from PDF-Forms
[ https://issues.apache.org/jira/browse/PDFBOX-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Ziel updated PDFBOX-4399: Description: Printing PDF-documents with macro [^original.pdf] renders hidden content [^printed.pdf] . Code used to print {code:java} InputStream sourceStream = new FileInputStream(pFile); try { PDDocument source = PDDocument.load(sourceStream); job.setPageable(new PDFPageable(source)); job.print(atts); } finally { sourceStream.close(); } {code} This is not only a problem of PDFBox ;) but can be done right ... ghostscript does it [^gs.png]. was: Printing PDF-documents with macro [^original.pdf] renders hidden content [^printed.pdf] . Code used to print {code:java} InputStream sourceStream = new FileInputStream(pFile); try { PDDocument source = PDDocument.load(sourceStream); job.setPageable(new PDFPageable(source)); job.print(atts); } finally { sourceStream.close(); } {code} This is not only a problem of PDFBox ;) but can be done right ... ghostscript does it !gs.png! . > Printing invisible Content from PDF-Forms > -- > > Key: PDFBOX-4399 > URL: https://issues.apache.org/jira/browse/PDFBOX-4399 > Project: PDFBox > Issue Type: Bug > Components: AcroForm, Rendering >Affects Versions: 2.0.6 >Reporter: Stefan Ziel >Priority: Major > Attachments: gs.png, original.pdf, printed.pdf > > > Printing PDF-documents with macro [^original.pdf] renders hidden content > [^printed.pdf] . Code used to print > {code:java} > InputStream sourceStream = new FileInputStream(pFile); > try { > PDDocument source = PDDocument.load(sourceStream); > job.setPageable(new PDFPageable(source)); > job.print(atts); > } finally { > sourceStream.close(); > } > {code} > This is not only a problem of PDFBox ;) but can be done right ... ghostscript > does it [^gs.png]. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-4399) Printing invisible Content from PDF-Forms
[ https://issues.apache.org/jira/browse/PDFBOX-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Ziel updated PDFBOX-4399: Description: Printing PDF-documents with macro [^original.pdf] renders hidden content [^printed.pdf] . Code used to print {code:java} InputStream sourceStream = new FileInputStream(pFile); try { PDDocument source = PDDocument.load(sourceStream); job.setPageable(new PDFPageable(source)); job.print(atts); } finally { sourceStream.close(); } {code} This is not only a problem of PDFBox ;) but can be done right ... ghostscript does it !gs.png! . was: Printing PDF-documents with macro [^original.pdf] renders hidden content [^printed.pdf] . Code used to print {code:java} InputStream sourceStream = new FileInputStream(pFile); try { PDDocument source = PDDocument.load(sourceStream); job.setPageable(new PDFPageable(source)); job.print(atts); } finally { sourceStream.close(); } {code} This is not only a problem of PDFBox ;) but can be done right ... ghostscript does it. > Printing invisible Content from PDF-Forms > -- > > Key: PDFBOX-4399 > URL: https://issues.apache.org/jira/browse/PDFBOX-4399 > Project: PDFBox > Issue Type: Bug > Components: AcroForm, Rendering >Affects Versions: 2.0.6 >Reporter: Stefan Ziel >Priority: Major > Attachments: gs.png, original.pdf, printed.pdf > > > Printing PDF-documents with macro [^original.pdf] renders hidden content > [^printed.pdf] . Code used to print > {code:java} > InputStream sourceStream = new FileInputStream(pFile); > try { > PDDocument source = PDDocument.load(sourceStream); > job.setPageable(new PDFPageable(source)); > job.print(atts); > } finally { > sourceStream.close(); > } > {code} > This is not only a problem of PDFBox ;) but can be done right ... ghostscript > does it !gs.png! . -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances
[ https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711383#comment-16711383 ] Itai Shaked commented on PDFBOX-4392: - This may shed some more light on the issue - the next line (172) creates a new `Color` object, which itself calls `toRGB` on the given color space - this seems to have been done later as a fix for PDFBOX-3549. This explains why removing the first call to `toRGB` improves performance by 50% - it was being called 2 times! To me this seems like proof the first call is redundant - the exception will be thrown in the constructor of `Color`, if it passed the previous test. Since the call to `toRGB` seems to be extremely slow, perhaps it is wise to also reverse the order of remaining to tests (creating a `Color` and a `ComponentColorModel`), so the slowest test is saved for last, and skipped if the profile is deemed problematic sooner. > PDF completely blow up the RAM on amazon instances > -- > > Key: PDFBOX-4392 > URL: https://issues.apache.org/jira/browse/PDFBOX-4392 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.12 >Reporter: Oleksandr Skoryi >Priority: Major > Fix For: 2.0.13 > > Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, > 4392-prereadICC.patch > > > Hi all > The issue is pretty straightforward. I receive a lot of pdfs every day and > render them. In most of the cases everything is OK, but PDFs which produces > WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is > Perceptual, ignoring, treating as Display class > working super long, and are super memory consumable. > It takes from 5 to 15 min on m5.large amazon instance. But attached PDF > completely killed the instance. The java process is just killed by linux > during processing with no exception in logs. > So could you please provide explanations what is going on with files with > WARN message above, and how can I improve the rendering. > > Here is my VM options > -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G > -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider" > Also don't hesitate to ask me about more PDF, I have tones of them :D > > And also a question, does GPU have influence on rendering? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances
[ https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711426#comment-16711426 ] Tilman Hausherr commented on PDFBOX-4392: - PDFBOX-1740 is not a typo - the file was mentioned by me and by John in PDFBOX-1893. I'll retest with KCMS and LCMS whether we can delete the call from line 170. I think the order doesn't matter because most files don't fail. > PDF completely blow up the RAM on amazon instances > -- > > Key: PDFBOX-4392 > URL: https://issues.apache.org/jira/browse/PDFBOX-4392 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.12 >Reporter: Oleksandr Skoryi >Priority: Major > Fix For: 2.0.13 > > Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, > 4392-prereadICC.patch > > > Hi all > The issue is pretty straightforward. I receive a lot of pdfs every day and > render them. In most of the cases everything is OK, but PDFs which produces > WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is > Perceptual, ignoring, treating as Display class > working super long, and are super memory consumable. > It takes from 5 to 15 min on m5.large amazon instance. But attached PDF > completely killed the instance. The java process is just killed by linux > during processing with no exception in logs. > So could you please provide explanations what is going on with files with > WARN message above, and how can I improve the rendering. > > Here is my VM options > -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G > -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider" > Also don't hesitate to ask me about more PDF, I have tones of them :D > > And also a question, does GPU have influence on rendering? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Created] (PDFBOX-4399) Printing invisible Content from PDF-Forms
Stefan Ziel created PDFBOX-4399: --- Summary: Printing invisible Content from PDF-Forms Key: PDFBOX-4399 URL: https://issues.apache.org/jira/browse/PDFBOX-4399 Project: PDFBox Issue Type: Bug Components: AcroForm, Rendering Affects Versions: 2.0.6 Reporter: Stefan Ziel Printing PDF-documents with macro renders hidden content. {code:java} InputStream sourceStream = new FileInputStream(pFile); try { PDDocument source = PDDocument.load(sourceStream); job.setPageable(new PDFPageable(source)); job.print(atts); } finally { sourceStream.close(); } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-4399) Printing invisible Content from PDF-Forms
[ https://issues.apache.org/jira/browse/PDFBOX-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Ziel updated PDFBOX-4399: Attachment: (was: printed.pdf) > Printing invisible Content from PDF-Forms > -- > > Key: PDFBOX-4399 > URL: https://issues.apache.org/jira/browse/PDFBOX-4399 > Project: PDFBox > Issue Type: Bug > Components: AcroForm, Rendering >Affects Versions: 2.0.6 >Reporter: Stefan Ziel >Priority: Major > Attachments: gs.png, original.pdf, printed.png > > > Printing PDF-documents with macro [^original.pdf] renders hidden content > [^printed.pdf] . Code used to print > {code:java} > InputStream sourceStream = new FileInputStream(pFile); > try { > PDDocument source = PDDocument.load(sourceStream); > job.setPageable(new PDFPageable(source)); > job.print(atts); > } finally { > sourceStream.close(); > } > {code} > This is not only a problem of PDFBox ;) but can be done right ... ghostscript > does it [^gs.png]. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Resolved] (PDFBOX-4303) Helv and ZaDb overridden
[ https://issues.apache.org/jira/browse/PDFBOX-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr resolved PDFBOX-4303. - Resolution: Fixed Assignee: Tilman Hausherr (was: Maruan Sahyoun) I created a different test than the proposed one to work without a file. > Helv and ZaDb overridden > > > Key: PDFBOX-4303 > URL: https://issues.apache.org/jira/browse/PDFBOX-4303 > Project: PDFBox > Issue Type: Bug > Components: AcroForm >Affects Versions: 2.0.11 >Reporter: simon steiner >Assignee: Tilman Hausherr >Priority: Major > Labels: Appearance > Fix For: 2.0.14, 3.0.0 PDFBox > > Attachments: PDFBOX-4303-2.0.12.diff, ReaderModifiedForm.pdf > > > Due to change: > PDFBOX-3943: create /Helv and /ZaDb entries if they don't exist, regardless > if /DR existed or not > > was working ok in 2.0.7, in 2.0 branch > PDAcroForm > verifyOrCreateDefaults(): > is: > {color:#80}if > {color}(!defaultResources.getCOSObject().containsKey({color:#008000}"Helv"{color})) > should be checking key in the font dictionary before calling > defaultResources.put -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-4399) Printing invisible Content from PDF-Forms
[ https://issues.apache.org/jira/browse/PDFBOX-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Ziel updated PDFBOX-4399: Description: Printing PDF-documents with macro [^original.pdf] renders hidden content [^printed.pdf] . Code used to print {code:java} InputStream sourceStream = new FileInputStream(pFile); try { PDDocument source = PDDocument.load(sourceStream); job.setPageable(new PDFPageable(source)); job.print(atts); } finally { sourceStream.close(); } {code} This is not only a problem of PDFBox ;) but can be done right ... ghostscript does it. was: Printing PDF-documents with macro renders hidden content. {code:java} InputStream sourceStream = new FileInputStream(pFile); try { PDDocument source = PDDocument.load(sourceStream); job.setPageable(new PDFPageable(source)); job.print(atts); } finally { sourceStream.close(); } {code} > Printing invisible Content from PDF-Forms > -- > > Key: PDFBOX-4399 > URL: https://issues.apache.org/jira/browse/PDFBOX-4399 > Project: PDFBox > Issue Type: Bug > Components: AcroForm, Rendering >Affects Versions: 2.0.6 >Reporter: Stefan Ziel >Priority: Major > Attachments: original.pdf, printed.pdf > > > Printing PDF-documents with macro [^original.pdf] renders hidden content > [^printed.pdf] . Code used to print > {code:java} > InputStream sourceStream = new FileInputStream(pFile); > try { > PDDocument source = PDDocument.load(sourceStream); > job.setPageable(new PDFPageable(source)); > job.print(atts); > } finally { > sourceStream.close(); > } > {code} > This is not only a problem of PDFBox ;) but can be done right ... ghostscript > does it. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-4399) Printing invisible Content from PDF-Forms
[ https://issues.apache.org/jira/browse/PDFBOX-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Ziel updated PDFBOX-4399: Description: Printing PDF-documents with macro [^original.pdf] renders hidden content [^printed.png] . Code used to print {code:java} InputStream sourceStream = new FileInputStream(pFile); try { PDDocument source = PDDocument.load(sourceStream); job.setPageable(new PDFPageable(source)); job.print(atts); } finally { sourceStream.close(); } {code} This is not only a problem of PDFBox ;) but can be done right ... ghostscript does it [^gs.png]. was: Printing PDF-documents with macro [^original.pdf] renders hidden content [^printed.pdf] . Code used to print {code:java} InputStream sourceStream = new FileInputStream(pFile); try { PDDocument source = PDDocument.load(sourceStream); job.setPageable(new PDFPageable(source)); job.print(atts); } finally { sourceStream.close(); } {code} This is not only a problem of PDFBox ;) but can be done right ... ghostscript does it [^gs.png]. > Printing invisible Content from PDF-Forms > -- > > Key: PDFBOX-4399 > URL: https://issues.apache.org/jira/browse/PDFBOX-4399 > Project: PDFBox > Issue Type: Bug > Components: AcroForm, Rendering >Affects Versions: 2.0.6 >Reporter: Stefan Ziel >Priority: Major > Attachments: gs.png, original.pdf, printed.png > > > Printing PDF-documents with macro [^original.pdf] renders hidden content > [^printed.png] . Code used to print > {code:java} > InputStream sourceStream = new FileInputStream(pFile); > try { > PDDocument source = PDDocument.load(sourceStream); > job.setPageable(new PDFPageable(source)); > job.print(atts); > } finally { > sourceStream.close(); > } > {code} > This is not only a problem of PDFBox ;) but can be done right ... ghostscript > does it [^gs.png]. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances
[ https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711613#comment-16711613 ] Tilman Hausherr commented on PDFBOX-4392: - ComponentColorModel construction is still needed when using LCMS. Everything else isn't needed when using LCMS. > PDF completely blow up the RAM on amazon instances > -- > > Key: PDFBOX-4392 > URL: https://issues.apache.org/jira/browse/PDFBOX-4392 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.12 >Reporter: Oleksandr Skoryi >Priority: Major > Fix For: 2.0.13 > > Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, > 4392-prereadICC.patch > > > Hi all > The issue is pretty straightforward. I receive a lot of pdfs every day and > render them. In most of the cases everything is OK, but PDFs which produces > WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is > Perceptual, ignoring, treating as Display class > working super long, and are super memory consumable. > It takes from 5 to 15 min on m5.large amazon instance. But attached PDF > completely killed the instance. The java process is just killed by linux > during processing with no exception in logs. > So could you please provide explanations what is going on with files with > WARN message above, and how can I improve the rendering. > > Here is my VM options > -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G > -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider" > Also don't hesitate to ask me about more PDF, I have tones of them :D > > And also a question, does GPU have influence on rendering? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-4399) Printing invisible Content from PDF-Forms
[ https://issues.apache.org/jira/browse/PDFBOX-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-4399: Affects Version/s: 2.0.13 > Printing invisible Content from PDF-Forms > -- > > Key: PDFBOX-4399 > URL: https://issues.apache.org/jira/browse/PDFBOX-4399 > Project: PDFBox > Issue Type: Bug > Components: AcroForm, Rendering >Affects Versions: 2.0.6, 2.0.13 >Reporter: Stefan Ziel >Priority: Major > Attachments: gs.png, original.pdf, printed.png > > > Printing PDF-documents with macro [^original.pdf] renders hidden content > [^printed.png] . Code used to print > {code:java} > InputStream sourceStream = new FileInputStream(pFile); > try { > PDDocument source = PDDocument.load(sourceStream); > job.setPageable(new PDFPageable(source)); > job.print(atts); > } finally { > sourceStream.close(); > } > {code} > This is not only a problem of PDFBox ;) but can be done right ... ghostscript > does it [^gs.png]. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4399) Printing invisible Content from PDF-Forms
[ https://issues.apache.org/jira/browse/PDFBOX-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711813#comment-16711813 ] Tilman Hausherr commented on PDFBOX-4399: - Your PDF contains Javascript and optional content groups. (I didn't test it because of Javascript). Chrome can display it, Edge and PDF.js can't. Neither can we. > Printing invisible Content from PDF-Forms > -- > > Key: PDFBOX-4399 > URL: https://issues.apache.org/jira/browse/PDFBOX-4399 > Project: PDFBox > Issue Type: Bug > Components: AcroForm, Rendering >Affects Versions: 2.0.6, 2.0.13 >Reporter: Stefan Ziel >Priority: Major > Attachments: gs.png, original.pdf, printed.png > > > Printing PDF-documents with macro [^original.pdf] renders hidden content > [^printed.png] . Code used to print > {code:java} > InputStream sourceStream = new FileInputStream(pFile); > try { > PDDocument source = PDDocument.load(sourceStream); > job.setPageable(new PDFPageable(source)); > job.print(atts); > } finally { > sourceStream.close(); > } > {code} > This is not only a problem of PDFBox ;) but can be done right ... ghostscript > does it [^gs.png]. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances
[ https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711704#comment-16711704 ] ASF subversion and git services commented on PDFBOX-4392: - Commit 1848341 from til...@apache.org in branch 'pdfbox/trunk' [ https://svn.apache.org/r1848341 ] PDFBOX-4392: work around java CMS bugs depending of the jdk version / the CMS type > PDF completely blow up the RAM on amazon instances > -- > > Key: PDFBOX-4392 > URL: https://issues.apache.org/jira/browse/PDFBOX-4392 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.12 >Reporter: Oleksandr Skoryi >Priority: Major > Fix For: 2.0.13 > > Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, > 4392-prereadICC.patch > > > Hi all > The issue is pretty straightforward. I receive a lot of pdfs every day and > render them. In most of the cases everything is OK, but PDFs which produces > WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is > Perceptual, ignoring, treating as Display class > working super long, and are super memory consumable. > It takes from 5 to 15 min on m5.large amazon instance. But attached PDF > completely killed the instance. The java process is just killed by linux > during processing with no exception in logs. > So could you please provide explanations what is going on with files with > WARN message above, and how can I improve the rendering. > > Here is my VM options > -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G > -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider" > Also don't hesitate to ask me about more PDF, I have tones of them :D > > And also a question, does GPU have influence on rendering? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances
[ https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711705#comment-16711705 ] ASF subversion and git services commented on PDFBOX-4392: - Commit 1848342 from til...@apache.org in branch 'pdfbox/branches/2.0' [ https://svn.apache.org/r1848342 ] PDFBOX-4392: work around java CMS bugs depending of the jdk version / the CMS type > PDF completely blow up the RAM on amazon instances > -- > > Key: PDFBOX-4392 > URL: https://issues.apache.org/jira/browse/PDFBOX-4392 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.12 >Reporter: Oleksandr Skoryi >Priority: Major > Fix For: 2.0.13 > > Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, > 4392-prereadICC.patch > > > Hi all > The issue is pretty straightforward. I receive a lot of pdfs every day and > render them. In most of the cases everything is OK, but PDFs which produces > WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is > Perceptual, ignoring, treating as Display class > working super long, and are super memory consumable. > It takes from 5 to 15 min on m5.large amazon instance. But attached PDF > completely killed the instance. The java process is just killed by linux > during processing with no exception in logs. > So could you please provide explanations what is going on with files with > WARN message above, and how can I improve the rendering. > > Here is my VM options > -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G > -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider" > Also don't hesitate to ask me about more PDF, I have tones of them :D > > And also a question, does GPU have influence on rendering? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances
[ https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711722#comment-16711722 ] Tilman Hausherr commented on PDFBOX-4392: - {{isMinJdk8()}} is now double... (also in PDFRenderer) I don't have a good idea where to put it and how to name it. > PDF completely blow up the RAM on amazon instances > -- > > Key: PDFBOX-4392 > URL: https://issues.apache.org/jira/browse/PDFBOX-4392 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.12 >Reporter: Oleksandr Skoryi >Priority: Major > Fix For: 2.0.13 > > Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, > 4392-prereadICC.patch > > > Hi all > The issue is pretty straightforward. I receive a lot of pdfs every day and > render them. In most of the cases everything is OK, but PDFs which produces > WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is > Perceptual, ignoring, treating as Display class > working super long, and are super memory consumable. > It takes from 5 to 15 min on m5.large amazon instance. But attached PDF > completely killed the instance. The java process is just killed by linux > during processing with no exception in logs. > So could you please provide explanations what is going on with files with > WARN message above, and how can I improve the rendering. > > Here is my VM options > -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G > -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider" > Also don't hesitate to ask me about more PDF, I have tones of them :D > > And also a question, does GPU have influence on rendering? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Comment Edited] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances
[ https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707114#comment-16707114 ] Tilman Hausherr edited comment on PDFBOX-4392 at 12/6/18 2:03 PM: -- On my PC it takes 106 seconds to render in the -"ridiculous speed"- "ultimate speed" mode of Windows 10, I have set -Xmx2g. Yes GPU may be relevant. The warning is not really important. was (Author: tilman): On my PC it takes 106 seconds to render in the "ultimate speed" mode of Windows 10, I have set -Xmx2g. Yes GPU may be relevant. The warning is not really important. > PDF completely blow up the RAM on amazon instances > -- > > Key: PDFBOX-4392 > URL: https://issues.apache.org/jira/browse/PDFBOX-4392 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.12 >Reporter: Oleksandr Skoryi >Priority: Major > Fix For: 2.0.13 > > Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, > 4392-prereadICC.patch > > > Hi all > The issue is pretty straightforward. I receive a lot of pdfs every day and > render them. In most of the cases everything is OK, but PDFs which produces > WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is > Perceptual, ignoring, treating as Display class > working super long, and are super memory consumable. > It takes from 5 to 15 min on m5.large amazon instance. But attached PDF > completely killed the instance. The java process is just killed by linux > during processing with no exception in logs. > So could you please provide explanations what is going on with files with > WARN message above, and how can I improve the rendering. > > Here is my VM options > -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G > -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider" > Also don't hesitate to ask me about more PDF, I have tones of them :D > > And also a question, does GPU have influence on rendering? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances
[ https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711504#comment-16711504 ] ASF subversion and git services commented on PDFBOX-4392: - Commit 1848318 from til...@apache.org in branch 'pdfbox/trunk' [ https://svn.apache.org/r1848318 ] PDFBOX-4392: remove toRGB() which is contained in Color() construction, thanks Itai Shaked > PDF completely blow up the RAM on amazon instances > -- > > Key: PDFBOX-4392 > URL: https://issues.apache.org/jira/browse/PDFBOX-4392 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.12 >Reporter: Oleksandr Skoryi >Priority: Major > Fix For: 2.0.13 > > Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, > 4392-prereadICC.patch > > > Hi all > The issue is pretty straightforward. I receive a lot of pdfs every day and > render them. In most of the cases everything is OK, but PDFs which produces > WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is > Perceptual, ignoring, treating as Display class > working super long, and are super memory consumable. > It takes from 5 to 15 min on m5.large amazon instance. But attached PDF > completely killed the instance. The java process is just killed by linux > during processing with no exception in logs. > So could you please provide explanations what is going on with files with > WARN message above, and how can I improve the rendering. > > Here is my VM options > -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G > -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider" > Also don't hesitate to ask me about more PDF, I have tones of them :D > > And also a question, does GPU have influence on rendering? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances
[ https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711528#comment-16711528 ] Tilman Hausherr commented on PDFBOX-4392: - I'll retest with KCMS and LCMS whether the ComponentColorModel construction is still needed. And also whether the "new Color" is needed on LCMS. > PDF completely blow up the RAM on amazon instances > -- > > Key: PDFBOX-4392 > URL: https://issues.apache.org/jira/browse/PDFBOX-4392 > Project: PDFBox > Issue Type: Bug >Affects Versions: 2.0.12 >Reporter: Oleksandr Skoryi >Priority: Major > Fix For: 2.0.13 > > Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, > 4392-prereadICC.patch > > > Hi all > The issue is pretty straightforward. I receive a lot of pdfs every day and > render them. In most of the cases everything is OK, but PDFs which produces > WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is > Perceptual, ignoring, treating as Display class > working super long, and are super memory consumable. > It takes from 5 to 15 min on m5.large amazon instance. But attached PDF > completely killed the instance. The java process is just killed by linux > during processing with no exception in logs. > So could you please provide explanations what is going on with files with > WARN message above, and how can I improve the rendering. > > Here is my VM options > -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G > -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider" > Also don't hesitate to ask me about more PDF, I have tones of them :D > > And also a question, does GPU have influence on rendering? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org