[jira] [Commented] (PDFBOX-4396) Memory leak due to soft reference caching

2018-12-06 Thread Ben Manes (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711099#comment-16711099
 ] 

Ben Manes commented on PDFBOX-4396:
---

This is probably covered in your references tickets. In ScratchFileBuffer it 
states,
{code:java}
/**
 * While calling finalize is normally discouraged we will have to
 * use it here as long as closing a scratch file buffer is not 
 * done in every case. Currently {@link COSStream} creates new
 * buffers without closing the old one - which might still be
 * used.
 * 
 * Enabling debugging one will see if there are still cases
 * where the buffer is not closed.
 */{code}
I wasn't able to reproduce the problem in isolation on the PDF document that 
failed (450mb, 999 pages). I could process it locally in ~12 minutes, the same 
as ghostscript. It may be due to the additional load put on the machine as the 
processing is cpu heavy, I process multiple pdfs and pages in parallel, and 
there is other incoming work. As G1 is is quota driven, likely the cpu 
thrashing is causes it to not have its work finished within the desired 
timeframes. When it exhausts its quota and is unable to keep up, that would 
eventually lead to an OOME. Since Java lacks functioning thread priorities, we 
it can't de-emphasize application threads for the collector. If G1 has moved 
away from stop-the-world to failing, then it cannot recover in this scenario. 
Since G1 has constantly changed, it's hard to pinpoint as descriptions from 
years ago are no longer accurate and likely they optimized against handling 
this case, preferring the application was fixed to be better behaved.

So far my fixes do seem to be chugging along and past the failure point, but 
still has more work before it's in the clear. I disabled caching (no obvious 
perf hit), discard a PDDocument every 25 pages, and call GC each time a 
PDDocument is closed. I may look into using ghostscript and lambda functions 
instead, to distribute the work and offload from application servers.

In regards to JDK10, there are some build tools not yet JDK11 compatible that I 
am waiting on. It takes some work to be JDK9 compatible, though 9=>10 was 
effortless. The 11 transition is more work due to additional module removals. I 
have 11 prototyped, but its stuck on an infinite compilation bug, due to using 
Gradle 4.x (incompatible) and a plugin not yet released with Gradle 5 support.

> Memory leak due to soft reference caching
> -
>
> Key: PDFBOX-4396
> URL: https://issues.apache.org/jira/browse/PDFBOX-4396
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.12
> Environment: JDK10; G1
>Reporter: Ben Manes
>Priority: Major
> Attachments: #2 - memory leak 2.png, #2 - memory leak.png, memory 
> leak 2.png, memory leak.png
>
>
> In a heap dump, it appears that DefaultResourceCache is retaining 5.3 GB of 
> memory due to buffered images (via PDImageXObject). I suspect that G1 is not 
> collecting soft references across all regions before it out-of-memory errors.
> In PDFBOX-4389, I discovered very slow PDDocument#load times due to a JDK10 
> I/O bug. Previously I was loading the document to render each page, but this 
> took 1.5 minutes. To work around that bug I reused the document instance 
> across pages. This seems to have fail because the pages were cached and not 
> cleared by the GC.
> The DefaultResourceCache does not prune its cache entries when the soft 
> references are collected. Like WeakHashMap, it should use a ReferenceQueue, 
> poll it on every access, and prune accordingly.
> Thankfully PDDocument#setResourceCache exists. For now I am going to reset 
> the cache to a new instance after a page has been rendered. The entries 
> should no longer be reachable and be GC'd more aggressively. If that doesn't 
> work, I'll either replace the cache (e.g. with Caffeine) or disable it by 
> setting the instance to null.
> I think the desired fix is to prune the DefaultResourceCache and, ideally, 
> reconsider usage of soft references (as they tend to be poor in practice). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Closed] (PDFBOX-3388) PDFTextStripper - ScratchFileBuffer not closed!

2018-12-06 Thread Tilman Hausherr (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr closed PDFBOX-3388.
---
Resolution: Duplicate

Closing as duplicate of PDFBOX-3359. I've added you as a watcher there.

> PDFTextStripper - ScratchFileBuffer not closed!
> ---
>
> Key: PDFBOX-3388
> URL: https://issues.apache.org/jira/browse/PDFBOX-3388
> Project: PDFBox
>  Issue Type: Bug
>Reporter: Roman Pichlik
>Priority: Major
> Attachments: CloseablePDFParser.java, PDFStripperTest.java, test.pdf
>
>
> _PDFTextStripper_ or inherently used classes probably do not close all opened 
> streams under all circumstances. You can reproduce that by the following 
> snippet of code and the attached PDF file.
> {code}
> try (RandomAccessBuffer rab = new RandomAccessBuffer(is)) {
> PDFParser parser = new PDFParser(rab);
> parser.parse();
> try (COSDocument cosDoc = parser.getDocument();PDDocument pdDoc = 
> new PDDocument(cosDoc);){
> PDFTextStripper pdfStripper = new PDFTextStripper();
> pdfStripper.getText(pdDoc);
> }
> } catch (IOException e) {
> throw new RuntimeException(e);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Created] (PDFBOX-4398) getLastSignatureDictionary modifies internal structure of PDDocument

2018-12-06 Thread beat weisskopf (JIRA)
beat weisskopf created PDFBOX-4398:
--

 Summary: getLastSignatureDictionary modifies internal structure of 
PDDocument
 Key: PDFBOX-4398
 URL: https://issues.apache.org/jira/browse/PDFBOX-4398
 Project: PDFBox
  Issue Type: Bug
  Components: AcroForm
Affects Versions: 2.0.12
Reporter: beat weisskopf


If one calls PDDocument#getLastSignatureDictionary, the AcroFrom is populated 
with the defaults even if not needed. This modifies the internals of the 
PDDocument and therefore there are changes to be saved, even if the file is not 
modified by "real" changes.

For example:
{code}
PDDocument pdfDocument = PDDocument.load(pdfBytes);
pdfDocument.getLastSignatureDictionary();
{code}
This calls the verifyOrCreateDefaults() method, which initializes the 
DR-Dictionary if not yet done. This is even done if getLastSignatureDictionary 
returns null.

Why this bothers me: it is very unexpected behaviour that a getter modifies an 
objects state. This is no big deal for our usecase, the other issue 
(PDFBOX-4303) was a bigger problem as we are diffing objects between revisions 
(current vs last signed revision).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4393) PDF signature invalid after second interactive field signed

2018-12-06 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711308#comment-16711308
 ] 

Tilman Hausherr commented on PDFBOX-4393:
-

[~Pooky] please give feedback on whether the workaround helped; try also the 
snapshot at
https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/2.0.14-SNAPSHOT/
that would make the workaround obsolete.

> PDF signature invalid after second interactive field signed
> ---
>
> Key: PDFBOX-4393
> URL: https://issues.apache.org/jira/browse/PDFBOX-4393
> Project: PDFBox
>  Issue Type: Bug
>  Components: Signing
>Affects Versions: 2.0.12, 2.0.13
> Environment: Windows
>Reporter: Martin Klíma
>Priority: Major
> Fix For: 2.0.14, 3.0.0 PDFBox
>
> Attachments: image-2018-12-04-11-14-01-586.png, 
> image-2018-12-04-11-14-32-676.png, streamserve_test_sig0.pdf, 
> streamserve_test_sig0_saved.pdf, streamserve_test_sig0_saved_signed by 
> PDFBox.pdf, streamserve_test_sig0_saved_signed by PDFBox_then_signed by 
> AR.pdf, streamserve_test_sig0_signed by PDFBoxNEW.pdf, 
> streamserve_test_sig0_signed by PDFBoxNEW_signed by AR.pdf, 
> streamserve_test_sig2.pdf, streamserve_test_sig3.pdf, 
> streamserve_test_sig3.signed.pdf
>
>
> Hi guys,
> I stumped on the problem with PDFBox and interactive field signing. I have 
> PDF generated with OpenText StreamServe with two interactive fields for 
> signing. See example 1 (streamserve_test_sig0.pdf) in attachement.
> When I use Adobe Reader I can sign both of the visual fields just fine but 
> when I use PDFBox to sign one of this field the following signature is marked 
> as invalid. Doesn't matter if I use PDFbox or sign it manually with Adobe 
> Reader See example 3: 
>  *  streamserve_test_sig3.pdf (signed by PDFBox) - valid
>  *  streamserve_test_sig3.signed.pdf (signed 2nd by Adobe Reader) - invalid
> !image-2018-12-04-11-14-01-586.png|width=582,height=269!
>  
> Also last example - when it´s signed first by Adobe Reader and then with 
> PDFBox the signature seems valid but it says the document was "certified". 
> See example 2 (streamserve_test_sig2.pdf)
> !image-2018-12-04-11-14-32-676.png|width=591,height=283!
>  
> What could be wrong? When is signed whole document with PDFBox it works just 
> fine.
> Thanks for response,
> Martin
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4398) getLastSignatureDictionary modifies internal structure of PDDocument

2018-12-06 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711279#comment-16711279
 ] 

Tilman Hausherr commented on PDFBOX-4398:
-

This was introduced in PDFBOX-3732 and also badly needed by PDFBOX-4393. Yes 
such side effects are usually a no-no but we replicated the behavior of Adobe. 
See also the ending comment by Maruan in PDFBOX-3732.

>From your text I understand that it bothers you, but it doesn't break 
>anything. So I'd just target this to 3.0, set it to minor and keep 2.x as is. 
>I won't do anything for now.

> getLastSignatureDictionary modifies internal structure of PDDocument
> 
>
> Key: PDFBOX-4398
> URL: https://issues.apache.org/jira/browse/PDFBOX-4398
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm
>Affects Versions: 2.0.12
>Reporter: beat weisskopf
>Priority: Minor
> Fix For: 3.0.0 PDFBox
>
>
> If one calls PDDocument#getLastSignatureDictionary, the AcroFrom is populated 
> with the defaults even if not needed. This modifies the internals of the 
> PDDocument and therefore there are changes to be saved, even if the file is 
> not modified by "real" changes.
> For example:
> {code}
> PDDocument pdfDocument = PDDocument.load(pdfBytes);
> pdfDocument.getLastSignatureDictionary();
> {code}
> This calls the verifyOrCreateDefaults() method, which initializes the 
> DR-Dictionary if not yet done. This is even done if 
> getLastSignatureDictionary returns null.
> Why this bothers me: it is very unexpected behaviour that a getter modifies 
> an objects state. This is no big deal for our usecase, the other issue 
> (PDFBOX-4303) was a bigger problem as we are diffing objects between 
> revisions (current vs last signed revision).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4398) getLastSignatureDictionary modifies internal structure of PDDocument

2018-12-06 Thread Tilman Hausherr (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-4398:

Fix Version/s: 3.0.0 PDFBox

> getLastSignatureDictionary modifies internal structure of PDDocument
> 
>
> Key: PDFBOX-4398
> URL: https://issues.apache.org/jira/browse/PDFBOX-4398
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm
>Affects Versions: 2.0.12
>Reporter: beat weisskopf
>Priority: Minor
> Fix For: 3.0.0 PDFBox
>
>
> If one calls PDDocument#getLastSignatureDictionary, the AcroFrom is populated 
> with the defaults even if not needed. This modifies the internals of the 
> PDDocument and therefore there are changes to be saved, even if the file is 
> not modified by "real" changes.
> For example:
> {code}
> PDDocument pdfDocument = PDDocument.load(pdfBytes);
> pdfDocument.getLastSignatureDictionary();
> {code}
> This calls the verifyOrCreateDefaults() method, which initializes the 
> DR-Dictionary if not yet done. This is even done if 
> getLastSignatureDictionary returns null.
> Why this bothers me: it is very unexpected behaviour that a getter modifies 
> an objects state. This is no big deal for our usecase, the other issue 
> (PDFBOX-4303) was a bigger problem as we are diffing objects between 
> revisions (current vs last signed revision).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-4393) PDF signature invalid after second interactive field signed

2018-12-06 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710060#comment-16710060
 ] 

Tilman Hausherr edited comment on PDFBOX-4393 at 12/6/18 11:14 AM:
---

Commit 1848213 from til...@apache.org in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1848213 ]

PDFBOX-4303, PDFBOX-4393: mark updated when applicable; check whether font 
dictionary exists; avoid bug that wrong dictionary was checked for font


was (Author: jira-bot):
Commit 1848213 from til...@apache.org in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1848213 ]

PDFBOX-4393: mark updated when applicable; check whether font dictionary 
exists; avoid bug that wrong dictionary was checked for font

> PDF signature invalid after second interactive field signed
> ---
>
> Key: PDFBOX-4393
> URL: https://issues.apache.org/jira/browse/PDFBOX-4393
> Project: PDFBox
>  Issue Type: Bug
>  Components: Signing
>Affects Versions: 2.0.12, 2.0.13
> Environment: Windows
>Reporter: Martin Klíma
>Priority: Major
> Fix For: 2.0.14, 3.0.0 PDFBox
>
> Attachments: image-2018-12-04-11-14-01-586.png, 
> image-2018-12-04-11-14-32-676.png, streamserve_test_sig0.pdf, 
> streamserve_test_sig0_saved.pdf, streamserve_test_sig0_saved_signed by 
> PDFBox.pdf, streamserve_test_sig0_saved_signed by PDFBox_then_signed by 
> AR.pdf, streamserve_test_sig0_signed by PDFBoxNEW.pdf, 
> streamserve_test_sig0_signed by PDFBoxNEW_signed by AR.pdf, 
> streamserve_test_sig2.pdf, streamserve_test_sig3.pdf, 
> streamserve_test_sig3.signed.pdf
>
>
> Hi guys,
> I stumped on the problem with PDFBox and interactive field signing. I have 
> PDF generated with OpenText StreamServe with two interactive fields for 
> signing. See example 1 (streamserve_test_sig0.pdf) in attachement.
> When I use Adobe Reader I can sign both of the visual fields just fine but 
> when I use PDFBox to sign one of this field the following signature is marked 
> as invalid. Doesn't matter if I use PDFbox or sign it manually with Adobe 
> Reader See example 3: 
>  *  streamserve_test_sig3.pdf (signed by PDFBox) - valid
>  *  streamserve_test_sig3.signed.pdf (signed 2nd by Adobe Reader) - invalid
> !image-2018-12-04-11-14-01-586.png|width=582,height=269!
>  
> Also last example - when it´s signed first by Adobe Reader and then with 
> PDFBox the signature seems valid but it says the document was "certified". 
> See example 2 (streamserve_test_sig2.pdf)
> !image-2018-12-04-11-14-32-676.png|width=591,height=283!
>  
> What could be wrong? When is signed whole document with PDFBox it works just 
> fine.
> Thanks for response,
> Martin
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-4393) PDF signature invalid after second interactive field signed

2018-12-06 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710059#comment-16710059
 ] 

Tilman Hausherr edited comment on PDFBOX-4393 at 12/6/18 11:14 AM:
---

Commit 1848212 from til...@apache.org in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1848212 ]

PDFBOX-4303, PDFBOX-4393: mark updated when applicable; check whether font 
dictionary exists; avoid bug that wrong dictionary was checked for font


was (Author: jira-bot):
Commit 1848212 from til...@apache.org in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1848212 ]

PDFBOX-4393: mark updated when applicable; check whether font dictionary 
exists; avoid bug that wrong dictionary was checked for font

> PDF signature invalid after second interactive field signed
> ---
>
> Key: PDFBOX-4393
> URL: https://issues.apache.org/jira/browse/PDFBOX-4393
> Project: PDFBox
>  Issue Type: Bug
>  Components: Signing
>Affects Versions: 2.0.12, 2.0.13
> Environment: Windows
>Reporter: Martin Klíma
>Priority: Major
> Fix For: 2.0.14, 3.0.0 PDFBox
>
> Attachments: image-2018-12-04-11-14-01-586.png, 
> image-2018-12-04-11-14-32-676.png, streamserve_test_sig0.pdf, 
> streamserve_test_sig0_saved.pdf, streamserve_test_sig0_saved_signed by 
> PDFBox.pdf, streamserve_test_sig0_saved_signed by PDFBox_then_signed by 
> AR.pdf, streamserve_test_sig0_signed by PDFBoxNEW.pdf, 
> streamserve_test_sig0_signed by PDFBoxNEW_signed by AR.pdf, 
> streamserve_test_sig2.pdf, streamserve_test_sig3.pdf, 
> streamserve_test_sig3.signed.pdf
>
>
> Hi guys,
> I stumped on the problem with PDFBox and interactive field signing. I have 
> PDF generated with OpenText StreamServe with two interactive fields for 
> signing. See example 1 (streamserve_test_sig0.pdf) in attachement.
> When I use Adobe Reader I can sign both of the visual fields just fine but 
> when I use PDFBox to sign one of this field the following signature is marked 
> as invalid. Doesn't matter if I use PDFbox or sign it manually with Adobe 
> Reader See example 3: 
>  *  streamserve_test_sig3.pdf (signed by PDFBox) - valid
>  *  streamserve_test_sig3.signed.pdf (signed 2nd by Adobe Reader) - invalid
> !image-2018-12-04-11-14-01-586.png|width=582,height=269!
>  
> Also last example - when it´s signed first by Adobe Reader and then with 
> PDFBox the signature seems valid but it says the document was "certified". 
> See example 2 (streamserve_test_sig2.pdf)
> !image-2018-12-04-11-14-32-676.png|width=591,height=283!
>  
> What could be wrong? When is signed whole document with PDFBox it works just 
> fine.
> Thanks for response,
> Martin
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4396) Memory leak due to soft reference caching

2018-12-06 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711189#comment-16711189
 ] 

Tilman Hausherr commented on PDFBOX-4396:
-

Can you build from source? If yes, open COSStream.java, and add
{code:java}IOUtils.closeQuietly(randomAccess);{code}
above 
{code:java}randomAccess = scratchFile.createBuffer();{code}
at two places.

> Memory leak due to soft reference caching
> -
>
> Key: PDFBOX-4396
> URL: https://issues.apache.org/jira/browse/PDFBOX-4396
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.12
> Environment: JDK10; G1
>Reporter: Ben Manes
>Priority: Major
> Attachments: #2 - memory leak 2.png, #2 - memory leak.png, memory 
> leak 2.png, memory leak.png
>
>
> In a heap dump, it appears that DefaultResourceCache is retaining 5.3 GB of 
> memory due to buffered images (via PDImageXObject). I suspect that G1 is not 
> collecting soft references across all regions before it out-of-memory errors.
> In PDFBOX-4389, I discovered very slow PDDocument#load times due to a JDK10 
> I/O bug. Previously I was loading the document to render each page, but this 
> took 1.5 minutes. To work around that bug I reused the document instance 
> across pages. This seems to have fail because the pages were cached and not 
> cleared by the GC.
> The DefaultResourceCache does not prune its cache entries when the soft 
> references are collected. Like WeakHashMap, it should use a ReferenceQueue, 
> poll it on every access, and prune accordingly.
> Thankfully PDDocument#setResourceCache exists. For now I am going to reset 
> the cache to a new instance after a page has been rendered. The entries 
> should no longer be reachable and be GC'd more aggressively. If that doesn't 
> work, I'll either replace the cache (e.g. with Caffeine) or disable it by 
> setting the instance to null.
> I think the desired fix is to prune the DefaultResourceCache and, ideally, 
> reconsider usage of soft references (as they tend to be poor in practice). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4303) Helv and ZaDb overridden

2018-12-06 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711295#comment-16711295
 ] 

Tilman Hausherr commented on PDFBOX-4303:
-

Commit 1848212 from til...@apache.org in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1848212 ]

PDFBOX-4303, PDFBOX-4393: mark updated when applicable; check whether font 
dictionary exists; avoid bug that wrong dictionary was checked for font

Commit 1848213 from til...@apache.org in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1848213 ]

PDFBOX-4303, PDFBOX-4393: mark updated when applicable; check whether font 
dictionary exists; avoid bug that wrong dictionary was checked for font



> Helv and ZaDb overridden
> 
>
> Key: PDFBOX-4303
> URL: https://issues.apache.org/jira/browse/PDFBOX-4303
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm
>Affects Versions: 2.0.11
>Reporter: simon steiner
>Assignee: Maruan Sahyoun
>Priority: Major
>  Labels: Appearance
> Fix For: 2.0.14, 3.0.0 PDFBox
>
> Attachments: PDFBOX-4303-2.0.12.diff, ReaderModifiedForm.pdf
>
>
> Due to change:
> PDFBOX-3943: create /Helv and /ZaDb entries if they don't exist, regardless 
> if /DR existed or not
>  
> was working ok in 2.0.7, in 2.0 branch
> PDAcroForm
> verifyOrCreateDefaults():
> is:
> {color:#80}if 
> {color}(!defaultResources.getCOSObject().containsKey({color:#008000}"Helv"{color}))
> should be checking key in the font dictionary before calling 
> defaultResources.put



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4393) PDF signature invalid after second interactive field signed

2018-12-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711297#comment-16711297
 ] 

ASF subversion and git services commented on PDFBOX-4393:
-

Commit 1848285 from til...@apache.org in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1848285 ]

PDFBOX-4303, PDFBOX-4393: add comments

> PDF signature invalid after second interactive field signed
> ---
>
> Key: PDFBOX-4393
> URL: https://issues.apache.org/jira/browse/PDFBOX-4393
> Project: PDFBox
>  Issue Type: Bug
>  Components: Signing
>Affects Versions: 2.0.12, 2.0.13
> Environment: Windows
>Reporter: Martin Klíma
>Priority: Major
> Fix For: 2.0.14, 3.0.0 PDFBox
>
> Attachments: image-2018-12-04-11-14-01-586.png, 
> image-2018-12-04-11-14-32-676.png, streamserve_test_sig0.pdf, 
> streamserve_test_sig0_saved.pdf, streamserve_test_sig0_saved_signed by 
> PDFBox.pdf, streamserve_test_sig0_saved_signed by PDFBox_then_signed by 
> AR.pdf, streamserve_test_sig0_signed by PDFBoxNEW.pdf, 
> streamserve_test_sig0_signed by PDFBoxNEW_signed by AR.pdf, 
> streamserve_test_sig2.pdf, streamserve_test_sig3.pdf, 
> streamserve_test_sig3.signed.pdf
>
>
> Hi guys,
> I stumped on the problem with PDFBox and interactive field signing. I have 
> PDF generated with OpenText StreamServe with two interactive fields for 
> signing. See example 1 (streamserve_test_sig0.pdf) in attachement.
> When I use Adobe Reader I can sign both of the visual fields just fine but 
> when I use PDFBox to sign one of this field the following signature is marked 
> as invalid. Doesn't matter if I use PDFbox or sign it manually with Adobe 
> Reader See example 3: 
>  *  streamserve_test_sig3.pdf (signed by PDFBox) - valid
>  *  streamserve_test_sig3.signed.pdf (signed 2nd by Adobe Reader) - invalid
> !image-2018-12-04-11-14-01-586.png|width=582,height=269!
>  
> Also last example - when it´s signed first by Adobe Reader and then with 
> PDFBox the signature seems valid but it says the document was "certified". 
> See example 2 (streamserve_test_sig2.pdf)
> !image-2018-12-04-11-14-32-676.png|width=591,height=283!
>  
> What could be wrong? When is signed whole document with PDFBox it works just 
> fine.
> Thanks for response,
> Martin
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4303) Helv and ZaDb overridden

2018-12-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711296#comment-16711296
 ] 

ASF subversion and git services commented on PDFBOX-4303:
-

Commit 1848285 from til...@apache.org in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1848285 ]

PDFBOX-4303, PDFBOX-4393: add comments

> Helv and ZaDb overridden
> 
>
> Key: PDFBOX-4303
> URL: https://issues.apache.org/jira/browse/PDFBOX-4303
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm
>Affects Versions: 2.0.11
>Reporter: simon steiner
>Assignee: Maruan Sahyoun
>Priority: Major
>  Labels: Appearance
> Fix For: 2.0.14, 3.0.0 PDFBox
>
> Attachments: PDFBOX-4303-2.0.12.diff, ReaderModifiedForm.pdf
>
>
> Due to change:
> PDFBOX-3943: create /Helv and /ZaDb entries if they don't exist, regardless 
> if /DR existed or not
>  
> was working ok in 2.0.7, in 2.0 branch
> PDAcroForm
> verifyOrCreateDefaults():
> is:
> {color:#80}if 
> {color}(!defaultResources.getCOSObject().containsKey({color:#008000}"Helv"{color}))
> should be checking key in the font dictionary before calling 
> defaultResources.put



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-4398) getLastSignatureDictionary modifies internal structure of PDDocument

2018-12-06 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711279#comment-16711279
 ] 

Tilman Hausherr edited comment on PDFBOX-4398 at 12/6/18 11:00 AM:
---

This was introduced in PDFBOX-3732 and also badly needed by PDFBOX-4393. Yes 
such side effects are usually a no-no but we replicated the behavior of Adobe. 
See also the ending comment by Maruan in PDFBOX-3732.

>From your text I understand that it bothers you, but it doesn't break 
>anything. So I'd just target this to 3.0 and keep 2.x as is. I won't do 
>anything for now.


was (Author: tilman):
This was introduced in PDFBOX-3732 and also badly needed by PDFBOX-4393. Yes 
such side effects are usually a no-no but we replicated the behavior of Adobe. 
See also the ending comment by Maruan in PDFBOX-3732.

>From your text I understand that it bothers you, but it doesn't break 
>anything. So I'd just target this to 3.0, set it to minor and keep 2.x as is. 
>I won't do anything for now.

> getLastSignatureDictionary modifies internal structure of PDDocument
> 
>
> Key: PDFBOX-4398
> URL: https://issues.apache.org/jira/browse/PDFBOX-4398
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm
>Affects Versions: 2.0.12
>Reporter: beat weisskopf
>Priority: Minor
> Fix For: 3.0.0 PDFBox
>
>
> If one calls PDDocument#getLastSignatureDictionary, the AcroFrom is populated 
> with the defaults even if not needed. This modifies the internals of the 
> PDDocument and therefore there are changes to be saved, even if the file is 
> not modified by "real" changes.
> For example:
> {code}
> PDDocument pdfDocument = PDDocument.load(pdfBytes);
> pdfDocument.getLastSignatureDictionary();
> {code}
> This calls the verifyOrCreateDefaults() method, which initializes the 
> DR-Dictionary if not yet done. This is even done if 
> getLastSignatureDictionary returns null.
> Why this bothers me: it is very unexpected behaviour that a getter modifies 
> an objects state. This is no big deal for our usecase, the other issue 
> (PDFBOX-4303) was a bigger problem as we are diffing objects between 
> revisions (current vs last signed revision).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-4396) Memory leak due to soft reference caching

2018-12-06 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711189#comment-16711189
 ] 

Tilman Hausherr edited comment on PDFBOX-4396 at 12/6/18 9:31 AM:
--

Can you build from source? If yes, open COSStream.java, and add
{code:java}
IOUtils.closeQuietly(randomAccess);{code}
above
{code:java}
randomAccess = scratchFile.createBuffer();{code}
at two places.

My observations that ramdomAccess was not null when an encrypted file is 
opened, because the same COSStream is rewritten and it can't be closed before 
that because this would result in an exception.


was (Author: tilman):
Can you build from source? If yes, open COSStream.java, and add
{code:java}IOUtils.closeQuietly(randomAccess);{code}
above 
{code:java}randomAccess = scratchFile.createBuffer();{code}
at two places.

> Memory leak due to soft reference caching
> -
>
> Key: PDFBOX-4396
> URL: https://issues.apache.org/jira/browse/PDFBOX-4396
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.12
> Environment: JDK10; G1
>Reporter: Ben Manes
>Priority: Major
> Attachments: #2 - memory leak 2.png, #2 - memory leak.png, memory 
> leak 2.png, memory leak.png
>
>
> In a heap dump, it appears that DefaultResourceCache is retaining 5.3 GB of 
> memory due to buffered images (via PDImageXObject). I suspect that G1 is not 
> collecting soft references across all regions before it out-of-memory errors.
> In PDFBOX-4389, I discovered very slow PDDocument#load times due to a JDK10 
> I/O bug. Previously I was loading the document to render each page, but this 
> took 1.5 minutes. To work around that bug I reused the document instance 
> across pages. This seems to have fail because the pages were cached and not 
> cleared by the GC.
> The DefaultResourceCache does not prune its cache entries when the soft 
> references are collected. Like WeakHashMap, it should use a ReferenceQueue, 
> poll it on every access, and prune accordingly.
> Thankfully PDDocument#setResourceCache exists. For now I am going to reset 
> the cache to a new instance after a page has been rendered. The entries 
> should no longer be reachable and be GC'd more aggressively. If that doesn't 
> work, I'll either replace the cache (e.g. with Caffeine) or disable it by 
> setting the instance to null.
> I think the desired fix is to prune the DefaultResourceCache and, ideally, 
> reconsider usage of soft references (as they tend to be poor in practice). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4393) PDF signature invalid after second interactive field signed

2018-12-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711300#comment-16711300
 ] 

ASF subversion and git services commented on PDFBOX-4393:
-

Commit 1848286 from til...@apache.org in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1848286 ]

PDFBOX-4303, PDFBOX-4393: add comments

> PDF signature invalid after second interactive field signed
> ---
>
> Key: PDFBOX-4393
> URL: https://issues.apache.org/jira/browse/PDFBOX-4393
> Project: PDFBox
>  Issue Type: Bug
>  Components: Signing
>Affects Versions: 2.0.12, 2.0.13
> Environment: Windows
>Reporter: Martin Klíma
>Priority: Major
> Fix For: 2.0.14, 3.0.0 PDFBox
>
> Attachments: image-2018-12-04-11-14-01-586.png, 
> image-2018-12-04-11-14-32-676.png, streamserve_test_sig0.pdf, 
> streamserve_test_sig0_saved.pdf, streamserve_test_sig0_saved_signed by 
> PDFBox.pdf, streamserve_test_sig0_saved_signed by PDFBox_then_signed by 
> AR.pdf, streamserve_test_sig0_signed by PDFBoxNEW.pdf, 
> streamserve_test_sig0_signed by PDFBoxNEW_signed by AR.pdf, 
> streamserve_test_sig2.pdf, streamserve_test_sig3.pdf, 
> streamserve_test_sig3.signed.pdf
>
>
> Hi guys,
> I stumped on the problem with PDFBox and interactive field signing. I have 
> PDF generated with OpenText StreamServe with two interactive fields for 
> signing. See example 1 (streamserve_test_sig0.pdf) in attachement.
> When I use Adobe Reader I can sign both of the visual fields just fine but 
> when I use PDFBox to sign one of this field the following signature is marked 
> as invalid. Doesn't matter if I use PDFbox or sign it manually with Adobe 
> Reader See example 3: 
>  *  streamserve_test_sig3.pdf (signed by PDFBox) - valid
>  *  streamserve_test_sig3.signed.pdf (signed 2nd by Adobe Reader) - invalid
> !image-2018-12-04-11-14-01-586.png|width=582,height=269!
>  
> Also last example - when it´s signed first by Adobe Reader and then with 
> PDFBox the signature seems valid but it says the document was "certified". 
> See example 2 (streamserve_test_sig2.pdf)
> !image-2018-12-04-11-14-32-676.png|width=591,height=283!
>  
> What could be wrong? When is signed whole document with PDFBox it works just 
> fine.
> Thanks for response,
> Martin
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4303) Helv and ZaDb overridden

2018-12-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711298#comment-16711298
 ] 

ASF subversion and git services commented on PDFBOX-4303:
-

Commit 1848286 from til...@apache.org in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1848286 ]

PDFBOX-4303, PDFBOX-4393: add comments

> Helv and ZaDb overridden
> 
>
> Key: PDFBOX-4303
> URL: https://issues.apache.org/jira/browse/PDFBOX-4303
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm
>Affects Versions: 2.0.11
>Reporter: simon steiner
>Assignee: Maruan Sahyoun
>Priority: Major
>  Labels: Appearance
> Fix For: 2.0.14, 3.0.0 PDFBox
>
> Attachments: PDFBOX-4303-2.0.12.diff, ReaderModifiedForm.pdf
>
>
> Due to change:
> PDFBOX-3943: create /Helv and /ZaDb entries if they don't exist, regardless 
> if /DR existed or not
>  
> was working ok in 2.0.7, in 2.0 branch
> PDAcroForm
> verifyOrCreateDefaults():
> is:
> {color:#80}if 
> {color}(!defaultResources.getCOSObject().containsKey({color:#008000}"Helv"{color}))
> should be checking key in the font dictionary before calling 
> defaultResources.put



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Closed] (PDFBOX-4396) Memory leak due to soft reference caching

2018-12-06 Thread Ben Manes (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Manes closed PDFBOX-4396.
-

> Memory leak due to soft reference caching
> -
>
> Key: PDFBOX-4396
> URL: https://issues.apache.org/jira/browse/PDFBOX-4396
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.12
> Environment: JDK10; G1
>Reporter: Ben Manes
>Priority: Major
> Attachments: #2 - memory leak 2.png, #2 - memory leak.png, memory 
> leak 2.png, memory leak.png
>
>
> In a heap dump, it appears that DefaultResourceCache is retaining 5.3 GB of 
> memory due to buffered images (via PDImageXObject). I suspect that G1 is not 
> collecting soft references across all regions before it out-of-memory errors.
> In PDFBOX-4389, I discovered very slow PDDocument#load times due to a JDK10 
> I/O bug. Previously I was loading the document to render each page, but this 
> took 1.5 minutes. To work around that bug I reused the document instance 
> across pages. This seems to have fail because the pages were cached and not 
> cleared by the GC.
> The DefaultResourceCache does not prune its cache entries when the soft 
> references are collected. Like WeakHashMap, it should use a ReferenceQueue, 
> poll it on every access, and prune accordingly.
> Thankfully PDDocument#setResourceCache exists. For now I am going to reset 
> the cache to a new instance after a page has been rendered. The entries 
> should no longer be reachable and be GC'd more aggressively. If that doesn't 
> work, I'll either replace the cache (e.g. with Caffeine) or disable it by 
> setting the instance to null.
> I think the desired fix is to prune the DefaultResourceCache and, ideally, 
> reconsider usage of soft references (as they tend to be poor in practice). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Resolved] (PDFBOX-4396) Memory leak due to soft reference caching

2018-12-06 Thread Ben Manes (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ben Manes resolved PDFBOX-4396.
---
Resolution: Workaround

> Memory leak due to soft reference caching
> -
>
> Key: PDFBOX-4396
> URL: https://issues.apache.org/jira/browse/PDFBOX-4396
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.12
> Environment: JDK10; G1
>Reporter: Ben Manes
>Priority: Major
> Attachments: #2 - memory leak 2.png, #2 - memory leak.png, memory 
> leak 2.png, memory leak.png
>
>
> In a heap dump, it appears that DefaultResourceCache is retaining 5.3 GB of 
> memory due to buffered images (via PDImageXObject). I suspect that G1 is not 
> collecting soft references across all regions before it out-of-memory errors.
> In PDFBOX-4389, I discovered very slow PDDocument#load times due to a JDK10 
> I/O bug. Previously I was loading the document to render each page, but this 
> took 1.5 minutes. To work around that bug I reused the document instance 
> across pages. This seems to have fail because the pages were cached and not 
> cleared by the GC.
> The DefaultResourceCache does not prune its cache entries when the soft 
> references are collected. Like WeakHashMap, it should use a ReferenceQueue, 
> poll it on every access, and prune accordingly.
> Thankfully PDDocument#setResourceCache exists. For now I am going to reset 
> the cache to a new instance after a page has been rendered. The entries 
> should no longer be reachable and be GC'd more aggressively. If that doesn't 
> work, I'll either replace the cache (e.g. with Caffeine) or disable it by 
> setting the instance to null.
> I think the desired fix is to prune the DefaultResourceCache and, ideally, 
> reconsider usage of soft references (as they tend to be poor in practice). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3359) Drawing to Graphics2D / ScratchFileBuffer not closed

2018-12-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-3359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711978#comment-16711978
 ] 

ASF subversion and git services commented on PDFBOX-3359:
-

Commit 1848359 from til...@apache.org in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1848359 ]

PDFBOX-2999, PDFBOX-3359: close existing buffer to avoid "ScratchFileBuffer not 
closed" log message

> Drawing to Graphics2D / ScratchFileBuffer not closed
> 
>
> Key: PDFBOX-3359
> URL: https://issues.apache.org/jira/browse/PDFBOX-3359
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.1
>Reporter: Ivan Ridao Freitas
>Priority: Major
> Fix For: 3.0.0 PDFBox
>
>
> First, there is a little bug on PDFRenderer.renderPageToGraphics(int 
> pageIndex, Graphics2D graphics, float scale) when using scale != 1 the call 
> to clearRect() fills the original size with white background, but it should 
> fill the scaled size.
> Second, I implemented a JPanel which is painted using that function and on 
> every paint this message goes to the console:
> "DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not closed!". Here is the 
> code to test it, run it and *resize the JFrame*:
> {code:title=PanelTest.java|borderStyle=solid}
> import java.awt.Dimension;
> import java.awt.Graphics;
> import java.awt.Graphics2D;
> import java.io.File;
> import java.io.IOException;
> import javax.swing.JFrame;
> import javax.swing.JPanel;
> import javax.swing.SwingUtilities;
> import javax.swing.WindowConstants;
> import org.apache.pdfbox.pdmodel.PDDocument;
> import org.apache.pdfbox.rendering.PDFRenderer;
> public class PanelTest {
>
> private static JPanel getTestPanel() {
> PDDocument doc = null;
> try {
> doc = PDDocument.load(new File("anyfile.pdf"));
> } catch (IOException e) {
> e.printStackTrace();
> }
> final PDFRenderer renderer = new PDFRenderer(doc);
> JPanel panel = new JPanel() {
> @Override
> protected void paintComponent(Graphics g) {
> try {
> renderer.renderPageToGraphics(0, (Graphics2D) g, 0.5f);
> } catch (IOException e) {
> e.printStackTrace();
> }
> }
> };
> return panel;
> }
> public static void main(String[] args) throws Exception {
> SwingUtilities.invokeLater(new Runnable() {
> @Override
> public void run() {
> JFrame frame = new JFrame();
> frame.setDefaultCloseOperation(WindowConstants.EXIT_ON_CLOSE);
> frame.add(getTestPanel());
> frame.pack();
> frame.setSize(600, 400);
> Dimension paneSize = frame.getSize();
> Dimension screenSize = frame.getToolkit().getScreenSize();
> frame.setLocation((screenSize.width - paneSize.width) / 2, 
> (screenSize.height - paneSize.height) / 2);
> frame.setTitle("Test");
> frame.setVisible(true);
> }
> });
> }
> }
> {code}
> Ivan 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2999) Optimize COSStream scratch file usage

2018-12-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711975#comment-16711975
 ] 

ASF subversion and git services commented on PDFBOX-2999:
-

Commit 1848358 from til...@apache.org in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1848358 ]

PDFBOX-2999, PDFBOX-3359: close existing buffer to avoid "ScratchFileBuffer not 
closed" log message

> Optimize COSStream scratch file usage
> -
>
> Key: PDFBOX-2999
> URL: https://issues.apache.org/jira/browse/PDFBOX-2999
> Project: PDFBox
>  Issue Type: Improvement
>  Components: PDModel
>Affects Versions: 2.0.0
>Reporter: Timo Boehme
>Assignee: Timo Boehme
>Priority: Major
>
> The usage of scratch file buffers in COSStreams is quite sloppy. A never 
> filled buffer is created in the beginning and existing buffers are discarded 
> without being closed when a variant of {{createOutputStream}} is called. 
> Furthermore it should be clarified if requesting an input stream without 
> having created an output stream before is ok and if a returned input stream 
> keeps valid after a new output stream is created (which is crucial for proper 
> buffer-closing). 
> This issue should resolve some of the shortcomings and document the expected 
> or even required usage of COSStream. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3359) Drawing to Graphics2D / ScratchFileBuffer not closed

2018-12-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-3359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711976#comment-16711976
 ] 

ASF subversion and git services commented on PDFBOX-3359:
-

Commit 1848358 from til...@apache.org in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1848358 ]

PDFBOX-2999, PDFBOX-3359: close existing buffer to avoid "ScratchFileBuffer not 
closed" log message

> Drawing to Graphics2D / ScratchFileBuffer not closed
> 
>
> Key: PDFBOX-3359
> URL: https://issues.apache.org/jira/browse/PDFBOX-3359
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.1
>Reporter: Ivan Ridao Freitas
>Priority: Major
> Fix For: 3.0.0 PDFBox
>
>
> First, there is a little bug on PDFRenderer.renderPageToGraphics(int 
> pageIndex, Graphics2D graphics, float scale) when using scale != 1 the call 
> to clearRect() fills the original size with white background, but it should 
> fill the scaled size.
> Second, I implemented a JPanel which is painted using that function and on 
> every paint this message goes to the console:
> "DEBUG ScratchFileBuffer:516 - ScratchFileBuffer not closed!". Here is the 
> code to test it, run it and *resize the JFrame*:
> {code:title=PanelTest.java|borderStyle=solid}
> import java.awt.Dimension;
> import java.awt.Graphics;
> import java.awt.Graphics2D;
> import java.io.File;
> import java.io.IOException;
> import javax.swing.JFrame;
> import javax.swing.JPanel;
> import javax.swing.SwingUtilities;
> import javax.swing.WindowConstants;
> import org.apache.pdfbox.pdmodel.PDDocument;
> import org.apache.pdfbox.rendering.PDFRenderer;
> public class PanelTest {
>
> private static JPanel getTestPanel() {
> PDDocument doc = null;
> try {
> doc = PDDocument.load(new File("anyfile.pdf"));
> } catch (IOException e) {
> e.printStackTrace();
> }
> final PDFRenderer renderer = new PDFRenderer(doc);
> JPanel panel = new JPanel() {
> @Override
> protected void paintComponent(Graphics g) {
> try {
> renderer.renderPageToGraphics(0, (Graphics2D) g, 0.5f);
> } catch (IOException e) {
> e.printStackTrace();
> }
> }
> };
> return panel;
> }
> public static void main(String[] args) throws Exception {
> SwingUtilities.invokeLater(new Runnable() {
> @Override
> public void run() {
> JFrame frame = new JFrame();
> frame.setDefaultCloseOperation(WindowConstants.EXIT_ON_CLOSE);
> frame.add(getTestPanel());
> frame.pack();
> frame.setSize(600, 400);
> Dimension paneSize = frame.getSize();
> Dimension screenSize = frame.getToolkit().getScreenSize();
> frame.setLocation((screenSize.width - paneSize.width) / 2, 
> (screenSize.height - paneSize.height) / 2);
> frame.setTitle("Test");
> frame.setVisible(true);
> }
> });
> }
> }
> {code}
> Ivan 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2999) Optimize COSStream scratch file usage

2018-12-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-2999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711977#comment-16711977
 ] 

ASF subversion and git services commented on PDFBOX-2999:
-

Commit 1848359 from til...@apache.org in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1848359 ]

PDFBOX-2999, PDFBOX-3359: close existing buffer to avoid "ScratchFileBuffer not 
closed" log message

> Optimize COSStream scratch file usage
> -
>
> Key: PDFBOX-2999
> URL: https://issues.apache.org/jira/browse/PDFBOX-2999
> Project: PDFBox
>  Issue Type: Improvement
>  Components: PDModel
>Affects Versions: 2.0.0
>Reporter: Timo Boehme
>Assignee: Timo Boehme
>Priority: Major
>
> The usage of scratch file buffers in COSStreams is quite sloppy. A never 
> filled buffer is created in the beginning and existing buffers are discarded 
> without being closed when a variant of {{createOutputStream}} is called. 
> Furthermore it should be clarified if requesting an input stream without 
> having created an output stream before is ok and if a returned input stream 
> keeps valid after a new output stream is created (which is crucial for proper 
> buffer-closing). 
> This issue should resolve some of the shortcomings and document the expected 
> or even required usage of COSStream. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4396) Memory leak due to soft reference caching

2018-12-06 Thread Ben Manes (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711955#comment-16711955
 ] 

Ben Manes commented on PDFBOX-4396:
---

The process completed for one of the large uploads and I had to disable the 
others due to taking too long (hours). The cpu overhead on the machine caused 
bad user-facing latencies, since the scheduler doesn't take cpu into account 
and those jobs were being delayed. I think since our use cases expanded 
expecting 5-10 page documents to now many thousands of pages (monthly 
historicals), it's no longer a good fit to do the work on a single process, 
shared with other user-facing work. I think my next step should be to migrate 
this use-case to a lambda, distribute page ranges, and invoke in parallel. That 
could easily be distributed using pdfbox and work great, but it's probably 
easier / faster / cheaper to use ghostscript for such a simple lambda task.

The documents are not encrypted so I think that case may not apply. In my code 
I often pass around a Guava Closer to accumulate resources across methods, and 
then ensure all are closed if not done so otherwise. If everything is 
associated to a document, it would make sense for a closer to be propagated 
from it and then it can close all of the resources (if not closed already). 
That could be a custom utility, etc. of course rather than Guava's.

You might also considered using weak / phantom references instead of 
finalization. For my application's file I/O (local and s3), I give clients a 
session with their own tempdir and reference count downloaded files against a 
global cache. The session handles are proxies that clients should close, but 
held in a weak keyed cache where the actual implementation is the value. Then 
when the proxy is collected, the strong-ref value is explicitly closed. This 
acts as a safety net just in case, since we do a lot of I/O and this form of 
reference caching is cheap. The same can be done better with phantom 
references, but more work than spinning up a weak cache with a removal 
listener. From reading the code, it looks like a lot of effort was made to 
close resources but it also got really complex with patches for the inevitable 
leaks. Of course, you might not be able to change much due to API compatibility 
needs.

I think at this point I'll close this, like the other, as not something 
trivially fixable. I do think better resource handing is warranted, but that 
requires a thoughtful refactor.

> Memory leak due to soft reference caching
> -
>
> Key: PDFBOX-4396
> URL: https://issues.apache.org/jira/browse/PDFBOX-4396
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.12
> Environment: JDK10; G1
>Reporter: Ben Manes
>Priority: Major
> Attachments: #2 - memory leak 2.png, #2 - memory leak.png, memory 
> leak 2.png, memory leak.png
>
>
> In a heap dump, it appears that DefaultResourceCache is retaining 5.3 GB of 
> memory due to buffered images (via PDImageXObject). I suspect that G1 is not 
> collecting soft references across all regions before it out-of-memory errors.
> In PDFBOX-4389, I discovered very slow PDDocument#load times due to a JDK10 
> I/O bug. Previously I was loading the document to render each page, but this 
> took 1.5 minutes. To work around that bug I reused the document instance 
> across pages. This seems to have fail because the pages were cached and not 
> cleared by the GC.
> The DefaultResourceCache does not prune its cache entries when the soft 
> references are collected. Like WeakHashMap, it should use a ReferenceQueue, 
> poll it on every access, and prune accordingly.
> Thankfully PDDocument#setResourceCache exists. For now I am going to reset 
> the cache to a new instance after a page has been rendered. The entries 
> should no longer be reachable and be GC'd more aggressively. If that doesn't 
> work, I'll either replace the cache (e.g. with Caffeine) or disable it by 
> setting the instance to null.
> I think the desired fix is to prune the DefaultResourceCache and, ideally, 
> reconsider usage of soft references (as they tend to be poor in practice). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Closed] (PDFBOX-4396) Memory leak due to soft reference caching

2018-12-06 Thread Tilman Hausherr (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr closed PDFBOX-4396.
---
Resolution: Workaround

> Memory leak due to soft reference caching
> -
>
> Key: PDFBOX-4396
> URL: https://issues.apache.org/jira/browse/PDFBOX-4396
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.12
> Environment: JDK10; G1
>Reporter: Ben Manes
>Priority: Major
> Attachments: #2 - memory leak 2.png, #2 - memory leak.png, memory 
> leak 2.png, memory leak.png
>
>
> In a heap dump, it appears that DefaultResourceCache is retaining 5.3 GB of 
> memory due to buffered images (via PDImageXObject). I suspect that G1 is not 
> collecting soft references across all regions before it out-of-memory errors.
> In PDFBOX-4389, I discovered very slow PDDocument#load times due to a JDK10 
> I/O bug. Previously I was loading the document to render each page, but this 
> took 1.5 minutes. To work around that bug I reused the document instance 
> across pages. This seems to have fail because the pages were cached and not 
> cleared by the GC.
> The DefaultResourceCache does not prune its cache entries when the soft 
> references are collected. Like WeakHashMap, it should use a ReferenceQueue, 
> poll it on every access, and prune accordingly.
> Thankfully PDDocument#setResourceCache exists. For now I am going to reset 
> the cache to a new instance after a page has been rendered. The entries 
> should no longer be reachable and be GC'd more aggressively. If that doesn't 
> work, I'll either replace the cache (e.g. with Caffeine) or disable it by 
> setting the instance to null.
> I think the desired fix is to prune the DefaultResourceCache and, ideally, 
> reconsider usage of soft references (as they tend to be poor in practice). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Reopened] (PDFBOX-4396) Memory leak due to soft reference caching

2018-12-06 Thread Tilman Hausherr (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr reopened PDFBOX-4396:
-

> Memory leak due to soft reference caching
> -
>
> Key: PDFBOX-4396
> URL: https://issues.apache.org/jira/browse/PDFBOX-4396
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.12
> Environment: JDK10; G1
>Reporter: Ben Manes
>Priority: Major
> Attachments: #2 - memory leak 2.png, #2 - memory leak.png, memory 
> leak 2.png, memory leak.png
>
>
> In a heap dump, it appears that DefaultResourceCache is retaining 5.3 GB of 
> memory due to buffered images (via PDImageXObject). I suspect that G1 is not 
> collecting soft references across all regions before it out-of-memory errors.
> In PDFBOX-4389, I discovered very slow PDDocument#load times due to a JDK10 
> I/O bug. Previously I was loading the document to render each page, but this 
> took 1.5 minutes. To work around that bug I reused the document instance 
> across pages. This seems to have fail because the pages were cached and not 
> cleared by the GC.
> The DefaultResourceCache does not prune its cache entries when the soft 
> references are collected. Like WeakHashMap, it should use a ReferenceQueue, 
> poll it on every access, and prune accordingly.
> Thankfully PDDocument#setResourceCache exists. For now I am going to reset 
> the cache to a new instance after a page has been rendered. The entries 
> should no longer be reachable and be GC'd more aggressively. If that doesn't 
> work, I'll either replace the cache (e.g. with Caffeine) or disable it by 
> setting the instance to null.
> I think the desired fix is to prune the DefaultResourceCache and, ideally, 
> reconsider usage of soft references (as they tend to be poor in practice). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-4396) Memory leak due to soft reference caching

2018-12-06 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711189#comment-16711189
 ] 

Tilman Hausherr edited comment on PDFBOX-4396 at 12/6/18 9:44 AM:
--

Can you build from source? If yes, open COSStream.java, and add
{code:java}
IOUtils.closeQuietly(randomAccess);{code}
above
{code:java}
randomAccess = scratchFile.createBuffer();{code}
at two places.

My observation was that randomAccess was not null when an encrypted file was 
opened, because the same COSStream is rewritten (in 
{{SecurityHandler.decryptStream()}}) and it can't be closed in that method 
because this would result in an exception thrown by COSStream.


was (Author: tilman):
Can you build from source? If yes, open COSStream.java, and add
{code:java}
IOUtils.closeQuietly(randomAccess);{code}
above
{code:java}
randomAccess = scratchFile.createBuffer();{code}
at two places.

My observations that ramdomAccess was not null when an encrypted file is 
opened, because the same COSStream is rewritten and it can't be closed before 
that because this would result in an exception.

> Memory leak due to soft reference caching
> -
>
> Key: PDFBOX-4396
> URL: https://issues.apache.org/jira/browse/PDFBOX-4396
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.12
> Environment: JDK10; G1
>Reporter: Ben Manes
>Priority: Major
> Attachments: #2 - memory leak 2.png, #2 - memory leak.png, memory 
> leak 2.png, memory leak.png
>
>
> In a heap dump, it appears that DefaultResourceCache is retaining 5.3 GB of 
> memory due to buffered images (via PDImageXObject). I suspect that G1 is not 
> collecting soft references across all regions before it out-of-memory errors.
> In PDFBOX-4389, I discovered very slow PDDocument#load times due to a JDK10 
> I/O bug. Previously I was loading the document to render each page, but this 
> took 1.5 minutes. To work around that bug I reused the document instance 
> across pages. This seems to have fail because the pages were cached and not 
> cleared by the GC.
> The DefaultResourceCache does not prune its cache entries when the soft 
> references are collected. Like WeakHashMap, it should use a ReferenceQueue, 
> poll it on every access, and prune accordingly.
> Thankfully PDDocument#setResourceCache exists. For now I am going to reset 
> the cache to a new instance after a page has been rendered. The entries 
> should no longer be reachable and be GC'd more aggressively. If that doesn't 
> work, I'll either replace the cache (e.g. with Caffeine) or disable it by 
> setting the instance to null.
> I think the desired fix is to prune the DefaultResourceCache and, ideally, 
> reconsider usage of soft references (as they tend to be poor in practice). 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances

2018-12-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711505#comment-16711505
 ] 

ASF subversion and git services commented on PDFBOX-4392:
-

Commit 1848319 from til...@apache.org in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1848319 ]

PDFBOX-4392: remove toRGB() which is contained in Color() construction, thanks 
Itai Shaked

> PDF completely blow up the RAM on amazon instances
> --
>
> Key: PDFBOX-4392
> URL: https://issues.apache.org/jira/browse/PDFBOX-4392
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.12
>Reporter: Oleksandr Skoryi
>Priority: Major
> Fix For: 2.0.13
>
> Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, 
> 4392-prereadICC.patch
>
>
> Hi all
> The issue is pretty straightforward. I receive a lot of pdfs every day and 
> render them. In most of the cases everything is OK, but PDFs which produces 
> WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is 
> Perceptual, ignoring, treating as Display class
> working super long, and are super memory consumable. 
> It takes from 5 to 15 min on m5.large amazon instance. But attached PDF 
> completely killed the instance. The java process is just killed by linux 
> during processing with no exception in logs. 
> So could you please provide explanations what is going on with files with 
> WARN message above, and how can I improve the rendering. 
>  
> Here is my VM options 
> -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G 
> -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider"
> Also don't hesitate to ask me about more PDF, I have tones of them :D
>  
> And also a question, does GPU have influence on rendering?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances

2018-12-06 Thread Itai Shaked (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711354#comment-16711354
 ] 

Itai Shaked commented on PDFBOX-4392:
-

So I looked into it more closely using a profiler (should have done this in the 
first place, as it revealed the call to `ensureDisplayProfile` has negligible 
performance impact), and it seems lots of time is wasted on the call to `toRGB` 
in line 170 of PDICCBased.  According to the comment above the call, this is 
done to test for bad ICC profiles as described in PDFBOX-1295, PDFBOX-1740 and 
PDFBOX-3610.  PDFBOX-1740 seems totally unrelated, so perhaps a typo in the 
comment? I have tried removing the call, and the files from PDFBOX-1295 and 
PDFBOX-3610 both render correctly (even with the call - no exception is 
thrown).  Furthermore, removing this call cuts render time on the file in this 
issue by ~50%. 

Could this be an issue with a bug in an older JDK version, and so is no longer 
needed? If so - perhaps a test can be added so `toRGB` is only called on known 
bad JDK versions? 

> PDF completely blow up the RAM on amazon instances
> --
>
> Key: PDFBOX-4392
> URL: https://issues.apache.org/jira/browse/PDFBOX-4392
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.12
>Reporter: Oleksandr Skoryi
>Priority: Major
> Fix For: 2.0.13
>
> Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, 
> 4392-prereadICC.patch
>
>
> Hi all
> The issue is pretty straightforward. I receive a lot of pdfs every day and 
> render them. In most of the cases everything is OK, but PDFs which produces 
> WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is 
> Perceptual, ignoring, treating as Display class
> working super long, and are super memory consumable. 
> It takes from 5 to 15 min on m5.large amazon instance. But attached PDF 
> completely killed the instance. The java process is just killed by linux 
> during processing with no exception in logs. 
> So could you please provide explanations what is going on with files with 
> WARN message above, and how can I improve the rendering. 
>  
> Here is my VM options 
> -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G 
> -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider"
> Also don't hesitate to ask me about more PDF, I have tones of them :D
>  
> And also a question, does GPU have influence on rendering?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4399) Printing invisible Content from PDF-Forms

2018-12-06 Thread Stefan Ziel (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Ziel updated PDFBOX-4399:

Attachment: (was: printed.brf)

> Printing invisible Content from PDF-Forms 
> --
>
> Key: PDFBOX-4399
> URL: https://issues.apache.org/jira/browse/PDFBOX-4399
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, Rendering
>Affects Versions: 2.0.6
>Reporter: Stefan Ziel
>Priority: Major
> Attachments: original.pdf
>
>
> Printing PDF-documents with macro renders hidden content.
> {code:java}
> InputStream sourceStream = new FileInputStream(pFile);
> try {
>   PDDocument source = PDDocument.load(sourceStream);
>   job.setPageable(new PDFPageable(source));
>   job.print(atts);
> } finally {
>   sourceStream.close();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4399) Printing invisible Content from PDF-Forms

2018-12-06 Thread Stefan Ziel (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Ziel updated PDFBOX-4399:

Attachment: printed.brf
original.pdf

> Printing invisible Content from PDF-Forms 
> --
>
> Key: PDFBOX-4399
> URL: https://issues.apache.org/jira/browse/PDFBOX-4399
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, Rendering
>Affects Versions: 2.0.6
>Reporter: Stefan Ziel
>Priority: Major
> Attachments: original.pdf
>
>
> Printing PDF-documents with macro renders hidden content.
> {code:java}
> InputStream sourceStream = new FileInputStream(pFile);
> try {
>   PDDocument source = PDDocument.load(sourceStream);
>   job.setPageable(new PDFPageable(source));
>   job.print(atts);
> } finally {
>   sourceStream.close();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances

2018-12-06 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711485#comment-16711485
 ] 

Tilman Hausherr commented on PDFBOX-4392:
-

I looked into the Color source code, that explains it LOL.

> PDF completely blow up the RAM on amazon instances
> --
>
> Key: PDFBOX-4392
> URL: https://issues.apache.org/jira/browse/PDFBOX-4392
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.12
>Reporter: Oleksandr Skoryi
>Priority: Major
> Fix For: 2.0.13
>
> Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, 
> 4392-prereadICC.patch
>
>
> Hi all
> The issue is pretty straightforward. I receive a lot of pdfs every day and 
> render them. In most of the cases everything is OK, but PDFs which produces 
> WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is 
> Perceptual, ignoring, treating as Display class
> working super long, and are super memory consumable. 
> It takes from 5 to 15 min on m5.large amazon instance. But attached PDF 
> completely killed the instance. The java process is just killed by linux 
> during processing with no exception in logs. 
> So could you please provide explanations what is going on with files with 
> WARN message above, and how can I improve the rendering. 
>  
> Here is my VM options 
> -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G 
> -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider"
> Also don't hesitate to ask me about more PDF, I have tones of them :D
>  
> And also a question, does GPU have influence on rendering?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4398) getLastSignatureDictionary modifies internal structure of PDDocument

2018-12-06 Thread beat weisskopf (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711372#comment-16711372
 ] 

beat weisskopf commented on PDFBOX-4398:


{quote}
I won't do anything for now.
{quote}

That's ok for me. Thanks for all the work!

> getLastSignatureDictionary modifies internal structure of PDDocument
> 
>
> Key: PDFBOX-4398
> URL: https://issues.apache.org/jira/browse/PDFBOX-4398
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm
>Affects Versions: 2.0.12
>Reporter: beat weisskopf
>Priority: Minor
> Fix For: 3.0.0 PDFBox
>
>
> If one calls PDDocument#getLastSignatureDictionary, the AcroFrom is populated 
> with the defaults even if not needed. This modifies the internals of the 
> PDDocument and therefore there are changes to be saved, even if the file is 
> not modified by "real" changes.
> For example:
> {code}
> PDDocument pdfDocument = PDDocument.load(pdfBytes);
> pdfDocument.getLastSignatureDictionary();
> {code}
> This calls the verifyOrCreateDefaults() method, which initializes the 
> DR-Dictionary if not yet done. This is even done if 
> getLastSignatureDictionary returns null.
> Why this bothers me: it is very unexpected behaviour that a getter modifies 
> an objects state. This is no big deal for our usecase, the other issue 
> (PDFBOX-4303) was a bigger problem as we are diffing objects between 
> revisions (current vs last signed revision).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4303) Helv and ZaDb overridden

2018-12-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711390#comment-16711390
 ] 

ASF subversion and git services commented on PDFBOX-4303:
-

Commit 1848299 from til...@apache.org in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1848299 ]

PDFBOX-4303, PDFBOX-4393: add test

> Helv and ZaDb overridden
> 
>
> Key: PDFBOX-4303
> URL: https://issues.apache.org/jira/browse/PDFBOX-4303
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm
>Affects Versions: 2.0.11
>Reporter: simon steiner
>Assignee: Maruan Sahyoun
>Priority: Major
>  Labels: Appearance
> Fix For: 2.0.14, 3.0.0 PDFBox
>
> Attachments: PDFBOX-4303-2.0.12.diff, ReaderModifiedForm.pdf
>
>
> Due to change:
> PDFBOX-3943: create /Helv and /ZaDb entries if they don't exist, regardless 
> if /DR existed or not
>  
> was working ok in 2.0.7, in 2.0 branch
> PDAcroForm
> verifyOrCreateDefaults():
> is:
> {color:#80}if 
> {color}(!defaultResources.getCOSObject().containsKey({color:#008000}"Helv"{color}))
> should be checking key in the font dictionary before calling 
> defaultResources.put



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4303) Helv and ZaDb overridden

2018-12-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711392#comment-16711392
 ] 

ASF subversion and git services commented on PDFBOX-4303:
-

Commit 1848300 from til...@apache.org in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1848300 ]

PDFBOX-4303, PDFBOX-4393: add test

> Helv and ZaDb overridden
> 
>
> Key: PDFBOX-4303
> URL: https://issues.apache.org/jira/browse/PDFBOX-4303
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm
>Affects Versions: 2.0.11
>Reporter: simon steiner
>Assignee: Maruan Sahyoun
>Priority: Major
>  Labels: Appearance
> Fix For: 2.0.14, 3.0.0 PDFBox
>
> Attachments: PDFBOX-4303-2.0.12.diff, ReaderModifiedForm.pdf
>
>
> Due to change:
> PDFBOX-3943: create /Helv and /ZaDb entries if they don't exist, regardless 
> if /DR existed or not
>  
> was working ok in 2.0.7, in 2.0 branch
> PDAcroForm
> verifyOrCreateDefaults():
> is:
> {color:#80}if 
> {color}(!defaultResources.getCOSObject().containsKey({color:#008000}"Helv"{color}))
> should be checking key in the font dictionary before calling 
> defaultResources.put



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4393) PDF signature invalid after second interactive field signed

2018-12-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711393#comment-16711393
 ] 

ASF subversion and git services commented on PDFBOX-4393:
-

Commit 1848300 from til...@apache.org in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1848300 ]

PDFBOX-4303, PDFBOX-4393: add test

> PDF signature invalid after second interactive field signed
> ---
>
> Key: PDFBOX-4393
> URL: https://issues.apache.org/jira/browse/PDFBOX-4393
> Project: PDFBox
>  Issue Type: Bug
>  Components: Signing
>Affects Versions: 2.0.12, 2.0.13
> Environment: Windows
>Reporter: Martin Klíma
>Priority: Major
> Fix For: 2.0.14, 3.0.0 PDFBox
>
> Attachments: image-2018-12-04-11-14-01-586.png, 
> image-2018-12-04-11-14-32-676.png, streamserve_test_sig0.pdf, 
> streamserve_test_sig0_saved.pdf, streamserve_test_sig0_saved_signed by 
> PDFBox.pdf, streamserve_test_sig0_saved_signed by PDFBox_then_signed by 
> AR.pdf, streamserve_test_sig0_signed by PDFBoxNEW.pdf, 
> streamserve_test_sig0_signed by PDFBoxNEW_signed by AR.pdf, 
> streamserve_test_sig2.pdf, streamserve_test_sig3.pdf, 
> streamserve_test_sig3.signed.pdf
>
>
> Hi guys,
> I stumped on the problem with PDFBox and interactive field signing. I have 
> PDF generated with OpenText StreamServe with two interactive fields for 
> signing. See example 1 (streamserve_test_sig0.pdf) in attachement.
> When I use Adobe Reader I can sign both of the visual fields just fine but 
> when I use PDFBox to sign one of this field the following signature is marked 
> as invalid. Doesn't matter if I use PDFbox or sign it manually with Adobe 
> Reader See example 3: 
>  *  streamserve_test_sig3.pdf (signed by PDFBox) - valid
>  *  streamserve_test_sig3.signed.pdf (signed 2nd by Adobe Reader) - invalid
> !image-2018-12-04-11-14-01-586.png|width=582,height=269!
>  
> Also last example - when it´s signed first by Adobe Reader and then with 
> PDFBox the signature seems valid but it says the document was "certified". 
> See example 2 (streamserve_test_sig2.pdf)
> !image-2018-12-04-11-14-32-676.png|width=591,height=283!
>  
> What could be wrong? When is signed whole document with PDFBox it works just 
> fine.
> Thanks for response,
> Martin
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4393) PDF signature invalid after second interactive field signed

2018-12-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711391#comment-16711391
 ] 

ASF subversion and git services commented on PDFBOX-4393:
-

Commit 1848299 from til...@apache.org in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1848299 ]

PDFBOX-4303, PDFBOX-4393: add test

> PDF signature invalid after second interactive field signed
> ---
>
> Key: PDFBOX-4393
> URL: https://issues.apache.org/jira/browse/PDFBOX-4393
> Project: PDFBox
>  Issue Type: Bug
>  Components: Signing
>Affects Versions: 2.0.12, 2.0.13
> Environment: Windows
>Reporter: Martin Klíma
>Priority: Major
> Fix For: 2.0.14, 3.0.0 PDFBox
>
> Attachments: image-2018-12-04-11-14-01-586.png, 
> image-2018-12-04-11-14-32-676.png, streamserve_test_sig0.pdf, 
> streamserve_test_sig0_saved.pdf, streamserve_test_sig0_saved_signed by 
> PDFBox.pdf, streamserve_test_sig0_saved_signed by PDFBox_then_signed by 
> AR.pdf, streamserve_test_sig0_signed by PDFBoxNEW.pdf, 
> streamserve_test_sig0_signed by PDFBoxNEW_signed by AR.pdf, 
> streamserve_test_sig2.pdf, streamserve_test_sig3.pdf, 
> streamserve_test_sig3.signed.pdf
>
>
> Hi guys,
> I stumped on the problem with PDFBox and interactive field signing. I have 
> PDF generated with OpenText StreamServe with two interactive fields for 
> signing. See example 1 (streamserve_test_sig0.pdf) in attachement.
> When I use Adobe Reader I can sign both of the visual fields just fine but 
> when I use PDFBox to sign one of this field the following signature is marked 
> as invalid. Doesn't matter if I use PDFbox or sign it manually with Adobe 
> Reader See example 3: 
>  *  streamserve_test_sig3.pdf (signed by PDFBox) - valid
>  *  streamserve_test_sig3.signed.pdf (signed 2nd by Adobe Reader) - invalid
> !image-2018-12-04-11-14-01-586.png|width=582,height=269!
>  
> Also last example - when it´s signed first by Adobe Reader and then with 
> PDFBox the signature seems valid but it says the document was "certified". 
> See example 2 (streamserve_test_sig2.pdf)
> !image-2018-12-04-11-14-32-676.png|width=591,height=283!
>  
> What could be wrong? When is signed whole document with PDFBox it works just 
> fine.
> Thanks for response,
> Martin
>  
>  
>  
>  
>  
>  
>  
>  
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4399) Printing invisible Content from PDF-Forms

2018-12-06 Thread Stefan Ziel (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Ziel updated PDFBOX-4399:

Attachment: printed.png

> Printing invisible Content from PDF-Forms 
> --
>
> Key: PDFBOX-4399
> URL: https://issues.apache.org/jira/browse/PDFBOX-4399
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, Rendering
>Affects Versions: 2.0.6
>Reporter: Stefan Ziel
>Priority: Major
> Attachments: gs.png, original.pdf, printed.png
>
>
> Printing PDF-documents with macro  [^original.pdf]  renders hidden content  
> [^printed.pdf] . Code used to print
> {code:java}
> InputStream sourceStream = new FileInputStream(pFile);
> try {
>   PDDocument source = PDDocument.load(sourceStream);
>   job.setPageable(new PDFPageable(source));
>   job.print(atts);
> } finally {
>   sourceStream.close();
> }
> {code}
> This is not only a problem of PDFBox ;) but can be done right ... ghostscript 
> does it  [^gs.png].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4399) Printing invisible Content from PDF-Forms

2018-12-06 Thread Stefan Ziel (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Ziel updated PDFBOX-4399:

Attachment: printed.pdf

> Printing invisible Content from PDF-Forms 
> --
>
> Key: PDFBOX-4399
> URL: https://issues.apache.org/jira/browse/PDFBOX-4399
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, Rendering
>Affects Versions: 2.0.6
>Reporter: Stefan Ziel
>Priority: Major
> Attachments: original.pdf, printed.pdf
>
>
> Printing PDF-documents with macro renders hidden content.
> {code:java}
> InputStream sourceStream = new FileInputStream(pFile);
> try {
>   PDDocument source = PDDocument.load(sourceStream);
>   job.setPageable(new PDFPageable(source));
>   job.print(atts);
> } finally {
>   sourceStream.close();
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4399) Printing invisible Content from PDF-Forms

2018-12-06 Thread Stefan Ziel (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Ziel updated PDFBOX-4399:

Attachment: gs.png

> Printing invisible Content from PDF-Forms 
> --
>
> Key: PDFBOX-4399
> URL: https://issues.apache.org/jira/browse/PDFBOX-4399
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, Rendering
>Affects Versions: 2.0.6
>Reporter: Stefan Ziel
>Priority: Major
> Attachments: gs.png, original.pdf, printed.pdf
>
>
> Printing PDF-documents with macro  [^original.pdf]  renders hidden content  
> [^printed.pdf] . Code used to print
> {code:java}
> InputStream sourceStream = new FileInputStream(pFile);
> try {
>   PDDocument source = PDDocument.load(sourceStream);
>   job.setPageable(new PDFPageable(source));
>   job.print(atts);
> } finally {
>   sourceStream.close();
> }
> {code}
> This is not only a problem of PDFBox ;) but can be done right ... ghostscript 
> does it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4399) Printing invisible Content from PDF-Forms

2018-12-06 Thread Stefan Ziel (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Ziel updated PDFBOX-4399:

Description: 
Printing PDF-documents with macro  [^original.pdf]  renders hidden content  
[^printed.pdf] . Code used to print

{code:java}
InputStream sourceStream = new FileInputStream(pFile);
try {
  PDDocument source = PDDocument.load(sourceStream);
  job.setPageable(new PDFPageable(source));
  job.print(atts);
} finally {
  sourceStream.close();
}
{code}

This is not only a problem of PDFBox ;) but can be done right ... ghostscript 
does it  [^gs.png].

  was:
Printing PDF-documents with macro  [^original.pdf]  renders hidden content  
[^printed.pdf] . Code used to print

{code:java}
InputStream sourceStream = new FileInputStream(pFile);
try {
  PDDocument source = PDDocument.load(sourceStream);
  job.setPageable(new PDFPageable(source));
  job.print(atts);
} finally {
  sourceStream.close();
}
{code}

This is not only a problem of PDFBox ;) but can be done right ... ghostscript 
does it  !gs.png! .


> Printing invisible Content from PDF-Forms 
> --
>
> Key: PDFBOX-4399
> URL: https://issues.apache.org/jira/browse/PDFBOX-4399
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, Rendering
>Affects Versions: 2.0.6
>Reporter: Stefan Ziel
>Priority: Major
> Attachments: gs.png, original.pdf, printed.pdf
>
>
> Printing PDF-documents with macro  [^original.pdf]  renders hidden content  
> [^printed.pdf] . Code used to print
> {code:java}
> InputStream sourceStream = new FileInputStream(pFile);
> try {
>   PDDocument source = PDDocument.load(sourceStream);
>   job.setPageable(new PDFPageable(source));
>   job.print(atts);
> } finally {
>   sourceStream.close();
> }
> {code}
> This is not only a problem of PDFBox ;) but can be done right ... ghostscript 
> does it  [^gs.png].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4399) Printing invisible Content from PDF-Forms

2018-12-06 Thread Stefan Ziel (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Ziel updated PDFBOX-4399:

Description: 
Printing PDF-documents with macro  [^original.pdf]  renders hidden content  
[^printed.pdf] . Code used to print

{code:java}
InputStream sourceStream = new FileInputStream(pFile);
try {
  PDDocument source = PDDocument.load(sourceStream);
  job.setPageable(new PDFPageable(source));
  job.print(atts);
} finally {
  sourceStream.close();
}
{code}

This is not only a problem of PDFBox ;) but can be done right ... ghostscript 
does it  !gs.png! .

  was:
Printing PDF-documents with macro  [^original.pdf]  renders hidden content  
[^printed.pdf] . Code used to print

{code:java}
InputStream sourceStream = new FileInputStream(pFile);
try {
  PDDocument source = PDDocument.load(sourceStream);
  job.setPageable(new PDFPageable(source));
  job.print(atts);
} finally {
  sourceStream.close();
}
{code}

This is not only a problem of PDFBox ;) but can be done right ... ghostscript 
does it.


> Printing invisible Content from PDF-Forms 
> --
>
> Key: PDFBOX-4399
> URL: https://issues.apache.org/jira/browse/PDFBOX-4399
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, Rendering
>Affects Versions: 2.0.6
>Reporter: Stefan Ziel
>Priority: Major
> Attachments: gs.png, original.pdf, printed.pdf
>
>
> Printing PDF-documents with macro  [^original.pdf]  renders hidden content  
> [^printed.pdf] . Code used to print
> {code:java}
> InputStream sourceStream = new FileInputStream(pFile);
> try {
>   PDDocument source = PDDocument.load(sourceStream);
>   job.setPageable(new PDFPageable(source));
>   job.print(atts);
> } finally {
>   sourceStream.close();
> }
> {code}
> This is not only a problem of PDFBox ;) but can be done right ... ghostscript 
> does it  !gs.png! .



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances

2018-12-06 Thread Itai Shaked (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711383#comment-16711383
 ] 

Itai Shaked commented on PDFBOX-4392:
-

This may shed some more light on the issue - the next line (172) creates a new 
`Color` object, which itself calls `toRGB` on the given color space - this 
seems to have been done later as a fix for PDFBOX-3549. This explains why 
removing the first call to `toRGB` improves performance by 50% - it was being 
called 2 times! To me this seems like proof the first call is redundant - the 
exception will be thrown in the constructor of `Color`, if it passed the 
previous test. 

Since the call to `toRGB` seems to be extremely slow, perhaps it is wise to 
also reverse the order of remaining to tests (creating a `Color` and a 
`ComponentColorModel`), so the slowest test is saved for last, and skipped if 
the profile is deemed problematic sooner. 

> PDF completely blow up the RAM on amazon instances
> --
>
> Key: PDFBOX-4392
> URL: https://issues.apache.org/jira/browse/PDFBOX-4392
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.12
>Reporter: Oleksandr Skoryi
>Priority: Major
> Fix For: 2.0.13
>
> Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, 
> 4392-prereadICC.patch
>
>
> Hi all
> The issue is pretty straightforward. I receive a lot of pdfs every day and 
> render them. In most of the cases everything is OK, but PDFs which produces 
> WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is 
> Perceptual, ignoring, treating as Display class
> working super long, and are super memory consumable. 
> It takes from 5 to 15 min on m5.large amazon instance. But attached PDF 
> completely killed the instance. The java process is just killed by linux 
> during processing with no exception in logs. 
> So could you please provide explanations what is going on with files with 
> WARN message above, and how can I improve the rendering. 
>  
> Here is my VM options 
> -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G 
> -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider"
> Also don't hesitate to ask me about more PDF, I have tones of them :D
>  
> And also a question, does GPU have influence on rendering?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances

2018-12-06 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711426#comment-16711426
 ] 

Tilman Hausherr commented on PDFBOX-4392:
-

PDFBOX-1740 is not a typo - the file was mentioned by me and by John in 
PDFBOX-1893.

I'll retest with KCMS and LCMS whether we can delete the call from line 170.

I think the order doesn't matter because most files don't fail.

> PDF completely blow up the RAM on amazon instances
> --
>
> Key: PDFBOX-4392
> URL: https://issues.apache.org/jira/browse/PDFBOX-4392
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.12
>Reporter: Oleksandr Skoryi
>Priority: Major
> Fix For: 2.0.13
>
> Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, 
> 4392-prereadICC.patch
>
>
> Hi all
> The issue is pretty straightforward. I receive a lot of pdfs every day and 
> render them. In most of the cases everything is OK, but PDFs which produces 
> WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is 
> Perceptual, ignoring, treating as Display class
> working super long, and are super memory consumable. 
> It takes from 5 to 15 min on m5.large amazon instance. But attached PDF 
> completely killed the instance. The java process is just killed by linux 
> during processing with no exception in logs. 
> So could you please provide explanations what is going on with files with 
> WARN message above, and how can I improve the rendering. 
>  
> Here is my VM options 
> -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G 
> -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider"
> Also don't hesitate to ask me about more PDF, I have tones of them :D
>  
> And also a question, does GPU have influence on rendering?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Created] (PDFBOX-4399) Printing invisible Content from PDF-Forms

2018-12-06 Thread Stefan Ziel (JIRA)
Stefan Ziel created PDFBOX-4399:
---

 Summary: Printing invisible Content from PDF-Forms 
 Key: PDFBOX-4399
 URL: https://issues.apache.org/jira/browse/PDFBOX-4399
 Project: PDFBox
  Issue Type: Bug
  Components: AcroForm, Rendering
Affects Versions: 2.0.6
Reporter: Stefan Ziel


Printing PDF-documents with macro renders hidden content.

{code:java}
InputStream sourceStream = new FileInputStream(pFile);
try {
  PDDocument source = PDDocument.load(sourceStream);
  job.setPageable(new PDFPageable(source));
  job.print(atts);
} finally {
  sourceStream.close();
}
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4399) Printing invisible Content from PDF-Forms

2018-12-06 Thread Stefan Ziel (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Ziel updated PDFBOX-4399:

Attachment: (was: printed.pdf)

> Printing invisible Content from PDF-Forms 
> --
>
> Key: PDFBOX-4399
> URL: https://issues.apache.org/jira/browse/PDFBOX-4399
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, Rendering
>Affects Versions: 2.0.6
>Reporter: Stefan Ziel
>Priority: Major
> Attachments: gs.png, original.pdf, printed.png
>
>
> Printing PDF-documents with macro  [^original.pdf]  renders hidden content  
> [^printed.pdf] . Code used to print
> {code:java}
> InputStream sourceStream = new FileInputStream(pFile);
> try {
>   PDDocument source = PDDocument.load(sourceStream);
>   job.setPageable(new PDFPageable(source));
>   job.print(atts);
> } finally {
>   sourceStream.close();
> }
> {code}
> This is not only a problem of PDFBox ;) but can be done right ... ghostscript 
> does it  [^gs.png].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Resolved] (PDFBOX-4303) Helv and ZaDb overridden

2018-12-06 Thread Tilman Hausherr (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-4303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr resolved PDFBOX-4303.
-
Resolution: Fixed
  Assignee: Tilman Hausherr  (was: Maruan Sahyoun)

I created a different test than the proposed one to work without a file.

> Helv and ZaDb overridden
> 
>
> Key: PDFBOX-4303
> URL: https://issues.apache.org/jira/browse/PDFBOX-4303
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm
>Affects Versions: 2.0.11
>Reporter: simon steiner
>Assignee: Tilman Hausherr
>Priority: Major
>  Labels: Appearance
> Fix For: 2.0.14, 3.0.0 PDFBox
>
> Attachments: PDFBOX-4303-2.0.12.diff, ReaderModifiedForm.pdf
>
>
> Due to change:
> PDFBOX-3943: create /Helv and /ZaDb entries if they don't exist, regardless 
> if /DR existed or not
>  
> was working ok in 2.0.7, in 2.0 branch
> PDAcroForm
> verifyOrCreateDefaults():
> is:
> {color:#80}if 
> {color}(!defaultResources.getCOSObject().containsKey({color:#008000}"Helv"{color}))
> should be checking key in the font dictionary before calling 
> defaultResources.put



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4399) Printing invisible Content from PDF-Forms

2018-12-06 Thread Stefan Ziel (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Ziel updated PDFBOX-4399:

Description: 
Printing PDF-documents with macro  [^original.pdf]  renders hidden content  
[^printed.pdf] . Code used to print

{code:java}
InputStream sourceStream = new FileInputStream(pFile);
try {
  PDDocument source = PDDocument.load(sourceStream);
  job.setPageable(new PDFPageable(source));
  job.print(atts);
} finally {
  sourceStream.close();
}
{code}

This is not only a problem of PDFBox ;) but can be done right ... ghostscript 
does it.

  was:
Printing PDF-documents with macro renders hidden content.

{code:java}
InputStream sourceStream = new FileInputStream(pFile);
try {
  PDDocument source = PDDocument.load(sourceStream);
  job.setPageable(new PDFPageable(source));
  job.print(atts);
} finally {
  sourceStream.close();
}
{code}


> Printing invisible Content from PDF-Forms 
> --
>
> Key: PDFBOX-4399
> URL: https://issues.apache.org/jira/browse/PDFBOX-4399
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, Rendering
>Affects Versions: 2.0.6
>Reporter: Stefan Ziel
>Priority: Major
> Attachments: original.pdf, printed.pdf
>
>
> Printing PDF-documents with macro  [^original.pdf]  renders hidden content  
> [^printed.pdf] . Code used to print
> {code:java}
> InputStream sourceStream = new FileInputStream(pFile);
> try {
>   PDDocument source = PDDocument.load(sourceStream);
>   job.setPageable(new PDFPageable(source));
>   job.print(atts);
> } finally {
>   sourceStream.close();
> }
> {code}
> This is not only a problem of PDFBox ;) but can be done right ... ghostscript 
> does it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4399) Printing invisible Content from PDF-Forms

2018-12-06 Thread Stefan Ziel (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stefan Ziel updated PDFBOX-4399:

Description: 
Printing PDF-documents with macro [^original.pdf] renders hidden content  
[^printed.png]  . Code used to print
{code:java}
InputStream sourceStream = new FileInputStream(pFile);
try {
  PDDocument source = PDDocument.load(sourceStream);
  job.setPageable(new PDFPageable(source));
  job.print(atts);
} finally {
  sourceStream.close();
}
{code}
This is not only a problem of PDFBox ;) but can be done right ... ghostscript 
does it [^gs.png].

  was:
Printing PDF-documents with macro  [^original.pdf]  renders hidden content  
[^printed.pdf] . Code used to print

{code:java}
InputStream sourceStream = new FileInputStream(pFile);
try {
  PDDocument source = PDDocument.load(sourceStream);
  job.setPageable(new PDFPageable(source));
  job.print(atts);
} finally {
  sourceStream.close();
}
{code}

This is not only a problem of PDFBox ;) but can be done right ... ghostscript 
does it  [^gs.png].


> Printing invisible Content from PDF-Forms 
> --
>
> Key: PDFBOX-4399
> URL: https://issues.apache.org/jira/browse/PDFBOX-4399
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, Rendering
>Affects Versions: 2.0.6
>Reporter: Stefan Ziel
>Priority: Major
> Attachments: gs.png, original.pdf, printed.png
>
>
> Printing PDF-documents with macro [^original.pdf] renders hidden content  
> [^printed.png]  . Code used to print
> {code:java}
> InputStream sourceStream = new FileInputStream(pFile);
> try {
>   PDDocument source = PDDocument.load(sourceStream);
>   job.setPageable(new PDFPageable(source));
>   job.print(atts);
> } finally {
>   sourceStream.close();
> }
> {code}
> This is not only a problem of PDFBox ;) but can be done right ... ghostscript 
> does it [^gs.png].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances

2018-12-06 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711613#comment-16711613
 ] 

Tilman Hausherr commented on PDFBOX-4392:
-

ComponentColorModel construction is still needed when using LCMS.

Everything else isn't needed when using LCMS.



> PDF completely blow up the RAM on amazon instances
> --
>
> Key: PDFBOX-4392
> URL: https://issues.apache.org/jira/browse/PDFBOX-4392
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.12
>Reporter: Oleksandr Skoryi
>Priority: Major
> Fix For: 2.0.13
>
> Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, 
> 4392-prereadICC.patch
>
>
> Hi all
> The issue is pretty straightforward. I receive a lot of pdfs every day and 
> render them. In most of the cases everything is OK, but PDFs which produces 
> WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is 
> Perceptual, ignoring, treating as Display class
> working super long, and are super memory consumable. 
> It takes from 5 to 15 min on m5.large amazon instance. But attached PDF 
> completely killed the instance. The java process is just killed by linux 
> during processing with no exception in logs. 
> So could you please provide explanations what is going on with files with 
> WARN message above, and how can I improve the rendering. 
>  
> Here is my VM options 
> -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G 
> -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider"
> Also don't hesitate to ask me about more PDF, I have tones of them :D
>  
> And also a question, does GPU have influence on rendering?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4399) Printing invisible Content from PDF-Forms

2018-12-06 Thread Tilman Hausherr (JIRA)


 [ 
https://issues.apache.org/jira/browse/PDFBOX-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-4399:

Affects Version/s: 2.0.13

> Printing invisible Content from PDF-Forms 
> --
>
> Key: PDFBOX-4399
> URL: https://issues.apache.org/jira/browse/PDFBOX-4399
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, Rendering
>Affects Versions: 2.0.6, 2.0.13
>Reporter: Stefan Ziel
>Priority: Major
> Attachments: gs.png, original.pdf, printed.png
>
>
> Printing PDF-documents with macro [^original.pdf] renders hidden content  
> [^printed.png]  . Code used to print
> {code:java}
> InputStream sourceStream = new FileInputStream(pFile);
> try {
>   PDDocument source = PDDocument.load(sourceStream);
>   job.setPageable(new PDFPageable(source));
>   job.print(atts);
> } finally {
>   sourceStream.close();
> }
> {code}
> This is not only a problem of PDFBox ;) but can be done right ... ghostscript 
> does it [^gs.png].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4399) Printing invisible Content from PDF-Forms

2018-12-06 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711813#comment-16711813
 ] 

Tilman Hausherr commented on PDFBOX-4399:
-

Your PDF contains Javascript and optional content groups. (I didn't test it 
because of Javascript). Chrome can display it, Edge and PDF.js can't. Neither 
can we.

> Printing invisible Content from PDF-Forms 
> --
>
> Key: PDFBOX-4399
> URL: https://issues.apache.org/jira/browse/PDFBOX-4399
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, Rendering
>Affects Versions: 2.0.6, 2.0.13
>Reporter: Stefan Ziel
>Priority: Major
> Attachments: gs.png, original.pdf, printed.png
>
>
> Printing PDF-documents with macro [^original.pdf] renders hidden content  
> [^printed.png]  . Code used to print
> {code:java}
> InputStream sourceStream = new FileInputStream(pFile);
> try {
>   PDDocument source = PDDocument.load(sourceStream);
>   job.setPageable(new PDFPageable(source));
>   job.print(atts);
> } finally {
>   sourceStream.close();
> }
> {code}
> This is not only a problem of PDFBox ;) but can be done right ... ghostscript 
> does it [^gs.png].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances

2018-12-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711704#comment-16711704
 ] 

ASF subversion and git services commented on PDFBOX-4392:
-

Commit 1848341 from til...@apache.org in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1848341 ]

PDFBOX-4392: work around java CMS bugs depending of the jdk version / the CMS 
type

> PDF completely blow up the RAM on amazon instances
> --
>
> Key: PDFBOX-4392
> URL: https://issues.apache.org/jira/browse/PDFBOX-4392
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.12
>Reporter: Oleksandr Skoryi
>Priority: Major
> Fix For: 2.0.13
>
> Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, 
> 4392-prereadICC.patch
>
>
> Hi all
> The issue is pretty straightforward. I receive a lot of pdfs every day and 
> render them. In most of the cases everything is OK, but PDFs which produces 
> WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is 
> Perceptual, ignoring, treating as Display class
> working super long, and are super memory consumable. 
> It takes from 5 to 15 min on m5.large amazon instance. But attached PDF 
> completely killed the instance. The java process is just killed by linux 
> during processing with no exception in logs. 
> So could you please provide explanations what is going on with files with 
> WARN message above, and how can I improve the rendering. 
>  
> Here is my VM options 
> -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G 
> -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider"
> Also don't hesitate to ask me about more PDF, I have tones of them :D
>  
> And also a question, does GPU have influence on rendering?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances

2018-12-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711705#comment-16711705
 ] 

ASF subversion and git services commented on PDFBOX-4392:
-

Commit 1848342 from til...@apache.org in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1848342 ]

PDFBOX-4392: work around java CMS bugs depending of the jdk version / the CMS 
type

> PDF completely blow up the RAM on amazon instances
> --
>
> Key: PDFBOX-4392
> URL: https://issues.apache.org/jira/browse/PDFBOX-4392
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.12
>Reporter: Oleksandr Skoryi
>Priority: Major
> Fix For: 2.0.13
>
> Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, 
> 4392-prereadICC.patch
>
>
> Hi all
> The issue is pretty straightforward. I receive a lot of pdfs every day and 
> render them. In most of the cases everything is OK, but PDFs which produces 
> WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is 
> Perceptual, ignoring, treating as Display class
> working super long, and are super memory consumable. 
> It takes from 5 to 15 min on m5.large amazon instance. But attached PDF 
> completely killed the instance. The java process is just killed by linux 
> during processing with no exception in logs. 
> So could you please provide explanations what is going on with files with 
> WARN message above, and how can I improve the rendering. 
>  
> Here is my VM options 
> -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G 
> -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider"
> Also don't hesitate to ask me about more PDF, I have tones of them :D
>  
> And also a question, does GPU have influence on rendering?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances

2018-12-06 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711722#comment-16711722
 ] 

Tilman Hausherr commented on PDFBOX-4392:
-

{{isMinJdk8()}} is now double... (also in PDFRenderer) I don't have a good idea 
where to put it and how to name it.

> PDF completely blow up the RAM on amazon instances
> --
>
> Key: PDFBOX-4392
> URL: https://issues.apache.org/jira/browse/PDFBOX-4392
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.12
>Reporter: Oleksandr Skoryi
>Priority: Major
> Fix For: 2.0.13
>
> Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, 
> 4392-prereadICC.patch
>
>
> Hi all
> The issue is pretty straightforward. I receive a lot of pdfs every day and 
> render them. In most of the cases everything is OK, but PDFs which produces 
> WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is 
> Perceptual, ignoring, treating as Display class
> working super long, and are super memory consumable. 
> It takes from 5 to 15 min on m5.large amazon instance. But attached PDF 
> completely killed the instance. The java process is just killed by linux 
> during processing with no exception in logs. 
> So could you please provide explanations what is going on with files with 
> WARN message above, and how can I improve the rendering. 
>  
> Here is my VM options 
> -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G 
> -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider"
> Also don't hesitate to ask me about more PDF, I have tones of them :D
>  
> And also a question, does GPU have influence on rendering?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances

2018-12-06 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16707114#comment-16707114
 ] 

Tilman Hausherr edited comment on PDFBOX-4392 at 12/6/18 2:03 PM:
--

On my PC it takes 106 seconds to render in the -"ridiculous speed"- "ultimate 
speed" mode of Windows 10, I have set -Xmx2g. Yes GPU may be relevant. The 
warning is not really important.


was (Author: tilman):
On my PC it takes 106 seconds to render in the "ultimate speed" mode of Windows 
10, I have set -Xmx2g. Yes GPU may be relevant. The warning is not really 
important.

> PDF completely blow up the RAM on amazon instances
> --
>
> Key: PDFBOX-4392
> URL: https://issues.apache.org/jira/browse/PDFBOX-4392
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.12
>Reporter: Oleksandr Skoryi
>Priority: Major
> Fix For: 2.0.13
>
> Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, 
> 4392-prereadICC.patch
>
>
> Hi all
> The issue is pretty straightforward. I receive a lot of pdfs every day and 
> render them. In most of the cases everything is OK, but PDFs which produces 
> WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is 
> Perceptual, ignoring, treating as Display class
> working super long, and are super memory consumable. 
> It takes from 5 to 15 min on m5.large amazon instance. But attached PDF 
> completely killed the instance. The java process is just killed by linux 
> during processing with no exception in logs. 
> So could you please provide explanations what is going on with files with 
> WARN message above, and how can I improve the rendering. 
>  
> Here is my VM options 
> -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G 
> -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider"
> Also don't hesitate to ask me about more PDF, I have tones of them :D
>  
> And also a question, does GPU have influence on rendering?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances

2018-12-06 Thread ASF subversion and git services (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711504#comment-16711504
 ] 

ASF subversion and git services commented on PDFBOX-4392:
-

Commit 1848318 from til...@apache.org in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1848318 ]

PDFBOX-4392: remove toRGB() which is contained in Color() construction, thanks 
Itai Shaked

> PDF completely blow up the RAM on amazon instances
> --
>
> Key: PDFBOX-4392
> URL: https://issues.apache.org/jira/browse/PDFBOX-4392
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.12
>Reporter: Oleksandr Skoryi
>Priority: Major
> Fix For: 2.0.13
>
> Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, 
> 4392-prereadICC.patch
>
>
> Hi all
> The issue is pretty straightforward. I receive a lot of pdfs every day and 
> render them. In most of the cases everything is OK, but PDFs which produces 
> WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is 
> Perceptual, ignoring, treating as Display class
> working super long, and are super memory consumable. 
> It takes from 5 to 15 min on m5.large amazon instance. But attached PDF 
> completely killed the instance. The java process is just killed by linux 
> during processing with no exception in logs. 
> So could you please provide explanations what is going on with files with 
> WARN message above, and how can I improve the rendering. 
>  
> Here is my VM options 
> -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G 
> -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider"
> Also don't hesitate to ask me about more PDF, I have tones of them :D
>  
> And also a question, does GPU have influence on rendering?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4392) PDF completely blow up the RAM on amazon instances

2018-12-06 Thread Tilman Hausherr (JIRA)


[ 
https://issues.apache.org/jira/browse/PDFBOX-4392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16711528#comment-16711528
 ] 

Tilman Hausherr commented on PDFBOX-4392:
-

I'll retest with KCMS and LCMS whether the ComponentColorModel construction is 
still needed. And also whether the "new Color" is needed on LCMS.

> PDF completely blow up the RAM on amazon instances
> --
>
> Key: PDFBOX-4392
> URL: https://issues.apache.org/jira/browse/PDFBOX-4392
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 2.0.12
>Reporter: Oleksandr Skoryi
>Priority: Major
> Fix For: 2.0.13
>
> Attachments: 2f0f8f77-7a85-416d-b5d2-47a07d1416d4_3.pdf, 
> 4392-prereadICC.patch
>
>
> Hi all
> The issue is pretty straightforward. I receive a lot of pdfs every day and 
> render them. In most of the cases everything is OK, but PDFs which produces 
> WARN org.apache.pdfbox.pdmodel.graphics.color.PDICCBased - ICC profile is 
> Perceptual, ignoring, treating as Display class
> working super long, and are super memory consumable. 
> It takes from 5 to 15 min on m5.large amazon instance. But attached PDF 
> completely killed the instance. The java process is just killed by linux 
> during processing with no exception in logs. 
> So could you please provide explanations what is going on with files with 
> WARN message above, and how can I improve the rendering. 
>  
> Here is my VM options 
> -Dorg.apache.pdfbox.rendering.UsePureJavaCMYKConversion=true -Xmx3G -Xms2G 
> -Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider"
> Also don't hesitate to ask me about more PDF, I have tones of them :D
>  
> And also a question, does GPU have influence on rendering?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org