Build failed in Jenkins: PDFBox-sonar #435

2018-04-07 Thread Apache Jenkins Server
See 


Changes:

[tilman] PDFBOX-4071: fix formatting

[tilman] PDFBOX-4186: restore and improve dpi usage

[tilman] PDFBOX-4186: add quality option to pdfbox-app, as suggested by Martin 
Hausner

[msahyoun] PDFBOX-4185: support COSString, COSArray mixed options entries

--
[...truncated 6.75 KB...]
[INFO] 
[INFO] 
[INFO] Building Apache XmpBox 3.0.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-enforcer-plugin:1.4.1:enforce (enforce-maven-version) @ xmpbox 
---
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (process-resource-bundles) 
@ xmpbox ---
[INFO] 
[INFO] --- maven-resources-plugin:3.0.2:resources (default-resources) @ xmpbox 
---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 

[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.6.0:compile (default-compile) @ xmpbox ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 72 source files to 

[WARNING] bootstrap class path not set in conjunction with -source 1.7
[INFO] 
[INFO] 
[INFO] Building Apache PDFBox 3.0.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-enforcer-plugin:1.4.1:enforce (enforce-maven-version) @ pdfbox 
---
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (process-resource-bundles) 
@ pdfbox ---
[INFO] 
[INFO] --- maven-resources-plugin:3.0.2:resources (default-resources) @ pdfbox 
---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] Copying 21 resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.6.0:compile (default-compile) @ pdfbox ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 592 source files to 

[WARNING] bootstrap class path not set in conjunction with -source 1.7
[WARNING] 
:[310,27]
 getHeight(int) in org.apache.pdfbox.pdmodel.font.PDFontLike has been deprecated
[WARNING] 
:[297,27]
 getHeight(int) in org.apache.pdfbox.pdmodel.font.PDFontLike has been deprecated
[WARNING] 
:[100,19]
 getUnicodeCmap() in org.apache.fontbox.ttf.TrueTypeFont has been deprecated
[INFO] 
:
 Some input files use unchecked or unsafe operations.
[INFO] 
:
 Recompile with -Xlint:unchecked for details.
[INFO] 
[INFO] 
[INFO] Building Apache Preflight 3.0.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-enforcer-plugin:1.4.1:enforce (enforce-maven-version) @ 
preflight ---
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (process-resource-bundles) 
@ preflight ---
[INFO] 
[INFO] --- maven-resources-plugin:3.0.2:resources (default-resources) @ 
preflight ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 0 resource
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.6.0:compile (default-compile) @ preflight ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 117 source files to 

[WARNING] bootstrap class path not set in conjunction with -source 1.7
[INFO] 
:
 

 uses unchecked or unsafe operations.
[INFO] 

Build failed in Jenkins: PDFBox-sonar » Apache PDFBox #435

2018-04-07 Thread Apache Jenkins Server
See 


--
[INFO] 
[INFO] 
[INFO] Building Apache PDFBox 3.0.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-enforcer-plugin:1.4.1:enforce (enforce-maven-version) @ 
pdfbox-reactor ---
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (process-resource-bundles) 
@ pdfbox-reactor ---
[INFO] 
[INFO] 
[INFO] Building Apache PDFBox 3.0.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- sonar-maven-plugin:3.4.0.905:sonar (default-cli) @ pdfbox-reactor ---
[INFO] User cache: /home/jenkins/.sonar/cache
[INFO] SonarQube version: 5.6.3
[INFO] Default locale: "en_US", source code encoding: "UTF-8"
[INFO] Load global repositories
[INFO] Load global repositories (done) | time=11587ms
[INFO] Server id: 17cad23492ff68b
[INFO] User cache: /home/jenkins/.sonar/cache
[INFO] Load plugins index
[INFO] Load plugins index (done) | time=342ms
[INFO] Process project properties
[INFO] Load project repositories
[INFO] Load project repositories (done) | time=1229ms
[INFO] Load quality profiles
[INFO] Load quality profiles (done) | time=355ms
[INFO] Load active rules
[INFO] Load active rules (done) | time=2055ms
[INFO] Publish mode
[INFO] -  Scan PDFBox parent
[INFO] Load server rules

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Resolved] (PDFBOX-4186) Add quality option for compressed images to pdfbox-app

2018-04-07 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr resolved PDFBOX-4186.
-
Resolution: Fixed
  Assignee: Tilman Hausherr

Done, snapshot here:

[https://repository.apache.org/content/groups/snapshots/org/apache/pdfbox/pdfbox-app/2.0.10-SNAPSHOT/]

enjoy :)

> Add quality option for compressed images to pdfbox-app
> --
>
> Key: PDFBOX-4186
> URL: https://issues.apache.org/jira/browse/PDFBOX-4186
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Utilities
>Affects Versions: 2.0.9, 3.0.0 PDFBox
>Reporter: Martin Hausner
>Assignee: Tilman Hausherr
>Priority: Major
> Fix For: 2.0.10, 3.0.0 PDFBox
>
> Attachments: pdfbox-tool.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Add commandline option *quality* option for compressed images to pdfbox-app
> ex: -quality 0.75
>  see [^pdfbox-tool.patch]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4071) Improve code quality (3)

2018-04-07 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429516#comment-16429516
 ] 

ASF subversion and git services commented on PDFBOX-4071:
-

Commit 1828609 from [~tilman] in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1828609 ]

PDFBOX-4071: fix formatting

> Improve code quality (3)
> 
>
> Key: PDFBOX-4071
> URL: https://issues.apache.org/jira/browse/PDFBOX-4071
> Project: PDFBox
>  Issue Type: Task
>Affects Versions: 2.0.8
>Reporter: Tilman Hausherr
>Priority: Major
> Attachments: pdfbox-screenshot-bad.png, pdfbox-screenshot-good.png
>
>
> This is a longterm issue for the task to improve code quality, by using the 
> [SonarQube 
> report|https://analysis.apache.org/dashboard/index/org.apache.pdfbox:pdfbox-reactor],
>  hints in different IDEs, the FindBugs tool and other code quality tools.
> This is a follow-up of PDFBOX-2852, which was getting too long.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4071) Improve code quality (3)

2018-04-07 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429515#comment-16429515
 ] 

ASF subversion and git services commented on PDFBOX-4071:
-

Commit 1828608 from [~tilman] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1828608 ]

PDFBOX-4071: fix formatting

> Improve code quality (3)
> 
>
> Key: PDFBOX-4071
> URL: https://issues.apache.org/jira/browse/PDFBOX-4071
> Project: PDFBox
>  Issue Type: Task
>Affects Versions: 2.0.8
>Reporter: Tilman Hausherr
>Priority: Major
> Attachments: pdfbox-screenshot-bad.png, pdfbox-screenshot-good.png
>
>
> This is a longterm issue for the task to improve code quality, by using the 
> [SonarQube 
> report|https://analysis.apache.org/dashboard/index/org.apache.pdfbox:pdfbox-reactor],
>  hints in different IDEs, the FindBugs tool and other code quality tools.
> This is a follow-up of PDFBOX-2852, which was getting too long.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4186) Add quality option for compressed images to pdfbox-app

2018-04-07 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429512#comment-16429512
 ] 

ASF subversion and git services commented on PDFBOX-4186:
-

Commit 1828606 from [~tilman] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1828606 ]

PDFBOX-4186: restore and improve dpi usage

> Add quality option for compressed images to pdfbox-app
> --
>
> Key: PDFBOX-4186
> URL: https://issues.apache.org/jira/browse/PDFBOX-4186
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Utilities
>Affects Versions: 2.0.9, 3.0.0 PDFBox
>Reporter: Martin Hausner
>Priority: Major
> Fix For: 2.0.10, 3.0.0 PDFBox
>
> Attachments: pdfbox-tool.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Add commandline option *quality* option for compressed images to pdfbox-app
> ex: -quality 0.75
>  see [^pdfbox-tool.patch]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4186) Add quality option for compressed images to pdfbox-app

2018-04-07 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429513#comment-16429513
 ] 

ASF subversion and git services commented on PDFBOX-4186:
-

Commit 1828607 from [~tilman] in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1828607 ]

PDFBOX-4186: restore and improve dpi usage

> Add quality option for compressed images to pdfbox-app
> --
>
> Key: PDFBOX-4186
> URL: https://issues.apache.org/jira/browse/PDFBOX-4186
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Utilities
>Affects Versions: 2.0.9, 3.0.0 PDFBox
>Reporter: Martin Hausner
>Priority: Major
> Fix For: 2.0.10, 3.0.0 PDFBox
>
> Attachments: pdfbox-tool.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Add commandline option *quality* option for compressed images to pdfbox-app
> ex: -quality 0.75
>  see [^pdfbox-tool.patch]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4186) Add quality option for compressed images to pdfbox-app

2018-04-07 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429506#comment-16429506
 ] 

ASF subversion and git services commented on PDFBOX-4186:
-

Commit 1828604 from [~tilman] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1828604 ]

PDFBOX-4186: add quality option to pdfbox-app, as suggested by Martin Hausner

> Add quality option for compressed images to pdfbox-app
> --
>
> Key: PDFBOX-4186
> URL: https://issues.apache.org/jira/browse/PDFBOX-4186
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Utilities
>Affects Versions: 2.0.9, 3.0.0 PDFBox
>Reporter: Martin Hausner
>Priority: Major
> Fix For: 2.0.10, 3.0.0 PDFBox
>
> Attachments: pdfbox-tool.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Add commandline option *quality* option for compressed images to pdfbox-app
> ex: -quality 0.75
>  see [^pdfbox-tool.patch]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4186) Add quality option for compressed images to pdfbox-app

2018-04-07 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429505#comment-16429505
 ] 

ASF subversion and git services commented on PDFBOX-4186:
-

Commit 1828603 from [~tilman] in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1828603 ]

PDFBOX-4186: add quality option to pdfbox-app, as suggested by Martin Hausner

> Add quality option for compressed images to pdfbox-app
> --
>
> Key: PDFBOX-4186
> URL: https://issues.apache.org/jira/browse/PDFBOX-4186
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Utilities
>Affects Versions: 2.0.9, 3.0.0 PDFBox
>Reporter: Martin Hausner
>Priority: Major
> Fix For: 2.0.10, 3.0.0 PDFBox
>
> Attachments: pdfbox-tool.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Add commandline option *quality* option for compressed images to pdfbox-app
> ex: -quality 0.75
>  see [^pdfbox-tool.patch]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4182) Improve memory usage of PDFMergerUtility

2018-04-07 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429472#comment-16429472
 ] 

Tilman Hausherr commented on PDFBOX-4182:
-

I found it:
https://stackoverflow.com/questions/47140209/files-flattened-and-merged-with-pdfbox-are-sharing-common-cosstream
which leads to PDFBOX-3999, PDFBOX-4003 and PDFBOX-4004.

> Improve memory usage of PDFMergerUtility
> 
>
> Key: PDFBOX-4182
> URL: https://issues.apache.org/jira/browse/PDFBOX-4182
> Project: PDFBox
>  Issue Type: Improvement
>Affects Versions: 2.0.9
>Reporter: Pas Filip
>Priority: Major
> Attachments: PDFMergerUtilityUsingSupplier.java, Supplier.java, 
> Suppliers.java, 
> failed-merge-utility-4gb-heap-out-of-memory-after-1800-pdfs.png, 
> merge-pdf-stats.xlsx, oom-2gb-heap-after-refactoring-leak-suspect-1.png, 
> oom-2gb-heap-after-refactoring-leak-suspect-2.png, successful - 
> refactored-merge-utility-4gb-heap-2618-files-merged.png, successful 
> -merge-utility-6gb-heap-2618-files-merged.png, 
> successful-merge-utility-6gb-heap-2618-files-merged-setupTempFileOnly.png, 
> successful-merge-utility-8gb-heap-2618-files-merged.png, 
> successful-refactored-merge-utility-4gb-heap-2618-files-merged-setupTempFileOnly.png
>
>
> I have been running some tests trying to merge large amounts (2618) of small 
> pdf documents, between 100kb and 130kb, into a single large pdf (288.433kb)
> Memory consumption seems to be the main limitation.
> ScratchFileBuffer seems to consume the majority of the memory usage.
> (see screenshot from mat in attachment)
> (I would include the hprof in attachment so you can analyze yourselves but 
> it's rather large)
> Note that it seems impossible to generate a large pdf using a small memory 
> footprint.
> I personally thought that using MemorySettings with temporary file only would 
> allow me to generate arbitrarily large pdf files but it doesn't seem to help.
> I've run the mergeDocuments with  memory settings:
>  * MemoryUsageSetting.setupMixed(1024L * 1024L, 1024L * 1024L * 1024L * 1024L 
> * 1024L)
>  * MemoryUsageSetting.setupTempFileOnly()
> Refactored version completes with *4GB* heap:
> with temp file only completes 2618 documents in 1.760 min
> *VS*
> *8GB* heap:
> with temp file only completes 2618 documents in 2.0 min
> Heaps of 6gb or less result in OOM. (Didn't try different sizes between 6GB 
> and 8GB)
>  It looks like the loop in the mergeDocuments accumulates PDDocument objects 
> in a list which are closed after the merge is completed.
> Refactoring the code to close these as they are used, instead of accumulating 
> them and closing all at the end, improves memory usage considerably.(although 
> doesn't seem to be eliminated completed based on mat analysis.)
> Another change I've implemented is to only create the inputstream when the 
> file needs to be read and to close it alongside the PDDocument.
> (Some inputstreams contain buffers and depending on the size of the buffers 
> and or the stream type accumulating all the streams is a potential 
> memory-hog.)
> These changes seems to have a beneficial improvement in the sense that I can 
> process the same amount of pdfs with about half the memory.
>  I'd appreciate it if you could roll these changes into the main codebase.
> (I've respected java 6 compatibility.)
> I've included in attachment the java files of the new implementation:
>  * Suppliers
>  * Supplier
>  * PDFMergerUtilityUsingSupplier
> PDFMergerUtilityUsingSupplier can replace the previous version. No signature 
> changes only internal code changes. (just rename the class to 
> PDFMergerUtility if you decide to implemented the changes.)
>  In attachment you can also find some screenshots from visualvm showing the 
> memory usage of the original version and the refactored version as well as 
> some info produced by mat after analysing the heap.
> If you know of any other means, without running into memory issues, to merge 
> large sets of pdf files into a large single pdf I'd love to hear about it!
> I'd also suggest that there should be further improvements made in memory 
> usage in general as pdfbox seems to consumer a lot of memory in general.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4186) Add quality option for compressed images to pdfbox-app

2018-04-07 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-4186:

Fix Version/s: 3.0.0 PDFBox
   2.0.10

> Add quality option for compressed images to pdfbox-app
> --
>
> Key: PDFBOX-4186
> URL: https://issues.apache.org/jira/browse/PDFBOX-4186
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Utilities
>Affects Versions: 2.0.9, 3.0.0 PDFBox
>Reporter: Martin Hausner
>Priority: Major
> Fix For: 2.0.10, 3.0.0 PDFBox
>
> Attachments: pdfbox-tool.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Add commandline option *quality* option for compressed images to pdfbox-app
> ex: -quality 0.75
>  see [^pdfbox-tool.patch]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4186) Add quality option for compressed images to pdfbox-app

2018-04-07 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-4186:

Component/s: Utilities

> Add quality option for compressed images to pdfbox-app
> --
>
> Key: PDFBOX-4186
> URL: https://issues.apache.org/jira/browse/PDFBOX-4186
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Utilities
>Affects Versions: 2.0.9, 3.0.0 PDFBox
>Reporter: Martin Hausner
>Priority: Major
> Fix For: 2.0.10, 3.0.0 PDFBox
>
> Attachments: pdfbox-tool.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Add commandline option *quality* option for compressed images to pdfbox-app
> ex: -quality 0.75
>  see [^pdfbox-tool.patch]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4186) Add quality option for compressed images to pdfbox-app

2018-04-07 Thread Martin Hausner (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martin Hausner updated PDFBOX-4186:
---
Affects Version/s: (was: 3.0.0 JBIG2)
   3.0.0 PDFBox

> Add quality option for compressed images to pdfbox-app
> --
>
> Key: PDFBOX-4186
> URL: https://issues.apache.org/jira/browse/PDFBOX-4186
> Project: PDFBox
>  Issue Type: Improvement
>Affects Versions: 2.0.9, 3.0.0 PDFBox
>Reporter: Martin Hausner
>Priority: Major
> Attachments: pdfbox-tool.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Add commandline option *quality* option for compressed images to pdfbox-app
> ex: -quality 0.75
>  see [^pdfbox-tool.patch]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4186) Add quality option for compressed images to pdfbox-app

2018-04-07 Thread Martin Hausner (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martin Hausner updated PDFBOX-4186:
---
Attachment: pdfbox-tool.patch

> Add quality option for compressed images to pdfbox-app
> --
>
> Key: PDFBOX-4186
> URL: https://issues.apache.org/jira/browse/PDFBOX-4186
> Project: PDFBox
>  Issue Type: Improvement
>Affects Versions: 2.0.9, 3.0.0 JBIG2
>Reporter: Martin Hausner
>Priority: Major
> Attachments: pdfbox-tool.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Add commandline option *quality* option for compressed images to pdfbox-app
> ex: -quality 0.75
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4186) Add quality option for compressed images to pdfbox-app

2018-04-07 Thread Martin Hausner (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martin Hausner updated PDFBOX-4186:
---
Description: 
Add commandline option *quality* option for compressed images to pdfbox-app

ex: -quality 0.75

 see [^pdfbox-tool.patch]

 

  was:
Add commandline option *quality* option for compressed images to pdfbox-app

ex: -quality 0.75

 

 


> Add quality option for compressed images to pdfbox-app
> --
>
> Key: PDFBOX-4186
> URL: https://issues.apache.org/jira/browse/PDFBOX-4186
> Project: PDFBox
>  Issue Type: Improvement
>Affects Versions: 2.0.9, 3.0.0 JBIG2
>Reporter: Martin Hausner
>Priority: Major
> Attachments: pdfbox-tool.patch
>
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Add commandline option *quality* option for compressed images to pdfbox-app
> ex: -quality 0.75
>  see [^pdfbox-tool.patch]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4186) Add quality option for compressed images to pdfbox-app

2018-04-07 Thread Martin Hausner (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martin Hausner updated PDFBOX-4186:
---
Docs Text:   (was: .)

> Add quality option for compressed images to pdfbox-app
> --
>
> Key: PDFBOX-4186
> URL: https://issues.apache.org/jira/browse/PDFBOX-4186
> Project: PDFBox
>  Issue Type: Improvement
>Affects Versions: 2.0.9, 3.0.0 JBIG2
>Reporter: Martin Hausner
>Priority: Major
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> Add commandline option *quality* option for compressed images to pdfbox-app
> ex: -quality 0.75
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Created] (PDFBOX-4186) Add quality option for compressed images to pdfbox-app

2018-04-07 Thread Martin Hausner (JIRA)
Martin Hausner created PDFBOX-4186:
--

 Summary: Add quality option for compressed images to pdfbox-app
 Key: PDFBOX-4186
 URL: https://issues.apache.org/jira/browse/PDFBOX-4186
 Project: PDFBox
  Issue Type: Improvement
Affects Versions: 3.0.0 JBIG2, 2.0.9
Reporter: Martin Hausner


Add commandline option *quality* option for compressed images to pdfbox-app

ex: -quality 0.75

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Resolved] (PDFBOX-4134) Resolve floating point comparisons flagged by Sonar

2018-04-07 Thread Maruan Sahyoun (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maruan Sahyoun resolved PDFBOX-4134.

Resolution: Fixed

Closing as there are no further instances of this warning in Sonar. Moving on 
we should watch for new issues of that type after commits.

> Resolve floating point comparisons flagged by Sonar
> ---
>
> Key: PDFBOX-4134
> URL: https://issues.apache.org/jira/browse/PDFBOX-4134
> Project: PDFBox
>  Issue Type: Sub-task
>Affects Versions: 3.0.0 PDFBox
>Reporter: Maruan Sahyoun
>Assignee: Maruan Sahyoun
>Priority: Major
> Fix For: 3.0.0 PDFBox
>
>
> Sonar is flagging floating point comparisons which are done using comparison 
> operators. Using {{Float.compare}} this can be done in a standards way. In 
> addition some comparisons might be done using a distance between two values.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4158) COSDocument and PDFMerger may not close all IO resources if closing of one fails

2018-04-07 Thread Maruan Sahyoun (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429438#comment-16429438
 ] 

Maruan Sahyoun commented on PDFBOX-4158:


[~gary.potagal] did you find the time for further testing? Can we close the 
issue?

> COSDocument and PDFMerger may not close all IO resources if closing of one 
> fails
> 
>
> Key: PDFBOX-4158
> URL: https://issues.apache.org/jira/browse/PDFBOX-4158
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel
>Affects Versions: 2.0.4, 2.0.9, 3.0.0 PDFBox
>Reporter: Maruan Sahyoun
>Assignee: Maruan Sahyoun
>Priority: Minor
> Fix For: 2.0.10, 3.0.0 PDFBox
>
> Attachments: BiggestObjectAllocationGraph.png, BiggestObjectList.png, 
> PDFBOX-4158.patch
>
>
> As observed on the users mailing list  {{COSDocument.close}} and 
> {{PDFMergerUtility.mergeDocuments}} might not close all IO resources if 
> closing of one of the resources fails



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Resolved] (PDFBOX-4172) Flatten fails on first form element only

2018-04-07 Thread Maruan Sahyoun (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-4172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maruan Sahyoun resolved PDFBOX-4172.

Resolution: Fixed

I'm closing this given the tests and the feedback in PDFBOX-4157. 

[~mbr] You can reopen if the issue persists.

> Flatten fails on first form element only
> 
>
> Key: PDFBOX-4172
> URL: https://issues.apache.org/jira/browse/PDFBOX-4172
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm
>Affects Versions: 2.0.9
>Reporter: michael-...@fami-braun.de
>Assignee: Maruan Sahyoun
>Priority: Major
>  Labels: flatten
> Fix For: 2.0.10, 3.0.0 PDFBox
>
> Attachments: example-filled-2.0.9.pdf, example-filled-fixed.pdf, 
> example.java, example.pdf
>
>
> I've create an PDF form using LibreOffice 5. For this document, the first 
> form element refuses to turn up filled when filling + flattening using PDFBox 
> 2.0.9 als well as trunk (512d016ad08a70dfb512f99d54092f8b586e8345).
> It turns out that resolveNeedsTranslation does not encounter any 
> PDFormXObject for the first element of this pdf form element but still 
> returns false, although translation is still needed.
> I've created a patch in 
> [https://github.com/michael-dev/pdfbox/tree/bugfix/flattenCorrectly] .
> I used evince 3.18.2 on ubuntu xenial as pdf viewer. Please see attached 
> example pdf form and the different results using pdfbox 2.0.9 and with the 
> above patch applied. The code used here is in example.java.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4182) Improve memory usage of PDFMergerUtility

2018-04-07 Thread Maruan Sahyoun (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429434#comment-16429434
 ] 

Maruan Sahyoun commented on PDFBOX-4182:


[~tilman] couldn't find the issue you are mentioning - would you mind taking a 
look if you are able to find it?

> Improve memory usage of PDFMergerUtility
> 
>
> Key: PDFBOX-4182
> URL: https://issues.apache.org/jira/browse/PDFBOX-4182
> Project: PDFBox
>  Issue Type: Improvement
>Affects Versions: 2.0.9
>Reporter: Pas Filip
>Priority: Major
> Attachments: PDFMergerUtilityUsingSupplier.java, Supplier.java, 
> Suppliers.java, 
> failed-merge-utility-4gb-heap-out-of-memory-after-1800-pdfs.png, 
> merge-pdf-stats.xlsx, oom-2gb-heap-after-refactoring-leak-suspect-1.png, 
> oom-2gb-heap-after-refactoring-leak-suspect-2.png, successful - 
> refactored-merge-utility-4gb-heap-2618-files-merged.png, successful 
> -merge-utility-6gb-heap-2618-files-merged.png, 
> successful-merge-utility-6gb-heap-2618-files-merged-setupTempFileOnly.png, 
> successful-merge-utility-8gb-heap-2618-files-merged.png, 
> successful-refactored-merge-utility-4gb-heap-2618-files-merged-setupTempFileOnly.png
>
>
> I have been running some tests trying to merge large amounts (2618) of small 
> pdf documents, between 100kb and 130kb, into a single large pdf (288.433kb)
> Memory consumption seems to be the main limitation.
> ScratchFileBuffer seems to consume the majority of the memory usage.
> (see screenshot from mat in attachment)
> (I would include the hprof in attachment so you can analyze yourselves but 
> it's rather large)
> Note that it seems impossible to generate a large pdf using a small memory 
> footprint.
> I personally thought that using MemorySettings with temporary file only would 
> allow me to generate arbitrarily large pdf files but it doesn't seem to help.
> I've run the mergeDocuments with  memory settings:
>  * MemoryUsageSetting.setupMixed(1024L * 1024L, 1024L * 1024L * 1024L * 1024L 
> * 1024L)
>  * MemoryUsageSetting.setupTempFileOnly()
> Refactored version completes with *4GB* heap:
> with temp file only completes 2618 documents in 1.760 min
> *VS*
> *8GB* heap:
> with temp file only completes 2618 documents in 2.0 min
> Heaps of 6gb or less result in OOM. (Didn't try different sizes between 6GB 
> and 8GB)
>  It looks like the loop in the mergeDocuments accumulates PDDocument objects 
> in a list which are closed after the merge is completed.
> Refactoring the code to close these as they are used, instead of accumulating 
> them and closing all at the end, improves memory usage considerably.(although 
> doesn't seem to be eliminated completed based on mat analysis.)
> Another change I've implemented is to only create the inputstream when the 
> file needs to be read and to close it alongside the PDDocument.
> (Some inputstreams contain buffers and depending on the size of the buffers 
> and or the stream type accumulating all the streams is a potential 
> memory-hog.)
> These changes seems to have a beneficial improvement in the sense that I can 
> process the same amount of pdfs with about half the memory.
>  I'd appreciate it if you could roll these changes into the main codebase.
> (I've respected java 6 compatibility.)
> I've included in attachment the java files of the new implementation:
>  * Suppliers
>  * Supplier
>  * PDFMergerUtilityUsingSupplier
> PDFMergerUtilityUsingSupplier can replace the previous version. No signature 
> changes only internal code changes. (just rename the class to 
> PDFMergerUtility if you decide to implemented the changes.)
>  In attachment you can also find some screenshots from visualvm showing the 
> memory usage of the original version and the refactored version as well as 
> some info produced by mat after analysing the heap.
> If you know of any other means, without running into memory issues, to merge 
> large sets of pdf files into a large single pdf I'd love to hear about it!
> I'd also suggest that there should be further improvements made in memory 
> usage in general as pdfbox seems to consumer a lot of memory in general.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-4184) [PATCH]: Support simple lossless compression of 16 bit RGB images

2018-04-07 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429420#comment-16429420
 ] 

Tilman Hausherr edited comment on PDFBOX-4184 at 4/7/18 3:32 PM:
-

Thanks... I'll commit this within the next few days... I managed to create such 
an Image (IrfanView says it has "64 BitsPerPixel") so we can also have a local 
test but I didn't manage to have a failure, i.e. a bad PDF like with [the image 
from your 
issue|https://user-images.githubusercontent.com/29379074/36145630-f304cd0e-10d7-11e8-942c-66eb8040be70.png]:
{code}
ColorModel colorModel = new 
ComponentColorModel(ColorSpace.getInstance(ColorSpace.CS_LINEAR_RGB),
true, false, Transparency.TRANSLUCENT, DataBuffer.TYPE_USHORT);
WritableRaster raster = 
Raster.createInterleavedRaster(DataBuffer.TYPE_USHORT, 256, 256, 4, null);
BufferedImage image = new BufferedImage(colorModel, raster, false, 
null);
for (int x = 0; x < image.getWidth(); ++x)
{
for (int y = 0; y < image.getHeight(); ++y)
{
if (x == y)
{
switch (x % 4)
{
case 0:
image.setRGB(x, y, 0x);
break;
case 1:
image.setRGB(x, y, 0xFF00FF00);
break;
case 2:
image.setRGB(x, y, 0xFFFF);
break;
case 3:
image.setRGB(x, y, 0x);
break;
}

}

}
}

PDDocument doc = new PDDocument();
PDPage page = new PDPage();
doc.addPage(page);
try (PDPageContentStream cs = new PDPageContentStream(doc, page))
{
cs.drawImage(LosslessFactory.createFromImage(doc, image), 0f, 
page.getMediaBox().getHeight() - image.getHeight());
}

{code}



was (Author: tilman):
Thanks... I'll commit this within the next few days... I managed to create such 
an Image (IrfanView says it has "64 BitsPerPixel") so we can also have a local 
test but I didn't manage to have a failure, i.e. a bad PDF like with the image 
from your issue:
{code}
ColorModel colorModel = new 
ComponentColorModel(ColorSpace.getInstance(ColorSpace.CS_LINEAR_RGB),
true, false, Transparency.TRANSLUCENT, DataBuffer.TYPE_USHORT);
WritableRaster raster = 
Raster.createInterleavedRaster(DataBuffer.TYPE_USHORT, 256, 256, 4, null);
BufferedImage image = new BufferedImage(colorModel, raster, false, 
null);
for (int x = 0; x < image.getWidth(); ++x)
{
for (int y = 0; y < image.getHeight(); ++y)
{
if (x == y)
{
switch (x % 4)
{
case 0:
image.setRGB(x, y, 0x);
break;
case 1:
image.setRGB(x, y, 0xFF00FF00);
break;
case 2:
image.setRGB(x, y, 0xFFFF);
break;
case 3:
image.setRGB(x, y, 0x);
break;
}

}

}
}

PDDocument doc = new PDDocument();
PDPage page = new PDPage();
doc.addPage(page);
try (PDPageContentStream cs = new PDPageContentStream(doc, page))
{
cs.drawImage(LosslessFactory.createFromImage(doc, image), 0f, 
page.getMediaBox().getHeight() - image.getHeight());
}

{code}


> [PATCH]: Support simple lossless compression of 16 bit RGB images
> -
>
> Key: PDFBOX-4184
> URL: https://issues.apache.org/jira/browse/PDFBOX-4184
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Writing
>Affects Versions: 2.0.9
>Reporter: Emmeran Seehuber
>Priority: Minor
> Fix For: 2.0.10, 3.0.0 PDFBox
>
> Attachments: pdfbox_support_16bit_image_write.patch
>
>
> The attached patch add support to write 16 bit per component images 
> correctly. I've integrated a test for this here: 
> [https://github.com/rototor/pdfbox-graphics2d/commit/8bf089cb74945bd4f0f15054754f51dd5b361fe9]
> It only supports 16-Bit TYPE_CUSTOM with DataType == USHORT images - but this 
> is what you usually get when you read a 16 bit PNG file.
> This would also fix 

[jira] [Comment Edited] (PDFBOX-4184) [PATCH]: Support simple lossless compression of 16 bit RGB images

2018-04-07 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429420#comment-16429420
 ] 

Tilman Hausherr edited comment on PDFBOX-4184 at 4/7/18 3:32 PM:
-

Thanks... I'll commit this within the next few days... I managed to create such 
an Image (IrfanView says it has "64 BitsPerPixel") so we can also have a local 
test but I didn't manage to have a failure, i.e. a bad PDF like with [the image 
from the github 
issue|https://user-images.githubusercontent.com/29379074/36145630-f304cd0e-10d7-11e8-942c-66eb8040be70.png]:
{code}
ColorModel colorModel = new 
ComponentColorModel(ColorSpace.getInstance(ColorSpace.CS_LINEAR_RGB),
true, false, Transparency.TRANSLUCENT, DataBuffer.TYPE_USHORT);
WritableRaster raster = 
Raster.createInterleavedRaster(DataBuffer.TYPE_USHORT, 256, 256, 4, null);
BufferedImage image = new BufferedImage(colorModel, raster, false, 
null);
for (int x = 0; x < image.getWidth(); ++x)
{
for (int y = 0; y < image.getHeight(); ++y)
{
if (x == y)
{
switch (x % 4)
{
case 0:
image.setRGB(x, y, 0x);
break;
case 1:
image.setRGB(x, y, 0xFF00FF00);
break;
case 2:
image.setRGB(x, y, 0xFFFF);
break;
case 3:
image.setRGB(x, y, 0x);
break;
}

}

}
}

PDDocument doc = new PDDocument();
PDPage page = new PDPage();
doc.addPage(page);
try (PDPageContentStream cs = new PDPageContentStream(doc, page))
{
cs.drawImage(LosslessFactory.createFromImage(doc, image), 0f, 
page.getMediaBox().getHeight() - image.getHeight());
}

{code}



was (Author: tilman):
Thanks... I'll commit this within the next few days... I managed to create such 
an Image (IrfanView says it has "64 BitsPerPixel") so we can also have a local 
test but I didn't manage to have a failure, i.e. a bad PDF like with [the image 
from your 
issue|https://user-images.githubusercontent.com/29379074/36145630-f304cd0e-10d7-11e8-942c-66eb8040be70.png]:
{code}
ColorModel colorModel = new 
ComponentColorModel(ColorSpace.getInstance(ColorSpace.CS_LINEAR_RGB),
true, false, Transparency.TRANSLUCENT, DataBuffer.TYPE_USHORT);
WritableRaster raster = 
Raster.createInterleavedRaster(DataBuffer.TYPE_USHORT, 256, 256, 4, null);
BufferedImage image = new BufferedImage(colorModel, raster, false, 
null);
for (int x = 0; x < image.getWidth(); ++x)
{
for (int y = 0; y < image.getHeight(); ++y)
{
if (x == y)
{
switch (x % 4)
{
case 0:
image.setRGB(x, y, 0x);
break;
case 1:
image.setRGB(x, y, 0xFF00FF00);
break;
case 2:
image.setRGB(x, y, 0xFFFF);
break;
case 3:
image.setRGB(x, y, 0x);
break;
}

}

}
}

PDDocument doc = new PDDocument();
PDPage page = new PDPage();
doc.addPage(page);
try (PDPageContentStream cs = new PDPageContentStream(doc, page))
{
cs.drawImage(LosslessFactory.createFromImage(doc, image), 0f, 
page.getMediaBox().getHeight() - image.getHeight());
}

{code}


> [PATCH]: Support simple lossless compression of 16 bit RGB images
> -
>
> Key: PDFBOX-4184
> URL: https://issues.apache.org/jira/browse/PDFBOX-4184
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Writing
>Affects Versions: 2.0.9
>Reporter: Emmeran Seehuber
>Priority: Minor
> Fix For: 2.0.10, 3.0.0 PDFBox
>
> Attachments: pdfbox_support_16bit_image_write.patch
>
>
> The attached patch add support to write 16 bit per component images 
> correctly. I've integrated a test for this here: 
> [https://github.com/rototor/pdfbox-graphics2d/commit/8bf089cb74945bd4f0f15054754f51dd5b361fe9]
> It only supports 16-Bit TYPE_CUSTOM with DataType == USHORT images - but 

[jira] [Comment Edited] (PDFBOX-4184) [PATCH]: Support simple lossless compression of 16 bit RGB images

2018-04-07 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429420#comment-16429420
 ] 

Tilman Hausherr edited comment on PDFBOX-4184 at 4/7/18 3:31 PM:
-

Thanks... I'll commit this within the next few days... I managed to create such 
an Image (IrfanView says it has "64 BitsPerPixel") so we can also have a local 
test but I didn't manage to have a failure, i.e. a bad PDF like with the image 
from your issue:
{code}
ColorModel colorModel = new 
ComponentColorModel(ColorSpace.getInstance(ColorSpace.CS_LINEAR_RGB),
true, false, Transparency.TRANSLUCENT, DataBuffer.TYPE_USHORT);
WritableRaster raster = 
Raster.createInterleavedRaster(DataBuffer.TYPE_USHORT, 256, 256, 4, null);
BufferedImage image = new BufferedImage(colorModel, raster, false, 
null);
for (int x = 0; x < image.getWidth(); ++x)
{
for (int y = 0; y < image.getHeight(); ++y)
{
if (x == y)
{
switch (x % 4)
{
case 0:
image.setRGB(x, y, 0x);
break;
case 1:
image.setRGB(x, y, 0xFF00FF00);
break;
case 2:
image.setRGB(x, y, 0xFFFF);
break;
case 3:
image.setRGB(x, y, 0x);
break;
}

}

}
}

PDDocument doc = new PDDocument();
PDPage page = new PDPage();
doc.addPage(page);
try (PDPageContentStream cs = new PDPageContentStream(doc, page))
{
cs.drawImage(LosslessFactory.createFromImage(doc, image), 0f, 
page.getMediaBox().getHeight() - image.getHeight());
}

{code}



was (Author: tilman):
Thanks... I'll commit this within the next few days... I managed to create such 
an image so we can also have a local test but I didn't manage to have a 
failure, i.e. a bad PDF like with the image from your issue:
{code}
ColorModel colorModel = new 
ComponentColorModel(ColorSpace.getInstance(ColorSpace.CS_LINEAR_RGB),
true, false, Transparency.TRANSLUCENT, DataBuffer.TYPE_USHORT);
WritableRaster raster = 
Raster.createInterleavedRaster(DataBuffer.TYPE_USHORT, 256, 256, 4, null);
BufferedImage image = new BufferedImage(colorModel, raster, false, 
null);
for (int x = 0; x < image.getWidth(); ++x)
{
for (int y = 0; y < image.getHeight(); ++y)
{
if (x == y)
{
switch (x % 4)
{
case 0:
image.setRGB(x, y, 0x);
break;
case 1:
image.setRGB(x, y, 0xFF00FF00);
break;
case 2:
image.setRGB(x, y, 0xFFFF);
break;
case 3:
image.setRGB(x, y, 0x);
break;
}

}

}
}

PDDocument doc = new PDDocument();
PDPage page = new PDPage();
doc.addPage(page);
try (PDPageContentStream cs = new PDPageContentStream(doc, page))
{
cs.drawImage(LosslessFactory.createFromImage(doc, image), 0f, 
page.getMediaBox().getHeight() - image.getHeight());
}

{code}


> [PATCH]: Support simple lossless compression of 16 bit RGB images
> -
>
> Key: PDFBOX-4184
> URL: https://issues.apache.org/jira/browse/PDFBOX-4184
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Writing
>Affects Versions: 2.0.9
>Reporter: Emmeran Seehuber
>Priority: Minor
> Fix For: 2.0.10, 3.0.0 PDFBox
>
> Attachments: pdfbox_support_16bit_image_write.patch
>
>
> The attached patch add support to write 16 bit per component images 
> correctly. I've integrated a test for this here: 
> [https://github.com/rototor/pdfbox-graphics2d/commit/8bf089cb74945bd4f0f15054754f51dd5b361fe9]
> It only supports 16-Bit TYPE_CUSTOM with DataType == USHORT images - but this 
> is what you usually get when you read a 16 bit PNG file.
> This would also fix [https://github.com/danfickle/openhtmltopdf/issues/173].
> The patch is against 2.0.9, but should apply to 3.0.0 too.
> There is still some room for improvements when 

[jira] [Commented] (PDFBOX-4184) [PATCH]: Support simple lossless compression of 16 bit RGB images

2018-04-07 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429420#comment-16429420
 ] 

Tilman Hausherr commented on PDFBOX-4184:
-

Thanks... I'll commit this within the next few days... I managed to create such 
an image so we can also have a local test but I didn't manage to have a 
failure, i.e. a bad PDF like with the image from your issue:
{code}
ColorModel colorModel = new 
ComponentColorModel(ColorSpace.getInstance(ColorSpace.CS_LINEAR_RGB),
true, false, Transparency.TRANSLUCENT, DataBuffer.TYPE_USHORT);
WritableRaster raster = 
Raster.createInterleavedRaster(DataBuffer.TYPE_USHORT, 256, 256, 4, null);
BufferedImage image = new BufferedImage(colorModel, raster, false, 
null);
for (int x = 0; x < image.getWidth(); ++x)
{
for (int y = 0; y < image.getHeight(); ++y)
{
if (x == y)
{
switch (x % 4)
{
case 0:
image.setRGB(x, y, 0x);
break;
case 1:
image.setRGB(x, y, 0xFF00FF00);
break;
case 2:
image.setRGB(x, y, 0xFFFF);
break;
case 3:
image.setRGB(x, y, 0x);
break;
}

}

}
}

PDDocument doc = new PDDocument();
PDPage page = new PDPage();
doc.addPage(page);
try (PDPageContentStream cs = new PDPageContentStream(doc, page))
{
cs.drawImage(LosslessFactory.createFromImage(doc, image), 0f, 
page.getMediaBox().getHeight() - image.getHeight());
}

{code}


> [PATCH]: Support simple lossless compression of 16 bit RGB images
> -
>
> Key: PDFBOX-4184
> URL: https://issues.apache.org/jira/browse/PDFBOX-4184
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Writing
>Affects Versions: 2.0.9
>Reporter: Emmeran Seehuber
>Priority: Minor
> Fix For: 2.0.10, 3.0.0 PDFBox
>
> Attachments: pdfbox_support_16bit_image_write.patch
>
>
> The attached patch add support to write 16 bit per component images 
> correctly. I've integrated a test for this here: 
> [https://github.com/rototor/pdfbox-graphics2d/commit/8bf089cb74945bd4f0f15054754f51dd5b361fe9]
> It only supports 16-Bit TYPE_CUSTOM with DataType == USHORT images - but this 
> is what you usually get when you read a 16 bit PNG file.
> This would also fix [https://github.com/danfickle/openhtmltopdf/issues/173].
> The patch is against 2.0.9, but should apply to 3.0.0 too.
> There is still some room for improvements when writing lossless images, as 
> the images are currently not efficiently encoded. I.e. you could use PNG 
> encodings to get a better compression. (By adding a COSName.DECODE_PARMS with 
> a COSName.PREDICTOR == 15 and encoding the images as PNG). But this is 
> something for a later patch. It would also need another API, as there is a 
> tradeoff speed vs compression ratio. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4185) Fetching options for PDChoice causes ClassCastException

2018-04-07 Thread Maruan Sahyoun (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429413#comment-16429413
 ] 

Maruan Sahyoun commented on PDFBOX-4185:


[~matthias.gall] please try with a snapshot which will be available later today 
or build from source and let me know if it works for you.

> Fetching options for PDChoice causes ClassCastException 
> 
>
> Key: PDFBOX-4185
> URL: https://issues.apache.org/jira/browse/PDFBOX-4185
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm
>Affects Versions: 2.0.4, 2.0.9, 3.0.0 PDFBox
>Reporter: Maruan Sahyoun
>Assignee: Maruan Sahyoun
>Priority: Major
> Fix For: 2.0.10, 3.0.0 PDFBox
>
>
> I am trying to fetch the options available for a PDChoice field in a form but 
> get a ClassCastException from the PDFBox internals.
> The problematic PDF is an Inheritance Tax form from the UK's Revenue and 
> Customs, specifically I am currently looking at IHT405:
> https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/697346/IHT405_online.pdf
> I use this code to iterate over the fields:
> {code}
>   PDDocument doc = PDDocument.load(resource.getFile());
>   PDDocumentCatalog catalog = doc.getDocumentCatalog();
>   PDAcroForm form = catalog.getAcroForm();
>   for (PDField field : form.getFields()) {
>   if ("Ch".equals(field.getFieldType())) {
>   PDChoice choice = (PDChoice) field;
>   // All these variants fail with a ClassCastException:
>   choice.getOptions();
>   choice.getOptionsDisplayValues();
>   choice.getOptionsExportValues(); // internally just 
> delegates to getOptions()
>   }
>   }
> {code}
> This is a stacktrace for e.g. the getOptionsExportValues() call:
> {noformat}
>   java.lang.ClassCastException: org.apache.pdfbox.cos.COSArray cannot be 
> cast to org.apache.pdfbox.cos.COSString
>   at 
> org.apache.pdfbox.pdmodel.common.COSArrayList.convertCOSStringCOSArrayToList(COSArrayList.java:367)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.FieldUtils.getPairableItems(FieldUtils.java:182)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDChoice.getOptions(PDChoice.java:91)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDChoice.getOptionsExportValues(PDChoice.java:210)
> {noformat}
> The problem is that the expected "stringArray" also contains COSArrays with 
> value and label for the options:
> {noformat}
>   COSArray{[COSString{ }, COSArray{[COSString{Mr}, COSString{MR}]}, 
> COSArray{[COSString{Mrs}, COSString{MRS}]}, COSArray{[COSString{Miss}, 
> COSString{MISS}]}, COSArray{[COSString{Ms}, COSString{MS}]}]}
> {noformat}
> This does not seem to be expected in FieldUtils.getPairableItems, which 
> introspects only the first item of the array and thus treats the array as an 
> array of strings.
> I found the bug with PDFBox 2.0.4 and upgraded to 2.0.9 which didn't help.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4185) Fetching options for PDChoice causes ClassCastException

2018-04-07 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429410#comment-16429410
 ] 

ASF subversion and git services commented on PDFBOX-4185:
-

Commit 1828596 from [~msahyoun] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1828596 ]

PDFBOX-4185: support COSString, COSArray mixed options entries

> Fetching options for PDChoice causes ClassCastException 
> 
>
> Key: PDFBOX-4185
> URL: https://issues.apache.org/jira/browse/PDFBOX-4185
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm
>Affects Versions: 2.0.4, 2.0.9, 3.0.0 PDFBox
>Reporter: Maruan Sahyoun
>Assignee: Maruan Sahyoun
>Priority: Major
> Fix For: 2.0.10, 3.0.0 PDFBox
>
>
> I am trying to fetch the options available for a PDChoice field in a form but 
> get a ClassCastException from the PDFBox internals.
> The problematic PDF is an Inheritance Tax form from the UK's Revenue and 
> Customs, specifically I am currently looking at IHT405:
> https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/697346/IHT405_online.pdf
> I use this code to iterate over the fields:
> {code}
>   PDDocument doc = PDDocument.load(resource.getFile());
>   PDDocumentCatalog catalog = doc.getDocumentCatalog();
>   PDAcroForm form = catalog.getAcroForm();
>   for (PDField field : form.getFields()) {
>   if ("Ch".equals(field.getFieldType())) {
>   PDChoice choice = (PDChoice) field;
>   // All these variants fail with a ClassCastException:
>   choice.getOptions();
>   choice.getOptionsDisplayValues();
>   choice.getOptionsExportValues(); // internally just 
> delegates to getOptions()
>   }
>   }
> {code}
> This is a stacktrace for e.g. the getOptionsExportValues() call:
> {noformat}
>   java.lang.ClassCastException: org.apache.pdfbox.cos.COSArray cannot be 
> cast to org.apache.pdfbox.cos.COSString
>   at 
> org.apache.pdfbox.pdmodel.common.COSArrayList.convertCOSStringCOSArrayToList(COSArrayList.java:367)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.FieldUtils.getPairableItems(FieldUtils.java:182)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDChoice.getOptions(PDChoice.java:91)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDChoice.getOptionsExportValues(PDChoice.java:210)
> {noformat}
> The problem is that the expected "stringArray" also contains COSArrays with 
> value and label for the options:
> {noformat}
>   COSArray{[COSString{ }, COSArray{[COSString{Mr}, COSString{MR}]}, 
> COSArray{[COSString{Mrs}, COSString{MRS}]}, COSArray{[COSString{Miss}, 
> COSString{MISS}]}, COSArray{[COSString{Ms}, COSString{MS}]}]}
> {noformat}
> This does not seem to be expected in FieldUtils.getPairableItems, which 
> introspects only the first item of the array and thus treats the array as an 
> array of strings.
> I found the bug with PDFBox 2.0.4 and upgraded to 2.0.9 which didn't help.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-4185) Fetching options for PDChoice causes ClassCastException

2018-04-07 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429408#comment-16429408
 ] 

ASF subversion and git services commented on PDFBOX-4185:
-

Commit 1828595 from [~msahyoun] in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1828595 ]

PDFBOX-4185: support COSString, COSArray mixed options entries

> Fetching options for PDChoice causes ClassCastException 
> 
>
> Key: PDFBOX-4185
> URL: https://issues.apache.org/jira/browse/PDFBOX-4185
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm
>Affects Versions: 2.0.4, 2.0.9, 3.0.0 PDFBox
>Reporter: Maruan Sahyoun
>Assignee: Maruan Sahyoun
>Priority: Major
> Fix For: 2.0.10, 3.0.0 PDFBox
>
>
> I am trying to fetch the options available for a PDChoice field in a form but 
> get a ClassCastException from the PDFBox internals.
> The problematic PDF is an Inheritance Tax form from the UK's Revenue and 
> Customs, specifically I am currently looking at IHT405:
> https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/697346/IHT405_online.pdf
> I use this code to iterate over the fields:
> {code}
>   PDDocument doc = PDDocument.load(resource.getFile());
>   PDDocumentCatalog catalog = doc.getDocumentCatalog();
>   PDAcroForm form = catalog.getAcroForm();
>   for (PDField field : form.getFields()) {
>   if ("Ch".equals(field.getFieldType())) {
>   PDChoice choice = (PDChoice) field;
>   // All these variants fail with a ClassCastException:
>   choice.getOptions();
>   choice.getOptionsDisplayValues();
>   choice.getOptionsExportValues(); // internally just 
> delegates to getOptions()
>   }
>   }
> {code}
> This is a stacktrace for e.g. the getOptionsExportValues() call:
> {noformat}
>   java.lang.ClassCastException: org.apache.pdfbox.cos.COSArray cannot be 
> cast to org.apache.pdfbox.cos.COSString
>   at 
> org.apache.pdfbox.pdmodel.common.COSArrayList.convertCOSStringCOSArrayToList(COSArrayList.java:367)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.FieldUtils.getPairableItems(FieldUtils.java:182)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDChoice.getOptions(PDChoice.java:91)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDChoice.getOptionsExportValues(PDChoice.java:210)
> {noformat}
> The problem is that the expected "stringArray" also contains COSArrays with 
> value and label for the options:
> {noformat}
>   COSArray{[COSString{ }, COSArray{[COSString{Mr}, COSString{MR}]}, 
> COSArray{[COSString{Mrs}, COSString{MRS}]}, COSArray{[COSString{Miss}, 
> COSString{MISS}]}, COSArray{[COSString{Ms}, COSString{MS}]}]}
> {noformat}
> This does not seem to be expected in FieldUtils.getPairableItems, which 
> introspects only the first item of the array and thus treats the array as an 
> array of strings.
> I found the bug with PDFBox 2.0.4 and upgraded to 2.0.9 which didn't help.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Assigned] (PDFBOX-4185) Fetching options for PDChoice causes ClassCastException

2018-04-07 Thread Maruan Sahyoun (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maruan Sahyoun reassigned PDFBOX-4185:
--

Assignee: Maruan Sahyoun

> Fetching options for PDChoice causes ClassCastException 
> 
>
> Key: PDFBOX-4185
> URL: https://issues.apache.org/jira/browse/PDFBOX-4185
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm
>Affects Versions: 2.0.4, 2.0.9, 3.0.0 PDFBox
>Reporter: Maruan Sahyoun
>Assignee: Maruan Sahyoun
>Priority: Major
> Fix For: 2.0.10, 3.0.0 PDFBox
>
>
> I am trying to fetch the options available for a PDChoice field in a form but 
> get a ClassCastException from the PDFBox internals.
> The problematic PDF is an Inheritance Tax form from the UK's Revenue and 
> Customs, specifically I am currently looking at IHT405:
> https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/697346/IHT405_online.pdf
> I use this code to iterate over the fields:
> {code}
>   PDDocument doc = PDDocument.load(resource.getFile());
>   PDDocumentCatalog catalog = doc.getDocumentCatalog();
>   PDAcroForm form = catalog.getAcroForm();
>   for (PDField field : form.getFields()) {
>   if ("Ch".equals(field.getFieldType())) {
>   PDChoice choice = (PDChoice) field;
>   // All these variants fail with a ClassCastException:
>   choice.getOptions();
>   choice.getOptionsDisplayValues();
>   choice.getOptionsExportValues(); // internally just 
> delegates to getOptions()
>   }
>   }
> {code}
> This is a stacktrace for e.g. the getOptionsExportValues() call:
> {noformat}
>   java.lang.ClassCastException: org.apache.pdfbox.cos.COSArray cannot be 
> cast to org.apache.pdfbox.cos.COSString
>   at 
> org.apache.pdfbox.pdmodel.common.COSArrayList.convertCOSStringCOSArrayToList(COSArrayList.java:367)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.FieldUtils.getPairableItems(FieldUtils.java:182)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDChoice.getOptions(PDChoice.java:91)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDChoice.getOptionsExportValues(PDChoice.java:210)
> {noformat}
> The problem is that the expected "stringArray" also contains COSArrays with 
> value and label for the options:
> {noformat}
>   COSArray{[COSString{ }, COSArray{[COSString{Mr}, COSString{MR}]}, 
> COSArray{[COSString{Mrs}, COSString{MRS}]}, COSArray{[COSString{Miss}, 
> COSString{MISS}]}, COSArray{[COSString{Ms}, COSString{MS}]}]}
> {noformat}
> This does not seem to be expected in FieldUtils.getPairableItems, which 
> introspects only the first item of the array and thus treats the array as an 
> array of strings.
> I found the bug with PDFBox 2.0.4 and upgraded to 2.0.9 which didn't help.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4185) Fetching options for PDChoice causes ClassCastException

2018-04-07 Thread Maruan Sahyoun (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-4185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maruan Sahyoun updated PDFBOX-4185:
---
Fix Version/s: 3.0.0 PDFBox
   2.0.10

> Fetching options for PDChoice causes ClassCastException 
> 
>
> Key: PDFBOX-4185
> URL: https://issues.apache.org/jira/browse/PDFBOX-4185
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm
>Affects Versions: 2.0.4, 2.0.9, 3.0.0 PDFBox
>Reporter: Maruan Sahyoun
>Assignee: Maruan Sahyoun
>Priority: Major
> Fix For: 2.0.10, 3.0.0 PDFBox
>
>
> I am trying to fetch the options available for a PDChoice field in a form but 
> get a ClassCastException from the PDFBox internals.
> The problematic PDF is an Inheritance Tax form from the UK's Revenue and 
> Customs, specifically I am currently looking at IHT405:
> https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/697346/IHT405_online.pdf
> I use this code to iterate over the fields:
> {code}
>   PDDocument doc = PDDocument.load(resource.getFile());
>   PDDocumentCatalog catalog = doc.getDocumentCatalog();
>   PDAcroForm form = catalog.getAcroForm();
>   for (PDField field : form.getFields()) {
>   if ("Ch".equals(field.getFieldType())) {
>   PDChoice choice = (PDChoice) field;
>   // All these variants fail with a ClassCastException:
>   choice.getOptions();
>   choice.getOptionsDisplayValues();
>   choice.getOptionsExportValues(); // internally just 
> delegates to getOptions()
>   }
>   }
> {code}
> This is a stacktrace for e.g. the getOptionsExportValues() call:
> {noformat}
>   java.lang.ClassCastException: org.apache.pdfbox.cos.COSArray cannot be 
> cast to org.apache.pdfbox.cos.COSString
>   at 
> org.apache.pdfbox.pdmodel.common.COSArrayList.convertCOSStringCOSArrayToList(COSArrayList.java:367)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.FieldUtils.getPairableItems(FieldUtils.java:182)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDChoice.getOptions(PDChoice.java:91)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDChoice.getOptionsExportValues(PDChoice.java:210)
> {noformat}
> The problem is that the expected "stringArray" also contains COSArrays with 
> value and label for the options:
> {noformat}
>   COSArray{[COSString{ }, COSArray{[COSString{Mr}, COSString{MR}]}, 
> COSArray{[COSString{Mrs}, COSString{MRS}]}, COSArray{[COSString{Miss}, 
> COSString{MISS}]}, COSArray{[COSString{Ms}, COSString{MS}]}]}
> {noformat}
> This does not seem to be expected in FieldUtils.getPairableItems, which 
> introspects only the first item of the array and thus treats the array as an 
> array of strings.
> I found the bug with PDFBox 2.0.4 and upgraded to 2.0.9 which didn't help.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-4184) [PATCH]: Support simple lossless compression of 16 bit RGB images

2018-04-07 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-4184?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-4184:

Fix Version/s: 3.0.0 PDFBox
   2.0.10

> [PATCH]: Support simple lossless compression of 16 bit RGB images
> -
>
> Key: PDFBOX-4184
> URL: https://issues.apache.org/jira/browse/PDFBOX-4184
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Writing
>Affects Versions: 2.0.9
>Reporter: Emmeran Seehuber
>Priority: Minor
> Fix For: 2.0.10, 3.0.0 PDFBox
>
> Attachments: pdfbox_support_16bit_image_write.patch
>
>
> The attached patch add support to write 16 bit per component images 
> correctly. I've integrated a test for this here: 
> [https://github.com/rototor/pdfbox-graphics2d/commit/8bf089cb74945bd4f0f15054754f51dd5b361fe9]
> It only supports 16-Bit TYPE_CUSTOM with DataType == USHORT images - but this 
> is what you usually get when you read a 16 bit PNG file.
> This would also fix [https://github.com/danfickle/openhtmltopdf/issues/173].
> The patch is against 2.0.9, but should apply to 3.0.0 too.
> There is still some room for improvements when writing lossless images, as 
> the images are currently not efficiently encoded. I.e. you could use PNG 
> encodings to get a better compression. (By adding a COSName.DECODE_PARMS with 
> a COSName.PREDICTOR == 15 and encoding the images as PNG). But this is 
> something for a later patch. It would also need another API, as there is a 
> tradeoff speed vs compression ratio. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Created] (PDFBOX-4185) Fetching options for PDChoice causes ClassCastException

2018-04-07 Thread Maruan Sahyoun (JIRA)
Maruan Sahyoun created PDFBOX-4185:
--

 Summary: Fetching options for PDChoice causes ClassCastException 
 Key: PDFBOX-4185
 URL: https://issues.apache.org/jira/browse/PDFBOX-4185
 Project: PDFBox
  Issue Type: Bug
  Components: AcroForm
Affects Versions: 2.0.9, 2.0.4, 3.0.0 PDFBox
Reporter: Maruan Sahyoun


I am trying to fetch the options available for a PDChoice field in a form but 
get a ClassCastException from the PDFBox internals.

The problematic PDF is an Inheritance Tax form from the UK's Revenue and 
Customs, specifically I am currently looking at IHT405:

https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/697346/IHT405_online.pdf

I use this code to iterate over the fields:

{code}
PDDocument doc = PDDocument.load(resource.getFile());
PDDocumentCatalog catalog = doc.getDocumentCatalog();
PDAcroForm form = catalog.getAcroForm();
for (PDField field : form.getFields()) {
if ("Ch".equals(field.getFieldType())) {
PDChoice choice = (PDChoice) field;
// All these variants fail with a ClassCastException:
choice.getOptions();
choice.getOptionsDisplayValues();
choice.getOptionsExportValues(); // internally just 
delegates to getOptions()
}
}
{code}
This is a stacktrace for e.g. the getOptionsExportValues() call:

{noformat}
java.lang.ClassCastException: org.apache.pdfbox.cos.COSArray cannot be 
cast to org.apache.pdfbox.cos.COSString
at 
org.apache.pdfbox.pdmodel.common.COSArrayList.convertCOSStringCOSArrayToList(COSArrayList.java:367)
at 
org.apache.pdfbox.pdmodel.interactive.form.FieldUtils.getPairableItems(FieldUtils.java:182)
at 
org.apache.pdfbox.pdmodel.interactive.form.PDChoice.getOptions(PDChoice.java:91)
at 
org.apache.pdfbox.pdmodel.interactive.form.PDChoice.getOptionsExportValues(PDChoice.java:210)

{noformat}


The problem is that the expected "stringArray" also contains COSArrays with 
value and label for the options:

{noformat}
COSArray{[COSString{ }, COSArray{[COSString{Mr}, COSString{MR}]}, 
COSArray{[COSString{Mrs}, COSString{MRS}]}, COSArray{[COSString{Miss}, 
COSString{MISS}]}, COSArray{[COSString{Ms}, COSString{MS}]}]}
{noformat}

This does not seem to be expected in FieldUtils.getPairableItems, which 
introspects only the first item of the array and thus treats the array as an 
array of strings.

I found the bug with PDFBox 2.0.4 and upgraded to 2.0.9 which didn't help.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Created] (PDFBOX-4184) [PATCH]: Support simple lossless compression of 16 bit RGB images

2018-04-07 Thread Emmeran Seehuber (JIRA)
Emmeran Seehuber created PDFBOX-4184:


 Summary: [PATCH]: Support simple lossless compression of 16 bit 
RGB images
 Key: PDFBOX-4184
 URL: https://issues.apache.org/jira/browse/PDFBOX-4184
 Project: PDFBox
  Issue Type: Improvement
  Components: Writing
Affects Versions: 2.0.9
Reporter: Emmeran Seehuber
 Attachments: pdfbox_support_16bit_image_write.patch

The attached patch add support to write 16 bit per component images correctly. 
I've integrated a test for this here: 
[https://github.com/rototor/pdfbox-graphics2d/commit/8bf089cb74945bd4f0f15054754f51dd5b361fe9]

It only supports 16-Bit TYPE_CUSTOM with DataType == USHORT images - but this 
is what you usually get when you read a 16 bit PNG file.

This would also fix [https://github.com/danfickle/openhtmltopdf/issues/173].

The patch is against 2.0.9, but should apply to 3.0.0 too.

There is still some room for improvements when writing lossless images, as the 
images are currently not efficiently encoded. I.e. you could use PNG encodings 
to get a better compression. (By adding a COSName.DECODE_PARMS with a 
COSName.PREDICTOR == 15 and encoding the images as PNG). But this is something 
for a later patch. It would also need another API, as there is a tradeoff speed 
vs compression ratio. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org