[jira] [Commented] (PDFBOX-3131) Reduce amount of intermediate data and objects to reduce memory footprint/complexity

2015-12-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034486#comment-15034486
 ] 

ASF subversion and git services commented on PDFBOX-3131:
-

Commit 1717510 from [~jahewson] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1717510 ]

PDFBOX-3131: Expose CFF font data without storing it in memory

> Reduce amount of intermediate data and objects to reduce memory 
> footprint/complexity
> 
>
> Key: PDFBOX-3131
> URL: https://issues.apache.org/jira/browse/PDFBOX-3131
> Project: PDFBox
>  Issue Type: Improvement
>  Components: FontBox
>Affects Versions: 2.0.0
>Reporter: Andreas Lehmkühler
>Assignee: Andreas Lehmkühler
> Fix For: 2.0.0
>
>
> The CFFParser holds a lot of intermediate data and produces a lot of objects 
> to do so. The idea is to reduce the amount of such objects and dat ot reduce 
> the memory footprint and the complexity.
> - the class IndexData holds intermediate data creates byte array everytime 
> when getBytes is called. I'm going to replace the class with a simple list to 
> reduce the memory footprint and the complexity
> - remove unused members of private classes
> - create a list of strings instead of a list of byte arrays which is used to 
> create those strings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3062) Text extraction and height different in 2.0

2015-12-01 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034507#comment-15034507
 ] 

John Hewson edited comment on PDFBOX-3062 at 12/1/15 8:22 PM:
--

BBox + CapHeight isn't reliable either. Either we should be using a reliable 
metric, or using some logical metric, but combining two metrics doesn't make 
sense. The BBox is fine, it's just not what you want it to be, i.e. a 
meaningful proxy for a glyph's visual bounds.


was (Author: jahewson):
BBox + CapHeight isn't reliable either. Either we should be using a reliable 
metric, or using some logical metric, but combining two metrics doesn't make 
sense.

> Text extraction and height different in 2.0
> ---
>
> Key: PDFBOX-3062
> URL: https://issues.apache.org/jira/browse/PDFBOX-3062
> Project: PDFBox
>  Issue Type: Sub-task
>  Components: Text extraction
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Fix For: 2.0.0
>
> Attachments: 005021-reduced.pdf, 
> PDFBOX-3062-H6NIYQXHLPGD3GI6SNIYINRAZBCDHUCB-reduced-marked-1.png, 
> PDFBOX-3062-H6NIYQXHLPGD3GI6SNIYINRAZBCDHUCB-reduced.pdf, 
> PDFBOX-3062-H6NIYQXHLPGD3GI6SNIYINRAZBCDHUCB.pdf, 
> PDFBOX-3062-N2MOQ7YZICIYGTPLQJAWJ4HLN6CCEMHZ-reduced.pdf, garbled text 2.pdf
>
>
> AR:
> {code}
> WITH THE increasing complexity of optical modules,
> {code}
> 1.8:
> {code}
> WITH THE increasing complexity of optical modules,
> String[39.6,399.6 fs=1.0 xscale=29.888 height=20.114626 space=7.472 
> width=28.214272]W
> String[69.488,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=3.3176804]I
> String[72.80568,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=6.0873947]T
> String[78.893074,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=7.1932907]H
> String[90.71916,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=6.0873947]T
> String[96.80656,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=7.1932907]H
> {code}
> 2.0:
> {code}
> W
> ITH THE increasing complexity of optical modules,
> String[39.6,399.6 fs=1.0 xscale=29.888 height=9.584274 space=7.472 
> width=28.209717]W
> String[69.488,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=3.3177567]I
> String[72.805756,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=6.0858]T
> String[78.891556,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=7.1949615]H
> String[90.719315,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=6.0858]T
> String[96.805115,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=7.1949615]H
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3138) PDTextField doesn't accept any Hebrew characters as new value

2015-12-01 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034535#comment-15034535
 ] 

John Hewson commented on PDFBOX-3138:
-

If you're creating the PDFs with Acrobat then change the font embedding 
settings there.

> PDTextField doesn't accept any Hebrew characters as new value
> -
>
> Key: PDFBOX-3138
> URL: https://issues.apache.org/jira/browse/PDFBOX-3138
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, FontBox
>Affects Versions: 2.0.0
> Environment: Eclipse 4.2.2, Windows 7 Pro, JRE 1.8.0_05
>Reporter: Gilad Denneboom
> Fix For: 2.1.0
>
> Attachments: SetHebrewFieldValueTest.java, Test-3-filled.pdf, 
> Test.pdf, Test.txt
>
>
> Trying to set a UTF-8 encoded Hebrew string as the value of a PDTextField 
> fails with the following exception:
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: No glyph for 
> U+05D7 in font AdobeHebrew-Regular
>   at 
> org.apache.pdfbox.pdmodel.font.PDType1CFont.encode(PDType1CFont.java:300)
>   at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:283)
>   at 
> org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:341)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PlainTextFormatter.format(PlainTextFormatter.java:213)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.insertGeneratedAppearance(AppearanceGeneratorHelper.java:373)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceContent(AppearanceGeneratorHelper.java:237)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceValue(AppearanceGeneratorHelper.java:144)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.constructAppearances(PDTextField.java:263)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTerminalField.applyChange(PDTerminalField.java:221)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.setValue(PDTextField.java:218)
>   at SetHebrewFieldValueTest.main(SetHebrewFieldValueTest.java:22)
> {code}
> I've tried using multiple fonts for the field, all of which can handle Hebrew 
> characters just fine, and got the same results in all of them.
> See attached files for a demonstration of the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3062) Text extraction and height different in 2.0

2015-12-01 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034552#comment-15034552
 ] 

Tilman Hausherr commented on PDFBOX-3062:
-

{quote}
BBox + CapHeight isn't reliable either. 
{quote}
How is it not reliable? Do you know of any files that get less tokens extracted?

{quote}
The BBox is fine, it's just not what you want it to be, i.e. a meaningful proxy 
for a glyph's visual bounds. {quote}
That's why CapHeight is used when the BBox isn't helpful.

{quote}
So are we going to solve that problem or not?{quote}
Don't know. The alternatives are:
- use BBox only: some files won't be extracted nicely
- use BBox + Capheight: more files will be extracted nicely, but the code will 
have 9 extra lines you don't like
- calculate a new BBox from actual glyphs: will make software slower, will 
delay release, may or may not be more reliable. (If a font subset has only 
non-capital glyphs, then the 1/2 of the "real bbox" would be too small)

> Text extraction and height different in 2.0
> ---
>
> Key: PDFBOX-3062
> URL: https://issues.apache.org/jira/browse/PDFBOX-3062
> Project: PDFBox
>  Issue Type: Sub-task
>  Components: Text extraction
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Fix For: 2.0.0
>
> Attachments: 005021-reduced.pdf, 
> PDFBOX-3062-H6NIYQXHLPGD3GI6SNIYINRAZBCDHUCB-reduced-marked-1.png, 
> PDFBOX-3062-H6NIYQXHLPGD3GI6SNIYINRAZBCDHUCB-reduced.pdf, 
> PDFBOX-3062-H6NIYQXHLPGD3GI6SNIYINRAZBCDHUCB.pdf, 
> PDFBOX-3062-N2MOQ7YZICIYGTPLQJAWJ4HLN6CCEMHZ-reduced.pdf, garbled text 2.pdf
>
>
> AR:
> {code}
> WITH THE increasing complexity of optical modules,
> {code}
> 1.8:
> {code}
> WITH THE increasing complexity of optical modules,
> String[39.6,399.6 fs=1.0 xscale=29.888 height=20.114626 space=7.472 
> width=28.214272]W
> String[69.488,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=3.3176804]I
> String[72.80568,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=6.0873947]T
> String[78.893074,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=7.1932907]H
> String[90.71916,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=6.0873947]T
> String[96.80656,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=7.1932907]H
> {code}
> 2.0:
> {code}
> W
> ITH THE increasing complexity of optical modules,
> String[39.6,399.6 fs=1.0 xscale=29.888 height=9.584274 space=7.472 
> width=28.209717]W
> String[69.488,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=3.3177567]I
> String[72.805756,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=6.0858]T
> String[78.891556,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=7.1949615]H
> String[90.719315,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=6.0858]T
> String[96.805115,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=7.1949615]H
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3131) Reduce amount of intermediate data and objects to reduce memory footprint/complexity

2015-12-01 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034494#comment-15034494
 ] 

John Hewson commented on PDFBOX-3131:
-

I was using the getData() API for an internal project. I've replaced its 
implementation with a new ByteSource class which allows a deferred byte[] 
source to be passed to CFFParser so that it can re-read the original byte[] 
source on demand, instead of keeping it in memory.

I'd have done something simpler but CFF fonts may be nested inside both TTCs 
and OTFs, so having a getBytes() API is really useful (at least to me).

> Reduce amount of intermediate data and objects to reduce memory 
> footprint/complexity
> 
>
> Key: PDFBOX-3131
> URL: https://issues.apache.org/jira/browse/PDFBOX-3131
> Project: PDFBox
>  Issue Type: Improvement
>  Components: FontBox
>Affects Versions: 2.0.0
>Reporter: Andreas Lehmkühler
>Assignee: Andreas Lehmkühler
> Fix For: 2.0.0
>
>
> The CFFParser holds a lot of intermediate data and produces a lot of objects 
> to do so. The idea is to reduce the amount of such objects and dat ot reduce 
> the memory footprint and the complexity.
> - the class IndexData holds intermediate data creates byte array everytime 
> when getBytes is called. I'm going to replace the class with a simple list to 
> reduce the memory footprint and the complexity
> - remove unused members of private classes
> - create a list of strings instead of a list of byte arrays which is used to 
> create those strings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3062) Text extraction and height different in 2.0

2015-12-01 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034507#comment-15034507
 ] 

John Hewson commented on PDFBOX-3062:
-

BBox + CapHeight isn't reliable either. Either we should be using a reliable 
metric, or using some logical metric, but combining two metrics doesn't make 
sense.

> Text extraction and height different in 2.0
> ---
>
> Key: PDFBOX-3062
> URL: https://issues.apache.org/jira/browse/PDFBOX-3062
> Project: PDFBox
>  Issue Type: Sub-task
>  Components: Text extraction
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Fix For: 2.0.0
>
> Attachments: 005021-reduced.pdf, 
> PDFBOX-3062-H6NIYQXHLPGD3GI6SNIYINRAZBCDHUCB-reduced-marked-1.png, 
> PDFBOX-3062-H6NIYQXHLPGD3GI6SNIYINRAZBCDHUCB-reduced.pdf, 
> PDFBOX-3062-H6NIYQXHLPGD3GI6SNIYINRAZBCDHUCB.pdf, 
> PDFBOX-3062-N2MOQ7YZICIYGTPLQJAWJ4HLN6CCEMHZ-reduced.pdf, garbled text 2.pdf
>
>
> AR:
> {code}
> WITH THE increasing complexity of optical modules,
> {code}
> 1.8:
> {code}
> WITH THE increasing complexity of optical modules,
> String[39.6,399.6 fs=1.0 xscale=29.888 height=20.114626 space=7.472 
> width=28.214272]W
> String[69.488,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=3.3176804]I
> String[72.80568,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=6.0873947]T
> String[78.893074,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=7.1932907]H
> String[90.71916,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=6.0873947]T
> String[96.80656,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=7.1932907]H
> {code}
> 2.0:
> {code}
> W
> ITH THE increasing complexity of optical modules,
> String[39.6,399.6 fs=1.0 xscale=29.888 height=9.584274 space=7.472 
> width=28.209717]W
> String[69.488,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=3.3177567]I
> String[72.805756,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=6.0858]T
> String[78.891556,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=7.1949615]H
> String[90.719315,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=6.0858]T
> String[96.805115,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=7.1949615]H
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3062) Text extraction and height different in 2.0

2015-12-01 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034507#comment-15034507
 ] 

John Hewson edited comment on PDFBOX-3062 at 12/1/15 8:22 PM:
--

BBox + CapHeight isn't reliable either. Either we should be using a reliable 
metric, or using some logical metric, but combining two metrics doesn't make 
sense. The BBox is fine, it's just not what you want it to be, i.e. a 
meaningful proxy for a glyph's visual bounds. So are we going to solve that 
problem or not?


was (Author: jahewson):
BBox + CapHeight isn't reliable either. Either we should be using a reliable 
metric, or using some logical metric, but combining two metrics doesn't make 
sense. The BBox is fine, it's just not what you want it to be, i.e. a 
meaningful proxy for a glyph's visual bounds.

> Text extraction and height different in 2.0
> ---
>
> Key: PDFBOX-3062
> URL: https://issues.apache.org/jira/browse/PDFBOX-3062
> Project: PDFBox
>  Issue Type: Sub-task
>  Components: Text extraction
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Fix For: 2.0.0
>
> Attachments: 005021-reduced.pdf, 
> PDFBOX-3062-H6NIYQXHLPGD3GI6SNIYINRAZBCDHUCB-reduced-marked-1.png, 
> PDFBOX-3062-H6NIYQXHLPGD3GI6SNIYINRAZBCDHUCB-reduced.pdf, 
> PDFBOX-3062-H6NIYQXHLPGD3GI6SNIYINRAZBCDHUCB.pdf, 
> PDFBOX-3062-N2MOQ7YZICIYGTPLQJAWJ4HLN6CCEMHZ-reduced.pdf, garbled text 2.pdf
>
>
> AR:
> {code}
> WITH THE increasing complexity of optical modules,
> {code}
> 1.8:
> {code}
> WITH THE increasing complexity of optical modules,
> String[39.6,399.6 fs=1.0 xscale=29.888 height=20.114626 space=7.472 
> width=28.214272]W
> String[69.488,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=3.3176804]I
> String[72.80568,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=6.0873947]T
> String[78.893074,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=7.1932907]H
> String[90.71916,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=6.0873947]T
> String[96.80656,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=7.1932907]H
> {code}
> 2.0:
> {code}
> W
> ITH THE increasing complexity of optical modules,
> String[39.6,399.6 fs=1.0 xscale=29.888 height=9.584274 space=7.472 
> width=28.209717]W
> String[69.488,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=3.3177567]I
> String[72.805756,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=6.0858]T
> String[78.891556,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=7.1949615]H
> String[90.719315,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=6.0858]T
> String[96.805115,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=7.1949615]H
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Resolved] (PDFBOX-3145) Security manager fails for .pdfbox.cache

2015-12-01 Thread John Hewson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson resolved PDFBOX-3145.
-
Resolution: Fixed

> Security manager fails for .pdfbox.cache
> 
>
> Key: PDFBOX-3145
> URL: https://issues.apache.org/jira/browse/PDFBOX-3145
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: simon steiner
>Assignee: John Hewson
> Fix For: 2.0.0
>
>
> Caused by: java.security.AccessControlException: access denied 
> ("java.io.FilePermission" "/home/simon/.pdfbox.cache" "read")
>   at 
> java.security.AccessControlContext.checkPermission(AccessControlContext.java:457)
>   at 
> java.security.AccessController.checkPermission(AccessController.java:884)
>   at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
>   at java.lang.SecurityManager.checkRead(SecurityManager.java:888)
>   at java.io.File.exists(File.java:814)
>   at 
> org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.loadDiskCache(FileSystemFontProvider.java:357)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3136) False negative on PDF/A-1A with wrongly given causes " Invalid graphics object, DestOutputProfile isn't a valid ICCProfile: Invalid ICC Profile Data" and "Invalid Colo

2015-12-01 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034686#comment-15034686
 ] 

Tilman Hausherr commented on PDFBOX-3136:
-

I also tested with the current veraPDF software (v 0.7.45). It also fails, 
although for an apparent different reason, but still related to output
 intents:
{quote}
DeviceRGB may be used only if the file has a PDF/A-1 OutputIntent that uses an 
RGB colour space
{quote}

As a paying Adobe Acrobat DC customer, I suggest you contact their support and 
ask why the file fails with three different validators and succeeds with theirs.

> False negative on PDF/A-1A with wrongly given causes " Invalid graphics 
> object, DestOutputProfile isn't a valid ICCProfile: Invalid ICC Profile Data" 
> and "Invalid Color space, The operator "rg" can't be used with CMYK Profile"
> --
>
> Key: PDFBOX-3136
> URL: https://issues.apache.org/jira/browse/PDFBOX-3136
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 2.0.0
>Reporter: Antoine Ribes
> Attachments: test_little-A1a.pdf
>
>
> Using the code of the CookBook for PDF/A validation (given for 1.8.10) :
> - with the test_little-A1a.pdf file (Adobe preflight (and pdfbox:1.8.10) 
> tells me it's a valid PDF/A-1A)
> - and only replacing the code "parser.parse()" with 
> "parser.parse(Format.PDF_A1A)",
> result.isValid() is false with version 2.0.0-RC2. Displayed results errors 
> are :
> - 2.1.4 - Invalid graphics object, DestOutputProfile isn't a valid 
> ICCProfile: Invalid ICC Profile Data
> - 2.1.4 - Invalid graphics object, DestOutputProfile isn't a valid 
> ICCProfile. Caused by : Invalid ICC Profile Data
> - 2.4.1 - Invalid Color space, The operator "rg" can't be used with CMYK 
> Profile
> Some log is displayed :
> WARN [org.apache.pdfbox.filter.FlateFilter] - FlateFilter: premature end of 
> stream due to a DataFormatException
> DEBUG [org.apache.pdfbox.io.ScratchFileBuffer] - ScratchFileBuffer not closed!
> WARN [org.apache.pdfbox.filter.FlateFilter] - FlateFilter: premature end of 
> stream due to a DataFormatException
> Note : Running same code with the pdfbox and preflight version 2.0.0-RC1 on 
> the same file, I get the exception :
> org.apache.pdfbox.preflight.exception.ValidationException: Unable to parse 
> the ICC Profile.
>   at 
> org.apache.pdfbox.preflight.process.CatalogValidationProcess.validateICCProfile(CatalogValidationProcess.java:383)
>   at 
> org.apache.pdfbox.preflight.process.CatalogValidationProcess.validateOutputIntent(CatalogValidationProcess.java:285)
>   at 
> org.apache.pdfbox.preflight.process.CatalogValidationProcess.validate(CatalogValidationProcess.java:148)
>   at 
> org.apache.pdfbox.preflight.utils.ContextHelper.callValidation(ContextHelper.java:84)
>   at 
> org.apache.pdfbox.preflight.utils.ContextHelper.validateElement(ContextHelper.java:122)
>   at 
> org.apache.pdfbox.preflight.PreflightDocument.validate(PreflightDocument.java:163)
> [...]
> Caused by: java.io.IOException: java.util.zip.DataFormatException: incorrect 
> data check
>   at org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:83)
>   at org.apache.pdfbox.cos.COSInputStream.create(COSInputStream.java:69)
>   at org.apache.pdfbox.cos.COSStream.createInputStream(COSStream.java:163)
>   at 
> org.apache.pdfbox.preflight.process.CatalogValidationProcess.validateICCProfile(CatalogValidationProcess.java:360)
>   ... 29 more
> Caused by: java.util.zip.DataFormatException: incorrect data check
>   at java.util.zip.Inflater.inflateBytes(Native Method)
> And a similar result as with 2.0.0-RC2 is obtained with 1.8.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3145) Security manager fails for .pdfbox.cache

2015-12-01 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034504#comment-15034504
 ] 

John Hewson commented on PDFBOX-3145:
-

If you can't read from the filesystem, why are you using FileSystemFontProvider?

> Security manager fails for .pdfbox.cache
> 
>
> Key: PDFBOX-3145
> URL: https://issues.apache.org/jira/browse/PDFBOX-3145
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: simon steiner
>Assignee: John Hewson
> Fix For: 2.0.0
>
>
> Caused by: java.security.AccessControlException: access denied 
> ("java.io.FilePermission" "/home/simon/.pdfbox.cache" "read")
>   at 
> java.security.AccessControlContext.checkPermission(AccessControlContext.java:457)
>   at 
> java.security.AccessController.checkPermission(AccessController.java:884)
>   at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
>   at java.lang.SecurityManager.checkRead(SecurityManager.java:888)
>   at java.io.File.exists(File.java:814)
>   at 
> org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.loadDiskCache(FileSystemFontProvider.java:357)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3145) Security manager fails for .pdfbox.cache

2015-12-01 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034532#comment-15034532
 ] 

John Hewson commented on PDFBOX-3145:
-

Ok, we can do that. Obviously you won't get any font substitutions.

> Security manager fails for .pdfbox.cache
> 
>
> Key: PDFBOX-3145
> URL: https://issues.apache.org/jira/browse/PDFBOX-3145
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: simon steiner
>Assignee: John Hewson
> Fix For: 2.0.0
>
>
> Caused by: java.security.AccessControlException: access denied 
> ("java.io.FilePermission" "/home/simon/.pdfbox.cache" "read")
>   at 
> java.security.AccessControlContext.checkPermission(AccessControlContext.java:457)
>   at 
> java.security.AccessController.checkPermission(AccessController.java:884)
>   at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
>   at java.lang.SecurityManager.checkRead(SecurityManager.java:888)
>   at java.io.File.exists(File.java:814)
>   at 
> org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.loadDiskCache(FileSystemFontProvider.java:357)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3145) Security manager fails for .pdfbox.cache

2015-12-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034533#comment-15034533
 ] 

ASF subversion and git services commented on PDFBOX-3145:
-

Commit 1717521 from [~jahewson] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1717521 ]

PDFBOX-3145: Log error if file system cannot be read/written

> Security manager fails for .pdfbox.cache
> 
>
> Key: PDFBOX-3145
> URL: https://issues.apache.org/jira/browse/PDFBOX-3145
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: simon steiner
>Assignee: John Hewson
> Fix For: 2.0.0
>
>
> Caused by: java.security.AccessControlException: access denied 
> ("java.io.FilePermission" "/home/simon/.pdfbox.cache" "read")
>   at 
> java.security.AccessControlContext.checkPermission(AccessControlContext.java:457)
>   at 
> java.security.AccessController.checkPermission(AccessController.java:884)
>   at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
>   at java.lang.SecurityManager.checkRead(SecurityManager.java:888)
>   at java.io.File.exists(File.java:814)
>   at 
> org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.loadDiskCache(FileSystemFontProvider.java:357)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3145) Security manager fails for .pdfbox.cache

2015-12-01 Thread simon steiner (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034509#comment-15034509
 ] 

simon steiner commented on PDFBOX-3145:
---

I'm using the default, normally things will be skipped rather than failing when 
they cant access disk

> Security manager fails for .pdfbox.cache
> 
>
> Key: PDFBOX-3145
> URL: https://issues.apache.org/jira/browse/PDFBOX-3145
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: simon steiner
>Assignee: John Hewson
> Fix For: 2.0.0
>
>
> Caused by: java.security.AccessControlException: access denied 
> ("java.io.FilePermission" "/home/simon/.pdfbox.cache" "read")
>   at 
> java.security.AccessControlContext.checkPermission(AccessControlContext.java:457)
>   at 
> java.security.AccessController.checkPermission(AccessController.java:884)
>   at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
>   at java.lang.SecurityManager.checkRead(SecurityManager.java:888)
>   at java.io.File.exists(File.java:814)
>   at 
> org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.loadDiskCache(FileSystemFontProvider.java:357)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3138) PDTextField doesn't accept any Hebrew characters as new value

2015-12-01 Thread Gilad Denneboom (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034524#comment-15034524
 ] 

Gilad Denneboom commented on PDFBOX-3138:
-

Embedding the font: Should I do that with PDFBox or via Acrobat? And how?

Setting NeedAppearances to true before applying the values: That worked!
Of course, the file always opens as "dirty", but that's a minor issue. Thanks 
for that!

It would be nice if this issue could be solved natively within PDFBox, though.

> PDTextField doesn't accept any Hebrew characters as new value
> -
>
> Key: PDFBOX-3138
> URL: https://issues.apache.org/jira/browse/PDFBOX-3138
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, FontBox
>Affects Versions: 2.0.0
> Environment: Eclipse 4.2.2, Windows 7 Pro, JRE 1.8.0_05
>Reporter: Gilad Denneboom
> Fix For: 2.1.0
>
> Attachments: SetHebrewFieldValueTest.java, Test-3-filled.pdf, 
> Test.pdf, Test.txt
>
>
> Trying to set a UTF-8 encoded Hebrew string as the value of a PDTextField 
> fails with the following exception:
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: No glyph for 
> U+05D7 in font AdobeHebrew-Regular
>   at 
> org.apache.pdfbox.pdmodel.font.PDType1CFont.encode(PDType1CFont.java:300)
>   at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:283)
>   at 
> org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:341)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PlainTextFormatter.format(PlainTextFormatter.java:213)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.insertGeneratedAppearance(AppearanceGeneratorHelper.java:373)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceContent(AppearanceGeneratorHelper.java:237)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceValue(AppearanceGeneratorHelper.java:144)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.constructAppearances(PDTextField.java:263)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTerminalField.applyChange(PDTerminalField.java:221)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.setValue(PDTextField.java:218)
>   at SetHebrewFieldValueTest.main(SetHebrewFieldValueTest.java:22)
> {code}
> I've tried using multiple fonts for the field, all of which can handle Hebrew 
> characters just fine, and got the same results in all of them.
> See attached files for a demonstration of the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Assigned] (PDFBOX-3145) Security manager fails for .pdfbox.cache

2015-12-01 Thread John Hewson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson reassigned PDFBOX-3145:
---

Assignee: John Hewson

> Security manager fails for .pdfbox.cache
> 
>
> Key: PDFBOX-3145
> URL: https://issues.apache.org/jira/browse/PDFBOX-3145
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: simon steiner
>Assignee: John Hewson
> Fix For: 2.0.0
>
>
> Caused by: java.security.AccessControlException: access denied 
> ("java.io.FilePermission" "/home/simon/.pdfbox.cache" "read")
>   at 
> java.security.AccessControlContext.checkPermission(AccessControlContext.java:457)
>   at 
> java.security.AccessController.checkPermission(AccessController.java:884)
>   at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
>   at java.lang.SecurityManager.checkRead(SecurityManager.java:888)
>   at java.io.File.exists(File.java:814)
>   at 
> org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.loadDiskCache(FileSystemFontProvider.java:357)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-3145) Security manager fails for .pdfbox.cache

2015-12-01 Thread John Hewson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson updated PDFBOX-3145:

Fix Version/s: 2.0.0

> Security manager fails for .pdfbox.cache
> 
>
> Key: PDFBOX-3145
> URL: https://issues.apache.org/jira/browse/PDFBOX-3145
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: simon steiner
> Fix For: 2.0.0
>
>
> Caused by: java.security.AccessControlException: access denied 
> ("java.io.FilePermission" "/home/simon/.pdfbox.cache" "read")
>   at 
> java.security.AccessControlContext.checkPermission(AccessControlContext.java:457)
>   at 
> java.security.AccessController.checkPermission(AccessController.java:884)
>   at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
>   at java.lang.SecurityManager.checkRead(SecurityManager.java:888)
>   at java.io.File.exists(File.java:814)
>   at 
> org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.loadDiskCache(FileSystemFontProvider.java:357)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3145) Security manager fails for .pdfbox.cache

2015-12-01 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034049#comment-15034049
 ] 

John Hewson commented on PDFBOX-3145:
-

I can't reproduce this. Are you doing something unusual?

> Security manager fails for .pdfbox.cache
> 
>
> Key: PDFBOX-3145
> URL: https://issues.apache.org/jira/browse/PDFBOX-3145
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: simon steiner
>Assignee: John Hewson
> Fix For: 2.0.0
>
>
> Caused by: java.security.AccessControlException: access denied 
> ("java.io.FilePermission" "/home/simon/.pdfbox.cache" "read")
>   at 
> java.security.AccessControlContext.checkPermission(AccessControlContext.java:457)
>   at 
> java.security.AccessController.checkPermission(AccessController.java:884)
>   at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
>   at java.lang.SecurityManager.checkRead(SecurityManager.java:888)
>   at java.io.File.exists(File.java:814)
>   at 
> org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.loadDiskCache(FileSystemFontProvider.java:357)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Build failed in Jenkins: PDFBox-trunk #2642

2015-12-01 Thread Apache Jenkins Server
See 

Changes:

[tilman] PDFBOX-3141: render border of link annotations

[tilman] PDFBOX-3141: add getter/setter /Border

--
[...truncated 284 lines...]
Running org.apache.xmpbox.schema.AdobePDFTest
Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.033 sec - in 
org.apache.xmpbox.schema.AdobePDFTest
Running org.apache.xmpbox.schema.AdobePDFErrorsTest
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.009 sec - in 
org.apache.xmpbox.schema.AdobePDFErrorsTest
Running org.apache.xmpbox.schema.PhotoshopSchemaTest
Tests run: 162, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.074 sec - 
in org.apache.xmpbox.schema.PhotoshopSchemaTest
Running org.apache.xmpbox.schema.XmpRightsSchemaTest
Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.027 sec - in 
org.apache.xmpbox.schema.XmpRightsSchemaTest
Running org.apache.xmpbox.schema.XMPSchemaTest
Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.037 sec - in 
org.apache.xmpbox.schema.XMPSchemaTest
Running org.apache.xmpbox.schema.BasicJobTicketSchemaTest
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.024 sec - in 
org.apache.xmpbox.schema.BasicJobTicketSchemaTest
Running org.apache.xmpbox.schema.PDFAIdentificationTest
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.017 sec - in 
org.apache.xmpbox.schema.PDFAIdentificationTest
Running org.apache.xmpbox.schema.PDFAIdentificationOthersTest
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.007 sec - in 
org.apache.xmpbox.schema.PDFAIdentificationOthersTest
Running org.apache.xmpbox.TestXMPWithDefinedSchemas
Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.053 sec - in 
org.apache.xmpbox.TestXMPWithDefinedSchemas
Running org.apache.xmpbox.SaveMetadataHelperTest
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.004 sec - in 
org.apache.xmpbox.SaveMetadataHelperTest

Results :

Tests run: 809, Failures: 0, Errors: 0, Skipped: 0

[JENKINS] Recording test results
[INFO] 
[INFO] --- animal-sniffer-maven-plugin:1.13:check (check-java-version) @ xmpbox 
---
[INFO] Checking unresolved references to org.codehaus.mojo.signature:java16:1.0
[INFO] 
[INFO] --- maven-bundle-plugin:2.5.3:bundle (default-bundle) @ xmpbox ---
[INFO] 
[INFO] --- maven-site-plugin:3.4:attach-descriptor (attach-descriptor) @ xmpbox 
---
[WARNING] Failed to getClass for org.apache.maven.plugin.source.SourceJarMojo
[INFO] 
[INFO] --- maven-source-plugin:2.3:jar (attach-sources) @ xmpbox ---
[INFO] Building jar: 

[INFO] 
[INFO] --- apache-rat-plugin:0.11:check (default) @ xmpbox ---
[INFO] 51 implicit excludes (use -debug for more details).
[INFO] Exclude: release.properties
[INFO] 136 resources included (use -debug for more details)
Warning:  org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser: Property 
'http://www.oracle.com/xml/jaxp/properties/entityExpansionLimit' is not 
recognized.
Compiler warnings:
  WARNING:  'org.apache.xerces.jaxp.SAXParserImpl: Property 
'http://javax.xml.XMLConstants/property/accessExternalDTD' is not recognized.'
Warning:  org.apache.xerces.parsers.SAXParser: Feature 
'http://javax.xml.XMLConstants/feature/secure-processing' is not recognized.
Warning:  org.apache.xerces.parsers.SAXParser: Property 
'http://javax.xml.XMLConstants/property/accessExternalDTD' is not recognized.
Warning:  org.apache.xerces.parsers.SAXParser: Property 
'http://www.oracle.com/xml/jaxp/properties/entityExpansionLimit' is not 
recognized.
[INFO] Rat check: Summary of files. Unapproved: 0 unknown: 0 generated: 0 
approved: 133 licence.
[INFO] 
[INFO] --- maven-install-plugin:2.5.2:install (default-install) @ xmpbox ---
[INFO] Installing 

 to 
/home/jenkins/jenkins-slave/maven-repositories/2/org/apache/pdfbox/xmpbox/2.0.0-SNAPSHOT/xmpbox-2.0.0-SNAPSHOT.jar
[INFO] Installing 
 to 
/home/jenkins/jenkins-slave/maven-repositories/2/org/apache/pdfbox/xmpbox/2.0.0-SNAPSHOT/xmpbox-2.0.0-SNAPSHOT.pom
[INFO] Installing 

 to 
/home/jenkins/jenkins-slave/maven-repositories/2/org/apache/pdfbox/xmpbox/2.0.0-SNAPSHOT/xmpbox-2.0.0-SNAPSHOT-sources.jar
[INFO] 
[INFO] --- maven-bundle-plugin:2.5.3:install (default-install) @ xmpbox ---
[INFO] Installing 
org/apache/pdfbox/xmpbox/2.0.0-SNAPSHOT/xmpbox-2.0.0-SNAPSHOT.jar
[INFO] Writing OBR metadata
[INFO] 
[INFO] --- maven-deploy-plugin:2.8.2:deploy (default-deploy) @ xmpbox ---
Downloading: 
https://repository.apache.org/content/repositories/snapshots/org/apache/pdfbox/xmpbox/2.0.0-SNAPSHOT/maven-metadata.xml

[jira] [Updated] (PDFBOX-3141) Link annotation borders not rendered

2015-12-01 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-3141:

Summary: Link annotation borders not rendered  (was: Annotation borders not 
rendered)

> Link annotation borders not rendered
> 
>
> Key: PDFBOX-3141
> URL: https://issues.apache.org/jira/browse/PDFBOX-3141
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel, Rendering
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Attachments: annots.pdf, annots.pdf-1.png
>
>
> Borders in annotations are not rendered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3030) Enhance documentation for PDFBox 2.0.0

2015-12-01 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034084#comment-15034084
 ] 

John Hewson commented on PDFBOX-3030:
-

Argh, we should try to remove it I think.

> Enhance documentation for PDFBox 2.0.0
> --
>
> Key: PDFBOX-3030
> URL: https://issues.apache.org/jira/browse/PDFBOX-3030
> Project: PDFBox
>  Issue Type: Task
>  Components: Documentation
>Affects Versions: 2.0.0
>Reporter: Maruan Sahyoun
>Assignee: Maruan Sahyoun
> Attachments: TGH-16862c48-6b0b-410e-8fc6-b1d9f4418ecc.htm
>
>
> Task to track enhancements to the documentation or website as part of PDFBox 
> 2.0.0
> - update javadoc (current as of writing)
> - migration guide 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Resolved] (PDFBOX-3137) Reduce/remove dependency on commons.io in preflight/xmpbox

2015-12-01 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler resolved PDFBOX-3137.

Resolution: Fixed

> Reduce/remove dependency on commons.io in preflight/xmpbox
> --
>
> Key: PDFBOX-3137
> URL: https://issues.apache.org/jira/browse/PDFBOX-3137
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Preflight, XmpBox
>Affects Versions: 2.0.0
>Reporter: Andreas Lehmkühler
>Assignee: Andreas Lehmkühler
> Fix For: 2.0.0
>
>
> The usage of commons.io should be removed or at least reduced to a minimum to 
> avoid the dependency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3131) Reduce amount of intermediate data and objects to reduce memory footprint/complexity

2015-12-01 Thread JIRA

[ 
https://issues.apache.org/jira/browse/PDFBOX-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034163#comment-15034163
 ] 

Andreas Lehmkühler commented on PDFBOX-3131:


The font source data isn't stored anymore. It isn't used anywhere. The only 
hint I've found is PDFBOX-2791. But I didn't got the point as the CFF data 
isn't changed/repaired anywhere.
If the data is really needed, we might implement some sort of a switch.

> Reduce amount of intermediate data and objects to reduce memory 
> footprint/complexity
> 
>
> Key: PDFBOX-3131
> URL: https://issues.apache.org/jira/browse/PDFBOX-3131
> Project: PDFBox
>  Issue Type: Improvement
>  Components: FontBox
>Affects Versions: 2.0.0
>Reporter: Andreas Lehmkühler
>Assignee: Andreas Lehmkühler
> Fix For: 2.0.0
>
>
> The CFFParser holds a lot of intermediate data and produces a lot of objects 
> to do so. The idea is to reduce the amount of such objects and dat ot reduce 
> the memory footprint and the complexity.
> - the class IndexData holds intermediate data creates byte array everytime 
> when getBytes is called. I'm going to replace the class with a simple list to 
> reduce the memory footprint and the complexity
> - remove unused members of private classes
> - create a list of strings instead of a list of byte arrays which is used to 
> create those strings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3141) Link annotation borders not rendered

2015-12-01 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034161#comment-15034161
 ] 

Tilman Hausherr commented on PDFBOX-3141:
-

The lines are thicker than with Adobe Reader, but identical to two other 
viewers (PDF.js and a java competitor). Adobe is not following its own 
specification, which states that the width is in user units.

> Link annotation borders not rendered
> 
>
> Key: PDFBOX-3141
> URL: https://issues.apache.org/jira/browse/PDFBOX-3141
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel, Rendering
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
>  Labels: Annotations
> Attachments: annots.pdf, annots.pdf-1-NEW.png, annots.pdf-1.png
>
>
> Borders in link annotations are not rendered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3133) PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is poor with systems having low RAM < 3GB and lower number of fonts.

2015-12-01 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034060#comment-15034060
 ] 

John Hewson commented on PDFBOX-3133:
-

I'm not seeing any evidence that the number of fonts makes a difference. The 
only number given in the previous test results is "90 fonts". Those results 
seem to show that more RAM results in faster rendering, up to 4GB, which makes 
perfect sense.

> PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is 
> poor with systems having low RAM < 3GB and lower number of fonts.
> ---
>
> Key: PDFBOX-3133
> URL: https://issues.apache.org/jira/browse/PDFBOX-3133
> Project: PDFBox
>  Issue Type: Improvement
>  Components: PDModel
>Affects Versions: 2.0.0
> Environment: MS Windows Systems with low RAM < 3GB and number of 
> fonts were less < 592 (or if desired fonts in PDF to be printed are not 
> available in local system ) 
>Reporter: Sridhar
>Assignee: John Hewson
>  Labels: performance
> Fix For: 2.0.0
>
>
> PDFBox 2.0.0-RC1, SNAPSHOTS and RC2 versions print takes 15+ seconds.
> Steps to reproduce
> -- 
> Use Windows System with < 3 GB RAM
> Use Systems with less number of fonts or without specific fonts in PDF file  
> to be printed.
> Printing PDF file 
> Took 14 to 20 seconds in system with 3 GB RAM which had 522 foints
> Took 24 to 34 seconds in system with 2 GB RAM which had 90 fonts
> Took only 2.5 seconds in system with 8 GB RAM which had 1025 fonts. 
> Doubt
>  
> Not browsed the code, but following is the doubt as causing performance issue.
> Though the code caches fonts by storing fonts in local .pdfbox.cache file 
> first time and caching fonts for subsequent times.
> Not clear whether the code updates the pdfbox fonts cache file if new fonts 
> are found in new PDF file to be printed, while printing subsequent times. 
> If the fonts in PDF file to be printed is not available in the .pdfbox.cache 
> file stored in local system/local system what is the behaviour?  Will the 
> code download fonts and update cache for subsequent times or is it limited by 
> fonts available in local system?  Looks like later is the case and 
> performance got hit either due to RAM or not constantly updating fonts cache 
> or due to un availability of fonts in local system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3144) NullPointerException in TTFSubsetter

2015-12-01 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034072#comment-15034072
 ] 

John Hewson commented on PDFBOX-3144:
-

Can you post the stack trace of the NullPointerException?

> NullPointerException in TTFSubsetter
> 
>
> Key: PDFBOX-3144
> URL: https://issues.apache.org/jira/browse/PDFBOX-3144
> Project: PDFBox
>  Issue Type: Bug
>  Components: FontBox
>Affects Versions: 2.0.0
> Environment: Version 2.0.0-RC2
>Reporter: Philip Helger
>
> An NPE happens in "public void TTFSubsetter.add(int unicode)" because the 
> "unicodeCmap" member is null.
> This might be, because the passed "ttf" member is based on a 
> "MemoryTTFDataStream" and has only 38 glyphs (so it might already be a 
> subset). The available tables of the TTF are only: [fpgm, head, cvt , glyf, 
> loca, gasp, hmtx, prep, hhea, maxp]
> The variables of the underyling font are:
> {code}
> this  PDType0Font  (id=58)
>   afmStandard14   null
>   avgFontWidth0.0 
>   cMapCMap  -> Identity-H
>   cMapUCS2null
>   descendantFont  PDCIDFontType2  (id=155)
>   dictCOSDictionary  -> COSDictionary{(COSName{Type}:COSName{Font}) 
> (COSName{BaseFont}:COSName{AAAMSE+OpenSans-Bold}) 
> (COSName{Subtype}:COSName{Type0}) (COSName{Encoding}:COSName{Identity-H}) 
> (COSName{DescendantFonts}:COSArray{[COSDictionary{(COSName{Type}:COSName{Font})
>  (COSName{Subtype}:COSName{CIDFontType2}) 
> (COSName{BaseFont}:COSName{AAAMSE+OpenSans-Bold}) 
> (COSName{CIDSystemInfo}:COSDictionary{(COSName{Registry}:COSString{Adobe}) 
> (COSName{Ordering}:COSString{Identity}) (COSName{Supplement}:COSInt{0}) }) 
> (COSName{FontDescriptor}:COSDictionary{(COSName{Type}:COSName{FontDescriptor})
>  (COSName{FontName}:COSName{AAAMSE+OpenSans-Bold}) (COSName{Flags}:COSInt{4}) 
> (COSName{FontWeight}:COSFloat{700.0}) (COSName{ItalicAngle}:COSFloat{0.0}) 
> (COSName{FontBBox}:COSArray{[COSFloat{-619.1406}, COSFloat{-292.96875}, 
> COSFloat{1318.8477}, COSFloat{1068.8477}]}) 
> (COSName{Ascent}:COSFloat{1068.8477}) (COSName{Descent}:COSFloat{-292.96875}) 
> (COSName{CapHeight}:COSFloat{713.8672}) 
> (COSName{XHeight}:COSFloat{545.89844}) (COSName{StemV}:COSFloat{251.93846}) 
> (COSName{FontFile2}:COSStream{(COSName{Filter}:COSName{FlateDecode}) 
> (COSName{Length}:COSInt{5625}) (COSName{Length1}:COSInt{8036}) }) 
> (COSName{CIDSet}:COSStream{(COSName{Filter}:COSName{FlateDecode}) 
> (COSName{Length}:COSInt{20}) }) }) (COSName{W}:COSArray{[COSInt{3}, 
> COSArray{[COSInt{260}]}, COSInt{68}, COSArray{[COSInt{604}, COSInt{633}, 
> COSInt{514}, COSInt{633}, COSInt{591}]}, COSInt{74}, COSArray{[COSInt{565}, 
> COSInt{657}, COSInt{305}]}, COSInt{15}, COSArray{[COSInt{290}]}, COSInt{79}, 
> COSArray{[COSInt{305}]}, COSInt{16}, COSArray{[COSInt{322}]}, COSInt{80}, 
> COSArray{[COSInt{982}]}, COSInt{17}, COSArray{[COSInt{285}]}, COSInt{81}, 
> COSArray{[COSInt{657}, COSInt{619}]}, COSInt{19}, COSArray{[COSInt{571}]}, 
> COSInt{83}, COSArray{[COSInt{633}]}, COSInt{20}, COSArray{[COSInt{571}]}, 
> COSInt{85}, COSArray{[COSInt{454}, COSInt{497}, COSInt{434}, COSInt{657}]}, 
> COSInt{27}, COSArray{[COSInt{571}, COSInt{571}, COSInt{285}]}, COSInt{93}, 
> COSArray{[COSInt{488}]}, COSInt{36}, COSArray{[COSInt{690}, COSInt{672}]}, 
> COSInt{40}, COSArray{[COSInt{560}]}, COSInt{48}, COSArray{[COSInt{943}, 
> COSInt{813}]}, COSInt{53}, COSArray{[COSInt{660}, COSInt{551}, COSInt{579}, 
> COSInt{756}]}, COSInt{61}, COSArray{[COSInt{579}]}]}) 
> (COSName{CIDToGIDMap}:COSStream{(COSName{Filter}:COSName{FlateDecode}) 
> (COSName{Length}:COSInt{84}) (COSName{Length1}:COSInt{188}) }) }]}) 
> (COSName{ToUnicode}:COSStream{(COSName{Filter}:COSName{FlateDecode}) 
> (COSName{Length}:COSInt{324}) }) }
>   embedderPDCIDFontType2Embedder 
>   fontDescriptor  null
>   fontWidthOfSpace-1.0
>   isCMapPredefinedtrue
>   isDescendantCJK false   
>   noUnicode   HashSet  -> empty
>   toUnicodeCMap   null
>   widths  null
> {code}
> I will try to find a minimum example on how to reproduce this. Currently it 
> is only reproducible as part of a bigger package :|



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Assigned] (PDFBOX-3140) Different fallback font rendering first and second time

2015-12-01 Thread John Hewson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson reassigned PDFBOX-3140:
---

Assignee: John Hewson

> Different fallback font rendering first and second time
> ---
>
> Key: PDFBOX-3140
> URL: https://issues.apache.org/jira/browse/PDFBOX-3140
> Project: PDFBox
>  Issue Type: Bug
>  Components: FontBox, Rendering
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Assignee: John Hewson
>Priority: Minor
> Fix For: 2.0.0
>
>
> The file from PDFBOX-2563 looks different depending on whether the cache 
> exists, because a different fallback font is used:
> After deleting .pdfbox.cache:
> {quote}
> Using fallback font Batang for CID-keyed TrueType font 
> {quote}
> Second run with existing cache:
> {quote}
> Using fallback font ArialUnicodeMS for CID-keyed TrueType font 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-3140) Different fallback font rendering first and second time

2015-12-01 Thread John Hewson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson updated PDFBOX-3140:

Fix Version/s: 2.0.0

> Different fallback font rendering first and second time
> ---
>
> Key: PDFBOX-3140
> URL: https://issues.apache.org/jira/browse/PDFBOX-3140
> Project: PDFBox
>  Issue Type: Bug
>  Components: FontBox, Rendering
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Assignee: John Hewson
>Priority: Minor
> Fix For: 2.0.0
>
>
> The file from PDFBOX-2563 looks different depending on whether the cache 
> exists, because a different fallback font is used:
> After deleting .pdfbox.cache:
> {quote}
> Using fallback font Batang for CID-keyed TrueType font 
> {quote}
> Second run with existing cache:
> {quote}
> Using fallback font ArialUnicodeMS for CID-keyed TrueType font 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-3141) Link annotation borders not rendered

2015-12-01 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-3141:

Attachment: annots.pdf-1-NEW.png

> Link annotation borders not rendered
> 
>
> Key: PDFBOX-3141
> URL: https://issues.apache.org/jira/browse/PDFBOX-3141
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel, Rendering
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Attachments: annots.pdf, annots.pdf-1-NEW.png, annots.pdf-1.png
>
>
> Borders in link annotations are not rendered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3140) Different fallback font rendering first and second time

2015-12-01 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034087#comment-15034087
 ] 

John Hewson commented on PDFBOX-3140:
-

Oh boy.

> Different fallback font rendering first and second time
> ---
>
> Key: PDFBOX-3140
> URL: https://issues.apache.org/jira/browse/PDFBOX-3140
> Project: PDFBox
>  Issue Type: Bug
>  Components: FontBox, Rendering
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Assignee: John Hewson
>Priority: Minor
> Fix For: 2.0.0
>
>
> The file from PDFBOX-2563 looks different depending on whether the cache 
> exists, because a different fallback font is used:
> After deleting .pdfbox.cache:
> {quote}
> Using fallback font Batang for CID-keyed TrueType font 
> {quote}
> Second run with existing cache:
> {quote}
> Using fallback font ArialUnicodeMS for CID-keyed TrueType font 
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-3138) PDTextField doesn't accept any Hebrew characters as new value

2015-12-01 Thread John Hewson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson updated PDFBOX-3138:

Description: 
Trying to set a UTF-8 encoded Hebrew string as the value of a PDTextField fails 
with the following exception:

{code}
Exception in thread "main" java.lang.IllegalArgumentException: No glyph for 
U+05D7 in font AdobeHebrew-Regular
at 
org.apache.pdfbox.pdmodel.font.PDType1CFont.encode(PDType1CFont.java:300)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:283)
at 
org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:341)
at 
org.apache.pdfbox.pdmodel.interactive.form.PlainTextFormatter.format(PlainTextFormatter.java:213)
at 
org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.insertGeneratedAppearance(AppearanceGeneratorHelper.java:373)
at 
org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceContent(AppearanceGeneratorHelper.java:237)
at 
org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceValue(AppearanceGeneratorHelper.java:144)
at 
org.apache.pdfbox.pdmodel.interactive.form.PDTextField.constructAppearances(PDTextField.java:263)
at 
org.apache.pdfbox.pdmodel.interactive.form.PDTerminalField.applyChange(PDTerminalField.java:221)
at 
org.apache.pdfbox.pdmodel.interactive.form.PDTextField.setValue(PDTextField.java:218)
at SetHebrewFieldValueTest.main(SetHebrewFieldValueTest.java:22)
{code}

I've tried using multiple fonts for the field, all of which can handle Hebrew 
characters just fine, and got the same results in all of them.
See attached files for a demonstration of the issue.

  was:
Trying to set a UTF-8 encoded Hebrew string as the value of a PDTextField fails 
with the following exception:

Exception in thread "main" java.lang.IllegalArgumentException: No glyph for 
U+05D7 in font AdobeHebrew-Regular
at 
org.apache.pdfbox.pdmodel.font.PDType1CFont.encode(PDType1CFont.java:300)
at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:283)
at 
org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:341)
at 
org.apache.pdfbox.pdmodel.interactive.form.PlainTextFormatter.format(PlainTextFormatter.java:213)
at 
org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.insertGeneratedAppearance(AppearanceGeneratorHelper.java:373)
at 
org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceContent(AppearanceGeneratorHelper.java:237)
at 
org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceValue(AppearanceGeneratorHelper.java:144)
at 
org.apache.pdfbox.pdmodel.interactive.form.PDTextField.constructAppearances(PDTextField.java:263)
at 
org.apache.pdfbox.pdmodel.interactive.form.PDTerminalField.applyChange(PDTerminalField.java:221)
at 
org.apache.pdfbox.pdmodel.interactive.form.PDTextField.setValue(PDTextField.java:218)
at SetHebrewFieldValueTest.main(SetHebrewFieldValueTest.java:22)

I've tried using multiple fonts for the field, all of which can handle Hebrew 
characters just fine, and got the same results in all of them.
See attached files for a demonstration of the issue.


> PDTextField doesn't accept any Hebrew characters as new value
> -
>
> Key: PDFBOX-3138
> URL: https://issues.apache.org/jira/browse/PDFBOX-3138
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, FontBox
>Affects Versions: 2.0.0
> Environment: Eclipse 4.2.2, Windows 7 Pro, JRE 1.8.0_05
>Reporter: Gilad Denneboom
>Priority: Minor
> Attachments: SetHebrewFieldValueTest.java, Test.pdf, Test.txt
>
>
> Trying to set a UTF-8 encoded Hebrew string as the value of a PDTextField 
> fails with the following exception:
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: No glyph for 
> U+05D7 in font AdobeHebrew-Regular
>   at 
> org.apache.pdfbox.pdmodel.font.PDType1CFont.encode(PDType1CFont.java:300)
>   at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:283)
>   at 
> org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:341)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PlainTextFormatter.format(PlainTextFormatter.java:213)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.insertGeneratedAppearance(AppearanceGeneratorHelper.java:373)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceContent(AppearanceGeneratorHelper.java:237)
>   at 
> 

[jira] [Commented] (PDFBOX-3131) Reduce amount of intermediate data and objects to reduce memory footprint/complexity

2015-12-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034122#comment-15034122
 ] 

ASF subversion and git services commented on PDFBOX-3131:
-

Commit 1717473 from [~lehmi] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1717473 ]

PDFBOX-3131: don't store the source data of the font

> Reduce amount of intermediate data and objects to reduce memory 
> footprint/complexity
> 
>
> Key: PDFBOX-3131
> URL: https://issues.apache.org/jira/browse/PDFBOX-3131
> Project: PDFBox
>  Issue Type: Improvement
>  Components: FontBox
>Affects Versions: 2.0.0
>Reporter: Andreas Lehmkühler
>Assignee: Andreas Lehmkühler
> Fix For: 2.0.0
>
>
> The CFFParser holds a lot of intermediate data and produces a lot of objects 
> to do so. The idea is to reduce the amount of such objects and dat ot reduce 
> the memory footprint and the complexity.
> - the class IndexData holds intermediate data creates byte array everytime 
> when getBytes is called. I'm going to replace the class with a simple list to 
> reduce the memory footprint and the complexity
> - remove unused members of private classes
> - create a list of strings instead of a list of byte arrays which is used to 
> create those strings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3133) PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is poor with systems having low RAM < 3GB and lower number of fonts.

2015-12-01 Thread Sridhar (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034130#comment-15034130
 ] 

Sridhar commented on PDFBOX-3133:
-

Thanks FYI, first time, when the warning message came it took 45 sec and 
subsequent time it was 18 to 19 secs.  Machine has 2GB RAM.

> PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is 
> poor with systems having low RAM < 3GB and lower number of fonts.
> ---
>
> Key: PDFBOX-3133
> URL: https://issues.apache.org/jira/browse/PDFBOX-3133
> Project: PDFBox
>  Issue Type: Improvement
>  Components: PDModel
>Affects Versions: 2.0.0
> Environment: MS Windows Systems with low RAM < 3GB and number of 
> fonts were less < 592 (or if desired fonts in PDF to be printed are not 
> available in local system ) 
>Reporter: Sridhar
>Assignee: John Hewson
>  Labels: performance
> Fix For: 2.0.0
>
>
> PDFBox 2.0.0-RC1, SNAPSHOTS and RC2 versions print takes 15+ seconds.
> Steps to reproduce
> -- 
> Use Windows System with < 3 GB RAM
> Use Systems with less number of fonts or without specific fonts in PDF file  
> to be printed.
> Printing PDF file 
> Took 14 to 20 seconds in system with 3 GB RAM which had 522 foints
> Took 24 to 34 seconds in system with 2 GB RAM which had 90 fonts
> Took only 2.5 seconds in system with 8 GB RAM which had 1025 fonts. 
> Doubt
>  
> Not browsed the code, but following is the doubt as causing performance issue.
> Though the code caches fonts by storing fonts in local .pdfbox.cache file 
> first time and caching fonts for subsequent times.
> Not clear whether the code updates the pdfbox fonts cache file if new fonts 
> are found in new PDF file to be printed, while printing subsequent times. 
> If the fonts in PDF file to be printed is not available in the .pdfbox.cache 
> file stored in local system/local system what is the behaviour?  Will the 
> code download fonts and update cache for subsequent times or is it limited by 
> fonts available in local system?  Looks like later is the case and 
> performance got hit either due to RAM or not constantly updating fonts cache 
> or due to un availability of fonts in local system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-3138) PDTextField doesn't accept any Hebrew characters as new value

2015-12-01 Thread John Hewson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson updated PDFBOX-3138:

Priority: Major  (was: Minor)

> PDTextField doesn't accept any Hebrew characters as new value
> -
>
> Key: PDFBOX-3138
> URL: https://issues.apache.org/jira/browse/PDFBOX-3138
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, FontBox
>Affects Versions: 2.0.0
> Environment: Eclipse 4.2.2, Windows 7 Pro, JRE 1.8.0_05
>Reporter: Gilad Denneboom
> Fix For: 2.1.0
>
> Attachments: SetHebrewFieldValueTest.java, Test.pdf, Test.txt
>
>
> Trying to set a UTF-8 encoded Hebrew string as the value of a PDTextField 
> fails with the following exception:
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: No glyph for 
> U+05D7 in font AdobeHebrew-Regular
>   at 
> org.apache.pdfbox.pdmodel.font.PDType1CFont.encode(PDType1CFont.java:300)
>   at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:283)
>   at 
> org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:341)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PlainTextFormatter.format(PlainTextFormatter.java:213)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.insertGeneratedAppearance(AppearanceGeneratorHelper.java:373)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceContent(AppearanceGeneratorHelper.java:237)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceValue(AppearanceGeneratorHelper.java:144)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.constructAppearances(PDTextField.java:263)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTerminalField.applyChange(PDTerminalField.java:221)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.setValue(PDTextField.java:218)
>   at SetHebrewFieldValueTest.main(SetHebrewFieldValueTest.java:22)
> {code}
> I've tried using multiple fonts for the field, all of which can handle Hebrew 
> characters just fine, and got the same results in all of them.
> See attached files for a demonstration of the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-3138) PDTextField doesn't accept any Hebrew characters as new value

2015-12-01 Thread John Hewson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson updated PDFBOX-3138:

Fix Version/s: 2.1.0

> PDTextField doesn't accept any Hebrew characters as new value
> -
>
> Key: PDFBOX-3138
> URL: https://issues.apache.org/jira/browse/PDFBOX-3138
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, FontBox
>Affects Versions: 2.0.0
> Environment: Eclipse 4.2.2, Windows 7 Pro, JRE 1.8.0_05
>Reporter: Gilad Denneboom
> Fix For: 2.1.0
>
> Attachments: SetHebrewFieldValueTest.java, Test.pdf, Test.txt
>
>
> Trying to set a UTF-8 encoded Hebrew string as the value of a PDTextField 
> fails with the following exception:
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: No glyph for 
> U+05D7 in font AdobeHebrew-Regular
>   at 
> org.apache.pdfbox.pdmodel.font.PDType1CFont.encode(PDType1CFont.java:300)
>   at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:283)
>   at 
> org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:341)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PlainTextFormatter.format(PlainTextFormatter.java:213)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.insertGeneratedAppearance(AppearanceGeneratorHelper.java:373)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceContent(AppearanceGeneratorHelper.java:237)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceValue(AppearanceGeneratorHelper.java:144)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.constructAppearances(PDTextField.java:263)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTerminalField.applyChange(PDTerminalField.java:221)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.setValue(PDTextField.java:218)
>   at SetHebrewFieldValueTest.main(SetHebrewFieldValueTest.java:22)
> {code}
> I've tried using multiple fonts for the field, all of which can handle Hebrew 
> characters just fine, and got the same results in all of them.
> See attached files for a demonstration of the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3138) PDTextField doesn't accept any Hebrew characters as new value

2015-12-01 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034142#comment-15034142
 ] 

John Hewson commented on PDFBOX-3138:
-

The embedded font used by the field does indeed contain Hebrew glyphs, and a 
valid "cmap" table which can be used to look up those glyphs. The mentioned 
character, U+05D7, is indeed is present in the font. 

The embedded font file is in OpenType format, however the PDF Font dictionary 
is Type1 and specifies WinAnsiEncoding, which does not include Hebrew 
characters. So, strictly speaking, the field cannot be filled using any 
non-ANSI characters and so PDFBox's behaviour is correct.

It would seem that PDFBox could so something more helpful in this instance. 
Filling the form with Acrobat results in the font from the form's DR being 
overridden in the Field itself with a new CIDFontType0 which has been created 
from the DR font. Ideally we would do that.

Do you have any control over the software producing these fields? I might be 
able to offer a workaround.

> PDTextField doesn't accept any Hebrew characters as new value
> -
>
> Key: PDFBOX-3138
> URL: https://issues.apache.org/jira/browse/PDFBOX-3138
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, FontBox
>Affects Versions: 2.0.0
> Environment: Eclipse 4.2.2, Windows 7 Pro, JRE 1.8.0_05
>Reporter: Gilad Denneboom
>Priority: Minor
> Fix For: 2.1.0
>
> Attachments: SetHebrewFieldValueTest.java, Test.pdf, Test.txt
>
>
> Trying to set a UTF-8 encoded Hebrew string as the value of a PDTextField 
> fails with the following exception:
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: No glyph for 
> U+05D7 in font AdobeHebrew-Regular
>   at 
> org.apache.pdfbox.pdmodel.font.PDType1CFont.encode(PDType1CFont.java:300)
>   at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:283)
>   at 
> org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:341)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PlainTextFormatter.format(PlainTextFormatter.java:213)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.insertGeneratedAppearance(AppearanceGeneratorHelper.java:373)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceContent(AppearanceGeneratorHelper.java:237)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceValue(AppearanceGeneratorHelper.java:144)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.constructAppearances(PDTextField.java:263)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTerminalField.applyChange(PDTerminalField.java:221)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.setValue(PDTextField.java:218)
>   at SetHebrewFieldValueTest.main(SetHebrewFieldValueTest.java:22)
> {code}
> I've tried using multiple fonts for the field, all of which can handle Hebrew 
> characters just fine, and got the same results in all of them.
> See attached files for a demonstration of the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3144) NullPointerException in TTFSubsetter

2015-12-01 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034068#comment-15034068
 ] 

John Hewson commented on PDFBOX-3144:
-

{quote}
The available tables of the TTF are only: [fpgm, head, cvt , glyf, loca, gasp, 
hmtx, prep, hhea, maxp]
{quote}

There's your problem. Without a "cmap" table, PDFBox can't use the font.

> NullPointerException in TTFSubsetter
> 
>
> Key: PDFBOX-3144
> URL: https://issues.apache.org/jira/browse/PDFBOX-3144
> Project: PDFBox
>  Issue Type: Bug
>  Components: FontBox
>Affects Versions: 2.0.0
> Environment: Version 2.0.0-RC2
>Reporter: Philip Helger
>
> An NPE happens in "public void TTFSubsetter.add(int unicode)" because the 
> "unicodeCmap" member is null.
> This might be, because the passed "ttf" member is based on a 
> "MemoryTTFDataStream" and has only 38 glyphs (so it might already be a 
> subset). The available tables of the TTF are only: [fpgm, head, cvt , glyf, 
> loca, gasp, hmtx, prep, hhea, maxp]
> The variables of the underyling font are:
> {code}
> this  PDType0Font  (id=58)
>   afmStandard14   null
>   avgFontWidth0.0 
>   cMapCMap  -> Identity-H
>   cMapUCS2null
>   descendantFont  PDCIDFontType2  (id=155)
>   dictCOSDictionary  -> COSDictionary{(COSName{Type}:COSName{Font}) 
> (COSName{BaseFont}:COSName{AAAMSE+OpenSans-Bold}) 
> (COSName{Subtype}:COSName{Type0}) (COSName{Encoding}:COSName{Identity-H}) 
> (COSName{DescendantFonts}:COSArray{[COSDictionary{(COSName{Type}:COSName{Font})
>  (COSName{Subtype}:COSName{CIDFontType2}) 
> (COSName{BaseFont}:COSName{AAAMSE+OpenSans-Bold}) 
> (COSName{CIDSystemInfo}:COSDictionary{(COSName{Registry}:COSString{Adobe}) 
> (COSName{Ordering}:COSString{Identity}) (COSName{Supplement}:COSInt{0}) }) 
> (COSName{FontDescriptor}:COSDictionary{(COSName{Type}:COSName{FontDescriptor})
>  (COSName{FontName}:COSName{AAAMSE+OpenSans-Bold}) (COSName{Flags}:COSInt{4}) 
> (COSName{FontWeight}:COSFloat{700.0}) (COSName{ItalicAngle}:COSFloat{0.0}) 
> (COSName{FontBBox}:COSArray{[COSFloat{-619.1406}, COSFloat{-292.96875}, 
> COSFloat{1318.8477}, COSFloat{1068.8477}]}) 
> (COSName{Ascent}:COSFloat{1068.8477}) (COSName{Descent}:COSFloat{-292.96875}) 
> (COSName{CapHeight}:COSFloat{713.8672}) 
> (COSName{XHeight}:COSFloat{545.89844}) (COSName{StemV}:COSFloat{251.93846}) 
> (COSName{FontFile2}:COSStream{(COSName{Filter}:COSName{FlateDecode}) 
> (COSName{Length}:COSInt{5625}) (COSName{Length1}:COSInt{8036}) }) 
> (COSName{CIDSet}:COSStream{(COSName{Filter}:COSName{FlateDecode}) 
> (COSName{Length}:COSInt{20}) }) }) (COSName{W}:COSArray{[COSInt{3}, 
> COSArray{[COSInt{260}]}, COSInt{68}, COSArray{[COSInt{604}, COSInt{633}, 
> COSInt{514}, COSInt{633}, COSInt{591}]}, COSInt{74}, COSArray{[COSInt{565}, 
> COSInt{657}, COSInt{305}]}, COSInt{15}, COSArray{[COSInt{290}]}, COSInt{79}, 
> COSArray{[COSInt{305}]}, COSInt{16}, COSArray{[COSInt{322}]}, COSInt{80}, 
> COSArray{[COSInt{982}]}, COSInt{17}, COSArray{[COSInt{285}]}, COSInt{81}, 
> COSArray{[COSInt{657}, COSInt{619}]}, COSInt{19}, COSArray{[COSInt{571}]}, 
> COSInt{83}, COSArray{[COSInt{633}]}, COSInt{20}, COSArray{[COSInt{571}]}, 
> COSInt{85}, COSArray{[COSInt{454}, COSInt{497}, COSInt{434}, COSInt{657}]}, 
> COSInt{27}, COSArray{[COSInt{571}, COSInt{571}, COSInt{285}]}, COSInt{93}, 
> COSArray{[COSInt{488}]}, COSInt{36}, COSArray{[COSInt{690}, COSInt{672}]}, 
> COSInt{40}, COSArray{[COSInt{560}]}, COSInt{48}, COSArray{[COSInt{943}, 
> COSInt{813}]}, COSInt{53}, COSArray{[COSInt{660}, COSInt{551}, COSInt{579}, 
> COSInt{756}]}, COSInt{61}, COSArray{[COSInt{579}]}]}) 
> (COSName{CIDToGIDMap}:COSStream{(COSName{Filter}:COSName{FlateDecode}) 
> (COSName{Length}:COSInt{84}) (COSName{Length1}:COSInt{188}) }) }]}) 
> (COSName{ToUnicode}:COSStream{(COSName{Filter}:COSName{FlateDecode}) 
> (COSName{Length}:COSInt{324}) }) }
>   embedderPDCIDFontType2Embedder 
>   fontDescriptor  null
>   fontWidthOfSpace-1.0
>   isCMapPredefinedtrue
>   isDescendantCJK false   
>   noUnicode   HashSet  -> empty
>   toUnicodeCMap   null
>   widths  null
> {code}
> I will try to find a minimum example on how to reproduce this. Currently it 
> is only reproducible as part of a bigger package :|



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3030) Enhance documentation for PDFBox 2.0.0

2015-12-01 Thread JIRA

[ 
https://issues.apache.org/jira/browse/PDFBOX-3030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034088#comment-15034088
 ] 

Andreas Lehmkühler commented on PDFBOX-3030:


I've already reduced the usage to a minimum, see PDFBOX-3137

> Enhance documentation for PDFBox 2.0.0
> --
>
> Key: PDFBOX-3030
> URL: https://issues.apache.org/jira/browse/PDFBOX-3030
> Project: PDFBox
>  Issue Type: Task
>  Components: Documentation
>Affects Versions: 2.0.0
>Reporter: Maruan Sahyoun
>Assignee: Maruan Sahyoun
> Attachments: TGH-16862c48-6b0b-410e-8fc6-b1d9f4418ecc.htm
>
>
> Task to track enhancements to the documentation or website as part of PDFBox 
> 2.0.0
> - update javadoc (current as of writing)
> - migration guide 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3141) Link annotation borders not rendered

2015-12-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034094#comment-15034094
 ] 

ASF subversion and git services commented on PDFBOX-3141:
-

Commit 1717471 from [~tilman] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1717471 ]

PDFBOX-3141: imports

> Link annotation borders not rendered
> 
>
> Key: PDFBOX-3141
> URL: https://issues.apache.org/jira/browse/PDFBOX-3141
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel, Rendering
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Attachments: annots.pdf, annots.pdf-1-NEW.png, annots.pdf-1.png
>
>
> Borders in link annotations are not rendered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3131) Reduce amount of intermediate data and objects to reduce memory footprint/complexity

2015-12-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034110#comment-15034110
 ] 

ASF subversion and git services commented on PDFBOX-3131:
-

Commit 1717472 from [~lehmi] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1717472 ]

PDFBOX-3131: don't copy data

> Reduce amount of intermediate data and objects to reduce memory 
> footprint/complexity
> 
>
> Key: PDFBOX-3131
> URL: https://issues.apache.org/jira/browse/PDFBOX-3131
> Project: PDFBox
>  Issue Type: Improvement
>  Components: FontBox
>Affects Versions: 2.0.0
>Reporter: Andreas Lehmkühler
>Assignee: Andreas Lehmkühler
> Fix For: 2.0.0
>
>
> The CFFParser holds a lot of intermediate data and produces a lot of objects 
> to do so. The idea is to reduce the amount of such objects and dat ot reduce 
> the memory footprint and the complexity.
> - the class IndexData holds intermediate data creates byte array everytime 
> when getBytes is called. I'm going to replace the class with a simple list to 
> reduce the memory footprint and the complexity
> - remove unused members of private classes
> - create a list of strings instead of a list of byte arrays which is used to 
> create those strings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Resolved] (PDFBOX-3141) Link annotation borders not rendered

2015-12-01 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr resolved PDFBOX-3141.
-
   Resolution: Fixed
Fix Version/s: 2.0.0

> Link annotation borders not rendered
> 
>
> Key: PDFBOX-3141
> URL: https://issues.apache.org/jira/browse/PDFBOX-3141
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel, Rendering
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
>  Labels: Annotations
> Fix For: 2.0.0
>
> Attachments: annots.pdf, annots.pdf-1-NEW.png, annots.pdf-1.png
>
>
> Borders in link annotations are not rendered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3133) PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is poor with systems having low RAM < 3GB and lower number of fonts.

2015-12-01 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034039#comment-15034039
 ] 

Tilman Hausherr commented on PDFBOX-3133:
-

I can do it, but I need to find a time window where the PC can be powered off.

I already tested low -Xmx values (50m) some days ago (when we exchanged mails) 
and it worked fine. I just retested with PDFToImage and no resolution parameter 
and it took about 5 seconds, maybe 6 with low memory. PC is 6 years old.

> PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is 
> poor with systems having low RAM < 3GB and lower number of fonts.
> ---
>
> Key: PDFBOX-3133
> URL: https://issues.apache.org/jira/browse/PDFBOX-3133
> Project: PDFBox
>  Issue Type: Improvement
>  Components: PDModel
>Affects Versions: 2.0.0
> Environment: MS Windows Systems with low RAM < 3GB and number of 
> fonts were less < 592 (or if desired fonts in PDF to be printed are not 
> available in local system ) 
>Reporter: Sridhar
>Assignee: John Hewson
>  Labels: performance
> Fix For: 2.0.0
>
>
> PDFBox 2.0.0-RC1, SNAPSHOTS and RC2 versions print takes 15+ seconds.
> Steps to reproduce
> -- 
> Use Windows System with < 3 GB RAM
> Use Systems with less number of fonts or without specific fonts in PDF file  
> to be printed.
> Printing PDF file 
> Took 14 to 20 seconds in system with 3 GB RAM which had 522 foints
> Took 24 to 34 seconds in system with 2 GB RAM which had 90 fonts
> Took only 2.5 seconds in system with 8 GB RAM which had 1025 fonts. 
> Doubt
>  
> Not browsed the code, but following is the doubt as causing performance issue.
> Though the code caches fonts by storing fonts in local .pdfbox.cache file 
> first time and caching fonts for subsequent times.
> Not clear whether the code updates the pdfbox fonts cache file if new fonts 
> are found in new PDF file to be printed, while printing subsequent times. 
> If the fonts in PDF file to be printed is not available in the .pdfbox.cache 
> file stored in local system/local system what is the behaviour?  Will the 
> code download fonts and update cache for subsequent times or is it limited by 
> fonts available in local system?  Looks like later is the case and 
> performance got hit either due to RAM or not constantly updating fonts cache 
> or due to un availability of fonts in local system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3141) Annotation borders not rendered

2015-12-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034057#comment-15034057
 ] 

ASF subversion and git services commented on PDFBOX-3141:
-

Commit 1717463 from [~tilman] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1717463 ]

PDFBOX-3141: add getter/setter /Border

> Annotation borders not rendered
> ---
>
> Key: PDFBOX-3141
> URL: https://issues.apache.org/jira/browse/PDFBOX-3141
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel, Rendering
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Attachments: annots.pdf, annots.pdf-1.png
>
>
> Borders in annotations are not rendered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3141) Annotation borders not rendered

2015-12-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034064#comment-15034064
 ] 

ASF subversion and git services commented on PDFBOX-3141:
-

Commit 1717464 from [~tilman] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1717464 ]

PDFBOX-3141: render border of link annotations

> Annotation borders not rendered
> ---
>
> Key: PDFBOX-3141
> URL: https://issues.apache.org/jira/browse/PDFBOX-3141
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel, Rendering
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Attachments: annots.pdf, annots.pdf-1.png
>
>
> Borders in annotations are not rendered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3133) PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is poor with systems having low RAM < 3GB and lower number of fonts.

2015-12-01 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034060#comment-15034060
 ] 

John Hewson edited comment on PDFBOX-3133 at 12/1/15 4:59 PM:
--

I'm not seeing any evidence that the number of fonts makes a difference. The 
only number given in the previous test results is "90 fonts". Those results 
seem to show that more RAM results in faster rendering, up to 4GB, which makes 
perfect sense.

Also, what -Xmx are you using for those machines? That's almost as important as 
how much RAM is installed.


was (Author: jahewson):
I'm not seeing any evidence that the number of fonts makes a difference. The 
only number given in the previous test results is "90 fonts". Those results 
seem to show that more RAM results in faster rendering, up to 4GB, which makes 
perfect sense.

> PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is 
> poor with systems having low RAM < 3GB and lower number of fonts.
> ---
>
> Key: PDFBOX-3133
> URL: https://issues.apache.org/jira/browse/PDFBOX-3133
> Project: PDFBox
>  Issue Type: Improvement
>  Components: PDModel
>Affects Versions: 2.0.0
> Environment: MS Windows Systems with low RAM < 3GB and number of 
> fonts were less < 592 (or if desired fonts in PDF to be printed are not 
> available in local system ) 
>Reporter: Sridhar
>Assignee: John Hewson
>  Labels: performance
> Fix For: 2.0.0
>
>
> PDFBox 2.0.0-RC1, SNAPSHOTS and RC2 versions print takes 15+ seconds.
> Steps to reproduce
> -- 
> Use Windows System with < 3 GB RAM
> Use Systems with less number of fonts or without specific fonts in PDF file  
> to be printed.
> Printing PDF file 
> Took 14 to 20 seconds in system with 3 GB RAM which had 522 foints
> Took 24 to 34 seconds in system with 2 GB RAM which had 90 fonts
> Took only 2.5 seconds in system with 8 GB RAM which had 1025 fonts. 
> Doubt
>  
> Not browsed the code, but following is the doubt as causing performance issue.
> Though the code caches fonts by storing fonts in local .pdfbox.cache file 
> first time and caching fonts for subsequent times.
> Not clear whether the code updates the pdfbox fonts cache file if new fonts 
> are found in new PDF file to be printed, while printing subsequent times. 
> If the fonts in PDF file to be printed is not available in the .pdfbox.cache 
> file stored in local system/local system what is the behaviour?  Will the 
> code download fonts and update cache for subsequent times or is it limited by 
> fonts available in local system?  Looks like later is the case and 
> performance got hit either due to RAM or not constantly updating fonts cache 
> or due to un availability of fonts in local system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Assigned] (PDFBOX-3144) NullPointerException in TTFSubsetter

2015-12-01 Thread John Hewson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson reassigned PDFBOX-3144:
---

Assignee: John Hewson

> NullPointerException in TTFSubsetter
> 
>
> Key: PDFBOX-3144
> URL: https://issues.apache.org/jira/browse/PDFBOX-3144
> Project: PDFBox
>  Issue Type: Bug
>  Components: FontBox
>Affects Versions: 2.0.0
> Environment: Version 2.0.0-RC2
>Reporter: Philip Helger
>Assignee: John Hewson
>
> An NPE happens in "public void TTFSubsetter.add(int unicode)" because the 
> "unicodeCmap" member is null.
> This might be, because the passed "ttf" member is based on a 
> "MemoryTTFDataStream" and has only 38 glyphs (so it might already be a 
> subset). The available tables of the TTF are only: [fpgm, head, cvt , glyf, 
> loca, gasp, hmtx, prep, hhea, maxp]
> The variables of the underyling font are:
> {code}
> this  PDType0Font  (id=58)
>   afmStandard14   null
>   avgFontWidth0.0 
>   cMapCMap  -> Identity-H
>   cMapUCS2null
>   descendantFont  PDCIDFontType2  (id=155)
>   dictCOSDictionary  -> COSDictionary{(COSName{Type}:COSName{Font}) 
> (COSName{BaseFont}:COSName{AAAMSE+OpenSans-Bold}) 
> (COSName{Subtype}:COSName{Type0}) (COSName{Encoding}:COSName{Identity-H}) 
> (COSName{DescendantFonts}:COSArray{[COSDictionary{(COSName{Type}:COSName{Font})
>  (COSName{Subtype}:COSName{CIDFontType2}) 
> (COSName{BaseFont}:COSName{AAAMSE+OpenSans-Bold}) 
> (COSName{CIDSystemInfo}:COSDictionary{(COSName{Registry}:COSString{Adobe}) 
> (COSName{Ordering}:COSString{Identity}) (COSName{Supplement}:COSInt{0}) }) 
> (COSName{FontDescriptor}:COSDictionary{(COSName{Type}:COSName{FontDescriptor})
>  (COSName{FontName}:COSName{AAAMSE+OpenSans-Bold}) (COSName{Flags}:COSInt{4}) 
> (COSName{FontWeight}:COSFloat{700.0}) (COSName{ItalicAngle}:COSFloat{0.0}) 
> (COSName{FontBBox}:COSArray{[COSFloat{-619.1406}, COSFloat{-292.96875}, 
> COSFloat{1318.8477}, COSFloat{1068.8477}]}) 
> (COSName{Ascent}:COSFloat{1068.8477}) (COSName{Descent}:COSFloat{-292.96875}) 
> (COSName{CapHeight}:COSFloat{713.8672}) 
> (COSName{XHeight}:COSFloat{545.89844}) (COSName{StemV}:COSFloat{251.93846}) 
> (COSName{FontFile2}:COSStream{(COSName{Filter}:COSName{FlateDecode}) 
> (COSName{Length}:COSInt{5625}) (COSName{Length1}:COSInt{8036}) }) 
> (COSName{CIDSet}:COSStream{(COSName{Filter}:COSName{FlateDecode}) 
> (COSName{Length}:COSInt{20}) }) }) (COSName{W}:COSArray{[COSInt{3}, 
> COSArray{[COSInt{260}]}, COSInt{68}, COSArray{[COSInt{604}, COSInt{633}, 
> COSInt{514}, COSInt{633}, COSInt{591}]}, COSInt{74}, COSArray{[COSInt{565}, 
> COSInt{657}, COSInt{305}]}, COSInt{15}, COSArray{[COSInt{290}]}, COSInt{79}, 
> COSArray{[COSInt{305}]}, COSInt{16}, COSArray{[COSInt{322}]}, COSInt{80}, 
> COSArray{[COSInt{982}]}, COSInt{17}, COSArray{[COSInt{285}]}, COSInt{81}, 
> COSArray{[COSInt{657}, COSInt{619}]}, COSInt{19}, COSArray{[COSInt{571}]}, 
> COSInt{83}, COSArray{[COSInt{633}]}, COSInt{20}, COSArray{[COSInt{571}]}, 
> COSInt{85}, COSArray{[COSInt{454}, COSInt{497}, COSInt{434}, COSInt{657}]}, 
> COSInt{27}, COSArray{[COSInt{571}, COSInt{571}, COSInt{285}]}, COSInt{93}, 
> COSArray{[COSInt{488}]}, COSInt{36}, COSArray{[COSInt{690}, COSInt{672}]}, 
> COSInt{40}, COSArray{[COSInt{560}]}, COSInt{48}, COSArray{[COSInt{943}, 
> COSInt{813}]}, COSInt{53}, COSArray{[COSInt{660}, COSInt{551}, COSInt{579}, 
> COSInt{756}]}, COSInt{61}, COSArray{[COSInt{579}]}]}) 
> (COSName{CIDToGIDMap}:COSStream{(COSName{Filter}:COSName{FlateDecode}) 
> (COSName{Length}:COSInt{84}) (COSName{Length1}:COSInt{188}) }) }]}) 
> (COSName{ToUnicode}:COSStream{(COSName{Filter}:COSName{FlateDecode}) 
> (COSName{Length}:COSInt{324}) }) }
>   embedderPDCIDFontType2Embedder 
>   fontDescriptor  null
>   fontWidthOfSpace-1.0
>   isCMapPredefinedtrue
>   isDescendantCJK false   
>   noUnicode   HashSet  -> empty
>   toUnicodeCMap   null
>   widths  null
> {code}
> I will try to find a minimum example on how to reproduce this. Currently it 
> is only reproducible as part of a bigger package :|



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Build failed in Jenkins: PDFBox-trunk » Apache PDFBox #2642

2015-12-01 Thread Apache Jenkins Server
See 


Changes:

[tilman] PDFBOX-3141: render border of link annotations

[tilman] PDFBOX-3141: add getter/setter /Border

--
[INFO] 
[INFO] 
[INFO] Building Apache PDFBox 2.0.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ pdfbox ---
[TASKS] Scanning folder 
' for 
files matching the pattern '**/*.java' - excludes: 
[TASKS] Found 631 files to scan for tasks
Found 221 open tasks.
[TASKS] Computing warning deltas based on reference build #2641
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ pdfbox ---
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ pdfbox ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 1 resource
[INFO] Copying 21 resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ pdfbox ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 547 source files to 

[INFO] -
[WARNING] COMPILATION WARNING : 
[INFO] -
[WARNING] bootstrap class path not set in conjunction with -source 1.6
[WARNING] 
:[268,27]
 getHeight(int) in org.apache.pdfbox.pdmodel.font.PDFontLike has been deprecated
[WARNING] 
:[291,27]
 getHeight(int) in org.apache.pdfbox.pdmodel.font.PDFontLike has been deprecated
[WARNING] 
:
 Some input files use unchecked or unsafe operations.
[WARNING] 
:
 Recompile with -Xlint:unchecked for details.
[INFO] 5 warnings 
[INFO] -
[INFO] -
[ERROR] COMPILATION ERROR : 
[INFO] -
[ERROR] 
:[853,43]
 cannot find symbol
  symbol:   class PDAnnotationLink
  location: class org.apache.pdfbox.rendering.PageDrawer
[ERROR] 
:[847,35]
 cannot find symbol
  symbol:   class PDAnnotationLink
  location: class org.apache.pdfbox.rendering.PageDrawer
[ERROR] 
:[849,39]
 cannot find symbol
  symbol:   class PDAnnotationLink
  location: class org.apache.pdfbox.rendering.PageDrawer
[ERROR] 
:[859,9]
 cannot find symbol
  symbol:   class COSArray
  location: class org.apache.pdfbox.rendering.PageDrawer
[ERROR] 
:[860,9]
 cannot find symbol
  symbol:   class PDBorderStyleDictionary
  location: class org.apache.pdfbox.rendering.PageDrawer
[ERROR] 
:[866,42]
 cannot find symbol
  symbol:   class COSNumber
  location: class org.apache.pdfbox.rendering.PageDrawer
[ERROR] 
:[868,27]
 cannot find symbol
  symbol:   class COSNumber
  location: class org.apache.pdfbox.rendering.PageDrawer
[ERROR] 
:[872,17]
 cannot find symbol
  symbol:   class COSBase
  location: class org.apache.pdfbox.rendering.PageDrawer
[ERROR] 

[jira] [Updated] (PDFBOX-3141) Link annotation borders not rendered

2015-12-01 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-3141:

Description: Borders in link annotations are not rendered.  (was: Borders 
in annotations are not rendered.)

> Link annotation borders not rendered
> 
>
> Key: PDFBOX-3141
> URL: https://issues.apache.org/jira/browse/PDFBOX-3141
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel, Rendering
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Attachments: annots.pdf, annots.pdf-1.png
>
>
> Borders in link annotations are not rendered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3141) Annotation borders not rendered

2015-12-01 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034082#comment-15034082
 ] 

Tilman Hausherr commented on PDFBOX-3141:
-

improved in my file list:
- annots.pdf
- example.pdf
- PDFBOX-1325.pdf
- PDFBOX-1606.pdf p3
- PDFBOX-2019-Annotations.pdf
- PDFBOX-2898.pdf
- PDFBOX-2348.pdf p5

About the last commit:
- This issue is only about link annotations. I haven't found an example for 
another type of annotations. I am changing the title accordingly.
- Border styles "Beveled" and "Inset" are not supported, "Solid" is used 
instead. I tried with Adobe Reader, it doesn't do anything else either.
- Draw rounded rectangle is not supported.

Re: annotations, we're really just at the beginning, try rendering 
http://www.pdfill.com/example/pdf_commenting_new.pdf .

@Maruan could you please look at the 2.0 spec 
- check whether they use "color" or "colour" and ask your contacts to use only 
one of these.
- check whether in annotations, the default color (black) is now mentioned if 
/C is missing







> Annotation borders not rendered
> ---
>
> Key: PDFBOX-3141
> URL: https://issues.apache.org/jira/browse/PDFBOX-3141
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel, Rendering
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Attachments: annots.pdf, annots.pdf-1.png
>
>
> Borders in annotations are not rendered.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Resolved] (PDFBOX-3139) Custom FontMapper cant be used

2015-12-01 Thread John Hewson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson resolved PDFBOX-3139.
-
Resolution: Fixed

> Custom FontMapper cant be used
> --
>
> Key: PDFBOX-3139
> URL: https://issues.apache.org/jira/browse/PDFBOX-3139
> Project: PDFBox
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 2.0.0
>Reporter: simon steiner
>Assignee: John Hewson
> Fix For: 2.0.0
>
>
> CIDFontMapping and FontMapping have private constructor so not sure how you 
> can even use this FontMapper



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-3139) Custom FontMapper cant be used

2015-12-01 Thread John Hewson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson updated PDFBOX-3139:

Fix Version/s: 2.0.0

> Custom FontMapper cant be used
> --
>
> Key: PDFBOX-3139
> URL: https://issues.apache.org/jira/browse/PDFBOX-3139
> Project: PDFBox
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 2.0.0
>Reporter: simon steiner
> Fix For: 2.0.0
>
>
> CIDFontMapping and FontMapping have private constructor so not sure how you 
> can even use this FontMapper



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3139) Custom FontMapper cant be used

2015-12-01 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034093#comment-15034093
 ] 

John Hewson commented on PDFBOX-3139:
-

Fixed. Any problems using the new FontMapper APIs just let me know.

> Custom FontMapper cant be used
> --
>
> Key: PDFBOX-3139
> URL: https://issues.apache.org/jira/browse/PDFBOX-3139
> Project: PDFBox
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 2.0.0
>Reporter: simon steiner
> Fix For: 2.0.0
>
>
> CIDFontMapping and FontMapping have private constructor so not sure how you 
> can even use this FontMapper



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3139) Custom FontMapper cant be used

2015-12-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034091#comment-15034091
 ] 

ASF subversion and git services commented on PDFBOX-3139:
-

Commit 1717470 from [~jahewson] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1717470 ]

PDFBOX-3139: Make FontMapping public

> Custom FontMapper cant be used
> --
>
> Key: PDFBOX-3139
> URL: https://issues.apache.org/jira/browse/PDFBOX-3139
> Project: PDFBox
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 2.0.0
>Reporter: simon steiner
> Fix For: 2.0.0
>
>
> CIDFontMapping and FontMapping have private constructor so not sure how you 
> can even use this FontMapper



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-3138) PDTextField doesn't accept any Hebrew characters as new value

2015-12-01 Thread John Hewson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson updated PDFBOX-3138:

Attachment: Test-3-filled.pdf

I've attached the same form filled by Acrobat.

> PDTextField doesn't accept any Hebrew characters as new value
> -
>
> Key: PDFBOX-3138
> URL: https://issues.apache.org/jira/browse/PDFBOX-3138
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, FontBox
>Affects Versions: 2.0.0
> Environment: Eclipse 4.2.2, Windows 7 Pro, JRE 1.8.0_05
>Reporter: Gilad Denneboom
> Fix For: 2.1.0
>
> Attachments: SetHebrewFieldValueTest.java, Test-3-filled.pdf, 
> Test.pdf, Test.txt
>
>
> Trying to set a UTF-8 encoded Hebrew string as the value of a PDTextField 
> fails with the following exception:
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: No glyph for 
> U+05D7 in font AdobeHebrew-Regular
>   at 
> org.apache.pdfbox.pdmodel.font.PDType1CFont.encode(PDType1CFont.java:300)
>   at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:283)
>   at 
> org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:341)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PlainTextFormatter.format(PlainTextFormatter.java:213)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.insertGeneratedAppearance(AppearanceGeneratorHelper.java:373)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceContent(AppearanceGeneratorHelper.java:237)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceValue(AppearanceGeneratorHelper.java:144)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.constructAppearances(PDTextField.java:263)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTerminalField.applyChange(PDTerminalField.java:221)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.setValue(PDTextField.java:218)
>   at SetHebrewFieldValueTest.main(SetHebrewFieldValueTest.java:22)
> {code}
> I've tried using multiple fonts for the field, all of which can handle Hebrew 
> characters just fine, and got the same results in all of them.
> See attached files for a demonstration of the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3024) Preflight validation call PDType0Font.clear at the wrong time

2015-12-01 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034153#comment-15034153
 ] 

John Hewson commented on PDFBOX-3024:
-

It's worth mentioning that CID 0 is the "missing CID", just like GID 0 is the 
missing glyph.

> Preflight validation call PDType0Font.clear at the wrong time
> -
>
> Key: PDFBOX-3024
> URL: https://issues.apache.org/jira/browse/PDFBOX-3024
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 1.8.10
>Reporter: Guillaume Monteils
> Attachments: 004973.pdf, PDF-Tools.png, PDFBox.png, eclipse-1.jpg, 
> eclipse-2.jpg
>
>
> I used the algorythm here to test PDF / A compliance :
> https://pdfbox.apache.org/1.8/cookbook/pdfavalidation.html
> With one pdf document (which i cant give you due to confidentiality), an 
> NullPointerException occur here :
> {code}
> java.lang.NullPointerException
>   at 
> org.apache.pdfbox.pdmodel.font.PDType0Font.getFontWidth(PDType0Font.java:188)
>   at 
> org.apache.pdfbox.preflight.font.container.FontContainer.checkGlyphWith(FontContainer.java:114)
>   at 
> org.apache.pdfbox.preflight.content.ContentStreamWrapper.validText(ContentStreamWrapper.java:372)...
> {code}
> As i dug deeper, i found that preflight loads a font context where it puts 
> all pdf fonts. The PDType0Font is also created and put in this context.
> {code}
> (CSObject : 
> COSDictionary{(COSName{BaseFont}:COSName{INWHIX+TimesNewRomanPSMT})   
> (COSName{DescendantFonts}:COSArray{[COSObject{349, 0}]}) 
> (COSName{Encoding}:COSName{Identity-H})   
> (COSName{Subtype}:COSName{Type0}) 
> (COSName{ToUnicode}:COSDictionary{(COSName{Filter}:COSName{FlateDecode})  
> (COSName{Length}:COSInt{260}) }) (COSName{Type}:COSName{Font}) })
> {code}
> The problem is that at the end of one step of the analysis, the clear method 
> is called on the PDType0Font (see eclipse-1.jpg), but the font is still 
> present in the context. On a second step, the same font is retrieved from the 
> context, with no data in it, and the NullPointerException occurs (see 
> eclipse-2.jpg).
> I tried the validation after removing the clear method from PDType0Font and 
> it works just fine.
> I think the problem comes from this context, and a clear on a font should 
> also trigger a deletion in this map.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3062) Text extraction and height different in 2.0

2015-12-01 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034170#comment-15034170
 ] 

John Hewson commented on PDFBOX-3062:
-

Then the next JIRA issue will be "Text extraction and height different in 2.1". 
We can stick with bbox for 2.0 but we shouldn't be trying to be backwards 
compatible with 1.8's broken hybrid metrics. So I'd suggest bbox for 2.0, new 
stuff for 2.1, no hybrid stuff anywhere.

> Text extraction and height different in 2.0
> ---
>
> Key: PDFBOX-3062
> URL: https://issues.apache.org/jira/browse/PDFBOX-3062
> Project: PDFBox
>  Issue Type: Sub-task
>  Components: Text extraction
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Fix For: 2.0.0
>
> Attachments: 005021-reduced.pdf, 
> PDFBOX-3062-H6NIYQXHLPGD3GI6SNIYINRAZBCDHUCB-reduced-marked-1.png, 
> PDFBOX-3062-H6NIYQXHLPGD3GI6SNIYINRAZBCDHUCB-reduced.pdf, 
> PDFBOX-3062-H6NIYQXHLPGD3GI6SNIYINRAZBCDHUCB.pdf, 
> PDFBOX-3062-N2MOQ7YZICIYGTPLQJAWJ4HLN6CCEMHZ-reduced.pdf, garbled text 2.pdf
>
>
> AR:
> {code}
> WITH THE increasing complexity of optical modules,
> {code}
> 1.8:
> {code}
> WITH THE increasing complexity of optical modules,
> String[39.6,399.6 fs=1.0 xscale=29.888 height=20.114626 space=7.472 
> width=28.214272]W
> String[69.488,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=3.3176804]I
> String[72.80568,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=6.0873947]T
> String[78.893074,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=7.1932907]H
> String[90.71916,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=6.0873947]T
> String[96.80656,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=7.1932907]H
> {code}
> 2.0:
> {code}
> W
> ITH THE increasing complexity of optical modules,
> String[39.6,399.6 fs=1.0 xscale=29.888 height=9.584274 space=7.472 
> width=28.209717]W
> String[69.488,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=3.3177567]I
> String[72.805756,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=6.0858]T
> String[78.891556,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=7.1949615]H
> String[90.719315,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=6.0858]T
> String[96.805115,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=7.1949615]H
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3138) PDTextField doesn't accept any Hebrew characters as new value

2015-12-01 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034181#comment-15034181
 ] 

John Hewson edited comment on PDFBOX-3138 at 12/1/15 5:48 PM:
--

If you can embed the font as a CIDFontType0 instead of Type1 then PDFBox will 
be able to find the Hebrew glyphs. Failing that, try creating the field with 
some placeholder Hebrew text in it - that might trigger Type 0 font embedding.

Once you have that working, you'll need to reorder the Hebrew string visually 
before embedding it in the PDF, as PDFBox doesn't know about RTL. Java's 
[Bidi|http://docs.oracle.com/javase/7/docs/api/java/text/Bidi.html] class has a 
method which can do this for you.


was (Author: jahewson):
If you can embed the font as a CIDFontType0 instead of Type1 then PDFBox will 
be able to find the Hebrew glyphs. Failing that, try creating the field with 
some placeholder Hebrew text in it - that might trigger Type 0 font embedding.

Once you have that working, you'll need to reorder the Hebrew string visually 
before embedding it in the PDF, as PDFBox doesn't know about RTL. Java's 
[Bidi|http://docs.oracle.com/javase/7/docs/api/java/text/Bidi.html]. class has 
a method which can do this for you.

> PDTextField doesn't accept any Hebrew characters as new value
> -
>
> Key: PDFBOX-3138
> URL: https://issues.apache.org/jira/browse/PDFBOX-3138
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, FontBox
>Affects Versions: 2.0.0
> Environment: Eclipse 4.2.2, Windows 7 Pro, JRE 1.8.0_05
>Reporter: Gilad Denneboom
> Fix For: 2.1.0
>
> Attachments: SetHebrewFieldValueTest.java, Test-3-filled.pdf, 
> Test.pdf, Test.txt
>
>
> Trying to set a UTF-8 encoded Hebrew string as the value of a PDTextField 
> fails with the following exception:
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: No glyph for 
> U+05D7 in font AdobeHebrew-Regular
>   at 
> org.apache.pdfbox.pdmodel.font.PDType1CFont.encode(PDType1CFont.java:300)
>   at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:283)
>   at 
> org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:341)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PlainTextFormatter.format(PlainTextFormatter.java:213)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.insertGeneratedAppearance(AppearanceGeneratorHelper.java:373)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceContent(AppearanceGeneratorHelper.java:237)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceValue(AppearanceGeneratorHelper.java:144)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.constructAppearances(PDTextField.java:263)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTerminalField.applyChange(PDTerminalField.java:221)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.setValue(PDTextField.java:218)
>   at SetHebrewFieldValueTest.main(SetHebrewFieldValueTest.java:22)
> {code}
> I've tried using multiple fonts for the field, all of which can handle Hebrew 
> characters just fine, and got the same results in all of them.
> See attached files for a demonstration of the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3138) PDTextField doesn't accept any Hebrew characters as new value

2015-12-01 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034181#comment-15034181
 ] 

John Hewson edited comment on PDFBOX-3138 at 12/1/15 5:49 PM:
--

If you can embed the font as a CIDFontType0 instead of Type1 then PDFBox will 
be able to find the Hebrew glyphs. Failing that, try creating the field with 
some placeholder Hebrew text in it - that might trigger Type 0 font embedding.

Once you have that working, you'll need to reorder the Hebrew string visually 
before embedding it in the PDF, as PDFBox doesn't know about RTL. Java's 
[Bidi|http://docs.oracle.com/javase/7/docs/api/java/text/Bidi.html] class has a 
method which can do this for you. You'll have to draw the RTL strings 
glyph-by-glyph in reverse.


was (Author: jahewson):
If you can embed the font as a CIDFontType0 instead of Type1 then PDFBox will 
be able to find the Hebrew glyphs. Failing that, try creating the field with 
some placeholder Hebrew text in it - that might trigger Type 0 font embedding.

Once you have that working, you'll need to reorder the Hebrew string visually 
before embedding it in the PDF, as PDFBox doesn't know about RTL. Java's 
[Bidi|http://docs.oracle.com/javase/7/docs/api/java/text/Bidi.html] class has a 
method which can do this for you.

> PDTextField doesn't accept any Hebrew characters as new value
> -
>
> Key: PDFBOX-3138
> URL: https://issues.apache.org/jira/browse/PDFBOX-3138
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, FontBox
>Affects Versions: 2.0.0
> Environment: Eclipse 4.2.2, Windows 7 Pro, JRE 1.8.0_05
>Reporter: Gilad Denneboom
> Fix For: 2.1.0
>
> Attachments: SetHebrewFieldValueTest.java, Test-3-filled.pdf, 
> Test.pdf, Test.txt
>
>
> Trying to set a UTF-8 encoded Hebrew string as the value of a PDTextField 
> fails with the following exception:
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: No glyph for 
> U+05D7 in font AdobeHebrew-Regular
>   at 
> org.apache.pdfbox.pdmodel.font.PDType1CFont.encode(PDType1CFont.java:300)
>   at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:283)
>   at 
> org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:341)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PlainTextFormatter.format(PlainTextFormatter.java:213)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.insertGeneratedAppearance(AppearanceGeneratorHelper.java:373)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceContent(AppearanceGeneratorHelper.java:237)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceValue(AppearanceGeneratorHelper.java:144)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.constructAppearances(PDTextField.java:263)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTerminalField.applyChange(PDTerminalField.java:221)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.setValue(PDTextField.java:218)
>   at SetHebrewFieldValueTest.main(SetHebrewFieldValueTest.java:22)
> {code}
> I've tried using multiple fonts for the field, all of which can handle Hebrew 
> characters just fine, and got the same results in all of them.
> See attached files for a demonstration of the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3138) PDTextField doesn't accept any Hebrew characters as new value

2015-12-01 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034181#comment-15034181
 ] 

John Hewson edited comment on PDFBOX-3138 at 12/1/15 5:51 PM:
--

If you can embed the font as a CIDFontType0 instead of Type1 then PDFBox will 
be able to find the Hebrew glyphs. Failing that, try creating the field with 
some placeholder Hebrew text in it - that might trigger Type 0 font embedding.

Once you have that working, you'll need to reorder the Hebrew string visually 
before embedding it in the PDF, as PDFBox doesn't know about RTL. Java's 
[Bidi|http://docs.oracle.com/javase/7/docs/api/java/text/Bidi.html] class has a 
method which can do this for you. Alternatively, try setting 
PDAcroForm#setNeedAppearances(true) which bypasses PDFBox appearance generation 
and lets the viewer generate their own appearance for the field - however this 
can be incompatible with some viewers.


was (Author: jahewson):
If you can embed the font as a CIDFontType0 instead of Type1 then PDFBox will 
be able to find the Hebrew glyphs. Failing that, try creating the field with 
some placeholder Hebrew text in it - that might trigger Type 0 font embedding.

Once you have that working, you'll need to reorder the Hebrew string visually 
before embedding it in the PDF, as PDFBox doesn't know about RTL. Java's 
[Bidi|http://docs.oracle.com/javase/7/docs/api/java/text/Bidi.html] class has a 
method which can do this for you. You'll have to draw the RTL strings 
glyph-by-glyph in reverse.

> PDTextField doesn't accept any Hebrew characters as new value
> -
>
> Key: PDFBOX-3138
> URL: https://issues.apache.org/jira/browse/PDFBOX-3138
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, FontBox
>Affects Versions: 2.0.0
> Environment: Eclipse 4.2.2, Windows 7 Pro, JRE 1.8.0_05
>Reporter: Gilad Denneboom
> Fix For: 2.1.0
>
> Attachments: SetHebrewFieldValueTest.java, Test-3-filled.pdf, 
> Test.pdf, Test.txt
>
>
> Trying to set a UTF-8 encoded Hebrew string as the value of a PDTextField 
> fails with the following exception:
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: No glyph for 
> U+05D7 in font AdobeHebrew-Regular
>   at 
> org.apache.pdfbox.pdmodel.font.PDType1CFont.encode(PDType1CFont.java:300)
>   at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:283)
>   at 
> org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:341)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PlainTextFormatter.format(PlainTextFormatter.java:213)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.insertGeneratedAppearance(AppearanceGeneratorHelper.java:373)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceContent(AppearanceGeneratorHelper.java:237)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceValue(AppearanceGeneratorHelper.java:144)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.constructAppearances(PDTextField.java:263)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTerminalField.applyChange(PDTerminalField.java:221)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.setValue(PDTextField.java:218)
>   at SetHebrewFieldValueTest.main(SetHebrewFieldValueTest.java:22)
> {code}
> I've tried using multiple fonts for the field, all of which can handle Hebrew 
> characters just fine, and got the same results in all of them.
> See attached files for a demonstration of the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3131) Reduce amount of intermediate data and objects to reduce memory footprint/complexity

2015-12-01 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034225#comment-15034225
 ] 

John Hewson commented on PDFBOX-3131:
-

Yep, whatever CFF data was exposed for it's not used, even in my private code.

> Reduce amount of intermediate data and objects to reduce memory 
> footprint/complexity
> 
>
> Key: PDFBOX-3131
> URL: https://issues.apache.org/jira/browse/PDFBOX-3131
> Project: PDFBox
>  Issue Type: Improvement
>  Components: FontBox
>Affects Versions: 2.0.0
>Reporter: Andreas Lehmkühler
>Assignee: Andreas Lehmkühler
> Fix For: 2.0.0
>
>
> The CFFParser holds a lot of intermediate data and produces a lot of objects 
> to do so. The idea is to reduce the amount of such objects and dat ot reduce 
> the memory footprint and the complexity.
> - the class IndexData holds intermediate data creates byte array everytime 
> when getBytes is called. I'm going to replace the class with a simple list to 
> reduce the memory footprint and the complexity
> - remove unused members of private classes
> - create a list of strings instead of a list of byte arrays which is used to 
> create those strings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-2420) DateConverter doesn't handle time zones outside -12 to +12 range properly

2015-12-01 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maciej Woźniak updated PDFBOX-2420:
---
Attachment: 
_PDFBOX_2420__Improved_handling_timezones_between__14_and__14_hours__Correct_version_.patch

Correct version.
Thank you for your comment.

> DateConverter doesn't handle time zones outside -12 to +12 range properly
> -
>
> Key: PDFBOX-2420
> URL: https://issues.apache.org/jira/browse/PDFBOX-2420
> Project: PDFBox
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 2.0.0
>Reporter: Arjohn Kampman
> Fix For: 2.1.0
>
> Attachments: 
> _PDFBOX_2420__Improved_handling_timezones_between__14_and__14_hours_.patch, 
> _PDFBOX_2420__Improved_handling_timezones_between__14_and__14_hours__Correct_version_.patch
>
>
> DateConverter normalizes time zones in restrainTZoffset(...) to a value that 
> is between -12:00 and +12:00. So a time zone like +13:00 gets normalized to 
> -11:00. However, the date itself is not adapted accordingly. As a result, a 
> time stamp like "2014-7-20T05:0:00+1300" gets changed to 
> "2014-7-20T05:0:00-1100", which is actually 24 hours later! To compensate for 
> the time zone change, 24 hours should have been subtracted from the date: 
> "2014-7-19T05:0:00-1100".
> Personally, I'd prefer to leave the time zones untouched completely. Note 
> that XML Schema defines time zones up to +/- 14:00 to be valid: 
> http://www.w3.org/TR/xmlschema-2/#dateTime-timezones. For any time zones out 
> of that range either generate an error or consider a garbage-in-garbage-out 
> policy.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3131) Reduce amount of intermediate data and objects to reduce memory footprint/complexity

2015-12-01 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034230#comment-15034230
 ] 

ASF subversion and git services commented on PDFBOX-3131:
-

Commit 1717478 from [~jahewson] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1717478 ]

PDFBOX-3131: Completely remove CFF data methods

> Reduce amount of intermediate data and objects to reduce memory 
> footprint/complexity
> 
>
> Key: PDFBOX-3131
> URL: https://issues.apache.org/jira/browse/PDFBOX-3131
> Project: PDFBox
>  Issue Type: Improvement
>  Components: FontBox
>Affects Versions: 2.0.0
>Reporter: Andreas Lehmkühler
>Assignee: Andreas Lehmkühler
> Fix For: 2.0.0
>
>
> The CFFParser holds a lot of intermediate data and produces a lot of objects 
> to do so. The idea is to reduce the amount of such objects and dat ot reduce 
> the memory footprint and the complexity.
> - the class IndexData holds intermediate data creates byte array everytime 
> when getBytes is called. I'm going to replace the class with a simple list to 
> reduce the memory footprint and the complexity
> - remove unused members of private classes
> - create a list of strings instead of a list of byte arrays which is used to 
> create those strings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3138) PDTextField doesn't accept any Hebrew characters as new value

2015-12-01 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034181#comment-15034181
 ] 

John Hewson commented on PDFBOX-3138:
-

If you can embed the font as a CIDFontType0 instead of Type1 then PDFBox will 
be able to find the Hebrew glyphs. Failing that, try creating the field with 
some placeholder Hebrew text in it - that might trigger Type 0 font embedding.

Once you have that working, you'll need to reorder the Hebrew string visually 
before embedding it in the PDF, as PDFBox doesn't know about RTL. Java's 
[Bidi|http://docs.oracle.com/javase/7/docs/api/java/text/Bidi.html]. class has 
a method which can do this for you.

> PDTextField doesn't accept any Hebrew characters as new value
> -
>
> Key: PDFBOX-3138
> URL: https://issues.apache.org/jira/browse/PDFBOX-3138
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, FontBox
>Affects Versions: 2.0.0
> Environment: Eclipse 4.2.2, Windows 7 Pro, JRE 1.8.0_05
>Reporter: Gilad Denneboom
> Fix For: 2.1.0
>
> Attachments: SetHebrewFieldValueTest.java, Test-3-filled.pdf, 
> Test.pdf, Test.txt
>
>
> Trying to set a UTF-8 encoded Hebrew string as the value of a PDTextField 
> fails with the following exception:
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: No glyph for 
> U+05D7 in font AdobeHebrew-Regular
>   at 
> org.apache.pdfbox.pdmodel.font.PDType1CFont.encode(PDType1CFont.java:300)
>   at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:283)
>   at 
> org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:341)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PlainTextFormatter.format(PlainTextFormatter.java:213)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.insertGeneratedAppearance(AppearanceGeneratorHelper.java:373)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceContent(AppearanceGeneratorHelper.java:237)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceValue(AppearanceGeneratorHelper.java:144)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.constructAppearances(PDTextField.java:263)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTerminalField.applyChange(PDTerminalField.java:221)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.setValue(PDTextField.java:218)
>   at SetHebrewFieldValueTest.main(SetHebrewFieldValueTest.java:22)
> {code}
> I've tried using multiple fonts for the field, all of which can handle Hebrew 
> characters just fine, and got the same results in all of them.
> See attached files for a demonstration of the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3138) PDTextField doesn't accept any Hebrew characters as new value

2015-12-01 Thread John Hewson (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034181#comment-15034181
 ] 

John Hewson edited comment on PDFBOX-3138 at 12/1/15 5:53 PM:
--

If you can embed the font as a CIDFontType0 instead of Type1 then PDFBox will 
be able to find the Hebrew glyphs. Failing that, try creating the field with 
some placeholder Hebrew text in it - that might trigger Type 0 font embedding.

Once you have that working, you'll need to reorder the Hebrew string visually 
before embedding it in the PDF, as PDFBox doesn't know about RTL. Java's 
[Bidi|http://docs.oracle.com/javase/7/docs/api/java/text/Bidi.html] class has a 
method which can do this for you, then create one final string which has the 
necessary characters reversed. Alternatively, try setting 
PDAcroForm#setNeedAppearances(true) which bypasses PDFBox appearance generation 
and lets the viewer generate their own appearance for the field - however this 
can be incompatible with some viewers.


was (Author: jahewson):
If you can embed the font as a CIDFontType0 instead of Type1 then PDFBox will 
be able to find the Hebrew glyphs. Failing that, try creating the field with 
some placeholder Hebrew text in it - that might trigger Type 0 font embedding.

Once you have that working, you'll need to reorder the Hebrew string visually 
before embedding it in the PDF, as PDFBox doesn't know about RTL. Java's 
[Bidi|http://docs.oracle.com/javase/7/docs/api/java/text/Bidi.html] class has a 
method which can do this for you. Alternatively, try setting 
PDAcroForm#setNeedAppearances(true) which bypasses PDFBox appearance generation 
and lets the viewer generate their own appearance for the field - however this 
can be incompatible with some viewers.

> PDTextField doesn't accept any Hebrew characters as new value
> -
>
> Key: PDFBOX-3138
> URL: https://issues.apache.org/jira/browse/PDFBOX-3138
> Project: PDFBox
>  Issue Type: Bug
>  Components: AcroForm, FontBox
>Affects Versions: 2.0.0
> Environment: Eclipse 4.2.2, Windows 7 Pro, JRE 1.8.0_05
>Reporter: Gilad Denneboom
> Fix For: 2.1.0
>
> Attachments: SetHebrewFieldValueTest.java, Test-3-filled.pdf, 
> Test.pdf, Test.txt
>
>
> Trying to set a UTF-8 encoded Hebrew string as the value of a PDTextField 
> fails with the following exception:
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: No glyph for 
> U+05D7 in font AdobeHebrew-Regular
>   at 
> org.apache.pdfbox.pdmodel.font.PDType1CFont.encode(PDType1CFont.java:300)
>   at org.apache.pdfbox.pdmodel.font.PDFont.encode(PDFont.java:283)
>   at 
> org.apache.pdfbox.pdmodel.PDPageContentStream.showText(PDPageContentStream.java:341)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PlainTextFormatter.format(PlainTextFormatter.java:213)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.insertGeneratedAppearance(AppearanceGeneratorHelper.java:373)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceContent(AppearanceGeneratorHelper.java:237)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.AppearanceGeneratorHelper.setAppearanceValue(AppearanceGeneratorHelper.java:144)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.constructAppearances(PDTextField.java:263)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTerminalField.applyChange(PDTerminalField.java:221)
>   at 
> org.apache.pdfbox.pdmodel.interactive.form.PDTextField.setValue(PDTextField.java:218)
>   at SetHebrewFieldValueTest.main(SetHebrewFieldValueTest.java:22)
> {code}
> I've tried using multiple fonts for the field, all of which can handle Hebrew 
> characters just fine, and got the same results in all of them.
> See attached files for a demonstration of the issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3062) Text extraction and height different in 2.0

2015-12-01 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034207#comment-15034207
 ] 

Tilman Hausherr commented on PDFBOX-3062:
-

This isn't really about metrics, this is about text extraction. The text 
extraction fails for some files if one uses the bbox only. It just isn't 
reliable enough. Users won't be happy if some files that did extract properly 
in the past no longer extract.

People who just want the sizes can still decide whether they want to use the 
values from PDFBox, or calculate their own.

> Text extraction and height different in 2.0
> ---
>
> Key: PDFBOX-3062
> URL: https://issues.apache.org/jira/browse/PDFBOX-3062
> Project: PDFBox
>  Issue Type: Sub-task
>  Components: Text extraction
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Fix For: 2.0.0
>
> Attachments: 005021-reduced.pdf, 
> PDFBOX-3062-H6NIYQXHLPGD3GI6SNIYINRAZBCDHUCB-reduced-marked-1.png, 
> PDFBOX-3062-H6NIYQXHLPGD3GI6SNIYINRAZBCDHUCB-reduced.pdf, 
> PDFBOX-3062-H6NIYQXHLPGD3GI6SNIYINRAZBCDHUCB.pdf, 
> PDFBOX-3062-N2MOQ7YZICIYGTPLQJAWJ4HLN6CCEMHZ-reduced.pdf, garbled text 2.pdf
>
>
> AR:
> {code}
> WITH THE increasing complexity of optical modules,
> {code}
> 1.8:
> {code}
> WITH THE increasing complexity of optical modules,
> String[39.6,399.6 fs=1.0 xscale=29.888 height=20.114626 space=7.472 
> width=28.214272]W
> String[69.488,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=3.3176804]I
> String[72.80568,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=6.0873947]T
> String[78.893074,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=7.1932907]H
> String[90.71916,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=6.0873947]T
> String[96.80656,386.16 fs=1.0 xscale=9.963 height=6.5955067 space=2.49075 
> width=7.1932907]H
> {code}
> 2.0:
> {code}
> W
> ITH THE increasing complexity of optical modules,
> String[39.6,399.6 fs=1.0 xscale=29.888 height=9.584274 space=7.472 
> width=28.209717]W
> String[69.488,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=3.3177567]I
> String[72.805756,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=6.0858]T
> String[78.891556,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=7.1949615]H
> String[90.719315,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=6.0858]T
> String[96.805115,386.16 fs=1.0 xscale=9.963 height=3.194865 space=2.49075 
> width=7.1949615]H
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Jenkins build is back to normal : PDFBox-trunk » Apache PDFBox #2643

2015-12-01 Thread Apache Jenkins Server
See 



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Jenkins build is back to normal : PDFBox-trunk #2643

2015-12-01 Thread Apache Jenkins Server
See 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2996) StackOverflow in Quicksort

2015-12-01 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034255#comment-15034255
 ] 

Tilman Hausherr commented on PDFBOX-2996:
-

Here's a list of files from my rendering tests that fail with TimSort. The 
files can be found with their issue or with google.

Basiswissen-Vorschriften.pdf
PDFBOX-1292.pdf
PDFBOX-1359.pdf
PDFBOX-2163-152584.pdf
PDFBOX-2163-258126.pdf
PDFBOX-2163-876636.pdf
PDFBOX-2187-002145.pdf
PDFBOX-2188.pdf
PDFBOX-2250-113223.pdf
PDFBOX-2367.pdf
PDFBOX-2385-032992.pdf
PDFBOX-2385-862497.pdf
PDFBOX-2845.pdf
PDFBOX-2904-378255.pdf
PDFBOX-2939.pdf
PDFJS-1324-2012_visitorsguide_web.pdf
PDFJS-1732.pdf


> StackOverflow in Quicksort
> --
>
> Key: PDFBOX-2996
> URL: https://issues.apache.org/jira/browse/PDFBOX-2996
> Project: PDFBox
>  Issue Type: Bug
>  Components: Text extraction
>Affects Versions: 1.8.10, 2.0.0
> Environment: Java 7
>Reporter: Manuel Aristaran
> Attachments: 001991.pdf, Lars-v0-PDFBOX-2996.patch, 
> Lars-v1-PDFBOX-2996.patch, Lars-v2-PDFBOX-2996.patch, QuickSort.java, 
> TestSortingAlgorithms.java, artikel1_20_arab.pdf-sorted-bubble.txt, 
> artikel1_20_arab.pdf-sorted-diff.txt, 
> artikel1_20_arab.pdf-sorted-iter-withRightPivot.txt, 
> artikel1_20_arab.pdf-sorted-iter.txt, 
> artikel1_20_arab.pdf-sorted-java8-legacyMergeSort.txt, 
> artikel1_20_arab.pdf-sorted-java8-timsort.txt, 
> artikel1_20_arab.pdf-sorted-qs-iterative-withMiddlePivot.txt, 
> artikel1_20_arab.pdf-sorted-qs-iterative-withRightPivot.txt, 
> artikel1_20_arab.pdf-sorted-qs-recursive.txt, 
> artikel1_20_arab.pdf-sorted-rekur.txt, diff-delta.png, failing_sort.pdf, 
> quicksort.patch
>
>
> Running PDFTextStripper through ExtractText triggers a StackOverflow 
> exception in the QuickSort implementation for [this particular 
> document|https://www.dropbox.com/s/6crie7y5gqadwa5/1.pdf?dl=0].
> To reproduce: {{java -jar pdfbox-app-1.8.11-SNAPSHOT.jar ExtractText -sort 
> failing_sort.pdf}}
> (Related to PDFBOX-1512)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Issue Comment Deleted] (PDFBOX-3131) Reduce amount of intermediate data and objects to reduce memory footprint/complexity

2015-12-01 Thread John Hewson (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Hewson updated PDFBOX-3131:

Comment: was deleted

(was: Yep, whatever CFF data was exposed for it's not used, even in my private 
code.)

> Reduce amount of intermediate data and objects to reduce memory 
> footprint/complexity
> 
>
> Key: PDFBOX-3131
> URL: https://issues.apache.org/jira/browse/PDFBOX-3131
> Project: PDFBox
>  Issue Type: Improvement
>  Components: FontBox
>Affects Versions: 2.0.0
>Reporter: Andreas Lehmkühler
>Assignee: Andreas Lehmkühler
> Fix For: 2.0.0
>
>
> The CFFParser holds a lot of intermediate data and produces a lot of objects 
> to do so. The idea is to reduce the amount of such objects and dat ot reduce 
> the memory footprint and the complexity.
> - the class IndexData holds intermediate data creates byte array everytime 
> when getBytes is called. I'm going to replace the class with a simple list to 
> reduce the memory footprint and the complexity
> - remove unused members of private classes
> - create a list of strings instead of a list of byte arrays which is used to 
> create those strings



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Created] (PDFBOX-3147) PDFBox fail to render Thai character properly

2015-12-01 Thread Nattapong Sirilappanich (JIRA)
Nattapong Sirilappanich created PDFBOX-3147:
---

 Summary: PDFBox fail to render Thai character properly
 Key: PDFBOX-3147
 URL: https://issues.apache.org/jira/browse/PDFBOX-3147
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 2.0.0
 Environment: Windows 7 x86-64.
JRE 8 build 1.8.0_66-b17
Reporter: Nattapong Sirilappanich


{code}
try {
// Create a document and add a page to it
PDDocument document = new PDDocument();
PDPage page = new PDPage();
document.addPage( page );

// Create a new font object by loading a TrueType font 
into the document
PDFont font = PDType0Font.load(document, new 
File("ARIALUNI.TTF"));

// Start a new content stream which will "hold" the to 
be created content
PDPageContentStream contentStream = new 
PDPageContentStream(document, page);

// Define a text content stream using the selected 
font, moving the cursor and drawing the text "Hello World"
contentStream.beginText();
contentStream.setFont( font, 12 );
contentStream.newLineAtOffset( 100, 700 );
contentStream.showText( "กูกินก้งปิ้งอยู่ในถ้ำ" );
contentStream.endText();

// Make sure that the content stream is closed:
contentStream.close();

// Save the results and ensure that the document is 
properly closed:
document.save( "ArialUnicode.pdf");
document.close();
} catch (IOException e) {
e.printStackTrace();
}
{code}

The code above is modified from sample code provided via PDFBox example.
I tried to use Arial Unicode font which is shipped as part of Windows 7.
The generated PDF missing some glyph and render some other gibberish glyph.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-3147) PDFBox fail to render Thai character properly

2015-12-01 Thread Nattapong Sirilappanich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nattapong Sirilappanich updated PDFBOX-3147:

Attachment: ThaiText.txt

The text file encoded in UTF-8 contains Thai text similar to the one in code 
that used to generate PDF file.

> PDFBox fail to render Thai character properly
> -
>
> Key: PDFBOX-3147
> URL: https://issues.apache.org/jira/browse/PDFBOX-3147
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
> Environment: Windows 7 x86-64.
> JRE 8 build 1.8.0_66-b17
>Reporter: Nattapong Sirilappanich
> Attachments: ArialUnicode.pdf, ThaiText.txt, compareresult.jpg
>
>
> {code}
>   try {
>   // Create a document and add a page to it
>   PDDocument document = new PDDocument();
>   PDPage page = new PDPage();
>   document.addPage( page );
>   // Create a new font object by loading a TrueType font 
> into the document
>   PDFont font = PDType0Font.load(document, new 
> File("ARIALUNI.TTF"));
>   // Start a new content stream which will "hold" the to 
> be created content
>   PDPageContentStream contentStream = new 
> PDPageContentStream(document, page);
>   // Define a text content stream using the selected 
> font, moving the cursor and drawing the text "Hello World"
>   contentStream.beginText();
>   contentStream.setFont( font, 12 );
>   contentStream.newLineAtOffset( 100, 700 );
>   contentStream.showText( "กูกินก้งปิ้งอยู่ในถ้ำ" );
>   contentStream.endText();
>   // Make sure that the content stream is closed:
>   contentStream.close();
>   // Save the results and ensure that the document is 
> properly closed:
>   document.save( "ArialUnicode.pdf");
>   document.close();
>   } catch (IOException e) {
>   e.printStackTrace();
>   }
> {code}
> The code above is modified from sample code provided via PDFBox example.
> I tried to use Arial Unicode font which is shipped as part of Windows 7.
> The generated PDF missing some glyph and render some other gibberish glyph.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-3147) PDFBox fail to render Thai character properly

2015-12-01 Thread Nattapong Sirilappanich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nattapong Sirilappanich updated PDFBOX-3147:

Attachment: compareresult.jpg

The screen shot comparing rendered PDF generated by PDFBox and actual text with 
similar font rendered in Microsoft Notepad.

> PDFBox fail to render Thai character properly
> -
>
> Key: PDFBOX-3147
> URL: https://issues.apache.org/jira/browse/PDFBOX-3147
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
> Environment: Windows 7 x86-64.
> JRE 8 build 1.8.0_66-b17
>Reporter: Nattapong Sirilappanich
> Attachments: ArialUnicode.pdf, compareresult.jpg
>
>
> {code}
>   try {
>   // Create a document and add a page to it
>   PDDocument document = new PDDocument();
>   PDPage page = new PDPage();
>   document.addPage( page );
>   // Create a new font object by loading a TrueType font 
> into the document
>   PDFont font = PDType0Font.load(document, new 
> File("ARIALUNI.TTF"));
>   // Start a new content stream which will "hold" the to 
> be created content
>   PDPageContentStream contentStream = new 
> PDPageContentStream(document, page);
>   // Define a text content stream using the selected 
> font, moving the cursor and drawing the text "Hello World"
>   contentStream.beginText();
>   contentStream.setFont( font, 12 );
>   contentStream.newLineAtOffset( 100, 700 );
>   contentStream.showText( "กูกินก้งปิ้งอยู่ในถ้ำ" );
>   contentStream.endText();
>   // Make sure that the content stream is closed:
>   contentStream.close();
>   // Save the results and ensure that the document is 
> properly closed:
>   document.save( "ArialUnicode.pdf");
>   document.close();
>   } catch (IOException e) {
>   e.printStackTrace();
>   }
> {code}
> The code above is modified from sample code provided via PDFBox example.
> I tried to use Arial Unicode font which is shipped as part of Windows 7.
> The generated PDF missing some glyph and render some other gibberish glyph.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-3147) PDFBox fail to render Thai character properly

2015-12-01 Thread Nattapong Sirilappanich (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nattapong Sirilappanich updated PDFBOX-3147:

Attachment: ArialUnicode.pdf

The generated PDF file

> PDFBox fail to render Thai character properly
> -
>
> Key: PDFBOX-3147
> URL: https://issues.apache.org/jira/browse/PDFBOX-3147
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
> Environment: Windows 7 x86-64.
> JRE 8 build 1.8.0_66-b17
>Reporter: Nattapong Sirilappanich
> Attachments: ArialUnicode.pdf
>
>
> {code}
>   try {
>   // Create a document and add a page to it
>   PDDocument document = new PDDocument();
>   PDPage page = new PDPage();
>   document.addPage( page );
>   // Create a new font object by loading a TrueType font 
> into the document
>   PDFont font = PDType0Font.load(document, new 
> File("ARIALUNI.TTF"));
>   // Start a new content stream which will "hold" the to 
> be created content
>   PDPageContentStream contentStream = new 
> PDPageContentStream(document, page);
>   // Define a text content stream using the selected 
> font, moving the cursor and drawing the text "Hello World"
>   contentStream.beginText();
>   contentStream.setFont( font, 12 );
>   contentStream.newLineAtOffset( 100, 700 );
>   contentStream.showText( "กูกินก้งปิ้งอยู่ในถ้ำ" );
>   contentStream.endText();
>   // Make sure that the content stream is closed:
>   contentStream.close();
>   // Save the results and ensure that the document is 
> properly closed:
>   document.save( "ArialUnicode.pdf");
>   document.close();
>   } catch (IOException e) {
>   e.printStackTrace();
>   }
> {code}
> The code above is modified from sample code provided via PDFBox example.
> I tried to use Arial Unicode font which is shipped as part of Windows 7.
> The generated PDF missing some glyph and render some other gibberish glyph.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3147) PDFBox fail to render Thai character properly

2015-12-01 Thread Nattapong Sirilappanich (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15035312#comment-15035312
 ] 

Nattapong Sirilappanich edited comment on PDFBOX-3147 at 12/2/15 5:52 AM:
--

The screen shot comparing rendered PDF generated by PDFBox and actual text with 
similar font rendered in Microsoft Notepad. Notice that in left most red oval 
have one glyph missing under base line character compare to left most green 
oval being rendered in Notepad.
The second to left most oval contain some overlap rectangular character due to 
it fail to render glyph properly. The right most green oval illustrate how 
Notepad render it.


was (Author: na...@th.ibm.com):
The screen shot comparing rendered PDF generated by PDFBox and actual text with 
similar font rendered in Microsoft Notepad.

> PDFBox fail to render Thai character properly
> -
>
> Key: PDFBOX-3147
> URL: https://issues.apache.org/jira/browse/PDFBOX-3147
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
> Environment: Windows 7 x86-64.
> JRE 8 build 1.8.0_66-b17
>Reporter: Nattapong Sirilappanich
> Attachments: ArialUnicode.pdf, ThaiText.txt, compareresult.jpg
>
>
> {code}
>   try {
>   // Create a document and add a page to it
>   PDDocument document = new PDDocument();
>   PDPage page = new PDPage();
>   document.addPage( page );
>   // Create a new font object by loading a TrueType font 
> into the document
>   PDFont font = PDType0Font.load(document, new 
> File("ARIALUNI.TTF"));
>   // Start a new content stream which will "hold" the to 
> be created content
>   PDPageContentStream contentStream = new 
> PDPageContentStream(document, page);
>   // Define a text content stream using the selected 
> font, moving the cursor and drawing the text "Hello World"
>   contentStream.beginText();
>   contentStream.setFont( font, 12 );
>   contentStream.newLineAtOffset( 100, 700 );
>   contentStream.showText( "กูกินก้งปิ้งอยู่ในถ้ำ" );
>   contentStream.endText();
>   // Make sure that the content stream is closed:
>   contentStream.close();
>   // Save the results and ensure that the document is 
> properly closed:
>   document.save( "ArialUnicode.pdf");
>   document.close();
>   } catch (IOException e) {
>   e.printStackTrace();
>   }
> {code}
> The code above is modified from sample code provided via PDFBox example.
> I tried to use Arial Unicode font which is shipped as part of Windows 7.
> The generated PDF missing some glyph and render some other gibberish glyph.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3145) Security manager fails for .pdfbox.cache

2015-12-01 Thread simon steiner (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034336#comment-15034336
 ] 

simon steiner commented on PDFBOX-3145:
---

I enabled the security manager, it wont have read/write access to the filesystem

> Security manager fails for .pdfbox.cache
> 
>
> Key: PDFBOX-3145
> URL: https://issues.apache.org/jira/browse/PDFBOX-3145
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: simon steiner
>Assignee: John Hewson
> Fix For: 2.0.0
>
>
> Caused by: java.security.AccessControlException: access denied 
> ("java.io.FilePermission" "/home/simon/.pdfbox.cache" "read")
>   at 
> java.security.AccessControlContext.checkPermission(AccessControlContext.java:457)
>   at 
> java.security.AccessController.checkPermission(AccessController.java:884)
>   at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
>   at java.lang.SecurityManager.checkRead(SecurityManager.java:888)
>   at java.io.File.exists(File.java:814)
>   at 
> org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.loadDiskCache(FileSystemFontProvider.java:357)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3145) Security manager fails for .pdfbox.cache

2015-12-01 Thread simon steiner (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034341#comment-15034341
 ] 

simon steiner commented on PDFBOX-3145:
---

-Djava.security.manager -Djava.security.policy==someURL

> Security manager fails for .pdfbox.cache
> 
>
> Key: PDFBOX-3145
> URL: https://issues.apache.org/jira/browse/PDFBOX-3145
> Project: PDFBox
>  Issue Type: Bug
>  Components: Rendering
>Affects Versions: 2.0.0
>Reporter: simon steiner
>Assignee: John Hewson
> Fix For: 2.0.0
>
>
> Caused by: java.security.AccessControlException: access denied 
> ("java.io.FilePermission" "/home/simon/.pdfbox.cache" "read")
>   at 
> java.security.AccessControlContext.checkPermission(AccessControlContext.java:457)
>   at 
> java.security.AccessController.checkPermission(AccessController.java:884)
>   at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
>   at java.lang.SecurityManager.checkRead(SecurityManager.java:888)
>   at java.io.File.exists(File.java:814)
>   at 
> org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.loadDiskCache(FileSystemFontProvider.java:357)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3133) PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is poor with systems having low RAM < 3GB and lower number of fonts.

2015-12-01 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034397#comment-15034397
 ] 

Tilman Hausherr commented on PDFBOX-3133:
-

I tried, but failed: my PC went into endless beep mode. I suspect that RAM 
elements must be in pairs. My PC has 4 elements. When I bought it, it had 2 RAM 
elements, and I later added two more.

I was able to boot with 4GB (2 elements) and it took about 4 seconds, even with 
low -Xmx. (Yes it was faster, this is because this was after a fresh start, no 
applications loaded, nothing else running).

> PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is 
> poor with systems having low RAM < 3GB and lower number of fonts.
> ---
>
> Key: PDFBOX-3133
> URL: https://issues.apache.org/jira/browse/PDFBOX-3133
> Project: PDFBox
>  Issue Type: Improvement
>  Components: PDModel
>Affects Versions: 2.0.0
> Environment: MS Windows Systems with low RAM < 3GB and number of 
> fonts were less < 592 (or if desired fonts in PDF to be printed are not 
> available in local system ) 
>Reporter: Sridhar
>Assignee: John Hewson
>  Labels: performance
> Fix For: 2.0.0
>
>
> PDFBox 2.0.0-RC1, SNAPSHOTS and RC2 versions print takes 15+ seconds.
> Steps to reproduce
> -- 
> Use Windows System with < 3 GB RAM
> Use Systems with less number of fonts or without specific fonts in PDF file  
> to be printed.
> Printing PDF file 
> Took 14 to 20 seconds in system with 3 GB RAM which had 522 foints
> Took 24 to 34 seconds in system with 2 GB RAM which had 90 fonts
> Took only 2.5 seconds in system with 8 GB RAM which had 1025 fonts. 
> Doubt
>  
> Not browsed the code, but following is the doubt as causing performance issue.
> Though the code caches fonts by storing fonts in local .pdfbox.cache file 
> first time and caching fonts for subsequent times.
> Not clear whether the code updates the pdfbox fonts cache file if new fonts 
> are found in new PDF file to be printed, while printing subsequent times. 
> If the fonts in PDF file to be printed is not available in the .pdfbox.cache 
> file stored in local system/local system what is the behaviour?  Will the 
> code download fonts and update cache for subsequent times or is it limited by 
> fonts available in local system?  Looks like later is the case and 
> performance got hit either due to RAM or not constantly updating fonts cache 
> or due to un availability of fonts in local system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3142) PDFMergerUtility with scratch file generates result with blank pages for certain source files.

2015-12-01 Thread Jim deVos (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15034433#comment-15034433
 ] 

Jim deVos commented on PDFBOX-3142:
---

Andreas - thanks for your reply. I'll run these source documents through a pdf 
validator to see what it finds.  Individually they open just fine (i.e. no 
blank pages) in various pdf viewers, but I suspect that these viewers are 
pretty forgiving w/ non-compliant files.   On that note, it would be  nice to 
know of a way to anticipate if the file will cause these issues before 
attempting to merge it with a coverpage.   At the moment all I see is the 
aforementioned error  message in the log, but I don't see a way to interrogate 
the parser to see if it has issues w/ the file.

As for v2,  that's a good suggestion. I'll rewrite my test for 2.0.0 and report 
the results.

> PDFMergerUtility with scratch file generates result with blank pages for 
> certain source files.
> --
>
> Key: PDFBOX-3142
> URL: https://issues.apache.org/jira/browse/PDFBOX-3142
> Project: PDFBox
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 1.8.10
> Environment: Ubuntu 14.04.3, java 1.8.0_66
>Reporter: Jim deVos
>
> My team uses PDFMergerUtility to attach cover pages to various pdfs .   We 
> recently we tried utilizing a scratch file (e.g. 
> PDFMergerUtility.mergeDocumentsNonSeq())  to cut down on the amount of RAM we 
> are using. This approach works for the majority of pdf's in our system, but 
> some files cause the merger utility to generate resultant pdf's with a blank 
> page.  Specifically, the result pdf contains a blank page after the coverpage 
> instead of the first page of the second document sent to merger utility.
> Whenever this problem occurs, we see the following line in our logs:
> {{org.apache.pdfbox.pdfparser.NonSequentialPDFParser - Can't find the object 
> 52 0 (origin offset 7187557)}}
> I'll try to attach/link an example pdf soon, but currently I don't have 
> permission to redistribute any files that exhibit the problem.  However,  
> here's a simple snippet that replicates the problem - it's pretty 
> straightforward.
> {code}
> @Test
> public void testMergeNonSeq() throws IOException, COSVisitorException {
> destinationPdf = new File(TMP_FOLDER, "result-nonseq.pdf");
> PDFMergerUtility ut = new PDFMergerUtility();
> RandomAccess ram = new 
> RandomAccessFile(File.createTempFile("mergeram", ".bin"), "rw");
> ut.addSource(coverpagePdf);
> ut.addSource(documentPdf);
> ut.setDestinationFileName(destinationPdf.getCanonicalPath());
> ut.mergeDocumentsNonSeq(ram);  
> 
> //the only automated way we have to tell that something went wrong is 
> to check the size of the result
> assertThat("destination pdf should be larger than the original pdf", 
> destinationPdf.length(), is( greaterThan(documentPdf.length(;
> }
> {code}
> Note we only see this problem with PDFMergerUtility.mergeDocumentsNonSeq().  
> Using PDFMergerUtility.mergeDocuments() does not exhibit any problems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Created] (PDFBOX-3146) Ink annotation borders not rendered

2015-12-01 Thread Tilman Hausherr (JIRA)
Tilman Hausherr created PDFBOX-3146:
---

 Summary: Ink annotation borders not rendered
 Key: PDFBOX-3146
 URL: https://issues.apache.org/jira/browse/PDFBOX-3146
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 2.0.0
Reporter: Tilman Hausherr
Assignee: Tilman Hausherr


I'll write something to render Ink annotations. One can be found in the file of 
PDFBOX-2583.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3133) PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is poor with systems having low RAM < 3GB and lower number of fonts.

2015-12-01 Thread Sridhar (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15033439#comment-15033439
 ] 

Sridhar commented on PDFBOX-3133:
-

John, Tilman
Here below are the test results

1. PDFToImage Commandline in a machine with 2 GB RAM and 90 fonts took 17 to 18 
seconds ( some warnings on fonts )
2. PDFToImage Commandline in a machine with 8 GB RAM took 4 to 5 seconds.
3. PDFToImage Commandline in another machine with 4 GB RAM took 3 to 5 seconds 
4. Printing to HP Laser jet from a program using 300 dpi in Ctors from a a 
machine with 4 GB RAM took 8 seconds

John, there are 2 variables in the equations one is RAM and another is # of 
fonts.
Performance is a function of RAM and # of fonts,  
a ) Time to create JPG or PNG Image using PDFToImage is a f (RAM, # of fonts ) 

Tilman
Since we tested with PDFToImage, you can eliminate the printer driver, printer 
speed, cable/network bandwidth from your suscpision.
One request, it is easy to reduce RAM and test, than add RAM and test. If you 
can remove 6GB of memory from your desktop or laptop and test, I am sure you 
should be able reproduce slow performance in PDFToImage  or print to see the 
difference.

 

> PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is 
> poor with systems having low RAM < 3GB and lower number of fonts.
> ---
>
> Key: PDFBOX-3133
> URL: https://issues.apache.org/jira/browse/PDFBOX-3133
> Project: PDFBox
>  Issue Type: Improvement
>  Components: PDModel
>Affects Versions: 2.0.0
> Environment: MS Windows Systems with low RAM < 3GB and number of 
> fonts were less < 592 (or if desired fonts in PDF to be printed are not 
> available in local system ) 
>Reporter: Sridhar
>Assignee: John Hewson
>  Labels: performance
> Fix For: 2.0.0
>
>
> PDFBox 2.0.0-RC1, SNAPSHOTS and RC2 versions print takes 15+ seconds.
> Steps to reproduce
> -- 
> Use Windows System with < 3 GB RAM
> Use Systems with less number of fonts or without specific fonts in PDF file  
> to be printed.
> Printing PDF file 
> Took 14 to 20 seconds in system with 3 GB RAM which had 522 foints
> Took 24 to 34 seconds in system with 2 GB RAM which had 90 fonts
> Took only 2.5 seconds in system with 8 GB RAM which had 1025 fonts. 
> Doubt
>  
> Not browsed the code, but following is the doubt as causing performance issue.
> Though the code caches fonts by storing fonts in local .pdfbox.cache file 
> first time and caching fonts for subsequent times.
> Not clear whether the code updates the pdfbox fonts cache file if new fonts 
> are found in new PDF file to be printed, while printing subsequent times. 
> If the fonts in PDF file to be printed is not available in the .pdfbox.cache 
> file stored in local system/local system what is the behaviour?  Will the 
> code download fonts and update cache for subsequent times or is it limited by 
> fonts available in local system?  Looks like later is the case and 
> performance got hit either due to RAM or not constantly updating fonts cache 
> or due to un availability of fonts in local system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2945) PDType1Font.getNameInFont(String) very slow when Unicode fallback is used

2015-12-01 Thread Philip Helger (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15033450#comment-15033450
 ] 

Philip Helger commented on PDFBOX-2945:
---

No problem - glad I found it :)

> PDType1Font.getNameInFont(String) very slow when Unicode fallback is used
> -
>
> Key: PDFBOX-2945
> URL: https://issues.apache.org/jira/browse/PDFBOX-2945
> Project: PDFBox
>  Issue Type: Improvement
>  Components: PDModel
>Affects Versions: 2.0.0
> Environment: Windows 10, Pdfbox SNAPSHOT as of revision 1697721 from 
> today, Java 1.7.0_76, 64Bit
>Reporter: Philip Helger
>Assignee: Tilman Hausherr
> Fix For: 2.0.0
>
> Attachments: pdfbox2945.patch
>
>
> When the method is called on a non-embedded font and the unicode fallback is 
> used, the line "String uniName = String.format("uni%04X", 
> unicodes.codePointAt(0));" is called and it is very slow. I suggest either 
> adding a cache (codepoint to uniname) or at least replace the String.format 
> call with something different, as this internally invokes a new RegExp 
> Matcher etc.
> Something like the following might do the trick (maybe you have a better 
> utility classes):
> {code}
> final StringBuilder aID = new StringBuilder (Integer.toString 
> (unicodes.codePointAt (0), 16).toUpperCase (Locale.US));
> while (aID.length () < 4)
>  aID.insert (0, '0');
> aID.insert (0, "uni");
> final String uniName = aID.toString ();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3133) PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is poor with systems having low RAM < 3GB and lower number of fonts.

2015-12-01 Thread Sridhar (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15033579#comment-15033579
 ] 

Sridhar edited comment on PDFBOX-3133 at 12/1/15 3:32 PM:
--

Dear Andreas

Thanks for your comment

Test file was already provided offline to Tilman.
Here again PDF file for you.

Though the content is replaced, the layout and design is popular and 
belongs to our customer and to maintain confidentiality please kindly 
don't share it publicly and you can use it internally for testing and 
delete after a month. 



What I could see is that the .pdfBox.cache file generated by PDFBox code 
in home directory has fonts read and cached from  ttf and ttc  fonts files 
available in system ( C;\Windows\fonts ) . 
Fonts might not influence but RAM should. 

I am unable to test with lower RAM and large number of fonts, hence 
requested Tilman, whose machine has 492+ fonts and 8 GB RAM to reduce RAM 
to 2 GB and test. 

Regards
Sridhar Sowmiyanarayanan
Tata Consultancy Services
Mailto: srid..@tcs.com
Website: http://www.tcs.com



was (Author: sridhar):
Dear Andreas

Thanks for your comment

Test file was already provided offline to Tilman.
Here again PDF file for you.

Though the content is replaced, the layout and design is popular and 
belongs to our customer and to maintain confidentiality please kindly 
don't share it publicly and you can use it internally for testing and 
delete after a month. 



What I could see is that the .pdfBox.cache file generated by PDFBox code 
in home directory has fonts read and cached from  ttf and ttc  fonts files 
available in system ( C;\Windows\fonts ) . 
Fonts might not influence but RAM should. 

I am unable to test with lower RAM and large number of fonts, hence 
requested Tilman, whose machine has 492+ fonts and 8 GB RAM to reduce RAM 
to 2 GB and test. 

Regards
Sridhar Sowmiyanarayanan
Tata Consultancy Services
Mailto: sridhar...@tcs.com
Website: http://www.tcs.com


> PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is 
> poor with systems having low RAM < 3GB and lower number of fonts.
> ---
>
> Key: PDFBOX-3133
> URL: https://issues.apache.org/jira/browse/PDFBOX-3133
> Project: PDFBox
>  Issue Type: Improvement
>  Components: PDModel
>Affects Versions: 2.0.0
> Environment: MS Windows Systems with low RAM < 3GB and number of 
> fonts were less < 592 (or if desired fonts in PDF to be printed are not 
> available in local system ) 
>Reporter: Sridhar
>Assignee: John Hewson
>  Labels: performance
> Fix For: 2.0.0
>
>
> PDFBox 2.0.0-RC1, SNAPSHOTS and RC2 versions print takes 15+ seconds.
> Steps to reproduce
> -- 
> Use Windows System with < 3 GB RAM
> Use Systems with less number of fonts or without specific fonts in PDF file  
> to be printed.
> Printing PDF file 
> Took 14 to 20 seconds in system with 3 GB RAM which had 522 foints
> Took 24 to 34 seconds in system with 2 GB RAM which had 90 fonts
> Took only 2.5 seconds in system with 8 GB RAM which had 1025 fonts. 
> Doubt
>  
> Not browsed the code, but following is the doubt as causing performance issue.
> Though the code caches fonts by storing fonts in local .pdfbox.cache file 
> first time and caching fonts for subsequent times.
> Not clear whether the code updates the pdfbox fonts cache file if new fonts 
> are found in new PDF file to be printed, while printing subsequent times. 
> If the fonts in PDF file to be printed is not available in the .pdfbox.cache 
> file stored in local system/local system what is the behaviour?  Will the 
> code download fonts and update cache for subsequent times or is it limited by 
> fonts available in local system?  Looks like later is the case and 
> performance got hit either due to RAM or not constantly updating fonts cache 
> or due to un availability of fonts in local system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Created] (PDFBOX-3144) NullPointerException in TTFSubsetter

2015-12-01 Thread Philip Helger (JIRA)
Philip Helger created PDFBOX-3144:
-

 Summary: NullPointerException in TTFSubsetter
 Key: PDFBOX-3144
 URL: https://issues.apache.org/jira/browse/PDFBOX-3144
 Project: PDFBox
  Issue Type: Bug
  Components: FontBox
Affects Versions: 2.0.0
 Environment: Version 2.0.0-RC2
Reporter: Philip Helger


An NPE happens in "public void TTFSubsetter.add(int unicode)" because the 
"unicodeCmap" member is null.
This might be, because the passed "ttf" member is based on a 
"MemoryTTFDataStream" and has only 38 glyphs (so it might already be a subset). 
The available tables of the TTF are only: [fpgm, head, cvt , glyf, loca, gasp, 
hmtx, prep, hhea, maxp]

The variables of the underyling font are:
thisPDType0Font  (id=58)
afmStandard14   null
avgFontWidth0.0 
cMapCMap  -> Identity-H
cMapUCS2null
descendantFont  PDCIDFontType2  (id=155)
dictCOSDictionary  -> COSDictionary{(COSName{Type}:COSName{Font}) 
(COSName{BaseFont}:COSName{AAAMSE+OpenSans-Bold}) 
(COSName{Subtype}:COSName{Type0}) (COSName{Encoding}:COSName{Identity-H}) 
(COSName{DescendantFonts}:COSArray{[COSDictionary{(COSName{Type}:COSName{Font}) 
(COSName{Subtype}:COSName{CIDFontType2}) 
(COSName{BaseFont}:COSName{AAAMSE+OpenSans-Bold}) 
(COSName{CIDSystemInfo}:COSDictionary{(COSName{Registry}:COSString{Adobe}) 
(COSName{Ordering}:COSString{Identity}) (COSName{Supplement}:COSInt{0}) }) 
(COSName{FontDescriptor}:COSDictionary{(COSName{Type}:COSName{FontDescriptor}) 
(COSName{FontName}:COSName{AAAMSE+OpenSans-Bold}) (COSName{Flags}:COSInt{4}) 
(COSName{FontWeight}:COSFloat{700.0}) (COSName{ItalicAngle}:COSFloat{0.0}) 
(COSName{FontBBox}:COSArray{[COSFloat{-619.1406}, COSFloat{-292.96875}, 
COSFloat{1318.8477}, COSFloat{1068.8477}]}) 
(COSName{Ascent}:COSFloat{1068.8477}) (COSName{Descent}:COSFloat{-292.96875}) 
(COSName{CapHeight}:COSFloat{713.8672}) (COSName{XHeight}:COSFloat{545.89844}) 
(COSName{StemV}:COSFloat{251.93846}) 
(COSName{FontFile2}:COSStream{(COSName{Filter}:COSName{FlateDecode}) 
(COSName{Length}:COSInt{5625}) (COSName{Length1}:COSInt{8036}) }) 
(COSName{CIDSet}:COSStream{(COSName{Filter}:COSName{FlateDecode}) 
(COSName{Length}:COSInt{20}) }) }) (COSName{W}:COSArray{[COSInt{3}, 
COSArray{[COSInt{260}]}, COSInt{68}, COSArray{[COSInt{604}, COSInt{633}, 
COSInt{514}, COSInt{633}, COSInt{591}]}, COSInt{74}, COSArray{[COSInt{565}, 
COSInt{657}, COSInt{305}]}, COSInt{15}, COSArray{[COSInt{290}]}, COSInt{79}, 
COSArray{[COSInt{305}]}, COSInt{16}, COSArray{[COSInt{322}]}, COSInt{80}, 
COSArray{[COSInt{982}]}, COSInt{17}, COSArray{[COSInt{285}]}, COSInt{81}, 
COSArray{[COSInt{657}, COSInt{619}]}, COSInt{19}, COSArray{[COSInt{571}]}, 
COSInt{83}, COSArray{[COSInt{633}]}, COSInt{20}, COSArray{[COSInt{571}]}, 
COSInt{85}, COSArray{[COSInt{454}, COSInt{497}, COSInt{434}, COSInt{657}]}, 
COSInt{27}, COSArray{[COSInt{571}, COSInt{571}, COSInt{285}]}, COSInt{93}, 
COSArray{[COSInt{488}]}, COSInt{36}, COSArray{[COSInt{690}, COSInt{672}]}, 
COSInt{40}, COSArray{[COSInt{560}]}, COSInt{48}, COSArray{[COSInt{943}, 
COSInt{813}]}, COSInt{53}, COSArray{[COSInt{660}, COSInt{551}, COSInt{579}, 
COSInt{756}]}, COSInt{61}, COSArray{[COSInt{579}]}]}) 
(COSName{CIDToGIDMap}:COSStream{(COSName{Filter}:COSName{FlateDecode}) 
(COSName{Length}:COSInt{84}) (COSName{Length1}:COSInt{188}) }) }]}) 
(COSName{ToUnicode}:COSStream{(COSName{Filter}:COSName{FlateDecode}) 
(COSName{Length}:COSInt{324}) }) }
embedderPDCIDFontType2Embedder 
fontDescriptor  null
fontWidthOfSpace-1.0
isCMapPredefinedtrue
isDescendantCJK false   
noUnicode   HashSet  -> empty
toUnicodeCMap   null
widths  null


I will try to find a minimum example on how to reproduce this. Currently it is 
only reproducible as part of a bigger package :|



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-3144) NullPointerException in TTFSubsetter

2015-12-01 Thread Philip Helger (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Philip Helger updated PDFBOX-3144:
--
Description: 
An NPE happens in "public void TTFSubsetter.add(int unicode)" because the 
"unicodeCmap" member is null.
This might be, because the passed "ttf" member is based on a 
"MemoryTTFDataStream" and has only 38 glyphs (so it might already be a subset). 
The available tables of the TTF are only: [fpgm, head, cvt , glyf, loca, gasp, 
hmtx, prep, hhea, maxp]

The variables of the underyling font are:
{code}
thisPDType0Font  (id=58)
afmStandard14   null
avgFontWidth0.0 
cMapCMap  -> Identity-H
cMapUCS2null
descendantFont  PDCIDFontType2  (id=155)
dictCOSDictionary  -> COSDictionary{(COSName{Type}:COSName{Font}) 
(COSName{BaseFont}:COSName{AAAMSE+OpenSans-Bold}) 
(COSName{Subtype}:COSName{Type0}) (COSName{Encoding}:COSName{Identity-H}) 
(COSName{DescendantFonts}:COSArray{[COSDictionary{(COSName{Type}:COSName{Font}) 
(COSName{Subtype}:COSName{CIDFontType2}) 
(COSName{BaseFont}:COSName{AAAMSE+OpenSans-Bold}) 
(COSName{CIDSystemInfo}:COSDictionary{(COSName{Registry}:COSString{Adobe}) 
(COSName{Ordering}:COSString{Identity}) (COSName{Supplement}:COSInt{0}) }) 
(COSName{FontDescriptor}:COSDictionary{(COSName{Type}:COSName{FontDescriptor}) 
(COSName{FontName}:COSName{AAAMSE+OpenSans-Bold}) (COSName{Flags}:COSInt{4}) 
(COSName{FontWeight}:COSFloat{700.0}) (COSName{ItalicAngle}:COSFloat{0.0}) 
(COSName{FontBBox}:COSArray{[COSFloat{-619.1406}, COSFloat{-292.96875}, 
COSFloat{1318.8477}, COSFloat{1068.8477}]}) 
(COSName{Ascent}:COSFloat{1068.8477}) (COSName{Descent}:COSFloat{-292.96875}) 
(COSName{CapHeight}:COSFloat{713.8672}) (COSName{XHeight}:COSFloat{545.89844}) 
(COSName{StemV}:COSFloat{251.93846}) 
(COSName{FontFile2}:COSStream{(COSName{Filter}:COSName{FlateDecode}) 
(COSName{Length}:COSInt{5625}) (COSName{Length1}:COSInt{8036}) }) 
(COSName{CIDSet}:COSStream{(COSName{Filter}:COSName{FlateDecode}) 
(COSName{Length}:COSInt{20}) }) }) (COSName{W}:COSArray{[COSInt{3}, 
COSArray{[COSInt{260}]}, COSInt{68}, COSArray{[COSInt{604}, COSInt{633}, 
COSInt{514}, COSInt{633}, COSInt{591}]}, COSInt{74}, COSArray{[COSInt{565}, 
COSInt{657}, COSInt{305}]}, COSInt{15}, COSArray{[COSInt{290}]}, COSInt{79}, 
COSArray{[COSInt{305}]}, COSInt{16}, COSArray{[COSInt{322}]}, COSInt{80}, 
COSArray{[COSInt{982}]}, COSInt{17}, COSArray{[COSInt{285}]}, COSInt{81}, 
COSArray{[COSInt{657}, COSInt{619}]}, COSInt{19}, COSArray{[COSInt{571}]}, 
COSInt{83}, COSArray{[COSInt{633}]}, COSInt{20}, COSArray{[COSInt{571}]}, 
COSInt{85}, COSArray{[COSInt{454}, COSInt{497}, COSInt{434}, COSInt{657}]}, 
COSInt{27}, COSArray{[COSInt{571}, COSInt{571}, COSInt{285}]}, COSInt{93}, 
COSArray{[COSInt{488}]}, COSInt{36}, COSArray{[COSInt{690}, COSInt{672}]}, 
COSInt{40}, COSArray{[COSInt{560}]}, COSInt{48}, COSArray{[COSInt{943}, 
COSInt{813}]}, COSInt{53}, COSArray{[COSInt{660}, COSInt{551}, COSInt{579}, 
COSInt{756}]}, COSInt{61}, COSArray{[COSInt{579}]}]}) 
(COSName{CIDToGIDMap}:COSStream{(COSName{Filter}:COSName{FlateDecode}) 
(COSName{Length}:COSInt{84}) (COSName{Length1}:COSInt{188}) }) }]}) 
(COSName{ToUnicode}:COSStream{(COSName{Filter}:COSName{FlateDecode}) 
(COSName{Length}:COSInt{324}) }) }
embedderPDCIDFontType2Embedder 
fontDescriptor  null
fontWidthOfSpace-1.0
isCMapPredefinedtrue
isDescendantCJK false   
noUnicode   HashSet  -> empty
toUnicodeCMap   null
widths  null
{code}

I will try to find a minimum example on how to reproduce this. Currently it is 
only reproducible as part of a bigger package :|

  was:
An NPE happens in "public void TTFSubsetter.add(int unicode)" because the 
"unicodeCmap" member is null.
This might be, because the passed "ttf" member is based on a 
"MemoryTTFDataStream" and has only 38 glyphs (so it might already be a subset). 
The available tables of the TTF are only: [fpgm, head, cvt , glyf, loca, gasp, 
hmtx, prep, hhea, maxp]

The variables of the underyling font are:
thisPDType0Font  (id=58)
afmStandard14   null
avgFontWidth0.0 
cMapCMap  -> Identity-H
cMapUCS2null
descendantFont  PDCIDFontType2  (id=155)
dictCOSDictionary  -> COSDictionary{(COSName{Type}:COSName{Font}) 
(COSName{BaseFont}:COSName{AAAMSE+OpenSans-Bold}) 
(COSName{Subtype}:COSName{Type0}) (COSName{Encoding}:COSName{Identity-H}) 
(COSName{DescendantFonts}:COSArray{[COSDictionary{(COSName{Type}:COSName{Font}) 
(COSName{Subtype}:COSName{CIDFontType2}) 
(COSName{BaseFont}:COSName{AAAMSE+OpenSans-Bold}) 
(COSName{CIDSystemInfo}:COSDictionary{(COSName{Registry}:COSString{Adobe}) 
(COSName{Ordering}:COSString{Identity}) (COSName{Supplement}:COSInt{0}) }) 

[jira] [Updated] (PDFBOX-3133) PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is poor with systems having low RAM < 3GB and lower number of fonts.

2015-12-01 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler updated PDFBOX-3133:
---
Attachment: (was: FontTest.pdf)

> PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is 
> poor with systems having low RAM < 3GB and lower number of fonts.
> ---
>
> Key: PDFBOX-3133
> URL: https://issues.apache.org/jira/browse/PDFBOX-3133
> Project: PDFBox
>  Issue Type: Improvement
>  Components: PDModel
>Affects Versions: 2.0.0
> Environment: MS Windows Systems with low RAM < 3GB and number of 
> fonts were less < 592 (or if desired fonts in PDF to be printed are not 
> available in local system ) 
>Reporter: Sridhar
>Assignee: John Hewson
>  Labels: performance
> Fix For: 2.0.0
>
>
> PDFBox 2.0.0-RC1, SNAPSHOTS and RC2 versions print takes 15+ seconds.
> Steps to reproduce
> -- 
> Use Windows System with < 3 GB RAM
> Use Systems with less number of fonts or without specific fonts in PDF file  
> to be printed.
> Printing PDF file 
> Took 14 to 20 seconds in system with 3 GB RAM which had 522 foints
> Took 24 to 34 seconds in system with 2 GB RAM which had 90 fonts
> Took only 2.5 seconds in system with 8 GB RAM which had 1025 fonts. 
> Doubt
>  
> Not browsed the code, but following is the doubt as causing performance issue.
> Though the code caches fonts by storing fonts in local .pdfbox.cache file 
> first time and caching fonts for subsequent times.
> Not clear whether the code updates the pdfbox fonts cache file if new fonts 
> are found in new PDF file to be printed, while printing subsequent times. 
> If the fonts in PDF file to be printed is not available in the .pdfbox.cache 
> file stored in local system/local system what is the behaviour?  Will the 
> code download fonts and update cache for subsequent times or is it limited by 
> fonts available in local system?  Looks like later is the case and 
> performance got hit either due to RAM or not constantly updating fonts cache 
> or due to un availability of fonts in local system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3136) False negative on PDF/A-1A with wrongly given causes " Invalid graphics object, DestOutputProfile isn't a valid ICCProfile: Invalid ICC Profile Data" and "Invalid Colo

2015-12-01 Thread Antoine Ribes (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15033545#comment-15033545
 ] 

Antoine Ribes commented on PDFBOX-3136:
---

Well, I'm confused because the file is valid according to "Adobe Acrobat DC, 
version 15.90 with preflight version 15.0.0", and ICC output profile seems to 
be resolved in the detail of the preflight analysis report.

I'm no PDF expert so I just don't know what to think about that.

> False negative on PDF/A-1A with wrongly given causes " Invalid graphics 
> object, DestOutputProfile isn't a valid ICCProfile: Invalid ICC Profile Data" 
> and "Invalid Color space, The operator "rg" can't be used with CMYK Profile"
> --
>
> Key: PDFBOX-3136
> URL: https://issues.apache.org/jira/browse/PDFBOX-3136
> Project: PDFBox
>  Issue Type: Bug
>  Components: Preflight
>Affects Versions: 2.0.0
>Reporter: Antoine Ribes
> Attachments: test_little-A1a.pdf
>
>
> Using the code of the CookBook for PDF/A validation (given for 1.8.10) :
> - with the test_little-A1a.pdf file (Adobe preflight (and pdfbox:1.8.10) 
> tells me it's a valid PDF/A-1A)
> - and only replacing the code "parser.parse()" with 
> "parser.parse(Format.PDF_A1A)",
> result.isValid() is false with version 2.0.0-RC2. Displayed results errors 
> are :
> - 2.1.4 - Invalid graphics object, DestOutputProfile isn't a valid 
> ICCProfile: Invalid ICC Profile Data
> - 2.1.4 - Invalid graphics object, DestOutputProfile isn't a valid 
> ICCProfile. Caused by : Invalid ICC Profile Data
> - 2.4.1 - Invalid Color space, The operator "rg" can't be used with CMYK 
> Profile
> Some log is displayed :
> WARN [org.apache.pdfbox.filter.FlateFilter] - FlateFilter: premature end of 
> stream due to a DataFormatException
> DEBUG [org.apache.pdfbox.io.ScratchFileBuffer] - ScratchFileBuffer not closed!
> WARN [org.apache.pdfbox.filter.FlateFilter] - FlateFilter: premature end of 
> stream due to a DataFormatException
> Note : Running same code with the pdfbox and preflight version 2.0.0-RC1 on 
> the same file, I get the exception :
> org.apache.pdfbox.preflight.exception.ValidationException: Unable to parse 
> the ICC Profile.
>   at 
> org.apache.pdfbox.preflight.process.CatalogValidationProcess.validateICCProfile(CatalogValidationProcess.java:383)
>   at 
> org.apache.pdfbox.preflight.process.CatalogValidationProcess.validateOutputIntent(CatalogValidationProcess.java:285)
>   at 
> org.apache.pdfbox.preflight.process.CatalogValidationProcess.validate(CatalogValidationProcess.java:148)
>   at 
> org.apache.pdfbox.preflight.utils.ContextHelper.callValidation(ContextHelper.java:84)
>   at 
> org.apache.pdfbox.preflight.utils.ContextHelper.validateElement(ContextHelper.java:122)
>   at 
> org.apache.pdfbox.preflight.PreflightDocument.validate(PreflightDocument.java:163)
> [...]
> Caused by: java.io.IOException: java.util.zip.DataFormatException: incorrect 
> data check
>   at org.apache.pdfbox.filter.FlateFilter.decode(FlateFilter.java:83)
>   at org.apache.pdfbox.cos.COSInputStream.create(COSInputStream.java:69)
>   at org.apache.pdfbox.cos.COSStream.createInputStream(COSStream.java:163)
>   at 
> org.apache.pdfbox.preflight.process.CatalogValidationProcess.validateICCProfile(CatalogValidationProcess.java:360)
>   ... 29 more
> Caused by: java.util.zip.DataFormatException: incorrect data check
>   at java.util.zip.Inflater.inflateBytes(Native Method)
> And a similar result as with 2.0.0-RC2 is obtained with 1.8.8.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3142) PDFMergerUtility with scratch file generates result with blank pages for certain source files.

2015-12-01 Thread JIRA

[ 
https://issues.apache.org/jira/browse/PDFBOX-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15033542#comment-15033542
 ] 

Andreas Lehmkühler commented on PDFBOX-3142:


Sounds like those pdfs are malformed and the non-sequential parser isn't able 
to repair them.

Did you ever give 2.0.0 a try? It contains a lot of improvements and bugfixes 
and not all them are/will be backported to 1.8.x The second RC is availbal 
through the download page

> PDFMergerUtility with scratch file generates result with blank pages for 
> certain source files.
> --
>
> Key: PDFBOX-3142
> URL: https://issues.apache.org/jira/browse/PDFBOX-3142
> Project: PDFBox
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 1.8.10
> Environment: Ubuntu 14.04.3, java 1.8.0_66
>Reporter: Jim deVos
>
> My team uses PDFMergerUtility to attach cover pages to various pdfs .   We 
> recently we tried utilizing a scratch file (e.g. 
> PDFMergerUtility.mergeDocumentsNonSeq())  to cut down on the amount of RAM we 
> are using. This approach works for the majority of pdf's in our system, but 
> some files cause the merger utility to generate resultant pdf's with a blank 
> page.  Specifically, the result pdf contains a blank page after the coverpage 
> instead of the first page of the second document sent to merger utility.
> Whenever this problem occurs, we see the following line in our logs:
> {{org.apache.pdfbox.pdfparser.NonSequentialPDFParser - Can't find the object 
> 52 0 (origin offset 7187557)}}
> I'll try to attach/link an example pdf soon, but currently I don't have 
> permission to redistribute any files that exhibit the problem.  However,  
> here's a simple snippet that replicates the problem - it's pretty 
> straightforward.
> {code}
> @Test
> public void testMergeNonSeq() throws IOException, COSVisitorException {
> destinationPdf = new File(TMP_FOLDER, "result-nonseq.pdf");
> PDFMergerUtility ut = new PDFMergerUtility();
> RandomAccess ram = new 
> RandomAccessFile(File.createTempFile("mergeram", ".bin"), "rw");
> ut.addSource(coverpagePdf);
> ut.addSource(documentPdf);
> ut.setDestinationFileName(destinationPdf.getCanonicalPath());
> ut.mergeDocumentsNonSeq(ram);  
> 
> //the only automated way we have to tell that something went wrong is 
> to check the size of the result
> assertThat("destination pdf should be larger than the original pdf", 
> destinationPdf.length(), is( greaterThan(documentPdf.length(;
> }
> {code}
> Note we only see this problem with PDFMergerUtility.mergeDocumentsNonSeq().  
> Using PDFMergerUtility.mergeDocuments() does not exhibit any problems.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-3133) PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is poor with systems having low RAM < 3GB and lower number of fonts.

2015-12-01 Thread Sridhar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sridhar updated PDFBOX-3133:

Attachment: FontTest.pdf

Dear Andreas

Thanks for your comment

Test file was already provided offline to Tilman.
Here again PDF file for you.

Though the content is replaced, the layout and design is popular and 
belongs to our customer and to maintain confidentiality please kindly 
don't share it publicly and you can use it internally for testing and 
delete after a month. 



What I could see is that the .pdfBox.cache file generated by PDFBox code 
in home directory has fonts read and cached from  ttf and ttc  fonts files 
available in system ( C;\Windows\fonts ) . 
Fonts might not influence but RAM should. 

I am unable to test with lower RAM and large number of fonts, hence 
requested Tilman, whose machine has 492+ fonts and 8 GB RAM to reduce RAM 
to 2 GB and test. 

Regards
Sridhar Sowmiyanarayanan
Tata Consultancy Services
Mailto: sridhar...@tcs.com
Website: http://www.tcs.com


> PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is 
> poor with systems having low RAM < 3GB and lower number of fonts.
> ---
>
> Key: PDFBOX-3133
> URL: https://issues.apache.org/jira/browse/PDFBOX-3133
> Project: PDFBox
>  Issue Type: Improvement
>  Components: PDModel
>Affects Versions: 2.0.0
> Environment: MS Windows Systems with low RAM < 3GB and number of 
> fonts were less < 592 (or if desired fonts in PDF to be printed are not 
> available in local system ) 
>Reporter: Sridhar
>Assignee: John Hewson
>  Labels: performance
> Fix For: 2.0.0
>
> Attachments: FontTest.pdf
>
>
> PDFBox 2.0.0-RC1, SNAPSHOTS and RC2 versions print takes 15+ seconds.
> Steps to reproduce
> -- 
> Use Windows System with < 3 GB RAM
> Use Systems with less number of fonts or without specific fonts in PDF file  
> to be printed.
> Printing PDF file 
> Took 14 to 20 seconds in system with 3 GB RAM which had 522 foints
> Took 24 to 34 seconds in system with 2 GB RAM which had 90 fonts
> Took only 2.5 seconds in system with 8 GB RAM which had 1025 fonts. 
> Doubt
>  
> Not browsed the code, but following is the doubt as causing performance issue.
> Though the code caches fonts by storing fonts in local .pdfbox.cache file 
> first time and caching fonts for subsequent times.
> Not clear whether the code updates the pdfbox fonts cache file if new fonts 
> are found in new PDF file to be printed, while printing subsequent times. 
> If the fonts in PDF file to be printed is not available in the .pdfbox.cache 
> file stored in local system/local system what is the behaviour?  Will the 
> code download fonts and update cache for subsequent times or is it limited by 
> fonts available in local system?  Looks like later is the case and 
> performance got hit either due to RAM or not constantly updating fonts cache 
> or due to un availability of fonts in local system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3133) PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is poor with systems having low RAM < 3GB and lower number of fonts.

2015-12-01 Thread JIRA

[ 
https://issues.apache.org/jira/browse/PDFBOX-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15033623#comment-15033623
 ] 

Andreas Lehmkühler commented on PDFBOX-3133:


I've removed the sample pdf as it was automatically attached to this JIRA

> PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is 
> poor with systems having low RAM < 3GB and lower number of fonts.
> ---
>
> Key: PDFBOX-3133
> URL: https://issues.apache.org/jira/browse/PDFBOX-3133
> Project: PDFBox
>  Issue Type: Improvement
>  Components: PDModel
>Affects Versions: 2.0.0
> Environment: MS Windows Systems with low RAM < 3GB and number of 
> fonts were less < 592 (or if desired fonts in PDF to be printed are not 
> available in local system ) 
>Reporter: Sridhar
>Assignee: John Hewson
>  Labels: performance
> Fix For: 2.0.0
>
>
> PDFBox 2.0.0-RC1, SNAPSHOTS and RC2 versions print takes 15+ seconds.
> Steps to reproduce
> -- 
> Use Windows System with < 3 GB RAM
> Use Systems with less number of fonts or without specific fonts in PDF file  
> to be printed.
> Printing PDF file 
> Took 14 to 20 seconds in system with 3 GB RAM which had 522 foints
> Took 24 to 34 seconds in system with 2 GB RAM which had 90 fonts
> Took only 2.5 seconds in system with 8 GB RAM which had 1025 fonts. 
> Doubt
>  
> Not browsed the code, but following is the doubt as causing performance issue.
> Though the code caches fonts by storing fonts in local .pdfbox.cache file 
> first time and caching fonts for subsequent times.
> Not clear whether the code updates the pdfbox fonts cache file if new fonts 
> are found in new PDF file to be printed, while printing subsequent times. 
> If the fonts in PDF file to be printed is not available in the .pdfbox.cache 
> file stored in local system/local system what is the behaviour?  Will the 
> code download fonts and update cache for subsequent times or is it limited by 
> fonts available in local system?  Looks like later is the case and 
> performance got hit either due to RAM or not constantly updating fonts cache 
> or due to un availability of fonts in local system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3133) PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is poor with systems having low RAM < 3GB and lower number of fonts.

2015-12-01 Thread JIRA

[ 
https://issues.apache.org/jira/browse/PDFBOX-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15033535#comment-15033535
 ] 

Andreas Lehmkühler commented on PDFBOX-3133:


Did you ever test a machine with a small amount of RAM and lots of fonts? I'd 
expect that the number of fonts doesn't make a difference as long as the cache 
is already up to date.

There is one information missing. What kind of pdf do you use for your tests? 
Can you provide us with a sample?


> PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is 
> poor with systems having low RAM < 3GB and lower number of fonts.
> ---
>
> Key: PDFBOX-3133
> URL: https://issues.apache.org/jira/browse/PDFBOX-3133
> Project: PDFBox
>  Issue Type: Improvement
>  Components: PDModel
>Affects Versions: 2.0.0
> Environment: MS Windows Systems with low RAM < 3GB and number of 
> fonts were less < 592 (or if desired fonts in PDF to be printed are not 
> available in local system ) 
>Reporter: Sridhar
>Assignee: John Hewson
>  Labels: performance
> Fix For: 2.0.0
>
>
> PDFBox 2.0.0-RC1, SNAPSHOTS and RC2 versions print takes 15+ seconds.
> Steps to reproduce
> -- 
> Use Windows System with < 3 GB RAM
> Use Systems with less number of fonts or without specific fonts in PDF file  
> to be printed.
> Printing PDF file 
> Took 14 to 20 seconds in system with 3 GB RAM which had 522 foints
> Took 24 to 34 seconds in system with 2 GB RAM which had 90 fonts
> Took only 2.5 seconds in system with 8 GB RAM which had 1025 fonts. 
> Doubt
>  
> Not browsed the code, but following is the doubt as causing performance issue.
> Though the code caches fonts by storing fonts in local .pdfbox.cache file 
> first time and caching fonts for subsequent times.
> Not clear whether the code updates the pdfbox fonts cache file if new fonts 
> are found in new PDF file to be printed, while printing subsequent times. 
> If the fonts in PDF file to be printed is not available in the .pdfbox.cache 
> file stored in local system/local system what is the behaviour?  Will the 
> code download fonts and update cache for subsequent times or is it limited by 
> fonts available in local system?  Looks like later is the case and 
> performance got hit either due to RAM or not constantly updating fonts cache 
> or due to un availability of fonts in local system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3133) PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is poor with systems having low RAM < 3GB and lower number of fonts.

2015-12-01 Thread JIRA

[ 
https://issues.apache.org/jira/browse/PDFBOX-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15033623#comment-15033623
 ] 

Andreas Lehmkühler edited comment on PDFBOX-3133 at 12/1/15 12:41 PM:
--

Thanks. I've removed the sample pdf as it was automatically attached to this 
JIRA


was (Author: lehmi):
I've removed the sample pdf as it was automatically attached to this JIRA

> PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is 
> poor with systems having low RAM < 3GB and lower number of fonts.
> ---
>
> Key: PDFBOX-3133
> URL: https://issues.apache.org/jira/browse/PDFBOX-3133
> Project: PDFBox
>  Issue Type: Improvement
>  Components: PDModel
>Affects Versions: 2.0.0
> Environment: MS Windows Systems with low RAM < 3GB and number of 
> fonts were less < 592 (or if desired fonts in PDF to be printed are not 
> available in local system ) 
>Reporter: Sridhar
>Assignee: John Hewson
>  Labels: performance
> Fix For: 2.0.0
>
>
> PDFBox 2.0.0-RC1, SNAPSHOTS and RC2 versions print takes 15+ seconds.
> Steps to reproduce
> -- 
> Use Windows System with < 3 GB RAM
> Use Systems with less number of fonts or without specific fonts in PDF file  
> to be printed.
> Printing PDF file 
> Took 14 to 20 seconds in system with 3 GB RAM which had 522 foints
> Took 24 to 34 seconds in system with 2 GB RAM which had 90 fonts
> Took only 2.5 seconds in system with 8 GB RAM which had 1025 fonts. 
> Doubt
>  
> Not browsed the code, but following is the doubt as causing performance issue.
> Though the code caches fonts by storing fonts in local .pdfbox.cache file 
> first time and caching fonts for subsequent times.
> Not clear whether the code updates the pdfbox fonts cache file if new fonts 
> are found in new PDF file to be printed, while printing subsequent times. 
> If the fonts in PDF file to be printed is not available in the .pdfbox.cache 
> file stored in local system/local system what is the behaviour?  Will the 
> code download fonts and update cache for subsequent times or is it limited by 
> fonts available in local system?  Looks like later is the case and 
> performance got hit either due to RAM or not constantly updating fonts cache 
> or due to un availability of fonts in local system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3133) PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is poor with systems having low RAM < 3GB and lower number of fonts.

2015-12-01 Thread Sridhar (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15033965#comment-15033965
 ] 

Sridhar commented on PDFBOX-3133:
-

Warning messgae is only for the first time, not subsequent runs of PDFToImage.  
Might be the font caching rebuilding message, will get you the same. Hope 
deleting the .cache file and re runing might result in warning message again. 

> PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is 
> poor with systems having low RAM < 3GB and lower number of fonts.
> ---
>
> Key: PDFBOX-3133
> URL: https://issues.apache.org/jira/browse/PDFBOX-3133
> Project: PDFBox
>  Issue Type: Improvement
>  Components: PDModel
>Affects Versions: 2.0.0
> Environment: MS Windows Systems with low RAM < 3GB and number of 
> fonts were less < 592 (or if desired fonts in PDF to be printed are not 
> available in local system ) 
>Reporter: Sridhar
>Assignee: John Hewson
>  Labels: performance
> Fix For: 2.0.0
>
>
> PDFBox 2.0.0-RC1, SNAPSHOTS and RC2 versions print takes 15+ seconds.
> Steps to reproduce
> -- 
> Use Windows System with < 3 GB RAM
> Use Systems with less number of fonts or without specific fonts in PDF file  
> to be printed.
> Printing PDF file 
> Took 14 to 20 seconds in system with 3 GB RAM which had 522 foints
> Took 24 to 34 seconds in system with 2 GB RAM which had 90 fonts
> Took only 2.5 seconds in system with 8 GB RAM which had 1025 fonts. 
> Doubt
>  
> Not browsed the code, but following is the doubt as causing performance issue.
> Though the code caches fonts by storing fonts in local .pdfbox.cache file 
> first time and caching fonts for subsequent times.
> Not clear whether the code updates the pdfbox fonts cache file if new fonts 
> are found in new PDF file to be printed, while printing subsequent times. 
> If the fonts in PDF file to be printed is not available in the .pdfbox.cache 
> file stored in local system/local system what is the behaviour?  Will the 
> code download fonts and update cache for subsequent times or is it limited by 
> fonts available in local system?  Looks like later is the case and 
> performance got hit either due to RAM or not constantly updating fonts cache 
> or due to un availability of fonts in local system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Issue Comment Deleted] (PDFBOX-3133) PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is poor with systems having low RAM < 3GB and lower number of fonts.

2015-12-01 Thread Sridhar (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sridhar updated PDFBOX-3133:

Comment: was deleted

(was: Warning messgae is only for the first time, not subsequent runs of 
PDFToImage.  Might be the font caching rebuilding message, will get you the 
same. Hope deleting the .cache file and re runing might result in warning 
message again. )

> PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is 
> poor with systems having low RAM < 3GB and lower number of fonts.
> ---
>
> Key: PDFBOX-3133
> URL: https://issues.apache.org/jira/browse/PDFBOX-3133
> Project: PDFBox
>  Issue Type: Improvement
>  Components: PDModel
>Affects Versions: 2.0.0
> Environment: MS Windows Systems with low RAM < 3GB and number of 
> fonts were less < 592 (or if desired fonts in PDF to be printed are not 
> available in local system ) 
>Reporter: Sridhar
>Assignee: John Hewson
>  Labels: performance
> Fix For: 2.0.0
>
>
> PDFBox 2.0.0-RC1, SNAPSHOTS and RC2 versions print takes 15+ seconds.
> Steps to reproduce
> -- 
> Use Windows System with < 3 GB RAM
> Use Systems with less number of fonts or without specific fonts in PDF file  
> to be printed.
> Printing PDF file 
> Took 14 to 20 seconds in system with 3 GB RAM which had 522 foints
> Took 24 to 34 seconds in system with 2 GB RAM which had 90 fonts
> Took only 2.5 seconds in system with 8 GB RAM which had 1025 fonts. 
> Doubt
>  
> Not browsed the code, but following is the doubt as causing performance issue.
> Though the code caches fonts by storing fonts in local .pdfbox.cache file 
> first time and caching fonts for subsequent times.
> Not clear whether the code updates the pdfbox fonts cache file if new fonts 
> are found in new PDF file to be printed, while printing subsequent times. 
> If the fonts in PDF file to be printed is not available in the .pdfbox.cache 
> file stored in local system/local system what is the behaviour?  Will the 
> code download fonts and update cache for subsequent times or is it limited by 
> fonts available in local system?  Looks like later is the case and 
> performance got hit either due to RAM or not constantly updating fonts cache 
> or due to un availability of fonts in local system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3133) PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is poor with systems having low RAM < 3GB and lower number of fonts.

2015-12-01 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15033998#comment-15033998
 ] 

Tilman Hausherr commented on PDFBOX-3133:
-

No need to, if it is really just the font cache rebuilding message (which 
should come only once).

> PDFBox 2.0.0-RC2 and earlier 2.0.0 SNAPSHOT Versions print performance is 
> poor with systems having low RAM < 3GB and lower number of fonts.
> ---
>
> Key: PDFBOX-3133
> URL: https://issues.apache.org/jira/browse/PDFBOX-3133
> Project: PDFBox
>  Issue Type: Improvement
>  Components: PDModel
>Affects Versions: 2.0.0
> Environment: MS Windows Systems with low RAM < 3GB and number of 
> fonts were less < 592 (or if desired fonts in PDF to be printed are not 
> available in local system ) 
>Reporter: Sridhar
>Assignee: John Hewson
>  Labels: performance
> Fix For: 2.0.0
>
>
> PDFBox 2.0.0-RC1, SNAPSHOTS and RC2 versions print takes 15+ seconds.
> Steps to reproduce
> -- 
> Use Windows System with < 3 GB RAM
> Use Systems with less number of fonts or without specific fonts in PDF file  
> to be printed.
> Printing PDF file 
> Took 14 to 20 seconds in system with 3 GB RAM which had 522 foints
> Took 24 to 34 seconds in system with 2 GB RAM which had 90 fonts
> Took only 2.5 seconds in system with 8 GB RAM which had 1025 fonts. 
> Doubt
>  
> Not browsed the code, but following is the doubt as causing performance issue.
> Though the code caches fonts by storing fonts in local .pdfbox.cache file 
> first time and caching fonts for subsequent times.
> Not clear whether the code updates the pdfbox fonts cache file if new fonts 
> are found in new PDF file to be printed, while printing subsequent times. 
> If the fonts in PDF file to be printed is not available in the .pdfbox.cache 
> file stored in local system/local system what is the behaviour?  Will the 
> code download fonts and update cache for subsequent times or is it limited by 
> fonts available in local system?  Looks like later is the case and 
> performance got hit either due to RAM or not constantly updating fonts cache 
> or due to un availability of fonts in local system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Created] (PDFBOX-3145) Security manager fails for .pdfbox.cache

2015-12-01 Thread simon steiner (JIRA)
simon steiner created PDFBOX-3145:
-

 Summary: Security manager fails for .pdfbox.cache
 Key: PDFBOX-3145
 URL: https://issues.apache.org/jira/browse/PDFBOX-3145
 Project: PDFBox
  Issue Type: Bug
  Components: Rendering
Affects Versions: 2.0.0
Reporter: simon steiner


Caused by: java.security.AccessControlException: access denied 
("java.io.FilePermission" "/home/simon/.pdfbox.cache" "read")
at 
java.security.AccessControlContext.checkPermission(AccessControlContext.java:457)
at 
java.security.AccessController.checkPermission(AccessController.java:884)
at java.lang.SecurityManager.checkPermission(SecurityManager.java:549)
at java.lang.SecurityManager.checkRead(SecurityManager.java:888)
at java.io.File.exists(File.java:814)
at 
org.apache.pdfbox.pdmodel.font.FileSystemFontProvider.loadDiskCache(FileSystemFontProvider.java:357)




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org