[jira] [Closed] (PDFBOX-2688) sun.java2d.Disposer leak when using pdf to image conversion in a server(tomcat)

2016-02-19 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr closed PDFBOX-2688.
---
Resolution: Cannot Reproduce

Closing for lack of feedback.

> sun.java2d.Disposer leak when using pdf to image conversion in a 
> server(tomcat)
> ---
>
> Key: PDFBOX-2688
> URL: https://issues.apache.org/jira/browse/PDFBOX-2688
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 1.8.8
>Reporter: Ankit Khanal
>
> I am running with 6GB of heap space and running PDF to PNG conversion in a 
> servlet container(tomcat). This happens only when running thousands of 
> requests for conversion.
> JVM memory statistics shows heap space never going above 1GB and non-heap 
> memory is also constant but the linux process or windows process seems to 
> consume around 8GB of memory.
> Heap dump shows that the largest object is sun.java2d.Disposer and is around 
> 200MB.
> It seems that the leaked memory is native memory used by java2d and not 
> accounted in the heap memory statistic but this growth of sun.java2d.Disposer 
> memory is proportional to the growth of process memory(linux 'top' command).
> {code}
>   BufferedImage image = null;
>   ByteArrayInputStream pdfStream = getpdfbytesfromExistingDoc();
>   PDDocument document = null;
>   PDPage page = null;
>   COSDocument cosDoc = null;
>   PDFParser parser = null;
>   try {
>   parser = new PDFParser(pdfStream);
>   parser.parse();
>   cosDoc = parser.getDocument();
>   document = new PDDocument(cosDoc);
>   @SuppressWarnings("unchecked")
>   List pages = 
> document.getDocumentCatalog().getAllPages();
>   page = pages.get(0);
>   int imageType = BufferedImage.TYPE_INT_ARGB;
>   image = page.convertToImage(imageType, 72);
>   } finally {
>   if (cosDoc != null) {
>   cosDoc.close();
>   }
>   if (parser != null) {
>   parser.clearResources();
>   }
>   if (document != null) {
>   if (page != null) {
>   page.clear();
>   }
>   document.close();
>   }
>   }
>   return image;
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3233) Create default resources with cache

2016-02-19 Thread Isto Nikula (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154050#comment-15154050
 ] 

Isto Nikula commented on PDFBOX-3233:
-

Using resource cache seems to improve performance quite a lot at least for some 
PDFs: I made a test case demonstrating this here 
https://github.com/istonikula/pdfbox-3233. When running without cache the test 
takes around 8s to complete, compared to  0.8s with cache.

I was running the test on OSX El Capitan 2.2 GHz i7 with oracle jdk 1.8.0_51-b16

> Create default resources with cache
> ---
>
> Key: PDFBOX-3233
> URL: https://issues.apache.org/jira/browse/PDFBOX-3233
> Project: PDFBox
>  Issue Type: Wish
>  Components: AcroForm
>Affects Versions: 2.0.0
>Reporter: Isto Nikula
>Priority: Minor
>  Labels: newbie
>
> NOTE: actual version is 2.0.0-RC3
> In PDAcroForm#getDefaultResources existing resources are created like this:
> {code}
> COSDictionary dr = (COSDictionary) 
> dictionary.getDictionaryObject(COSName.DR);
> if (dr != null)
> {
> retval = new PDResources(dr);
> }
> {code}
> PDResources supports resource cache but default resources is always created 
> without one.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Created] (PDFBOX-3240) Missing Type for standard type 1 fonts

2016-02-19 Thread Andrea Vacondio (JIRA)
Andrea Vacondio created PDFBOX-3240:
---

 Summary: Missing Type for standard type 1 fonts
 Key: PDFBOX-3240
 URL: https://issues.apache.org/jira/browse/PDFBOX-3240
 Project: PDFBox
  Issue Type: Bug
  Components: PDModel
Affects Versions: 2.0.0
Reporter: Andrea Vacondio
Priority: Trivial


The org.apache.pdfbox.pdmodel.font.PDFont(String baseFont) constructor, used to 
create standard type 1 fonts, doesn't set the required Type item to the 
dictionary. According to table 111 it's required so I guess it should set it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Assigned] (PDFBOX-3240) Missing Type for standard type 1 fonts

2016-02-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler reassigned PDFBOX-3240:
--

Assignee: Andreas Lehmkühler

> Missing Type for standard type 1 fonts
> --
>
> Key: PDFBOX-3240
> URL: https://issues.apache.org/jira/browse/PDFBOX-3240
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel
>Affects Versions: 2.0.0
>Reporter: Andrea Vacondio
>Assignee: Andreas Lehmkühler
>Priority: Trivial
>
> The org.apache.pdfbox.pdmodel.font.PDFont(String baseFont) constructor, used 
> to create standard type 1 fonts, doesn't set the required Type item to the 
> dictionary. According to table 111 it's required so I guess it should set it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3240) Missing Type for standard type 1 fonts

2016-02-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154443#comment-15154443
 ] 

ASF subversion and git services commented on PDFBOX-3240:
-

Commit 1731268 from [~lehmi] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1731268 ]

PDFBOX-3240: added missing TYPE value to font dictionary as proposed by Andrea 
Vacondio

> Missing Type for standard type 1 fonts
> --
>
> Key: PDFBOX-3240
> URL: https://issues.apache.org/jira/browse/PDFBOX-3240
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel
>Affects Versions: 2.0.0
>Reporter: Andrea Vacondio
>Assignee: Andreas Lehmkühler
>Priority: Trivial
>
> The org.apache.pdfbox.pdmodel.font.PDFont(String baseFont) constructor, used 
> to create standard type 1 fonts, doesn't set the required Type item to the 
> dictionary. According to table 111 it's required so I guess it should set it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Resolved] (PDFBOX-3240) Missing Type for standard type 1 fonts

2016-02-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler resolved PDFBOX-3240.

   Resolution: Fixed
Fix Version/s: 2.0.0

Done, [~torakiki] thanks for the hint

> Missing Type for standard type 1 fonts
> --
>
> Key: PDFBOX-3240
> URL: https://issues.apache.org/jira/browse/PDFBOX-3240
> Project: PDFBox
>  Issue Type: Bug
>  Components: PDModel
>Affects Versions: 2.0.0
>Reporter: Andrea Vacondio
>Assignee: Andreas Lehmkühler
>Priority: Trivial
> Fix For: 2.0.0
>
>
> The org.apache.pdfbox.pdmodel.font.PDFont(String baseFont) constructor, used 
> to create standard type 1 fonts, doesn't set the required Type item to the 
> dictionary. According to table 111 it's required so I guess it should set it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3239) PDFTextStripper parses document as empty

2016-02-19 Thread JIRA

[ 
https://issues.apache.org/jira/browse/PDFBOX-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154469#comment-15154469
 ] 

Andreas Lehmkühler commented on PDFBOX-3239:


The PDF doesn't provide any information on how to map the unreadable encoding 
to something readable.

{quote}
 If I open the document with OSX preview,
{quote}
It doesn't work on OSX either, at least on mine. Are you sure that you are 
using the same pdf? Which tool did you use on debian?



> PDFTextStripper parses document as empty
> 
>
> Key: PDFBOX-3239
> URL: https://issues.apache.org/jira/browse/PDFBOX-3239
> Project: PDFBox
>  Issue Type: Bug
>Affects Versions: 1.8.11, 2.0.0
> Environment: OSX 10.10.5, JAVA8
>Reporter: Vojtech Knyttl
>
> The document is parsed as empty with new lines at the end of each page.
> http://pub.goout.cz/malformed_parse.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2983) Corrupted PDF after adding text

2016-02-19 Thread JIRA

[ 
https://issues.apache.org/jira/browse/PDFBOX-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154480#comment-15154480
 ] 

Andreas Lehmkühler commented on PDFBOX-2983:


I came to the conclusion that it would be tricky to backport those changes. As 
2.0.0 will be released quite soon, I don't want to put that much effort into 
improving the 1.8 branch.

> Corrupted PDF after adding text
> ---
>
> Key: PDFBOX-2983
> URL: https://issues.apache.org/jira/browse/PDFBOX-2983
> Project: PDFBox
>  Issue Type: Bug
>  Components: Writing
>Affects Versions: 1.8.9, 1.8.10
>Reporter: Brian Schmoll
>Assignee: Andreas Lehmkühler
> Attachments: PdfDataLoss.zip, PdfStampingLogging.zip
>
>
> We have a web application which writes an official stamp to PDF documents 
> after they have been approved.  Recently some PDFs have become corrupted 
> after the stamp is written to the document.  The stamp appears on the 
> document, but all other content is removed.  Adobe Reader also displays a 
> dialogue box indicating the document has been corrupted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Closed] (PDFBOX-2983) Corrupted PDF after adding text

2016-02-19 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler closed PDFBOX-2983.
--
Resolution: Won't Fix

> Corrupted PDF after adding text
> ---
>
> Key: PDFBOX-2983
> URL: https://issues.apache.org/jira/browse/PDFBOX-2983
> Project: PDFBox
>  Issue Type: Bug
>  Components: Writing
>Affects Versions: 1.8.9, 1.8.10
>Reporter: Brian Schmoll
>Assignee: Andreas Lehmkühler
> Attachments: PdfDataLoss.zip, PdfStampingLogging.zip
>
>
> We have a web application which writes an official stamp to PDF documents 
> after they have been approved.  Recently some PDFs have become corrupted 
> after the stamp is written to the document.  The stamp appears on the 
> document, but all other content is removed.  Adobe Reader also displays a 
> dialogue box indicating the document has been corrupted.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2852) Improve code quality (2)

2016-02-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154675#comment-15154675
 ] 

ASF subversion and git services commented on PDFBOX-2852:
-

Commit 1731286 from [~tilman] in branch 'pdfbox/branches/1.8'
[ https://svn.apache.org/r1731286 ]

PDFBOX-2852: clarify javadoc

> Improve code quality (2)
> 
>
> Key: PDFBOX-2852
> URL: https://issues.apache.org/jira/browse/PDFBOX-2852
> Project: PDFBox
>  Issue Type: Task
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
> Attachments: winansiencoding.patch, winansiencoding2.patch
>
>
> This is a longterm issue for the task to improve code quality, by using the 
> [SonarQube 
> report|https://analysis.apache.org/dashboard/index/org.apache.pdfbox:pdfbox-reactor],
>  hints in different IDEs, the FindBugs tool and other code quality tools.
> This is a follow-up of PDFBOX-2576, which was getting too long.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3030) Enhance documentation for PDFBox 2.0.0

2016-02-19 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154684#comment-15154684
 ] 

Tilman Hausherr commented on PDFBOX-3030:
-

2.0 for the FAQ or the migration guide:

Why was the ReplaceText example removed?

Because it gave the incorrect illusion that text can be replaced easily. Words 
are often split, as seen by this excerpt of a content stream:
{code}
[ (Do) -29 (c) -1 (umen) 30 (tation) ] TJ
{code}

Other problems will appear with font subsets: for example, if only the glyphs 
for a, b and c are used, these would be encoded as hex 0, 1 and 2, so you won't 
find "abc". Additionally, you can't replace "c" with "d" because it isn't part 
of the subset.

You could also have problems with ligatures, e.g. "ff", "fl", "fi", "ffi", 
"ffl", which can be represented by a single code in many fonts.

To understand this yourself, view any file with PDFDebugger and have a look at 
the "Contents" entry of a page.

> Enhance documentation for PDFBox 2.0.0
> --
>
> Key: PDFBOX-3030
> URL: https://issues.apache.org/jira/browse/PDFBOX-3030
> Project: PDFBox
>  Issue Type: Task
>  Components: Documentation
>Affects Versions: 2.0.0
>Reporter: Maruan Sahyoun
>Assignee: Maruan Sahyoun
> Attachments: TGH-16862c48-6b0b-410e-8fc6-b1d9f4418ecc.htm
>
>
> Task to track enhancements to the documentation or website as part of PDFBox 
> 2.0.0
> - update javadoc (current as of writing)
> - migration guide 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3030) Enhance documentation for PDFBox 2.0.0

2016-02-19 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154684#comment-15154684
 ] 

Tilman Hausherr edited comment on PDFBOX-3030 at 2/19/16 7:11 PM:
--

2.0 for the FAQ or the migration guide:

Why was the ReplaceText example removed?

Because it gave the incorrect illusion that text can be replaced easily. Words 
are often split, as seen by this excerpt of a content stream:
{code}
[ (Do) -29 (c) -1 (umen) 30 (tation) ] TJ
{code}

Other problems will appear with font subsets: for example, if only the glyphs 
for a, b and c are used, these would be encoded as hex 0, 1 and 2, so you won't 
find "abc". Additionally, you can't replace "c" with "d" because it isn't part 
of the subset.

You could also have problems with ligatures, e.g. "ff", "fl", "fi", "ffi", 
"ffl", which can be represented by a single code in many fonts.

To understand this yourself, view any file with PDFDebugger and have a look at 
the "Contents" entry of a page.

reason why:
https://stackoverflow.com/questions/35420609/pdfbox-2-0-rc3-find-and-replace-text


was (Author: tilman):
2.0 for the FAQ or the migration guide:

Why was the ReplaceText example removed?

Because it gave the incorrect illusion that text can be replaced easily. Words 
are often split, as seen by this excerpt of a content stream:
{code}
[ (Do) -29 (c) -1 (umen) 30 (tation) ] TJ
{code}

Other problems will appear with font subsets: for example, if only the glyphs 
for a, b and c are used, these would be encoded as hex 0, 1 and 2, so you won't 
find "abc". Additionally, you can't replace "c" with "d" because it isn't part 
of the subset.

You could also have problems with ligatures, e.g. "ff", "fl", "fi", "ffi", 
"ffl", which can be represented by a single code in many fonts.

To understand this yourself, view any file with PDFDebugger and have a look at 
the "Contents" entry of a page.

> Enhance documentation for PDFBox 2.0.0
> --
>
> Key: PDFBOX-3030
> URL: https://issues.apache.org/jira/browse/PDFBOX-3030
> Project: PDFBox
>  Issue Type: Task
>  Components: Documentation
>Affects Versions: 2.0.0
>Reporter: Maruan Sahyoun
>Assignee: Maruan Sahyoun
> Attachments: TGH-16862c48-6b0b-410e-8fc6-b1d9f4418ecc.htm
>
>
> Task to track enhancements to the documentation or website as part of PDFBox 
> 2.0.0
> - update javadoc (current as of writing)
> - migration guide 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-1594) Add support for AES256 Encryption

2016-02-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154712#comment-15154712
 ] 

ASF subversion and git services commented on PDFBOX-1594:
-

Commit 1731289 from [~tilman] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1731289 ]

PDFBOX-1594: fix operator precedence

> Add support for AES256 Encryption 
> --
>
> Key: PDFBOX-1594
> URL: https://issues.apache.org/jira/browse/PDFBOX-1594
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Crypto
>Reporter: Maruan Sahyoun
>Assignee: John Hewson
>  Labels: AES256
> Fix For: 2.0.0
>
> Attachments: fix-pdfbox-2.0.0-encrypt.diff, pdfbox-1.8.4-aes256.diff, 
> pdfbox-2.0.0-r1580297-aes256.diff
>
>
> Adobe 9 added support for AES 256 encryption. Further information is 
> available at  
> http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/adobe_supplement_iso32000.pdf
>  (specially 3.5.1) or ISO 32000-2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-1594) Add support for AES256 Encryption

2016-02-19 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154714#comment-15154714
 ] 

Tilman Hausherr commented on PDFBOX-1594:
-

See java operator precedence here: 
https://docs.oracle.com/javase/tutorial/java/nutsandbolts/operators.html

IMHO it is better to use "too many" parentheses than to try to properly 
remember these.

> Add support for AES256 Encryption 
> --
>
> Key: PDFBOX-1594
> URL: https://issues.apache.org/jira/browse/PDFBOX-1594
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Crypto
>Reporter: Maruan Sahyoun
>Assignee: John Hewson
>  Labels: AES256
> Fix For: 2.0.0
>
> Attachments: fix-pdfbox-2.0.0-encrypt.diff, pdfbox-1.8.4-aes256.diff, 
> pdfbox-2.0.0-r1580297-aes256.diff
>
>
> Adobe 9 added support for AES 256 encryption. Further information is 
> available at  
> http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/adobe_supplement_iso32000.pdf
>  (specially 3.5.1) or ISO 32000-2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-1594) Add support for AES256 Encryption

2016-02-19 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-1594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154714#comment-15154714
 ] 

Tilman Hausherr edited comment on PDFBOX-1594 at 2/19/16 7:24 PM:
--

See java operator precedence here: 
https://docs.oracle.com/javase/tutorial/java/nutsandbolts/operators.html

IMHO it is better to use "too many" parentheses than to try to properly 
remember these rules.


was (Author: tilman):
See java operator precedence here: 
https://docs.oracle.com/javase/tutorial/java/nutsandbolts/operators.html

IMHO it is better to use "too many" parentheses than to try to properly 
remember these.

> Add support for AES256 Encryption 
> --
>
> Key: PDFBOX-1594
> URL: https://issues.apache.org/jira/browse/PDFBOX-1594
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Crypto
>Reporter: Maruan Sahyoun
>Assignee: John Hewson
>  Labels: AES256
> Fix For: 2.0.0
>
> Attachments: fix-pdfbox-2.0.0-encrypt.diff, pdfbox-1.8.4-aes256.diff, 
> pdfbox-2.0.0-r1580297-aes256.diff
>
>
> Adobe 9 added support for AES 256 encryption. Further information is 
> available at  
> http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/adobe_supplement_iso32000.pdf
>  (specially 3.5.1) or ISO 32000-2.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2852) Improve code quality (2)

2016-02-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15154717#comment-15154717
 ] 

ASF subversion and git services commented on PDFBOX-2852:
-

Commit 1731290 from [~tilman] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1731290 ]

PDFBOX-2852: add some comments, improve error msg

> Improve code quality (2)
> 
>
> Key: PDFBOX-2852
> URL: https://issues.apache.org/jira/browse/PDFBOX-2852
> Project: PDFBox
>  Issue Type: Task
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
> Attachments: winansiencoding.patch, winansiencoding2.patch
>
>
> This is a longterm issue for the task to improve code quality, by using the 
> [SonarQube 
> report|https://analysis.apache.org/dashboard/index/org.apache.pdfbox:pdfbox-reactor],
>  hints in different IDEs, the FindBugs tool and other code quality tools.
> This is a follow-up of PDFBOX-2576, which was getting too long.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Updated] (PDFBOX-2941) Improve PDFDebugger (2)

2016-02-19 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-2941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr updated PDFBOX-2941:

Description: 
This is a follow-up issue to PDFBOX-2530 to implement extra ideas that came up 
in GSoC2015, ideas that were not implemented due to lack of time, and new ideas.
- save modified PDFs
- refactor PDFDebugger.java
- render glyphs of fonts
- editing in hex viewer
- ✓ refactor StreamPane to share stream filtering among Text view and hex view
- ✓ password dialog when hitting protected PDF
- remove nodes (e.g. elements from a COSDictionary)
- show "pretty" XML
- delete array or dictionary elements
- edit & keep content streams
- load content streams
- display filtered streams even if the unfiltered stream is corrupt 
(PDFBOX-2976)
- ✓ display the "caused by" part exception stack trace (nested exceptions)
- keep zoom
- integrate DrawPrintTextLocations into rendering
- integrate area text extraction with a mouse-created rectangle that shows the 
coordinates in a status line
- show permission flags of {{Encrypt/P}} entry


  was:
This is a follow-up issue to PDFBOX-2530 to implement extra ideas that came up 
in GSoC2015, ideas that were not implemented due to lack of time, and new ideas.
- save modified PDFs
- refactor PDFDebugger.java
- render glyphs of fonts
- editing in hex viewer
- ✓ refactor StreamPane to share stream filtering among Text view and hex view
- ✓ password dialog when hitting protected PDF
- remove nodes (e.g. elements from a COSDictionary)
- show "pretty" XML
- delete array or dictionary elements
- edit & keep content streams
- load content streams
- display filtered streams even if the unfiltered stream is corrupt 
(PDFBOX-2976)
- ✓ display the "caused by" part exception stack trace (nested exceptions)
- keep zoom
- integrate DrawPrintTextLocations into rendering
- integrate area text extraction with a mouse-created rectangle that shows the 
coordinates in a status line



> Improve PDFDebugger (2)
> ---
>
> Key: PDFBOX-2941
> URL: https://issues.apache.org/jira/browse/PDFBOX-2941
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Utilities
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
> Attachments: gs-bugzilla694570.pdf, osx-tabs.png, 
> screenshot_debugger_new.png, screenshot_debugger_not_aligned.png, 
> screenshot_debugger_old.png, screenshot_w7_fontsize.png, 
> separate_filter_choice_from_text_hex_views.diff, sonar_qube_resolve.diff, 
> sonar_qube_resolve_25_08.diff
>
>
> This is a follow-up issue to PDFBOX-2530 to implement extra ideas that came 
> up in GSoC2015, ideas that were not implemented due to lack of time, and new 
> ideas.
> - save modified PDFs
> - refactor PDFDebugger.java
> - render glyphs of fonts
> - editing in hex viewer
> - ✓ refactor StreamPane to share stream filtering among Text view and hex view
> - ✓ password dialog when hitting protected PDF
> - remove nodes (e.g. elements from a COSDictionary)
> - show "pretty" XML
> - delete array or dictionary elements
> - edit & keep content streams
> - load content streams
> - display filtered streams even if the unfiltered stream is corrupt 
> (PDFBOX-2976)
> - ✓ display the "caused by" part exception stack trace (nested exceptions)
> - keep zoom
> - integrate DrawPrintTextLocations into rendering
> - integrate area text extraction with a mouse-created rectangle that shows 
> the coordinates in a status line
> - show permission flags of {{Encrypt/P}} entry



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Created] (PDFBOX-3241) return original PDF Header

2016-02-19 Thread Tilman Hausherr (JIRA)
Tilman Hausherr created PDFBOX-3241:
---

 Summary: return original PDF Header
 Key: PDFBOX-3241
 URL: https://issues.apache.org/jira/browse/PDFBOX-3241
 Project: PDFBox
  Issue Type: Wish
  Components: Parsing
Affects Versions: 1.8.11
Reporter: Tilman Hausherr
Assignee: Tilman Hausherr
 Fix For: 1.8.12


Wish by [~abyss] presented in the mailing list
{quote}
Yes, I know, that the version in catalog shall be used to determine version
and therefore the version COSDocument#getVersion() method result is expected
to reflect that. But I ask the header string. And it's result differs from
the actual header string in the file after PDFParser finishes its job.

Please, bear also in mind that Extensions Dictionary (see ISO 32000-1
chapter 7.12) validation should consider the values both in document catalog
and header:
"The value of BaseVersion, when treated as a version number, shall be less
than or equal to the PDF version, both in the document header (see 7.5.2,
"File Header") and the catalog Version key value, if present."

As it says "both", that means BaseVersion may not exceed the value neither
in header nor in catalog, therefore we need to validate that. 
{quote}
my answer:

How about something like this:
{code}
private String originalHeaderString = null;

public void setOriginalHeaderString(String header)
{
if (originalHeaderString != null)
  throw bad state exception blah blah
originalHeaderString = header;
}

public String getOriginalHeaderString()
{
return originalHeaderString ;
}
{code}
The setter should be called only once by parseHeader().

This was accepted, so I'll implement it (for 1.8 only)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3241) return original PDF Header

2016-02-19 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15155039#comment-15155039
 ] 

ASF subversion and git services commented on PDFBOX-3241:
-

Commit 1731305 from [~tilman] in branch 'pdfbox/branches/1.8'
[ https://svn.apache.org/r1731305 ]

PDFBOX-3241: return original header string

> return original PDF Header
> --
>
> Key: PDFBOX-3241
> URL: https://issues.apache.org/jira/browse/PDFBOX-3241
> Project: PDFBox
>  Issue Type: Wish
>  Components: Parsing
>Affects Versions: 1.8.11
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Fix For: 1.8.12
>
>
> Wish by [~abyss] presented in the mailing list
> {quote}
> Yes, I know, that the version in catalog shall be used to determine version
> and therefore the version COSDocument#getVersion() method result is expected
> to reflect that. But I ask the header string. And it's result differs from
> the actual header string in the file after PDFParser finishes its job.
> Please, bear also in mind that Extensions Dictionary (see ISO 32000-1
> chapter 7.12) validation should consider the values both in document catalog
> and header:
> "The value of BaseVersion, when treated as a version number, shall be less
> than or equal to the PDF version, both in the document header (see 7.5.2,
> "File Header") and the catalog Version key value, if present."
> As it says "both", that means BaseVersion may not exceed the value neither
> in header nor in catalog, therefore we need to validate that. 
> {quote}
> my answer:
> How about something like this:
> {code}
> private String originalHeaderString = null;
> public void setOriginalHeaderString(String header)
> {
> if (originalHeaderString != null)
>   throw bad state exception blah blah
> originalHeaderString = header;
> }
> public String getOriginalHeaderString()
> {
> return originalHeaderString ;
> }
> {code}
> The setter should be called only once by parseHeader().
> This was accepted, so I'll implement it (for 1.8 only)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3241) return original PDF Header

2016-02-19 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15155043#comment-15155043
 ] 

Tilman Hausherr commented on PDFBOX-3241:
-

[~abyss] I committed a more simpler change. Please give feedback whether this 
is helpful, or whether you prefer that I return the original version as a float.

> return original PDF Header
> --
>
> Key: PDFBOX-3241
> URL: https://issues.apache.org/jira/browse/PDFBOX-3241
> Project: PDFBox
>  Issue Type: Wish
>  Components: Parsing
>Affects Versions: 1.8.11
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Fix For: 1.8.12
>
>
> Wish by [~abyss] presented in the mailing list
> {quote}
> Yes, I know, that the version in catalog shall be used to determine version
> and therefore the version COSDocument#getVersion() method result is expected
> to reflect that. But I ask the header string. And it's result differs from
> the actual header string in the file after PDFParser finishes its job.
> Please, bear also in mind that Extensions Dictionary (see ISO 32000-1
> chapter 7.12) validation should consider the values both in document catalog
> and header:
> "The value of BaseVersion, when treated as a version number, shall be less
> than or equal to the PDF version, both in the document header (see 7.5.2,
> "File Header") and the catalog Version key value, if present."
> As it says "both", that means BaseVersion may not exceed the value neither
> in header nor in catalog, therefore we need to validate that. 
> {quote}
> my answer:
> How about something like this:
> {code}
> private String originalHeaderString = null;
> public void setOriginalHeaderString(String header)
> {
> if (originalHeaderString != null)
>   throw bad state exception blah blah
> originalHeaderString = header;
> }
> public String getOriginalHeaderString()
> {
> return originalHeaderString ;
> }
> {code}
> The setter should be called only once by parseHeader().
> This was accepted, so I'll implement it (for 1.8 only)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3241) return original PDF Header

2016-02-19 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15155043#comment-15155043
 ] 

Tilman Hausherr edited comment on PDFBOX-3241 at 2/19/16 10:44 PM:
---

[~abyss] I committed a more simpler change. Please give feedback whether this 
is helpful, or whether you prefer that I return the original header version as 
a float.


was (Author: tilman):
[~abyss] I committed a more simpler change. Please give feedback whether this 
is helpful, or whether you prefer that I return the original version as a float.

> return original PDF Header
> --
>
> Key: PDFBOX-3241
> URL: https://issues.apache.org/jira/browse/PDFBOX-3241
> Project: PDFBox
>  Issue Type: Wish
>  Components: Parsing
>Affects Versions: 1.8.11
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Fix For: 1.8.12
>
>
> Wish by [~abyss] presented in the mailing list
> {quote}
> Yes, I know, that the version in catalog shall be used to determine version
> and therefore the version COSDocument#getVersion() method result is expected
> to reflect that. But I ask the header string. And it's result differs from
> the actual header string in the file after PDFParser finishes its job.
> Please, bear also in mind that Extensions Dictionary (see ISO 32000-1
> chapter 7.12) validation should consider the values both in document catalog
> and header:
> "The value of BaseVersion, when treated as a version number, shall be less
> than or equal to the PDF version, both in the document header (see 7.5.2,
> "File Header") and the catalog Version key value, if present."
> As it says "both", that means BaseVersion may not exceed the value neither
> in header nor in catalog, therefore we need to validate that. 
> {quote}
> my answer:
> How about something like this:
> {code}
> private String originalHeaderString = null;
> public void setOriginalHeaderString(String header)
> {
> if (originalHeaderString != null)
>   throw bad state exception blah blah
> originalHeaderString = header;
> }
> public String getOriginalHeaderString()
> {
> return originalHeaderString ;
> }
> {code}
> The setter should be called only once by parseHeader().
> This was accepted, so I'll implement it (for 1.8 only)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org