[jira] [Resolved] (PDFBOX-3541) Use /L entry to determine if a linearized file shall be treated as such for PDF/A validation
[ https://issues.apache.org/jira/browse/PDFBOX-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr resolved PDFBOX-3541. - Resolution: Fixed Assignee: Tilman Hausherr > Use /L entry to determine if a linearized file shall be treated as such for > PDF/A validation > > > Key: PDFBOX-3541 > URL: https://issues.apache.org/jira/browse/PDFBOX-3541 > Project: PDFBox > Issue Type: Improvement > Components: Preflight >Affects Versions: 2.0.3 >Reporter: Maruan Sahyoun >Assignee: Tilman Hausherr >Priority: Minor > Fix For: 2.0.4, 2.1.0 > > > With PDFBOX-3540 the detection of a linearized file which has later been > updated for PDF/A validation was improved so that provisions can be properly > applied or ignored. That could be improved by checking the /L entry of the > linearization dictionary. The *ISO 19005-1:2005/Cor.2:2011* has this: > {quote} > In a linearized PDF, if the ID keyword is present in both the first page > trailer dictionary and the last > trailer dictionary, the value to both instances of the ID keyword shall be > identical. > ... > This provision shall not apply where the value to the L key in the > linearization dictionary does not match the actual length of the PDF. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Updated] (PDFBOX-3541) Use /L entry to determine if a linearized file shall be treated as such for PDF/A validation
[ https://issues.apache.org/jira/browse/PDFBOX-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr updated PDFBOX-3541: Affects Version/s: 2.0.3 > Use /L entry to determine if a linearized file shall be treated as such for > PDF/A validation > > > Key: PDFBOX-3541 > URL: https://issues.apache.org/jira/browse/PDFBOX-3541 > Project: PDFBox > Issue Type: Improvement > Components: Preflight >Affects Versions: 2.0.3 >Reporter: Maruan Sahyoun >Priority: Minor > Fix For: 2.0.4, 2.1.0 > > > With PDFBOX-3540 the detection of a linearized file which has later been > updated for PDF/A validation was improved so that provisions can be properly > applied or ignored. That could be improved by checking the /L entry of the > linearization dictionary. The *ISO 19005-1:2005/Cor.2:2011* has this: > {quote} > In a linearized PDF, if the ID keyword is present in both the first page > trailer dictionary and the last > trailer dictionary, the value to both instances of the ID keyword shall be > identical. > ... > This provision shall not apply where the value to the L key in the > linearization dictionary does not match the actual length of the PDF. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3541) Use /L entry to determine if a linearized file shall be treated as such for PDF/A validation
[ https://issues.apache.org/jira/browse/PDFBOX-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15608970#comment-15608970 ] ASF subversion and git services commented on PDFBOX-3541: - Commit 1766703 from [~tilman] in branch 'pdfbox/branches/2.0' [ https://svn.apache.org/r1766703 ] PDFBOX-3541: also use length to determine whether it's a linearized file > Use /L entry to determine if a linearized file shall be treated as such for > PDF/A validation > > > Key: PDFBOX-3541 > URL: https://issues.apache.org/jira/browse/PDFBOX-3541 > Project: PDFBox > Issue Type: Improvement > Components: Preflight >Reporter: Maruan Sahyoun >Priority: Minor > Fix For: 2.0.4, 2.1.0 > > > With PDFBOX-3540 the detection of a linearized file which has later been > updated for PDF/A validation was improved so that provisions can be properly > applied or ignored. That could be improved by checking the /L entry of the > linearization dictionary. The *ISO 19005-1:2005/Cor.2:2011* has this: > {quote} > In a linearized PDF, if the ID keyword is present in both the first page > trailer dictionary and the last > trailer dictionary, the value to both instances of the ID keyword shall be > identical. > ... > This provision shall not apply where the value to the L key in the > linearization dictionary does not match the actual length of the PDF. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3541) Use /L entry to determine if a linearized file shall be treated as such for PDF/A validation
[ https://issues.apache.org/jira/browse/PDFBOX-3541?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15608969#comment-15608969 ] ASF subversion and git services commented on PDFBOX-3541: - Commit 1766702 from [~tilman] in branch 'pdfbox/trunk' [ https://svn.apache.org/r1766702 ] PDFBOX-3541: also use length to determine whether it's a linearized file > Use /L entry to determine if a linearized file shall be treated as such for > PDF/A validation > > > Key: PDFBOX-3541 > URL: https://issues.apache.org/jira/browse/PDFBOX-3541 > Project: PDFBox > Issue Type: Improvement > Components: Preflight >Reporter: Maruan Sahyoun >Priority: Minor > Fix For: 2.0.4, 2.1.0 > > > With PDFBOX-3540 the detection of a linearized file which has later been > updated for PDF/A validation was improved so that provisions can be properly > applied or ignored. That could be improved by checking the /L entry of the > linearization dictionary. The *ISO 19005-1:2005/Cor.2:2011* has this: > {quote} > In a linearized PDF, if the ID keyword is present in both the first page > trailer dictionary and the last > trailer dictionary, the value to both instances of the ID keyword shall be > identical. > ... > This provision shall not apply where the value to the L key in the > linearization dictionary does not match the actual length of the PDF. > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3542) Can PDFBOX use Streams to read PDSignatures from document?
[ https://issues.apache.org/jira/browse/PDFBOX-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15608930#comment-15608930 ] Tilman Hausherr commented on PDFBOX-3542: - I don't have any better ideas now. The reason that PDFBox loads all document structures for signing is because PDFBox doesn't have "parse on demand". PDFBox must load the document because the signature field + annotation must be appended in a way to conform with the existing structures. If you have a non confidential file I could have a look whether there is some optimization that we missed. But don't expect any miracles, and this may take a few days. Here's a list of huge files that I have: {code} 72.168.407 475419.pdf 71.460.416 620038.pdf 50.820.260 302439.pdf 49.749.322 209086.pdf 46.733.747 755045.pdf 38.696.108 503657.pdf 37.580.965 767115.pdf 37.148.455 942416.pdf 36.775.297 364591.pdf 31.845.407 240242.pdf 31.196.981 574442.pdf 30.773.313 560466.pdf 30.397.179 134823.pdf 27.228.247 234570.pdf 26.519.885 071300.pdf 26.338.972 884613.pdf 26.262.215 022391.pdf 25.805.316 160655.pdf 25.465.233 898927.pdf 23.065.331 509787.pdf 22.805.751 125112.pdf 22.381.412 486395.pdf 21.718.527 510488.pdf 21.486.589 586504.pdf {code} These files can be found here: http://digitalcorpora.org/corp/nps/files/govdocs1/zipfiles/ The three first digits of the files in my list (e.g. 586504.pdf, whose size is 21.486.589 bytes) tell the name of the zip file (586.zip). > Can PDFBOX use Streams to read PDSignatures from document? > -- > > Key: PDFBOX-3542 > URL: https://issues.apache.org/jira/browse/PDFBOX-3542 > Project: PDFBox > Issue Type: Wish > Components: PDModel >Affects Versions: 2.0.3 >Reporter: Andrea Paternesi >Priority: Critical > > I did not find a way to avoid loading into memory the whole PDDocument to > read the signatures dictionaries. > If you have very big PDF files (30MB or more), java gets an Out of Memory > error. > Right now i did not find a correct way to load signatures usign stream. > Can you give any hont? > Thanks in advance. > Andrea. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3544) Invalid ByteRange for getContents() method
[ https://issues.apache.org/jira/browse/PDFBOX-3544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15608927#comment-15608927 ] Tilman Hausherr commented on PDFBOX-3544: - I've renamed the confusing variable names in PDFBOX-2852 ("len" instead of "end"). > Invalid ByteRange for getContents() method > -- > > Key: PDFBOX-3544 > URL: https://issues.apache.org/jira/browse/PDFBOX-3544 > Project: PDFBox > Issue Type: Bug > Components: Signing >Affects Versions: 2.0.3 >Reporter: Lonzak > > PDSignature.java class, getContents() method, line 325ff. > {code:title=PDSignature.java|borderStyle=solid} > /** > * Will return the embedded signature between the byterange gap. > * > * @param pdfFile The signed pdf file as byte array > * @return a byte array containing the signature > * @throws IOException if the pdfFile can't be read > */ > public byte[] getContents(byte[] pdfFile) throws IOException > { > int[] byteRange = getByteRange(); > int begin = byteRange[0]+byteRange[1]+1; > int end = byteRange[2]-begin; > return getContents(new COSFilterInputStream(pdfFile,new int[] > {begin,end})); > } > {code:} > Lets asume a byte range of > /ByteRange[ 0, 840, 960, 240] > The current implementation would return > {841, 119} which is from *841 - 960* > According to > [adobe|http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/DigitalSignaturesInPDF.pdf] > (page 5) this is invalid: > {quote} > "In this example, the hash is calculated for bytes 0 through 839, and 960 > through 1200." > {quote} > Thus the values for the signature should be > {840, 119} which is from *840 - 959* > The implementation should be: > {code:title=PDSignature.java|borderStyle=solid} > /** > * Will return the embedded signature between the byterange gap. > * > * @param pdfFile The signed pdf file as byte array > * @return a byte array containing the signature > * @throws IOException if the pdfFile can't be read > */ > public byte[] getContents(byte[] pdfFile) throws IOException > { > int[] byteRange = getByteRange(); > int begin = byteRange[0]+byteRange[1]; > int end = byteRange[2]-begin-1; > return getContents(new COSFilterInputStream(pdfFile,new int[] > {begin,end})); > } > {code:} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-2852) Improve code quality (2)
[ https://issues.apache.org/jira/browse/PDFBOX-2852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15608921#comment-15608921 ] ASF subversion and git services commented on PDFBOX-2852: - Commit 1766696 from [~tilman] in branch 'pdfbox/branches/2.0' [ https://svn.apache.org/r1766696 ] PDFBOX-2852: rename misleading variable names > Improve code quality (2) > > > Key: PDFBOX-2852 > URL: https://issues.apache.org/jira/browse/PDFBOX-2852 > Project: PDFBox > Issue Type: Task >Affects Versions: 2.0.0 >Reporter: Tilman Hausherr > Attachments: PDNameTreeNode.java.patch, XMPSchema.java.patch, > explicit_array_creation.patch, fix_javadoc.patch, foreach.patch, > noarray.patch, semicolon.patch, stringbuilder.patch, > unnecessary_type_casting.patch, unused_imports.patch, usestatic.patch, > winansiencoding.patch, winansiencoding2.patch > > > This is a longterm issue for the task to improve code quality, by using the > [SonarQube > report|https://analysis.apache.org/dashboard/index/org.apache.pdfbox:pdfbox-reactor], > hints in different IDEs, the FindBugs tool and other code quality tools. > This is a follow-up of PDFBOX-2576, which was getting too long. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-2852) Improve code quality (2)
[ https://issues.apache.org/jira/browse/PDFBOX-2852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15608922#comment-15608922 ] ASF subversion and git services commented on PDFBOX-2852: - Commit 1766697 from [~tilman] in branch 'pdfbox/trunk' [ https://svn.apache.org/r1766697 ] PDFBOX-2852: rename misleading variable names > Improve code quality (2) > > > Key: PDFBOX-2852 > URL: https://issues.apache.org/jira/browse/PDFBOX-2852 > Project: PDFBox > Issue Type: Task >Affects Versions: 2.0.0 >Reporter: Tilman Hausherr > Attachments: PDNameTreeNode.java.patch, XMPSchema.java.patch, > explicit_array_creation.patch, fix_javadoc.patch, foreach.patch, > noarray.patch, semicolon.patch, stringbuilder.patch, > unnecessary_type_casting.patch, unused_imports.patch, usestatic.patch, > winansiencoding.patch, winansiencoding2.patch > > > This is a longterm issue for the task to improve code quality, by using the > [SonarQube > report|https://analysis.apache.org/dashboard/index/org.apache.pdfbox:pdfbox-reactor], > hints in different IDEs, the FindBugs tool and other code quality tools. > This is a follow-up of PDFBOX-2576, which was getting too long. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3544) Invalid ByteRange for getContents() method
[ https://issues.apache.org/jira/browse/PDFBOX-3544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15608901#comment-15608901 ] Tilman Hausherr commented on PDFBOX-3544: - I didn't test your code change (by your own admission, it doesn't work), I looked at what we do: we take the offset at "<" and after the ">" and that one is in /ByteRange. So there has to be an adjustment to get the actual signatzure. Adobe does the same, see the signed files in PDFBOX-3540. Because of that we have to add 1. I agree that this is in contradiction with the document you mention - that one sets the offset after the "<". You can also find a file signed with itext in PDFBOX-1751, and with Adobe in PDFBOX-3114. > Invalid ByteRange for getContents() method > -- > > Key: PDFBOX-3544 > URL: https://issues.apache.org/jira/browse/PDFBOX-3544 > Project: PDFBox > Issue Type: Bug > Components: Signing >Affects Versions: 2.0.3 >Reporter: Lonzak > > PDSignature.java class, getContents() method, line 325ff. > {code:title=PDSignature.java|borderStyle=solid} > /** > * Will return the embedded signature between the byterange gap. > * > * @param pdfFile The signed pdf file as byte array > * @return a byte array containing the signature > * @throws IOException if the pdfFile can't be read > */ > public byte[] getContents(byte[] pdfFile) throws IOException > { > int[] byteRange = getByteRange(); > int begin = byteRange[0]+byteRange[1]+1; > int end = byteRange[2]-begin; > return getContents(new COSFilterInputStream(pdfFile,new int[] > {begin,end})); > } > {code:} > Lets asume a byte range of > /ByteRange[ 0, 840, 960, 240] > The current implementation would return > {841, 119} which is from *841 - 960* > According to > [adobe|http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/DigitalSignaturesInPDF.pdf] > (page 5) this is invalid: > {quote} > "In this example, the hash is calculated for bytes 0 through 839, and 960 > through 1200." > {quote} > Thus the values for the signature should be > {840, 119} which is from *840 - 959* > The implementation should be: > {code:title=PDSignature.java|borderStyle=solid} > /** > * Will return the embedded signature between the byterange gap. > * > * @param pdfFile The signed pdf file as byte array > * @return a byte array containing the signature > * @throws IOException if the pdfFile can't be read > */ > public byte[] getContents(byte[] pdfFile) throws IOException > { > int[] byteRange = getByteRange(); > int begin = byteRange[0]+byteRange[1]; > int end = byteRange[2]-begin-1; > return getContents(new COSFilterInputStream(pdfFile,new int[] > {begin,end})); > } > {code:} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Comment Edited] (PDFBOX-3544) Invalid ByteRange for getContents() method
[ https://issues.apache.org/jira/browse/PDFBOX-3544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15608607#comment-15608607 ] Lonzak edited comment on PDFBOX-3544 at 10/26/16 2:33 PM: -- Or is that document wrong? Strangely I get an AIOOB Exception if I try that version... was (Author: teewetee): Or is that document wrong? > Invalid ByteRange for getContents() method > -- > > Key: PDFBOX-3544 > URL: https://issues.apache.org/jira/browse/PDFBOX-3544 > Project: PDFBox > Issue Type: Bug > Components: Signing >Affects Versions: 2.0.3 >Reporter: Lonzak > > PDSignature.java class, getContents() method, line 325ff. > {code:title=PDSignature.java|borderStyle=solid} > /** > * Will return the embedded signature between the byterange gap. > * > * @param pdfFile The signed pdf file as byte array > * @return a byte array containing the signature > * @throws IOException if the pdfFile can't be read > */ > public byte[] getContents(byte[] pdfFile) throws IOException > { > int[] byteRange = getByteRange(); > int begin = byteRange[0]+byteRange[1]+1; > int end = byteRange[2]-begin; > return getContents(new COSFilterInputStream(pdfFile,new int[] > {begin,end})); > } > {code:} > Lets asume a byte range of > /ByteRange[ 0, 840, 960, 240] > The current implementation would return > {841, 119} which is from *841 - 960* > According to > [adobe|http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/DigitalSignaturesInPDF.pdf] > (page 5) this is invalid: > {quote} > "In this example, the hash is calculated for bytes 0 through 839, and 960 > through 1200." > {quote} > Thus the values for the signature should be > {840, 119} which is from *840 - 959* > The implementation should be: > {code:title=PDSignature.java|borderStyle=solid} > /** > * Will return the embedded signature between the byterange gap. > * > * @param pdfFile The signed pdf file as byte array > * @return a byte array containing the signature > * @throws IOException if the pdfFile can't be read > */ > public byte[] getContents(byte[] pdfFile) throws IOException > { > int[] byteRange = getByteRange(); > int begin = byteRange[0]+byteRange[1]; > int end = byteRange[2]-begin-1; > return getContents(new COSFilterInputStream(pdfFile,new int[] > {begin,end})); > } > {code:} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3544) Invalid ByteRange for getContents() method
[ https://issues.apache.org/jira/browse/PDFBOX-3544?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15608607#comment-15608607 ] Lonzak commented on PDFBOX-3544: Or is that document wrong? > Invalid ByteRange for getContents() method > -- > > Key: PDFBOX-3544 > URL: https://issues.apache.org/jira/browse/PDFBOX-3544 > Project: PDFBox > Issue Type: Bug > Components: Signing >Affects Versions: 2.0.3 >Reporter: Lonzak > > PDSignature.java class, getContents() method, line 325ff. > {code:title=PDSignature.java|borderStyle=solid} > /** > * Will return the embedded signature between the byterange gap. > * > * @param pdfFile The signed pdf file as byte array > * @return a byte array containing the signature > * @throws IOException if the pdfFile can't be read > */ > public byte[] getContents(byte[] pdfFile) throws IOException > { > int[] byteRange = getByteRange(); > int begin = byteRange[0]+byteRange[1]+1; > int end = byteRange[2]-begin; > return getContents(new COSFilterInputStream(pdfFile,new int[] > {begin,end})); > } > {code:} > Lets asume a byte range of > /ByteRange[ 0, 840, 960, 240] > The current implementation would return > {841, 119} which is from *841 - 960* > According to > [adobe|http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/DigitalSignaturesInPDF.pdf] > (page 5) this is invalid: > {quote} > "In this example, the hash is calculated for bytes 0 through 839, and 960 > through 1200." > {quote} > Thus the values for the signature should be > {840, 119} which is from *840 - 959* > The implementation should be: > {code:title=PDSignature.java|borderStyle=solid} > /** > * Will return the embedded signature between the byterange gap. > * > * @param pdfFile The signed pdf file as byte array > * @return a byte array containing the signature > * @throws IOException if the pdfFile can't be read > */ > public byte[] getContents(byte[] pdfFile) throws IOException > { > int[] byteRange = getByteRange(); > int begin = byteRange[0]+byteRange[1]; > int end = byteRange[2]-begin-1; > return getContents(new COSFilterInputStream(pdfFile,new int[] > {begin,end})); > } > {code:} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Created] (PDFBOX-3544) Invalid ByteRange for getContents() method
TvT created PDFBOX-3544: --- Summary: Invalid ByteRange for getContents() method Key: PDFBOX-3544 URL: https://issues.apache.org/jira/browse/PDFBOX-3544 Project: PDFBox Issue Type: Bug Components: Signing Affects Versions: 2.0.3 Reporter: TvT PDSignature.java class, getContents() method, line 325ff. {code:title=PDSignature.java|borderStyle=solid} /** * Will return the embedded signature between the byterange gap. * * @param pdfFile The signed pdf file as byte array * @return a byte array containing the signature * @throws IOException if the pdfFile can't be read */ public byte[] getContents(byte[] pdfFile) throws IOException { int[] byteRange = getByteRange(); int begin = byteRange[0]+byteRange[1]+1; int end = byteRange[2]-begin; return getContents(new COSFilterInputStream(pdfFile,new int[] {begin,end})); } {code:} Lets asume a byte range of /ByteRange[ 0, 840, 960, 240] The current implementation would return {841, 119} which is from *841 - 960* According to [adobe|http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/DigitalSignaturesInPDF.pdf] (page 5) this is invalid: {quote} "In this example, the hash is calculated for bytes 0 through 839, and 960 through 1200." {quote} Thus the values for the signature should be {840, 119} which is from *840 - 959* The implementation should be: {code:title=PDSignature.java|borderStyle=solid} /** * Will return the embedded signature between the byterange gap. * * @param pdfFile The signed pdf file as byte array * @return a byte array containing the signature * @throws IOException if the pdfFile can't be read */ public byte[] getContents(byte[] pdfFile) throws IOException { int[] byteRange = getByteRange(); int begin = byteRange[0]+byteRange[1]; int end = byteRange[2]-begin-1; return getContents(new COSFilterInputStream(pdfFile,new int[] {begin,end})); } {code:} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Resolved] (PDFBOX-3532) Java 6 errors
[ https://issues.apache.org/jira/browse/PDFBOX-3532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tilman Hausherr resolved PDFBOX-3532. - Resolution: Fixed > Java 6 errors > - > > Key: PDFBOX-3532 > URL: https://issues.apache.org/jira/browse/PDFBOX-3532 > Project: PDFBox > Issue Type: Bug >Affects Versions: 1.8.12, 2.0.3 >Reporter: simon steiner >Assignee: Tilman Hausherr > Fix For: 1.8.13, 2.0.4, 2.1.0 > > > Under java 6 and 8 and clean ~/.m2 directory: > mvn clean install -DskipTests > Downloading: > http://www.pdfa.org/wp-content/uploads/2011/08/isartor-pdfa-2008-08-13.zip > javax.net.ssl.SSLHandshakeException: > sun.security.validator.ValidatorException: PKIX path building failed: > sun.security.provider.certpath.SunCertPathBuilderException: unable to find > valid certification path to requested target > Java 6 only: > [ERROR] > pdf-box-svn/pdfbox/src/test/java/org/apache/pdfbox/pdmodel/TestPDDocument.java:[205,41] > cannot find symbol > [ERROR] symbol : class Builder > [ERROR] location: class java.util.Locale -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3540) Trailer Syntax error, ID is different in the first and the last trailer - for PDF with incremental updates
[ https://issues.apache.org/jira/browse/PDFBOX-3540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15607765#comment-15607765 ] Maya Angelova commented on PDFBOX-3540: --- Well, I have the file as an attachment in a mail, my tests extract the attachment, and thereafter perform the validation... I'll be able to look into it in a few hours and write the result here. Thank you for your effort! > Trailer Syntax error, ID is different in the first and the last trailer - for > PDF with incremental updates > -- > > Key: PDFBOX-3540 > URL: https://issues.apache.org/jira/browse/PDFBOX-3540 > Project: PDFBox > Issue Type: Bug > Components: Preflight >Affects Versions: 1.8.12, 2.0.3 >Reporter: Maruan Sahyoun >Assignee: Tilman Hausherr > Fix For: 2.0.4, 2.1.0 > > Attachments: Pardes13_Rez02.pdf, testfile_original.pdf, > testfile_signed_once.pdf, testfile_signed_twice.pdf > > > As reported at the users mailing list: > > Hello guys, > I have the following problem using apache.pdfbox when validating a valid > PDF/A-1 file, which is being signed twice: > 1. The online validator confirms that the file is valid > (https://www.pdf-tools.com/pdf/validate-pdfa-online.aspx) > 2. But when I validate it using the following code: > {code} > PreflightParser parser = new PreflightParser(byteDatasource); > parser.parse(); > PreflightDocument document = parser.getPreflightDocument(); > document.validate(); > result = document.getResult(); > {code} > 3. The file is linearized > 4. I get that the file is invalid and the error description reads: > {code} > Trailer Syntax error, ID is different in the first and the last trailer > {code} > According to issues PDFBOX-3256 and PDFBOX-2502 this should be fixed? > Could anyone give me a tip how to go around this problem or would that be a > bug? > The pdf file is attached. > > *Analysis:* > The original PDF is linearized with a subsequent incremental update. > According to ISO 32000-1 F1 > {quote} > Incremental update shall still be permitted, but the resulting PDF is no > longer linearized and subsequently shall be treated as ordinary PDF. > Linearizing it again may require reprocessing the entire file; see G.7, > "Accessing an Updated File" for details. > {quote} > as the file shall no longer be treated as linearized the provision about > matching ID's as outlined in PDFBOX-2502 no longer applies. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3540) Trailer Syntax error, ID is different in the first and the last trailer - for PDF with incremental updates
[ https://issues.apache.org/jira/browse/PDFBOX-3540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15607763#comment-15607763 ] Maya Angelova commented on PDFBOX-3540: --- Well, I have the file as an attachment in a mail, my tests extract the attachment, and thereafter perform the validation... I'll be able to look into it in a few hours and write the result here. Thank you for your effort! > Trailer Syntax error, ID is different in the first and the last trailer - for > PDF with incremental updates > -- > > Key: PDFBOX-3540 > URL: https://issues.apache.org/jira/browse/PDFBOX-3540 > Project: PDFBox > Issue Type: Bug > Components: Preflight >Affects Versions: 1.8.12, 2.0.3 >Reporter: Maruan Sahyoun >Assignee: Tilman Hausherr > Fix For: 2.0.4, 2.1.0 > > Attachments: Pardes13_Rez02.pdf, testfile_original.pdf, > testfile_signed_once.pdf, testfile_signed_twice.pdf > > > As reported at the users mailing list: > > Hello guys, > I have the following problem using apache.pdfbox when validating a valid > PDF/A-1 file, which is being signed twice: > 1. The online validator confirms that the file is valid > (https://www.pdf-tools.com/pdf/validate-pdfa-online.aspx) > 2. But when I validate it using the following code: > {code} > PreflightParser parser = new PreflightParser(byteDatasource); > parser.parse(); > PreflightDocument document = parser.getPreflightDocument(); > document.validate(); > result = document.getResult(); > {code} > 3. The file is linearized > 4. I get that the file is invalid and the error description reads: > {code} > Trailer Syntax error, ID is different in the first and the last trailer > {code} > According to issues PDFBOX-3256 and PDFBOX-2502 this should be fixed? > Could anyone give me a tip how to go around this problem or would that be a > bug? > The pdf file is attached. > > *Analysis:* > The original PDF is linearized with a subsequent incremental update. > According to ISO 32000-1 F1 > {quote} > Incremental update shall still be permitted, but the resulting PDF is no > longer linearized and subsequently shall be treated as ordinary PDF. > Linearizing it again may require reprocessing the entire file; see G.7, > "Accessing an Updated File" for details. > {quote} > as the file shall no longer be treated as linearized the provision about > matching ID's as outlined in PDFBOX-2502 no longer applies. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Issue Comment Deleted] (PDFBOX-3540) Trailer Syntax error, ID is different in the first and the last trailer - for PDF with incremental updates
[ https://issues.apache.org/jira/browse/PDFBOX-3540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maya Angelova updated PDFBOX-3540: -- Comment: was deleted (was: Well, I have the file as an attachment in a mail, my tests extract the attachment, and thereafter perform the validation... I'll be able to look into it in a few hours and write the result here. Thank you for your effort!) > Trailer Syntax error, ID is different in the first and the last trailer - for > PDF with incremental updates > -- > > Key: PDFBOX-3540 > URL: https://issues.apache.org/jira/browse/PDFBOX-3540 > Project: PDFBox > Issue Type: Bug > Components: Preflight >Affects Versions: 1.8.12, 2.0.3 >Reporter: Maruan Sahyoun >Assignee: Tilman Hausherr > Fix For: 2.0.4, 2.1.0 > > Attachments: Pardes13_Rez02.pdf, testfile_original.pdf, > testfile_signed_once.pdf, testfile_signed_twice.pdf > > > As reported at the users mailing list: > > Hello guys, > I have the following problem using apache.pdfbox when validating a valid > PDF/A-1 file, which is being signed twice: > 1. The online validator confirms that the file is valid > (https://www.pdf-tools.com/pdf/validate-pdfa-online.aspx) > 2. But when I validate it using the following code: > {code} > PreflightParser parser = new PreflightParser(byteDatasource); > parser.parse(); > PreflightDocument document = parser.getPreflightDocument(); > document.validate(); > result = document.getResult(); > {code} > 3. The file is linearized > 4. I get that the file is invalid and the error description reads: > {code} > Trailer Syntax error, ID is different in the first and the last trailer > {code} > According to issues PDFBOX-3256 and PDFBOX-2502 this should be fixed? > Could anyone give me a tip how to go around this problem or would that be a > bug? > The pdf file is attached. > > *Analysis:* > The original PDF is linearized with a subsequent incremental update. > According to ISO 32000-1 F1 > {quote} > Incremental update shall still be permitted, but the resulting PDF is no > longer linearized and subsequently shall be treated as ordinary PDF. > Linearizing it again may require reprocessing the entire file; see G.7, > "Accessing an Updated File" for details. > {quote} > as the file shall no longer be treated as linearized the provision about > matching ID's as outlined in PDFBOX-2502 no longer applies. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Comment Edited] (PDFBOX-3542) Can PDFBOX use Streams to read PDSignatures from document?
[ https://issues.apache.org/jira/browse/PDFBOX-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15607728#comment-15607728 ] Tilman Hausherr edited comment on PDFBOX-3542 at 10/26/16 7:33 AM: --- Loading from a file is best. To save memory, try {code} PDDocument.load (file, MemoryUsageSetting.setupTempFileOnly()); {code} was (Author: tilman): Loading from a file is best. To save memory, try {code} PDDocument.load (new File(), MemoryUsageSetting.setupTempFileOnly()); {code} > Can PDFBOX use Streams to read PDSignatures from document? > -- > > Key: PDFBOX-3542 > URL: https://issues.apache.org/jira/browse/PDFBOX-3542 > Project: PDFBox > Issue Type: Wish > Components: PDModel >Affects Versions: 2.0.3 >Reporter: Andrea Paternesi >Priority: Critical > > I did not find a way to avoid loading into memory the whole PDDocument to > read the signatures dictionaries. > If you have very big PDF files (30MB or more), java gets an Out of Memory > error. > Right now i did not find a correct way to load signatures usign stream. > Can you give any hont? > Thanks in advance. > Andrea. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3542) Can PDFBOX use Streams to read PDSignatures from document?
[ https://issues.apache.org/jira/browse/PDFBOX-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15607728#comment-15607728 ] Tilman Hausherr commented on PDFBOX-3542: - Loading from a file is best. To save memory, try {code} PDDocument.load (new File(), MemoryUsageSetting.setupTempFileOnly()); {code} > Can PDFBOX use Streams to read PDSignatures from document? > -- > > Key: PDFBOX-3542 > URL: https://issues.apache.org/jira/browse/PDFBOX-3542 > Project: PDFBox > Issue Type: Wish > Components: PDModel >Affects Versions: 2.0.3 >Reporter: Andrea Paternesi >Priority: Critical > > I did not find a way to avoid loading into memory the whole PDDocument to > read the signatures dictionaries. > If you have very big PDF files (30MB or more), java gets an Out of Memory > error. > Right now i did not find a correct way to load signatures usign stream. > Can you give any hont? > Thanks in advance. > Andrea. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Comment Edited] (PDFBOX-3542) Can PDFBOX use Streams to read PDSignatures from document?
[ https://issues.apache.org/jira/browse/PDFBOX-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15607660#comment-15607660 ] Andrea Paternesi edited comment on PDFBOX-3542 at 10/26/16 7:18 AM: I usually use PDDocument.load() passing a File. And in certain circumstances it gets the out of memory error. There are some cases in which you cannot use JVM parameters to use more memory. In my case we use the old java AX Bridge to integrate some signing features with some old MS Visual Fox code. There is no way to pass arguments in java 7 within the bridge. If i use a FileInputStream to load the PDDocument will it handle the stream in a different way? What i need to do is only read the signatures to validate them. So i suppose i do not need to load the entire dcument in memory but extract only the signature dictionary which is very little in size even with many signatures inside. What i noticed is that while signing even a big file this does not happen. I will try the 2.0.4 but what is the scatch file method? Any clue? Thanks. Andrea. was (Author: patton73): I usually use PDDocument.load() passing a File. And in certain circumstances it gets the out of memory error. There are some cases in which you cannot use JVM parameters to use more memory. In my case we use the old java AX Bridge to integrate some signing features with some old MS Visual Fox code. There is no way to pass arguments in java 7 within the bridge. If i use a FileInputStream to load the PDDocument will it handle the stream in a different way? What i need to do is only read the signatures to validate them. So i suppose i do not need to load the entire dcument in memory but extract only the signature dictionary which is very little in size even with many signatures inside. What i noticed is that while signing even a big file this does not happen. I will try the 2.0.4 and see the scatch file method. Any clue? Thanks. Andrea. > Can PDFBOX use Streams to read PDSignatures from document? > -- > > Key: PDFBOX-3542 > URL: https://issues.apache.org/jira/browse/PDFBOX-3542 > Project: PDFBox > Issue Type: Wish > Components: PDModel >Affects Versions: 2.0.3 >Reporter: Andrea Paternesi >Priority: Critical > > I did not find a way to avoid loading into memory the whole PDDocument to > read the signatures dictionaries. > If you have very big PDF files (30MB or more), java gets an Out of Memory > error. > Right now i did not find a correct way to load signatures usign stream. > Can you give any hont? > Thanks in advance. > Andrea. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Comment Edited] (PDFBOX-3542) Can PDFBOX use Streams to read PDSignatures from document?
[ https://issues.apache.org/jira/browse/PDFBOX-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15607660#comment-15607660 ] Andrea Paternesi edited comment on PDFBOX-3542 at 10/26/16 7:17 AM: I usually use PDDocument.load() passing a File. And in certain circumstances it gets the out of memory error. There are some cases in which you cannot use JVM parameters to use more memory. In my case we use the old java AX Bridge to integrate some signing features with some old MS Visual Fox code. There is no way to pass arguments in java 7 within the bridge. If i use a FileInputStream to load the PDDocument will it handle the stream in a different way? What i need to do is only read the signatures to validate them. So i suppose i do not need to load the entire dcument in memory but extract only the signature dictionary which is very little in size even with many signatures inside. What i noticed is that while signing even a big file this does not happen. I will try the 2.0.4 and see the scatch file method. Any clue? Thanks. Andrea. was (Author: patton73): I usually use PDDocument.load() passing a File. And in certain circumstances it gets the out of memory error. There are some cases in which you cannot use JVM parameters to use more memory. In my case we use the old java AX Bridge to integrate some signing features with some old MS Visual Fox code. There is no way to pass arguments in java 7 within the bridge. If i use a FileInputStream to load the PDDocument will it handle the stream in a different way? What i need to do is only read the signatures to validate them. So i suppose i do not need to load the entire dcument in memory but extract only the signature dictionary which is very little in size even with many signatures inside. What i noticed is that while signing even a big file this does not happen. Any clue? Thanks. Andrea. > Can PDFBOX use Streams to read PDSignatures from document? > -- > > Key: PDFBOX-3542 > URL: https://issues.apache.org/jira/browse/PDFBOX-3542 > Project: PDFBox > Issue Type: Wish > Components: PDModel >Affects Versions: 2.0.3 >Reporter: Andrea Paternesi >Priority: Critical > > I did not find a way to avoid loading into memory the whole PDDocument to > read the signatures dictionaries. > If you have very big PDF files (30MB or more), java gets an Out of Memory > error. > Right now i did not find a correct way to load signatures usign stream. > Can you give any hont? > Thanks in advance. > Andrea. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org
[jira] [Commented] (PDFBOX-3542) Can PDFBOX use Streams to read PDSignatures from document?
[ https://issues.apache.org/jira/browse/PDFBOX-3542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15607660#comment-15607660 ] Andrea Paternesi commented on PDFBOX-3542: -- I usually use PDDocument.load() passing a File. And in certain circumstances it gets the out of memory error. There are some cases in which you cannot use JVM parameters to use more memory. In my case we use the old java AX Bridge to integrate some signing features with some old MS Visual Fox code. There is no way to pass arguments in java 7 within the bridge. If i use a FileInputStream to load the PDDocument will it handle the stream in a different way? What i need to do is only read the signatures to validate them. So i suppose i do not need to load the entire dcument in memory but extract only the signature dictionary which is very little in size even with many signatures inside. What i noticed is that while signing even a big file this does not happen. Any clue? Thanks. Andrea. > Can PDFBOX use Streams to read PDSignatures from document? > -- > > Key: PDFBOX-3542 > URL: https://issues.apache.org/jira/browse/PDFBOX-3542 > Project: PDFBox > Issue Type: Wish > Components: PDModel >Affects Versions: 2.0.3 >Reporter: Andrea Paternesi >Priority: Critical > > I did not find a way to avoid loading into memory the whole PDDocument to > read the signatures dictionaries. > If you have very big PDF files (30MB or more), java gets an Out of Memory > error. > Right now i did not find a correct way to load signatures usign stream. > Can you give any hont? > Thanks in advance. > Andrea. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org For additional commands, e-mail: dev-h...@pdfbox.apache.org