[jira] [Comment Edited] (PDFBOX-3975) ExtractText converts some diacritics to combining forms that don't get combined

2017-10-22 Thread Matthew Self (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214506#comment-16214506
 ] 

Matthew Self edited comment on PDFBOX-3975 at 10/23/17 1:31 AM:


After reading more about combining marks 
(http://unicode.org/faq/char_combmark.html) I see that my suggestion is based 
on an incorrect assumption.  There are many valid accented characters that 
don't have a combined form and can only be represented by the base character 
plus a combining diacritic.  So, the fact that Normalizer.normalize() doesn't 
convert a pair into a single character does not mean that it is not a valid 
combination.

So, back to the original issue, it seems that the correct solution to prevent 
U+005E (CIRCUMFLEX ACCENT) being turned into U+0302 (COMBINING CIRCUMFLEX 
ACCENT) in this particular PDF file is to tighten up the overlap detection so 
that combineDiacritic() is not called in this case at all.  It seems that there 
is no reliable way to reject the potential combination of a base character and 
diacritic mark based only on the characters.  A COMBINING CIRCUMFLEX ACCENT 
could in theory be applied to any base character.


was (Author: mself):
After reading more about combining marks 
(http://unicode.org/faq/char_combmark.html) I see that my suggestion is based 
on an incorrect assumption.  There are many valid accented characters that 
don't have a combined form and can only be represented by the base character 
plus a combining diacritic.  So, the fact that Normalizer.Form.NFC() doesn't 
convert a pair into a single character does not mean that it is not a valid 
combination.

So, back to the original issue, it seems that the correct solution to prevent 
U+005E (CIRCUMFLEX ACCENT) being turned into U+0302 (COMBINING CIRCUMFLEX 
ACCENT) in this particular PDF file is to tighten up the overlap detection so 
that combineDiacritic() is not called in this case at all.  It seems that there 
is no reliable way to reject the potential combination of a base character and 
diacritic mark based only on the characters.  A COMBINING CIRCUMFLEX ACCENT 
could in theory be applied to any base character.

> ExtractText converts some diacritics to combining forms that don't get 
> combined
> ---
>
> Key: PDFBOX-3975
> URL: https://issues.apache.org/jira/browse/PDFBOX-3975
> Project: PDFBox
>  Issue Type: Bug
>  Components: Text extraction
>Affects Versions: 2.0.7
>Reporter: Matthew Self
>
> When I use ExtractText on the file 
> http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf,
>  there is an issue with the "^" character on page 15.
> The extracted text is "special characters ( * ! & } ̂  % and so on ) . )".
> Note that the extracted "^" character is U+0302 (COMBINING CIRCUMFLEX ACCENT) 
> when it ought to be plain old U+005E (CIRCUMFLEX ACCENT).
> I believe that what is happening is the original U+005E character is being 
> converted to U+0302 by the DIACRITICS map in TextPosition.java:
> map.put(0x005e, "\u0302");
> This is probably because the character slightly overlaps the preceding space 
> character.  But then this combining diacritic can't be combined with space 
> character, so the extracted text contains the combining character instead of 
> the original.
> One solution would be to tighten up the detection of overlaps so that 
> combineDiacritic() is not called in this instance.
> Another (perhaps more robust) solution would be to verify in 
> combineDiacritic() that the call to Normalizer.normalize() actually does 
> combine the combining form of the diacritic with the previous character.  If 
> the result of calling Normalizer.normalize() has more than one character in 
> it, then the diacritic must not have been combined with the previous 
> character.  In that case, the diacritic should not be merged.
> The goal would be for the extracted text to never contain combining 
> characters that failed to combine.
> P.S.  Thank you for the great library of PDFBox!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3975) ExtractText converts some diacritics to combining forms that don't get combined

2017-10-22 Thread Matthew Self (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214498#comment-16214498
 ] 

Matthew Self edited comment on PDFBOX-3975 at 10/23/17 1:30 AM:


Looking more closely at the code, I see that mergeDiacritic() isn't actually 
merging the base character and the diacritic into its NFC form (a single 
character), but rather leaving it in NFD form (the base char followed by 
combining diacritic).

For example, I have a PDF document that contains the name "Krkošek".  In the 
Tj, this consists of "s" followed by U+02C7 (CANON), which will be displayed as 
two characters in a text editor.  The output of ExtractText is "s" followed 
U+030C (COMBINING CANON).  This is valid UTF-8 and will display correctly in a 
text editor, but it is in NFD form rather than NFC form.  The desired output 
would be the single character U+0161 (LATIN SMALL LETTER S WITH CANON), which 
is the same Unicode string but in NFC form.

My suggestion would be to rework this code so that instead of just converting 
the diacritics from stand-alone form to combining form, it also uses 
Normalizer.Form.NFC() to combine the base character and the diacritic.  If this 
results in a single character, then the output is in the desired NFC form.  If 
this results in no change to the string, then mergeDiacritic() should not merge 
the characters (even though they appear to overlap) and leave the diacritic 
character in its original (stand-alone) form.

This would fix both issues (unwanted conversion of U+005E to U+0302 and failure 
to produce the NFC form U+0161).


was (Author: mself):
Looking more closely at the code, I see that mergeDiacritic() isn't actually 
merging the base character and the diacritic into its NFC form (a single 
character), but rather leaving it in NFD form (the base char followed by 
combining diacritic).

For example, I have a PDF document that contains the name "Krkošek".  In the 
Tj, this consists of "s" followed by U+02C7 (CANON), which will be displayed as 
two characters in a text editor.  The output of ExtractText is "s" followed 
U+030C (COMBINING CANON).  This is valid UTF-8 and will display correctly in a 
text editor, but it is in NFD form rather than NFC form.  The desired output 
would be the single character U+0161 (LATIN SMALL LETTER S WITH CANON), which 
is the same Unicode string but in NFC form.

My suggestion would be to rework this code so that instead of just converting 
the diacritics from stand-alone form to combining form, it also uses 
Normalizer.Form.NFC() to combine the base character and the diacritic.  If this 
results in a single character, then the output is in the desired NFC form.  If 
this results in no change to the string, then mergeDiacritic() should not merge 
the characters (even though they appear to overlap) and leave the diacritic 
character in its original (stand-alone) form.

This would fix both issues (unwanted conversion of U+005E to U+0302 and failure 
to produce the NFC form U+0161).

If you agree with this approach, I can work on a patchset and run the 
regression tests.

> ExtractText converts some diacritics to combining forms that don't get 
> combined
> ---
>
> Key: PDFBOX-3975
> URL: https://issues.apache.org/jira/browse/PDFBOX-3975
> Project: PDFBox
>  Issue Type: Bug
>  Components: Text extraction
>Affects Versions: 2.0.7
>Reporter: Matthew Self
>
> When I use ExtractText on the file 
> http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf,
>  there is an issue with the "^" character on page 15.
> The extracted text is "special characters ( * ! & } ̂  % and so on ) . )".
> Note that the extracted "^" character is U+0302 (COMBINING CIRCUMFLEX ACCENT) 
> when it ought to be plain old U+005E (CIRCUMFLEX ACCENT).
> I believe that what is happening is the original U+005E character is being 
> converted to U+0302 by the DIACRITICS map in TextPosition.java:
> map.put(0x005e, "\u0302");
> This is probably because the character slightly overlaps the preceding space 
> character.  But then this combining diacritic can't be combined with space 
> character, so the extracted text contains the combining character instead of 
> the original.
> One solution would be to tighten up the detection of overlaps so that 
> combineDiacritic() is not called in this instance.
> Another (perhaps more robust) solution would be to verify in 
> combineDiacritic() that the call to Normalizer.normalize() actually does 
> combine the combining form of the diacritic with the previous character.  If 
> the result of calling Normalizer.normalize() has more than one character in 
> it, then the diacritic must not have been combined with the previous 
> character.  In that case, the 

[jira] [Commented] (PDFBOX-3975) ExtractText converts some diacritics to combining forms that don't get combined

2017-10-22 Thread Matthew Self (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214506#comment-16214506
 ] 

Matthew Self commented on PDFBOX-3975:
--

After reading more about combining marks 
(http://unicode.org/faq/char_combmark.html) I see that my suggestion is based 
on an incorrect assumption.  There are many valid accented characters that 
don't have a combined form and can only be represented by the base character 
plus a combining diacritic.  So, the fact that Normalizer.Form.NFC() doesn't 
convert a pair into a single character does not mean that it is not a valid 
combination.

So, back to the original issue, it seems that the correct solution to prevent 
U+005E (CIRCUMFLEX ACCENT) being turned into U+0302 (COMBINING CIRCUMFLEX 
ACCENT) in this particular PDF file is to tighten up the overlap detection so 
that combineDiacritic() is not called in this case at all.  It seems that there 
is no reliable way to reject the potential combination of a base character and 
diacritic mark based only on the characters.  A COMBINING CIRCUMFLEX ACCENT 
could in theory be applied to any base character.

> ExtractText converts some diacritics to combining forms that don't get 
> combined
> ---
>
> Key: PDFBOX-3975
> URL: https://issues.apache.org/jira/browse/PDFBOX-3975
> Project: PDFBox
>  Issue Type: Bug
>  Components: Text extraction
>Affects Versions: 2.0.7
>Reporter: Matthew Self
>
> When I use ExtractText on the file 
> http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf,
>  there is an issue with the "^" character on page 15.
> The extracted text is "special characters ( * ! & } ̂  % and so on ) . )".
> Note that the extracted "^" character is U+0302 (COMBINING CIRCUMFLEX ACCENT) 
> when it ought to be plain old U+005E (CIRCUMFLEX ACCENT).
> I believe that what is happening is the original U+005E character is being 
> converted to U+0302 by the DIACRITICS map in TextPosition.java:
> map.put(0x005e, "\u0302");
> This is probably because the character slightly overlaps the preceding space 
> character.  But then this combining diacritic can't be combined with space 
> character, so the extracted text contains the combining character instead of 
> the original.
> One solution would be to tighten up the detection of overlaps so that 
> combineDiacritic() is not called in this instance.
> Another (perhaps more robust) solution would be to verify in 
> combineDiacritic() that the call to Normalizer.normalize() actually does 
> combine the combining form of the diacritic with the previous character.  If 
> the result of calling Normalizer.normalize() has more than one character in 
> it, then the diacritic must not have been combined with the previous 
> character.  In that case, the diacritic should not be merged.
> The goal would be for the extracted text to never contain combining 
> characters that failed to combine.
> P.S.  Thank you for the great library of PDFBox!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Issue Comment Deleted] (PDFBOX-3975) ExtractText converts some diacritics to combining forms that don't get combined

2017-10-22 Thread Matthew Self (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthew Self updated PDFBOX-3975:
-
Comment: was deleted

(was: Looking more closely at the code, I see that mergeDiacritic() isn't 
actually merging the base character and the diacritic into its NFC form (a 
single character), but rather leaving it in NFD form (the base char followed by 
combining diacritic).

For example, I have a PDF document that contains the name "Krkošek".  In the 
Tj, this consists of "s" followed by U+02C7 (CANON), which will be displayed as 
two characters in a text editor.  The output of ExtractText is "s" followed 
U+030C (COMBINING CANON).  This is valid UTF-8 and will display correctly in a 
text editor, but it is in NFD form rather than NFC form.  The desired output 
would be the single character U+0161 (LATIN SMALL LETTER S WITH CANON), which 
is the same Unicode string but in NFC form.

My suggestion would be to rework this code so that instead of just converting 
the diacritics from stand-alone form to combining form, it also uses 
Normalizer.Form.NFC() to combine the base character and the diacritic.  If this 
results in a single character, then the output is in the desired NFC form.  If 
this results in no change to the string, then mergeDiacritic() should not merge 
the characters (even though they appear to overlap) and leave the diacritic 
character in its original (stand-alone) form.

This would fix both issues (unwanted conversion of U+005E to U+0302 and failure 
to produce the NFC form U+0161).)

> ExtractText converts some diacritics to combining forms that don't get 
> combined
> ---
>
> Key: PDFBOX-3975
> URL: https://issues.apache.org/jira/browse/PDFBOX-3975
> Project: PDFBox
>  Issue Type: Bug
>  Components: Text extraction
>Affects Versions: 2.0.7
>Reporter: Matthew Self
>
> When I use ExtractText on the file 
> http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf,
>  there is an issue with the "^" character on page 15.
> The extracted text is "special characters ( * ! & } ̂  % and so on ) . )".
> Note that the extracted "^" character is U+0302 (COMBINING CIRCUMFLEX ACCENT) 
> when it ought to be plain old U+005E (CIRCUMFLEX ACCENT).
> I believe that what is happening is the original U+005E character is being 
> converted to U+0302 by the DIACRITICS map in TextPosition.java:
> map.put(0x005e, "\u0302");
> This is probably because the character slightly overlaps the preceding space 
> character.  But then this combining diacritic can't be combined with space 
> character, so the extracted text contains the combining character instead of 
> the original.
> One solution would be to tighten up the detection of overlaps so that 
> combineDiacritic() is not called in this instance.
> Another (perhaps more robust) solution would be to verify in 
> combineDiacritic() that the call to Normalizer.normalize() actually does 
> combine the combining form of the diacritic with the previous character.  If 
> the result of calling Normalizer.normalize() has more than one character in 
> it, then the diacritic must not have been combined with the previous 
> character.  In that case, the diacritic should not be merged.
> The goal would be for the extracted text to never contain combining 
> characters that failed to combine.
> P.S.  Thank you for the great library of PDFBox!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3975) ExtractText converts some diacritics to combining forms that don't get combined

2017-10-22 Thread Matthew Self (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214498#comment-16214498
 ] 

Matthew Self commented on PDFBOX-3975:
--

Looking more closely at the code, I see that mergeDiacritic() isn't actually 
merging the base character and the diacritic into its NFC form (a single 
character), but rather leaving it in NFD form (the base char followed by 
combining diacritic).

For example, I have a PDF document that contains the name "Krkošek".  In the 
Tj, this consists of "s" followed by U+02C7 (CANON), which will be displayed as 
two characters in a text editor.  The output of ExtractText is "s" followed 
U+030C (COMBINING CANON).  This is valid UTF-8 and will display correctly in a 
text editor, but it is in NFD form rather than NFC form.  The desired output 
would be the single character U+0161 (LATIN SMALL LETTER S WITH CANON), which 
is the same Unicode string but in NFC form.

My suggestion would be to rework this code so that instead of just converting 
the diacritics from stand-alone form to combining form, it also uses 
Normalizer.Form.NFC() to combine the base character and the diacritic.  If this 
results in a single character, then the output is in the desired NFC form.  If 
this results in no change to the string, then mergeDiacritic() should not merge 
the characters (even though they appear to overlap) and leave the diacritic 
character in its original (stand-alone) form.

This would fix both issues (unwanted conversion of U+005E to U+0302 and failure 
to produce the NFC form U+0161).

If you agree with this approach, I can work on a patchset and run the 
regression tests.

> ExtractText converts some diacritics to combining forms that don't get 
> combined
> ---
>
> Key: PDFBOX-3975
> URL: https://issues.apache.org/jira/browse/PDFBOX-3975
> Project: PDFBox
>  Issue Type: Bug
>  Components: Text extraction
>Affects Versions: 2.0.7
>Reporter: Matthew Self
>
> When I use ExtractText on the file 
> http://www.adobe.com/content/dam/acom/en/devnet/acrobat/pdfs/PDF32000_2008.pdf,
>  there is an issue with the "^" character on page 15.
> The extracted text is "special characters ( * ! & } ̂  % and so on ) . )".
> Note that the extracted "^" character is U+0302 (COMBINING CIRCUMFLEX ACCENT) 
> when it ought to be plain old U+005E (CIRCUMFLEX ACCENT).
> I believe that what is happening is the original U+005E character is being 
> converted to U+0302 by the DIACRITICS map in TextPosition.java:
> map.put(0x005e, "\u0302");
> This is probably because the character slightly overlaps the preceding space 
> character.  But then this combining diacritic can't be combined with space 
> character, so the extracted text contains the combining character instead of 
> the original.
> One solution would be to tighten up the detection of overlaps so that 
> combineDiacritic() is not called in this instance.
> Another (perhaps more robust) solution would be to verify in 
> combineDiacritic() that the call to Normalizer.normalize() actually does 
> combine the combining form of the diacritic with the previous character.  If 
> the result of calling Normalizer.normalize() has more than one character in 
> it, then the diacritic must not have been combined with the previous 
> character.  In that case, the diacritic should not be merged.
> The goal would be for the extracted text to never contain combining 
> characters that failed to combine.
> P.S.  Thank you for the great library of PDFBox!



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Re: 2.0.8?

2017-10-22 Thread Andreas Lehmkuehler
@Tim I've fixed the last open regression in 2.0.8, Tilmans test run hasn't 
showed any regression. Please re-run your tests again to see if we can proceed 
with 2.0.8, I'd really like to push it out.


TIA again,
Andreas


Am 08.10.2017 um 16:11 schrieb Andreas Lehmkuehler:

Am 03.10.2017 um 15:38 schrieb Allison, Timothy B.:



And yes, we need another regressions run if possible


Sounds good.  Will do once I hear that we're good to go.  Thank you!

We are good now.

@Tim: Could you please re-run your test to see how good we are?

TIA,
Andreas



-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3957) Pages lost

2017-10-22 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214411#comment-16214411
 ] 

Tilman Hausherr commented on PDFBOX-3957:
-

all good!

> Pages lost
> --
>
> Key: PDFBOX-3957
> URL: https://issues.apache.org/jira/browse/PDFBOX-3957
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing
>Affects Versions: 2.0.8
>Reporter: Tilman Hausherr
>Assignee: Andreas Lehmkühler
>  Labels: regression
>
> The file from PDFBOX-3785 has only 1 page, but should have 11.
> Possibly also
> KHVPCI4WW5C5NYXYTG4UFWB53TKQAQVI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3957) Pages lost

2017-10-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214391#comment-16214391
 ] 

ASF subversion and git services commented on PDFBOX-3957:
-

Commit 1812938 from [~lehmi] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1812938 ]

PDFBOX-3957: switch test to correct value after improving rebuild mechanism

> Pages lost
> --
>
> Key: PDFBOX-3957
> URL: https://issues.apache.org/jira/browse/PDFBOX-3957
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing
>Affects Versions: 2.0.8
>Reporter: Tilman Hausherr
>Assignee: Andreas Lehmkühler
>  Labels: regression
>
> The file from PDFBOX-3785 has only 1 page, but should have 11.
> Possibly also
> KHVPCI4WW5C5NYXYTG4UFWB53TKQAQVI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3957) Pages lost

2017-10-22 Thread JIRA

[ 
https://issues.apache.org/jira/browse/PDFBOX-3957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214379#comment-16214379
 ] 

Andreas Lehmkühler edited comment on PDFBOX-3957 at 10/22/17 5:33 PM:
--

I've improved the rebuild mechanism. It first looks for valid trailer entries 
and only if none are found the existing brute force search for root/info 
dictionaries is used.

[~tilman] Please run your tests as a first check (-after I've finished my work 
in trunk-)


was (Author: lehmi):
I've improved the rebuild mechanism. It first looks for valid trailer entries 
and only if none are found the existing brute force search for root/info 
dictionaries is used.

[~tilman] Please run your tests as a first check (after I've finished my work 
in trunk)

> Pages lost
> --
>
> Key: PDFBOX-3957
> URL: https://issues.apache.org/jira/browse/PDFBOX-3957
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing
>Affects Versions: 2.0.8
>Reporter: Tilman Hausherr
>Assignee: Andreas Lehmkühler
>  Labels: regression
>
> The file from PDFBOX-3785 has only 1 page, but should have 11.
> Possibly also
> KHVPCI4WW5C5NYXYTG4UFWB53TKQAQVI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3957) Pages lost

2017-10-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214390#comment-16214390
 ] 

ASF subversion and git services commented on PDFBOX-3957:
-

Commit 1812937 from [~lehmi] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1812937 ]

PDFBOX-3957: search for valid trailer entries when rebuilding the trailer

> Pages lost
> --
>
> Key: PDFBOX-3957
> URL: https://issues.apache.org/jira/browse/PDFBOX-3957
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing
>Affects Versions: 2.0.8
>Reporter: Tilman Hausherr
>Assignee: Andreas Lehmkühler
>  Labels: regression
>
> The file from PDFBOX-3785 has only 1 page, but should have 11.
> Possibly also
> KHVPCI4WW5C5NYXYTG4UFWB53TKQAQVI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3957) Pages lost

2017-10-22 Thread JIRA

[ 
https://issues.apache.org/jira/browse/PDFBOX-3957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214379#comment-16214379
 ] 

Andreas Lehmkühler edited comment on PDFBOX-3957 at 10/22/17 5:09 PM:
--

I've improved the rebuild mechanism. It first looks for valid trailer entries 
and only if none are found the existing brute force search for root/info 
dictionaries is used.

[~tilman] Please run your tests as a first check (after I've finished my work 
in trunk)


was (Author: lehmi):
I've improved the rebuild mechanism. It first looks for valid trailer entries 
and only if none are found the existing brute force search for root/info 
dictionaries is used.

[~tilman] Please run your tests as a first check

> Pages lost
> --
>
> Key: PDFBOX-3957
> URL: https://issues.apache.org/jira/browse/PDFBOX-3957
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing
>Affects Versions: 2.0.8
>Reporter: Tilman Hausherr
>Assignee: Andreas Lehmkühler
>  Labels: regression
>
> The file from PDFBOX-3785 has only 1 page, but should have 11.
> Possibly also
> KHVPCI4WW5C5NYXYTG4UFWB53TKQAQVI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3957) Pages lost

2017-10-22 Thread JIRA

[ 
https://issues.apache.org/jira/browse/PDFBOX-3957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214379#comment-16214379
 ] 

Andreas Lehmkühler commented on PDFBOX-3957:


I've improved the rebuild mechanism. It first looks for valid trailer entries 
and only if none are found the existing brute force search for root/info 
dictionaries is used.

[~tilman] Please run your tests as a first check

> Pages lost
> --
>
> Key: PDFBOX-3957
> URL: https://issues.apache.org/jira/browse/PDFBOX-3957
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing
>Affects Versions: 2.0.8
>Reporter: Tilman Hausherr
>Assignee: Andreas Lehmkühler
>  Labels: regression
>
> The file from PDFBOX-3785 has only 1 page, but should have 11.
> Possibly also
> KHVPCI4WW5C5NYXYTG4UFWB53TKQAQVI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3957) Pages lost

2017-10-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214378#comment-16214378
 ] 

ASF subversion and git services commented on PDFBOX-3957:
-

Commit 1812934 from [~lehmi] in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1812934 ]

PDFBOX-3957: switch test to correct value after improving rebuild mechanism

> Pages lost
> --
>
> Key: PDFBOX-3957
> URL: https://issues.apache.org/jira/browse/PDFBOX-3957
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing
>Affects Versions: 2.0.8
>Reporter: Tilman Hausherr
>Assignee: Andreas Lehmkühler
>  Labels: regression
>
> The file from PDFBOX-3785 has only 1 page, but should have 11.
> Possibly also
> KHVPCI4WW5C5NYXYTG4UFWB53TKQAQVI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3957) Pages lost

2017-10-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214377#comment-16214377
 ] 

ASF subversion and git services commented on PDFBOX-3957:
-

Commit 1812933 from [~lehmi] in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1812933 ]

PDFBOX-3957: search for valid trailer entries when rebuilding the trailer

> Pages lost
> --
>
> Key: PDFBOX-3957
> URL: https://issues.apache.org/jira/browse/PDFBOX-3957
> Project: PDFBox
>  Issue Type: Bug
>  Components: Parsing
>Affects Versions: 2.0.8
>Reporter: Tilman Hausherr
>Assignee: Andreas Lehmkühler
>  Labels: regression
>
> The file from PDFBOX-3785 has only 1 page, but should have 11.
> Possibly also
> KHVPCI4WW5C5NYXYTG4UFWB53TKQAQVI



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3971) Add Certificate Dictionary to seed value in signature field

2017-10-22 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214231#comment-16214231
 ] 

Tilman Hausherr edited comment on PDFBOX-3971 at 10/22/17 1:38 PM:
---

Thanks... My thoughts:
- setCertificate() has an NPE potential now (remove .getCOSObject()) (no need 
to resubmit, I can fix that too)
- line lengths (no need to resubmit, I can fix this with one keystroke)
- "public boolean remove..." it removes an item and returns a result, i.e. does 
two things. I searched for "public boolean remove" (4 matches) and "public void 
remove" (20 matches) in our code and it looks as we're mostly comfortable not 
bothering about the result 
- "setNeedToBeUpdated()" calls - they don't hurt, but I doubt that they are 
needed. These are used in incremental save. But with signatures, the whole 
signature is new, unless you're signing in an empty signature field. So setting 
that is only needed if you think that the structure exists before signing. If 
this is so, then keep it. If it isn't so, then remove these calls.
- line 341 dead code.
- why FLAG_RESERVED and related methods? My understanding is that this is 
reserved for further use so it shouldn't be there at all.
- replace .getItem() with .getDictionaryObject() unless you expect it to be a 
direct object
- the "if (flag)" code isn't needed where it is now, the code there can be put 
at the place where flag is set to true, it is unlikely that the list element 
would be there twice. (If the code is needed at all, see 4th point)
- line 364 you can use "base", and the result is never null
- javadoc of getKeyUsage / setKeyUsage: use HTML to have a list, see 
AccessPermission class javadoc on how to do this



was (Author: tilman):
Thanks... I'll review the code today or monday. I'll edit my thoughts here.
- setCertificate() has an NPE potential now (remove .getCOSObject()) (no need 
to resubmit)
- line lengths (no need to resubmit)
- "public boolean remove..." it removes an item and returns a result, i.e. does 
two things. I searched for "public boolean remove" (4 matches) and "public void 
remove" (20 matches) in our code and it looks as we're mostly comfortable not 
bothering about the result 
- "setNeedToBeUpdated()" calls - they don't hurt, but I doubt that they are 
needed. These are used in incremental save. But with signatures, the whole 
signature is new, unless you're signing in an empty signature field. So setting 
that is only needed if you think that the structure exists before signing. If 
this is so, then keep it. If it isn't so, then remove these calls.
- line 341 dead code.
- why FLAG_RESERVED and related methods? My understanding is that this is 
reserved for further use so it shouldn't be there at all.
- replace .getItem() with .getDictionaryObject() unless you expect it to be a 
direct object
- the "if (flag)" code isn't needed where it is now, the code there can be put 
at the place where flag is set to true, it is unlikely that the list element 
would be there twice. (If the code is needed at all, see 4th point)
- line 364 you can use "base", and the result is never null
- javadoc of getKeyUsage / setKeyUsage: use HTML to have a list, see 
AccessPermission class javadoc on how to do this
- 

> Add Certificate Dictionary to seed value in signature field
> ---
>
> Key: PDFBOX-3971
> URL: https://issues.apache.org/jira/browse/PDFBOX-3971
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Signing
>Reporter: Hossam Hazem
>  Labels: documentation, features, newbie, patch, test
> Attachments: COSName.patch, PDSeedValue.patch, 
> PDSeedValueCertificate.java
>
>
> This dictionary is important as it gives the ability to put certificate 
> constraints on a signature field, like if you want signatures that are signed 
> by a specific issuer or authority to only be used in a field.
> currently tested Issuer constraint and it worked, acrobat reader ignores 
> other certificates and only allow the issuer given to sign the field. 
> documentation is not complete waiting for the initial acceptance to complete.
> new class PDSeedValueCertificate is added which refers to this certificate.
> PDSeedValue is modified to add the new dictionary.
> COSName is modified to add the new pdf names that are included in the 
> dictionary.
> reference for this dictionary can be found in PDF reference 1.7 section 
> 12.7.4.5 table 235 page 457 in here 
> http://www.adobe.com/content/dam/acom/en/devnet/pdf/PDF32000_2008.pdf
>  or chapter 8 table 8.84 page 700 in here 
> http://archimedespalimpsest.net/Documents/External/pdf_reference_1-7.pdf
> and in here
> https://www.adobe.com/devnet-docs/acrobatetk/tools/DigSig/Acrobat_DigitalSignatures_in_PDF.pdf
> this is my first contribution, hope everything 

[jira] [Comment Edited] (PDFBOX-3971) Add Certificate Dictionary to seed value in signature field

2017-10-22 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214231#comment-16214231
 ] 

Tilman Hausherr edited comment on PDFBOX-3971 at 10/22/17 1:24 PM:
---

Thanks... I'll review the code today or monday. I'll edit my thoughts here.
- setCertificate() has an NPE potential now (remove .getCOSObject()) (no need 
to resubmit)
- line lengths (no need to resubmit)
- "public boolean remove..." it removes an item and returns a result, i.e. does 
two things. I searched for "public boolean remove" (4 matches) and "public void 
remove" (20 matches) in our code and it looks as we're mostly comfortable not 
bothering about the result 
- "setNeedToBeUpdated()" calls - they don't hurt, but I doubt that they are 
needed. These are used in incremental save. But with signatures, the whole 
signature is new, unless you're signing in an empty signature field. So setting 
that is only needed if you think that the structure exists before signing. If 
this is so, then keep it. If it isn't so, then remove these calls.
- line 341 dead code.
- why FLAG_RESERVED and related methods? My understanding is that this is 
reserved for further use so it shouldn't be there at all.
- replace .getItem() with .getDictionaryObject() unless you expect it to be a 
direct object
- the "if (flag)" code isn't needed where it is now, the code there can be put 
at the place where flag is set to true, it is unlikely that the list element 
would be there twice. (If the code is needed at all, see 4th point)
- line 364 you can use "base", and the result is never null
- javadoc of getKeyUsage / setKeyUsage: use HTML to have a list, see 
AccessPermission class javadoc on how to do this
- 


was (Author: tilman):
Thanks... I'll review the code today or monday. I'll edit my thoughts here.
- setCertificate() has an NPE potential now (remove .getCOSObject()) (no need 
to resubmit)
- line lengths (no need to resubmit)
- "public boolean remove..." it removes an item and returns a result, i.e. does 
two things. I searched for "public boolean remove" (4 matches) and "public void 
remove" (20 matches) in our code and it looks as we're mostly comfortable not 
bothering about the result 
- "setNeedToBeUpdated()" calls - they don't hurt, but I doubt that they are 
needed. These are used in incremental save. But with signatures, the whole 
signature is new, unless you're signing in an empty signature field. So setting 
that is only needed if you think that the structure exists before signing. If 
this is so, then keep it. If it isn't so, then remove these calls.
- line 341 dead code.
- why FLAG_RESERVED and related methods? My understanding is that this is 
reserved for further use so it shouldn't be there at all.
- replace .getItem() with .getDictionaryObject() unless you expect it to be a 
direct object
- the "if (flag)" code isn't needed where it is now, the code there can be put 
at the place where flag is set to true, it is unlikely that the list element 
would be there twice. (If the code is needed at all, see 4th point)
- line 364 you can use "base", and the result is never null
- 

> Add Certificate Dictionary to seed value in signature field
> ---
>
> Key: PDFBOX-3971
> URL: https://issues.apache.org/jira/browse/PDFBOX-3971
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Signing
>Reporter: Hossam Hazem
>  Labels: documentation, features, newbie, patch, test
> Attachments: COSName.patch, PDSeedValue.patch, 
> PDSeedValueCertificate.java
>
>
> This dictionary is important as it gives the ability to put certificate 
> constraints on a signature field, like if you want signatures that are signed 
> by a specific issuer or authority to only be used in a field.
> currently tested Issuer constraint and it worked, acrobat reader ignores 
> other certificates and only allow the issuer given to sign the field. 
> documentation is not complete waiting for the initial acceptance to complete.
> new class PDSeedValueCertificate is added which refers to this certificate.
> PDSeedValue is modified to add the new dictionary.
> COSName is modified to add the new pdf names that are included in the 
> dictionary.
> reference for this dictionary can be found in PDF reference 1.7 section 
> 12.7.4.5 table 235 page 457 in here 
> http://www.adobe.com/content/dam/acom/en/devnet/pdf/PDF32000_2008.pdf
>  or chapter 8 table 8.84 page 700 in here 
> http://archimedespalimpsest.net/Documents/External/pdf_reference_1-7.pdf
> and in here
> https://www.adobe.com/devnet-docs/acrobatetk/tools/DigSig/Acrobat_DigitalSignatures_in_PDF.pdf
> this is my first contribution, hope everything goes well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (PDFBOX-3971) Add Certificate Dictionary to seed value in signature field

2017-10-22 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214231#comment-16214231
 ] 

Tilman Hausherr edited comment on PDFBOX-3971 at 10/22/17 1:21 PM:
---

Thanks... I'll review the code today or monday. I'll edit my thoughts here.
- setCertificate() has an NPE potential now (remove .getCOSObject()) (no need 
to resubmit)
- line lengths (no need to resubmit)
- "public boolean remove..." it removes an item and returns a result, i.e. does 
two things. I searched for "public boolean remove" (4 matches) and "public void 
remove" (20 matches) in our code and it looks as we're mostly comfortable not 
bothering about the result 
- "setNeedToBeUpdated()" calls - they don't hurt, but I doubt that they are 
needed. These are used in incremental save. But with signatures, the whole 
signature is new, unless you're signing in an empty signature field. So setting 
that is only needed if you think that the structure exists before signing. If 
this is so, then keep it. If it isn't so, then remove these calls.
- line 341 dead code.
- why FLAG_RESERVED and related methods? My understanding is that this is 
reserved for further use so it shouldn't be there at all.
- replace .getItem() with .getDictionaryObject() unless you expect it to be a 
direct object
- the "if (flag)" code isn't needed where it is now, the code there can be put 
at the place where flag is set to true, it is unlikely that the list element 
would be there twice. (If the code is needed at all, see 4th point)
- line 364 you can use "base", and the result is never null
- 


was (Author: tilman):
Thanks... I'll review the code today or monday. I'll edit my thoughts here.
- setCertificate() has an NPE potential now (remove .getCOSObject()) (no need 
to resubmit)
- line lengths (no need to resubmit)
- "public boolean remove..." it removes an item and returns a result, i.e. does 
two things. I searched for "public boolean remove" (4 matches) and "public void 
remove" (20 matches) in our code and it looks as we're mostly comfortable not 
bothering about the result 
- "setNeedToBeUpdated()" calls - they don't hurt, but I doubt that they are 
needed. These are used in incremental save. But with signatures, the whole 
signature is new, unless you're signing in an empty signature field. So setting 
that is only needed if you think that the structure exists before signing. If 
this is so, then keep it. If it isn't so, then remove these calls.
- line 341 dead code.
- why FLAG_RESERVED and related methods? My understanding is that this is 
reserved for further use so it shouldn't be there at all.
- replace .getItem() with .getDictionaryObject() unless you expect it to be a 
direct object
- the "if (flag)" code isn't needed where it is now, the code there can be put 
at the place where flag is set to true, it is unlikely that the list element 
would be there twice. (If the code is needed at all, see 4th point)
- 

> Add Certificate Dictionary to seed value in signature field
> ---
>
> Key: PDFBOX-3971
> URL: https://issues.apache.org/jira/browse/PDFBOX-3971
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Signing
>Reporter: Hossam Hazem
>  Labels: documentation, features, newbie, patch, test
> Attachments: COSName.patch, PDSeedValue.patch, 
> PDSeedValueCertificate.java
>
>
> This dictionary is important as it gives the ability to put certificate 
> constraints on a signature field, like if you want signatures that are signed 
> by a specific issuer or authority to only be used in a field.
> currently tested Issuer constraint and it worked, acrobat reader ignores 
> other certificates and only allow the issuer given to sign the field. 
> documentation is not complete waiting for the initial acceptance to complete.
> new class PDSeedValueCertificate is added which refers to this certificate.
> PDSeedValue is modified to add the new dictionary.
> COSName is modified to add the new pdf names that are included in the 
> dictionary.
> reference for this dictionary can be found in PDF reference 1.7 section 
> 12.7.4.5 table 235 page 457 in here 
> http://www.adobe.com/content/dam/acom/en/devnet/pdf/PDF32000_2008.pdf
>  or chapter 8 table 8.84 page 700 in here 
> http://archimedespalimpsest.net/Documents/External/pdf_reference_1-7.pdf
> and in here
> https://www.adobe.com/devnet-docs/acrobatetk/tools/DigSig/Acrobat_DigitalSignatures_in_PDF.pdf
> this is my first contribution, hope everything goes well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3971) Add Certificate Dictionary to seed value in signature field

2017-10-22 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214231#comment-16214231
 ] 

Tilman Hausherr edited comment on PDFBOX-3971 at 10/22/17 12:54 PM:


Thanks... I'll review the code today or monday. I'll edit my thoughts here.
- setCertificate() has an NPE potential now (remove .getCOSObject()) (no need 
to resubmit)
- line lengths (no need to resubmit)
- "public boolean remove..." it removes an item and returns a result, i.e. does 
two things. I searched for "public boolean remove" (4 matches) and "public void 
remove" (20 matches) in our code and it looks as we're mostly comfortable not 
bothering about the result 
- "setNeedToBeUpdated()" calls - they don't hurt, but I doubt that they are 
needed. These are used in incremental save. But with signatures, the whole 
signature is new, unless you're signing in an empty signature field. So setting 
that is only needed if you think that the structure exists before signing. If 
this is so, then keep it. If it isn't so, then remove these calls.
- line 341 dead code.
- why FLAG_RESERVED and related methods? My understanding is that this is 
reserved for further use so it shouldn't be there at all.
- replace .getItem() with .getDictionaryObject() unless you expect it to be a 
direct object
- the "if (flag)" code isn't needed where it is now, the code there can be put 
at the place where flag is set to true, it is unlikely that the list element 
would be there twice. (If the code is needed at all, see 4th point)
- 


was (Author: tilman):
Thanks... I'll review the code today or monday. I'll edit my thoughts here.
- setCertificate() has an NPE potential now (remove .getCOSObject()) (no need 
to resubmit)
- line lengths (no need to resubmit)
- "public boolean remove..." it removes an item and returns a result, i.e. does 
two things. I searched for "public boolean remove" (4 matches) and "public void 
remove" (20 matches) in our code and it looks as we're mostly comfortable not 
bothering about the result 
- "setNeedToBeUpdated()" calls - they don't hurt, but I doubt that they are 
needed. These are used in incremental save. But with signatures, the whole 
signature is new, unless you're signing in an empty signature field. So setting 
that is only needed if you think that the structure exists before signing. If 
this is so, then keep it. If it isn't so, then remove these calls.
- line 341 dead code.
- why FLAG_RESERVED and related methods? My understanding is that this is 
reserved for further use so it shouldn't be there at all.
- replace .getItem() with .getDictionaryObject() unless you expect it to be a 
direct object
- 

> Add Certificate Dictionary to seed value in signature field
> ---
>
> Key: PDFBOX-3971
> URL: https://issues.apache.org/jira/browse/PDFBOX-3971
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Signing
>Reporter: Hossam Hazem
>  Labels: documentation, features, newbie, patch, test
> Attachments: COSName.patch, PDSeedValue.patch, 
> PDSeedValueCertificate.java
>
>
> This dictionary is important as it gives the ability to put certificate 
> constraints on a signature field, like if you want signatures that are signed 
> by a specific issuer or authority to only be used in a field.
> currently tested Issuer constraint and it worked, acrobat reader ignores 
> other certificates and only allow the issuer given to sign the field. 
> documentation is not complete waiting for the initial acceptance to complete.
> new class PDSeedValueCertificate is added which refers to this certificate.
> PDSeedValue is modified to add the new dictionary.
> COSName is modified to add the new pdf names that are included in the 
> dictionary.
> reference for this dictionary can be found in PDF reference 1.7 section 
> 12.7.4.5 table 235 page 457 in here 
> http://www.adobe.com/content/dam/acom/en/devnet/pdf/PDF32000_2008.pdf
>  or chapter 8 table 8.84 page 700 in here 
> http://archimedespalimpsest.net/Documents/External/pdf_reference_1-7.pdf
> and in here
> https://www.adobe.com/devnet-docs/acrobatetk/tools/DigSig/Acrobat_DigitalSignatures_in_PDF.pdf
> this is my first contribution, hope everything goes well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3971) Add Certificate Dictionary to seed value in signature field

2017-10-22 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214231#comment-16214231
 ] 

Tilman Hausherr edited comment on PDFBOX-3971 at 10/22/17 9:11 AM:
---

Thanks... I'll review the code today or monday. I'll edit my thoughts here.
- setCertificate() has an NPE potential now (remove .getCOSObject()) (no need 
to resubmit)
- line lengths (no need to resubmit)
- "public boolean remove..." it removes an item and returns a result, i.e. does 
two things. I searched for "public boolean remove" (4 matches) and "public void 
remove" (20 matches) in our code and it looks as we're mostly comfortable not 
bothering about the result 
- "setNeedToBeUpdated()" calls - they don't hurt, but I doubt that they are 
needed. These are used in incremental save. But with signatures, the whole 
signature is new, unless you're signing in an empty signature field. So setting 
that is only needed if you think that the structure exists before signing. If 
this is so, then keep it. If it isn't so, then remove these calls.
- line 341 dead code.
- why FLAG_RESERVED and related methods? My understanding is that this is 
reserved for further use so it shouldn't be there at all.
- replace .getItem() with .getDictionaryObject() unless you expect it to be a 
direct object
- 


was (Author: tilman):
Thanks... I'll review the code today or monday. I'll edit my thoughts here.
- setCertificate() has an NPE potential now (remove .getCOSObject()) (no need 
to resubmit)
- line lengths (no need to resubmit)
- "public boolean remove..." it removes an item and returns a result, i.e. does 
two things. I searched for "public boolean remove" (4 matches) and "public void 
remove" (20 matches) in our code and it looks as we're mostly comfortable not 
bothering about the result 
- "setNeedToBeUpdated()" calls - they don't hurt, but I doubt that they are 
needed. These are used in incremental save. But with signatures, the whole 
signature is new, unless you're signing in an empty signature field. So setting 
that is only needed if you think that the structure exists before signing. If 
this is so, then keep it. If it isn't so, then remove these calls.
- line 341 dead code.
- why FLAG_RESERVED and related methods? My understanding is that this is 
reserved for further use so it shouldn't be there at all.
- 

> Add Certificate Dictionary to seed value in signature field
> ---
>
> Key: PDFBOX-3971
> URL: https://issues.apache.org/jira/browse/PDFBOX-3971
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Signing
>Reporter: Hossam Hazem
>  Labels: documentation, features, newbie, patch, test
> Attachments: COSName.patch, PDSeedValue.patch, 
> PDSeedValueCertificate.java
>
>
> This dictionary is important as it gives the ability to put certificate 
> constraints on a signature field, like if you want signatures that are signed 
> by a specific issuer or authority to only be used in a field.
> currently tested Issuer constraint and it worked, acrobat reader ignores 
> other certificates and only allow the issuer given to sign the field. 
> documentation is not complete waiting for the initial acceptance to complete.
> new class PDSeedValueCertificate is added which refers to this certificate.
> PDSeedValue is modified to add the new dictionary.
> COSName is modified to add the new pdf names that are included in the 
> dictionary.
> reference for this dictionary can be found in PDF reference 1.7 section 
> 12.7.4.5 table 235 page 457 in here 
> http://www.adobe.com/content/dam/acom/en/devnet/pdf/PDF32000_2008.pdf
>  or chapter 8 table 8.84 page 700 in here 
> http://archimedespalimpsest.net/Documents/External/pdf_reference_1-7.pdf
> and in here
> https://www.adobe.com/devnet-docs/acrobatetk/tools/DigSig/Acrobat_DigitalSignatures_in_PDF.pdf
> this is my first contribution, hope everything goes well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2852) Improve code quality (2)

2017-10-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214240#comment-16214240
 ] 

ASF subversion and git services commented on PDFBOX-2852:
-

Commit 1812890 from [~tilman] in branch 'pdfbox/trunk'
[ https://svn.apache.org/r1812890 ]

PDFBOX-2852: make field final

> Improve code quality (2)
> 
>
> Key: PDFBOX-2852
> URL: https://issues.apache.org/jira/browse/PDFBOX-2852
> Project: PDFBox
>  Issue Type: Task
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
> Attachments: PDNameTreeNode.java.patch, StringBuffer.patch, 
> XMPSchema.java.patch, explicit_array_creation.patch, fix_javadoc.patch, 
> foreach.patch, foreach2.patch, generic_type_arguments.patch, noarray.patch, 
> semicolon.patch, stringbuilder.patch, unnecessary_type_casting.patch, 
> unused_imports.patch, usestatic.patch, winansiencoding.patch, 
> winansiencoding2.patch
>
>
> This is a longterm issue for the task to improve code quality, by using the 
> [SonarQube 
> report|https://analysis.apache.org/dashboard/index/org.apache.pdfbox:pdfbox-reactor],
>  hints in different IDEs, the FindBugs tool and other code quality tools.
> This is a follow-up of PDFBOX-2576, which was getting too long.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-2852) Improve code quality (2)

2017-10-22 Thread ASF subversion and git services (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-2852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214239#comment-16214239
 ] 

ASF subversion and git services commented on PDFBOX-2852:
-

Commit 1812889 from [~tilman] in branch 'pdfbox/branches/2.0'
[ https://svn.apache.org/r1812889 ]

PDFBOX-2852: make field final

> Improve code quality (2)
> 
>
> Key: PDFBOX-2852
> URL: https://issues.apache.org/jira/browse/PDFBOX-2852
> Project: PDFBox
>  Issue Type: Task
>Affects Versions: 2.0.0
>Reporter: Tilman Hausherr
> Attachments: PDNameTreeNode.java.patch, StringBuffer.patch, 
> XMPSchema.java.patch, explicit_array_creation.patch, fix_javadoc.patch, 
> foreach.patch, foreach2.patch, generic_type_arguments.patch, noarray.patch, 
> semicolon.patch, stringbuilder.patch, unnecessary_type_casting.patch, 
> unused_imports.patch, usestatic.patch, winansiencoding.patch, 
> winansiencoding2.patch
>
>
> This is a longterm issue for the task to improve code quality, by using the 
> [SonarQube 
> report|https://analysis.apache.org/dashboard/index/org.apache.pdfbox:pdfbox-reactor],
>  hints in different IDEs, the FindBugs tool and other code quality tools.
> This is a follow-up of PDFBOX-2576, which was getting too long.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3971) Add Certificate Dictionary to seed value in signature field

2017-10-22 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214231#comment-16214231
 ] 

Tilman Hausherr edited comment on PDFBOX-3971 at 10/22/17 8:59 AM:
---

Thanks... I'll review the code today or monday. I'll edit my thoughts here.
- setCertificate() has an NPE potential now (remove .getCOSObject()) (no need 
to resubmit)
- line lengths (no need to resubmit)
- "public boolean remove..." it removes an item and returns a result, i.e. does 
two things. I searched for "public boolean remove" (4 matches) and "public void 
remove" (20 matches) in our code and it looks as we're mostly comfortable not 
bothering about the result 
- "setNeedToBeUpdated()" calls - they don't hurt, but I doubt that they are 
needed. These are used in incremental save. But with signatures, the whole 
signature is new, unless you're signing in an empty signature field. So setting 
that is only needed if you think that the structure exists before signing. If 
this is so, then keep it. If it isn't so, then remove these calls.
- line 341 dead code.
- why FLAG_RESERVED and related methods? My understanding is that this is 
reserved for further use so it shouldn't be there at all.
- 


was (Author: tilman):
Thanks... I'll review the code today or monday. I'll edit my thoughts here.
- setCertificate() has an NPE potential now (remove .getCOSObject()) (no need 
to resubmit)
- line lengths (no need to resubmit)
- "public boolean remove..." it removes an item and returns a result, i.e. does 
two things. I searched for "public boolean remove" (4 matches) and "public void 
remove" (20 matches) in our code and it looks as we're mostly comfortable not 
bothering about the result 
- "setNeedToBeUpdated()" calls - they don't hurt, but I doubt that they are 
needed. These are used in incremental save. But with signatures, the whole 
signature is new, unless you're signing in an empty signature field. So setting 
that is only needed if you think that the structure exists before signing. If 
this is so, then keep it. If it isn't so, then remove these calls.
- line 341 dead code.
- 

> Add Certificate Dictionary to seed value in signature field
> ---
>
> Key: PDFBOX-3971
> URL: https://issues.apache.org/jira/browse/PDFBOX-3971
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Signing
>Reporter: Hossam Hazem
>  Labels: documentation, features, newbie, patch, test
> Attachments: COSName.patch, PDSeedValue.patch, 
> PDSeedValueCertificate.java
>
>
> This dictionary is important as it gives the ability to put certificate 
> constraints on a signature field, like if you want signatures that are signed 
> by a specific issuer or authority to only be used in a field.
> currently tested Issuer constraint and it worked, acrobat reader ignores 
> other certificates and only allow the issuer given to sign the field. 
> documentation is not complete waiting for the initial acceptance to complete.
> new class PDSeedValueCertificate is added which refers to this certificate.
> PDSeedValue is modified to add the new dictionary.
> COSName is modified to add the new pdf names that are included in the 
> dictionary.
> reference for this dictionary can be found in PDF reference 1.7 section 
> 12.7.4.5 table 235 page 457 in here 
> http://www.adobe.com/content/dam/acom/en/devnet/pdf/PDF32000_2008.pdf
>  or chapter 8 table 8.84 page 700 in here 
> http://archimedespalimpsest.net/Documents/External/pdf_reference_1-7.pdf
> and in here
> https://www.adobe.com/devnet-docs/acrobatetk/tools/DigSig/Acrobat_DigitalSignatures_in_PDF.pdf
> this is my first contribution, hope everything goes well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3971) Add Certificate Dictionary to seed value in signature field

2017-10-22 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214231#comment-16214231
 ] 

Tilman Hausherr edited comment on PDFBOX-3971 at 10/22/17 8:37 AM:
---

Thanks... I'll review the code today or monday. I'll edit my thoughts here.
- setCertificate() has an NPE potential now (remove .getCOSObject()) (no need 
to resubmit)
- line lengths (no need to resubmit)
- "public boolean remove..." it removes an item and returns a result, i.e. does 
two things. I searched for "public boolean remove" (4 matches) and "public void 
remove" (20 matches) in our code and it looks as we're mostly comfortable not 
bothering about the result 
- "setNeedToBeUpdated()" calls - they don't hurt, but I doubt that they are 
needed. These are used in incremental save. But with signatures, the whole 
signature is new, unless you're signing in an empty signature field. So setting 
that is only needed if you think that the structure exists before signing. If 
this is so, then keep it. If it isn't so, then remove these calls.
- line 341 dead code.
- 


was (Author: tilman):
Thanks... I'll review the code today or monday. I'll edit my thoughts here.
- setCertificate() has an NPE potential now (remove .getCOSObject()) (no need 
to resubmit)
- line lengths (no need to resubmit)
- "public boolean remove..." it removes an item and returns a result, i.e. does 
two things. I searched for "public boolean remove" (4 matches) and "public void 
remove" (20 matches) 
- "setNeedToBeUpdated()" calls - they don't hurt, but I doubt that they are 
needed. These are used in incremental save. But with signatures, the whole 
signature is new, unless you're signing in an empty signature field. So setting 
that is only needed if you think that the structure exists before signing. If 
this is so, then keep it. If it isn't so, then remove these calls.
- line 341 dead code.
- 

> Add Certificate Dictionary to seed value in signature field
> ---
>
> Key: PDFBOX-3971
> URL: https://issues.apache.org/jira/browse/PDFBOX-3971
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Signing
>Reporter: Hossam Hazem
>  Labels: documentation, features, newbie, patch, test
> Attachments: COSName.patch, PDSeedValue.patch, 
> PDSeedValueCertificate.java
>
>
> This dictionary is important as it gives the ability to put certificate 
> constraints on a signature field, like if you want signatures that are signed 
> by a specific issuer or authority to only be used in a field.
> currently tested Issuer constraint and it worked, acrobat reader ignores 
> other certificates and only allow the issuer given to sign the field. 
> documentation is not complete waiting for the initial acceptance to complete.
> new class PDSeedValueCertificate is added which refers to this certificate.
> PDSeedValue is modified to add the new dictionary.
> COSName is modified to add the new pdf names that are included in the 
> dictionary.
> reference for this dictionary can be found in PDF reference 1.7 section 
> 12.7.4.5 table 235 page 457 in here 
> http://www.adobe.com/content/dam/acom/en/devnet/pdf/PDF32000_2008.pdf
>  or chapter 8 table 8.84 page 700 in here 
> http://archimedespalimpsest.net/Documents/External/pdf_reference_1-7.pdf
> and in here
> https://www.adobe.com/devnet-docs/acrobatetk/tools/DigSig/Acrobat_DigitalSignatures_in_PDF.pdf
> this is my first contribution, hope everything goes well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Commented] (PDFBOX-3971) Add Certificate Dictionary to seed value in signature field

2017-10-22 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214231#comment-16214231
 ] 

Tilman Hausherr commented on PDFBOX-3971:
-

Thanks... I'll review the code today or monday. I'll edit my thoughts here.
- setCertificate() has an NPE potential now (remove .getCOSObject()) (no need 
to resubmit)
- line lengths (no need to resubmit)
- "public boolean remove..." remove and return a result, i.e. does two things. 
I searched for "public boolean remove" (4 matches) and "public void remove" (20 
matches) 
- "setNeedToBeUpdated()" calls - they don't hurt, but I doubt that they are 
needed. These are used in incremental save. But with signatures, the whole 
signature is new, unless you're signing in an empty signature field. So setting 
that is only needed if you think that the structure exists before signing. If 
this is so, then keep it. If it isn't so, then remove these calls.
- line 341 dead code.
- 

> Add Certificate Dictionary to seed value in signature field
> ---
>
> Key: PDFBOX-3971
> URL: https://issues.apache.org/jira/browse/PDFBOX-3971
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Signing
>Reporter: Hossam Hazem
>  Labels: documentation, features, newbie, patch, test
> Attachments: COSName.patch, PDSeedValue.patch, 
> PDSeedValueCertificate.java
>
>
> This dictionary is important as it gives the ability to put certificate 
> constraints on a signature field, like if you want signatures that are signed 
> by a specific issuer or authority to only be used in a field.
> currently tested Issuer constraint and it worked, acrobat reader ignores 
> other certificates and only allow the issuer given to sign the field. 
> documentation is not complete waiting for the initial acceptance to complete.
> new class PDSeedValueCertificate is added which refers to this certificate.
> PDSeedValue is modified to add the new dictionary.
> COSName is modified to add the new pdf names that are included in the 
> dictionary.
> reference for this dictionary can be found in PDF reference 1.7 section 
> 12.7.4.5 table 235 page 457 in here 
> http://www.adobe.com/content/dam/acom/en/devnet/pdf/PDF32000_2008.pdf
>  or chapter 8 table 8.84 page 700 in here 
> http://archimedespalimpsest.net/Documents/External/pdf_reference_1-7.pdf
> and in here
> https://www.adobe.com/devnet-docs/acrobatetk/tools/DigSig/Acrobat_DigitalSignatures_in_PDF.pdf
> this is my first contribution, hope everything goes well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Comment Edited] (PDFBOX-3971) Add Certificate Dictionary to seed value in signature field

2017-10-22 Thread Tilman Hausherr (JIRA)

[ 
https://issues.apache.org/jira/browse/PDFBOX-3971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16214231#comment-16214231
 ] 

Tilman Hausherr edited comment on PDFBOX-3971 at 10/22/17 8:36 AM:
---

Thanks... I'll review the code today or monday. I'll edit my thoughts here.
- setCertificate() has an NPE potential now (remove .getCOSObject()) (no need 
to resubmit)
- line lengths (no need to resubmit)
- "public boolean remove..." it removes an item and returns a result, i.e. does 
two things. I searched for "public boolean remove" (4 matches) and "public void 
remove" (20 matches) 
- "setNeedToBeUpdated()" calls - they don't hurt, but I doubt that they are 
needed. These are used in incremental save. But with signatures, the whole 
signature is new, unless you're signing in an empty signature field. So setting 
that is only needed if you think that the structure exists before signing. If 
this is so, then keep it. If it isn't so, then remove these calls.
- line 341 dead code.
- 


was (Author: tilman):
Thanks... I'll review the code today or monday. I'll edit my thoughts here.
- setCertificate() has an NPE potential now (remove .getCOSObject()) (no need 
to resubmit)
- line lengths (no need to resubmit)
- "public boolean remove..." remove and return a result, i.e. does two things. 
I searched for "public boolean remove" (4 matches) and "public void remove" (20 
matches) 
- "setNeedToBeUpdated()" calls - they don't hurt, but I doubt that they are 
needed. These are used in incremental save. But with signatures, the whole 
signature is new, unless you're signing in an empty signature field. So setting 
that is only needed if you think that the structure exists before signing. If 
this is so, then keep it. If it isn't so, then remove these calls.
- line 341 dead code.
- 

> Add Certificate Dictionary to seed value in signature field
> ---
>
> Key: PDFBOX-3971
> URL: https://issues.apache.org/jira/browse/PDFBOX-3971
> Project: PDFBox
>  Issue Type: Improvement
>  Components: Signing
>Reporter: Hossam Hazem
>  Labels: documentation, features, newbie, patch, test
> Attachments: COSName.patch, PDSeedValue.patch, 
> PDSeedValueCertificate.java
>
>
> This dictionary is important as it gives the ability to put certificate 
> constraints on a signature field, like if you want signatures that are signed 
> by a specific issuer or authority to only be used in a field.
> currently tested Issuer constraint and it worked, acrobat reader ignores 
> other certificates and only allow the issuer given to sign the field. 
> documentation is not complete waiting for the initial acceptance to complete.
> new class PDSeedValueCertificate is added which refers to this certificate.
> PDSeedValue is modified to add the new dictionary.
> COSName is modified to add the new pdf names that are included in the 
> dictionary.
> reference for this dictionary can be found in PDF reference 1.7 section 
> 12.7.4.5 table 235 page 457 in here 
> http://www.adobe.com/content/dam/acom/en/devnet/pdf/PDF32000_2008.pdf
>  or chapter 8 table 8.84 page 700 in here 
> http://archimedespalimpsest.net/Documents/External/pdf_reference_1-7.pdf
> and in here
> https://www.adobe.com/devnet-docs/acrobatetk/tools/DigSig/Acrobat_DigitalSignatures_in_PDF.pdf
> this is my first contribution, hope everything goes well.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



[jira] [Resolved] (PDFBOX-3972) Incorrect page after merge for OpenAction with GoTo page destination

2017-10-22 Thread Tilman Hausherr (JIRA)

 [ 
https://issues.apache.org/jira/browse/PDFBOX-3972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tilman Hausherr resolved PDFBOX-3972.
-
   Resolution: Fixed
Fix Version/s: 3.0.0
   2.0.8

> Incorrect page after merge for OpenAction with GoTo page destination
> 
>
> Key: PDFBOX-3972
> URL: https://issues.apache.org/jira/browse/PDFBOX-3972
> Project: PDFBox
>  Issue Type: Bug
>  Components: Utilities
>Affects Versions: 2.0.7
>Reporter: Tilman Hausherr
>Assignee: Tilman Hausherr
> Fix For: 2.0.8, 3.0.0
>
> Attachments: pdf_layer_new-merged.pdf, pdf_layer_new.pdf
>
>
> Merge the attached file with itself. Open the result file with PDFDebugger 
> and look at {{Root/OpenAction/D/\[0]}}. The page there has the object number 
> 3. However a look at the page tree shows that the pages have the object 
> numbers 6 and 7.
> I noticed this while researching a different problem with the file at
> https://stackoverflow.com/questions/46850515/



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Jenkins build is back to normal : PDFBox-sonar #281

2017-10-22 Thread Apache Jenkins Server
See 


-
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org



Build failed in Jenkins: PDFBox-sonar #280

2017-10-22 Thread Apache Jenkins Server
See 

--
[...truncated 1.01 KB...]
Deleting 
Deleting 
Deleting 
Deleting 
Deleting 
Deleting 
Deleting 
Deleting 

Updating http://svn.apache.org/repos/asf/pdfbox/trunk at revision 
'2017-10-22T07:25:27.365 +'
At revision 1812881

No changes for http://svn.apache.org/repos/asf/pdfbox/trunk since the previous 
build
Injecting SonarQube environment variables using the configuration: ASF Sonar 
Analysis
Parsing POMs
Established TCP socket on 37212
maven3-agent.jar already up to date
maven3-interceptor.jar already up to date
maven3-interceptor-commons.jar already up to date
[trunk] $ /home/jenkins/tools/java/latest1.8/bin/java -Xmx1g 
-XX:MaxPermSize=300m -cp 
/home/jenkins/jenkins-slave/maven3-agent.jar:/home/jenkins/tools/maven/apache-maven-3.0.4/boot/plexus-classworlds-2.4.jar
 org.jvnet.hudson.maven3.agent.Maven3Main 
/home/jenkins/tools/maven/apache-maven-3.0.4 
/home/jenkins/jenkins-slave/slave.jar 
/home/jenkins/jenkins-slave/maven3-interceptor.jar 
/home/jenkins/jenkins-slave/maven3-interceptor-commons.jar 37212
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=300m; 
support was removed in 8.0
<===[JENKINS REMOTING CAPACITY]===>   channel started
Executing Maven:  -B -f 
 
-Dmaven.repo.local=/home/jenkins/jenkins-slave/maven-repositories/0 compile 
sonar:sonar -Dsonar.host.url=https://builds.apache.org/analysis 
-Dskip-bavaria=false
[INFO] Scanning for projects...
[INFO] 
[INFO] Reactor Build Order:
[INFO] 
[INFO] PDFBox parent
[INFO] Apache FontBox
[INFO] Apache XmpBox
[INFO] Apache PDFBox
[INFO] Apache Preflight
[INFO] Apache Preflight application
[INFO] Apache PDFBox Debugger
[INFO] Apache PDFBox tools
[INFO] Apache PDFBox application
[INFO] Apache PDFBox Debugger application
[INFO] Apache PDFBox examples
[INFO] Apache PDFBox
[INFO] 
[INFO] 
[INFO] Building PDFBox parent 3.0.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ pdfbox-parent 
---
[INFO] 
[INFO] 
[INFO] Building Apache FontBox 3.0.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ fontbox ---
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ fontbox 
---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] Copying 91 resources
[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ fontbox ---
[INFO] Changes detected - recompiling the module!
[INFO] Compiling 98 source files to 

[WARNING] bootstrap class path not set in conjunction with -source 1.7
[WARNING] 
:
 Some input files use unchecked or unsafe operations.
[WARNING] 
:
 Recompile with -Xlint:unchecked for details.
[INFO] 
[INFO] 
[INFO] Building Apache XmpBox 3.0.0-SNAPSHOT
[INFO] 
[INFO] 
[INFO] --- maven-remote-resources-plugin:1.5:process (default) @ xmpbox ---
[INFO] 
[INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ xmpbox ---
[INFO] Using 'UTF-8' encoding to copy filtered resources.
[INFO] skip non existing resourceDirectory 

[INFO] Copying 3 resources
[INFO] 
[INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ xmpbox ---
[INFO] Changes detected - recompiling the module!
[INFO]