[ 
https://issues.apache.org/jira/browse/PDFBOX-4102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16358716#comment-16358716
 ] 

Tilman Hausherr commented on PDFBOX-4102:
-----------------------------------------

In ftp, use the "bin" or "binary" command. May already have been set by 
default. To get a checksum of a file,
{code}
MessageDigest md = MessageDigest.getInstance("MD5");
try (InputStream dis = new DigestInputStream(new FileInputStream(FILENAME), md))
{
    while (dis.read() >= 0)
        ;
}
byte[] digest = md.digest();
{code}
Then print the hex values of these bytes and compare. Depending on your system, 
you may have some command line utility that does this.


Re filtering (only relevant if your file is in the resources and you're working 
with maven), see this recent answer:

https://stackoverflow.com/a/48692034/535646

> java.lang.IllegalArgumentException: root cannot be null
> -------------------------------------------------------
>
>                 Key: PDFBOX-4102
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4102
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 2.0.8
>            Reporter: lwf
>            Priority: Major
>         Attachments: Testing 123.pdf
>
>
> {color:#f00000}java.lang.IllegalArgumentException: root cannot be null{color}
>  at org.apache.pdfbox.pdmodel.PDPageTree.<init>(PDPageTree.java:75)
>  at 
> org.apache.pdfbox.pdmodel.PDDocumentCatalog.getPages(PDDocumentCatalog.java:129)
>  at org.apache.pdfbox.pdmodel.PDDocument.getPages(PDDocument.java:1401)
>  at org.apache.pdfbox.text.PDFTextStripper.writeText(PDFTextStripper.java:266)
>  at org.apache.pdfbox.text.PDFTextStripper.getText(PDFTextStripper.java:227)
>  
>  
> Due to confidentiality of the original document, I'm uploaded a test document 
> which results in the same error. I'm using pdfbox-app-2.0.8.jar... please 
> help asap :(



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to