RE: 2.0.8?

Allison, Timothy B. Mon, 02 Oct 2017 14:49:56 -0700

> Re 308576.pdf: the text extraction has a huge loss, but a manual check shows 
> it is identical. However that file has the NPE from PDActionURI.getURI(), 
> could it be that this results in an abort of text extraction?
Same for 569017.pdf.


Likely.  There are two "per file pair contents" files.  The one ending with 
"_ignore_exceptions.xlsx" means that results are not reported if there was an 
exception caught for one of the files (308576.pdf and 569017.pdf aren't in that 
file).  The other one "*_with_exceptions" includes both.  Based on your 
feedback, I should add 2 boolean cols to "*_with_exceptions.xlsx" for 
exceptionInA and exceptionInB?

RE: 2.0.8?

Reply via email to