Bug#930281: PDF which fails to index with TypeError: object of type 'NoneType' has no len()

2019-06-12 Thread Jean-Francois Dockes
Thanks, it's nice to see the Recoll metadata gathering facilities being put to use :) jf

Bug#930281: PDF which fails to index with TypeError: object of type 'NoneType' has no len()

2019-06-12 Thread Anthony DeRobertis
On Wed, Jun 12, 2019 at 07:40:14PM +0200, Jean-Francois Dockes wrote: > > I am attaching a fixed script for your testing, it should replace > /usr/share/recoll/rclpdf.py Appears to be /usr/share/recoll/filters/rclpdf.py here. Anyway, I put your new version in place, and that fixed all of the PDF

Bug#930281: PDF which fails to index with TypeError: object of type 'NoneType' has no len()

2019-06-12 Thread Jean-Francois Dockes
Hi, The document has XMP metadata inside XML attributes, instead of element text. The script did not handle this well, and there were a few other issues too. I am attaching a fixed script for your testing, it should replace /usr/share/recoll/rclpdf.py J.F. Dockes rclpdf.py Description: Binary

Bug#930281: PDF which fails to index with TypeError: object of type 'NoneType' has no len()

2019-06-09 Thread Anthony DeRobertis
Package: recoll Version: 1.24.3-3 Severity: normal -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I a bunch of PDFs which fail to index with a Python exception. A bunch are confidential, but this one isn't. Traceback (most recent call last): File "/usr/share/recoll/filters/rclpdf.py", line 523,