PDF text extractions leaves non-breakable spaces
------------------------------------------------
Key: NXP-4924
URL: https://jira.nuxeo.org/browse/NXP-4924
Project: Nuxeo Enterprise Platform
Issue Type: Bug
Affects Versions: 5.3.1
Reporter: Florent Guillaume
Assignee: Florent Guillaume
Fix For: 5.3.2
Some PDF files, when going through PDFBox for text extraction, end up having
spaces that are non-breakable (u00a0). This isn't correctly indexed by some
fulltext parsers.
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
https://jira.nuxeo.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
_______________________________________________
ECM-tickets mailing list
[email protected]
http://lists.nuxeo.com/mailman/listinfo/ecm-tickets