PDF text extractions leaves non-breakable spaces
------------------------------------------------

                 Key: NXP-4924
                 URL: https://jira.nuxeo.org/browse/NXP-4924
             Project: Nuxeo Enterprise Platform
          Issue Type: Bug
    Affects Versions: 5.3.1
            Reporter: Florent Guillaume
            Assignee: Florent Guillaume
             Fix For: 5.3.2


Some PDF files, when going through PDFBox for text extraction, end up having 
spaces that are non-breakable (u00a0). This isn't correctly indexed by some 
fulltext parsers.


-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://jira.nuxeo.org/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        
_______________________________________________
ECM-tickets mailing list
[email protected]
http://lists.nuxeo.com/mailman/listinfo/ecm-tickets

Reply via email to