Hi, If you could post a sample XML file that you modified the output of pdftotext to fit the XML parser, it would be helpful for some kind people to develop a patch.
Regards, mpsuzuki On 11/14/2013 10:04 PM, Paweł Leń wrote:
Hello, I have error when running: pdftotext -bbox -htmlmeta 'myfile.pdf' 'tempFile.xml' The output xml have <title> tag on the begining of document (meta section), error appears when title contains "&" character. Title field has no CDATA and it is not quoted so it causes error in my xmllib parser. Can I (or You :) ) fix it somehow? Beast regards *-- * *Paweł Leń* _______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
_______________________________________________ poppler mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/poppler
