Dear Tika Dev Team,


Hope this email finds you well.



I have been actively using Tika for pdf file reading. One issue I found is
the parsing order. As shown in attached image, the parsing order of pdf
file is not  based on position of texts.



As suggested in this github link
<https://github.com/chrismattmann/tika-python/issues/266>, I used a
customized config file (see attached), hoping to solve the issue. But this
has not worked out. If any chance, can you please review this issue, and
provide any insights or solutions?



Thanks so much in advance.



Regards,

Luke

Attachment: tika.config
Description: XML document

Reply via email to