I answered, asked to have a look at your file (upload to a sharehoster),
and mentioned that your config file is suspicious.
Tilman
Am 20.12.2019 um 19:06 schrieb Lu Sun:
Dear PDFBox Dev Team,
Hope this message finds you well.
Just wanted to raise this for your attention. Please can you provide
any solutions on the parsing order issue? Attached is my config file,
an example of pdf file and my parsing results.
Thanks so much in advance. Wish you and your team a Merry Christmas
and Happy New Year.
Regards,
Luke
On Tue, 17 Dec 2019 at 12:34, Tim Allison <talli...@apache.org
<mailto:talli...@apache.org>> wrote:
PDFBox Colleagues,
Any recommendations?
On Mon, Dec 16, 2019 at 7:05 AM Lu Sun <vistax...@gmail.com
<mailto:vistax...@gmail.com>> wrote:
Dear Tika Dev Team,
Hope this email finds you well.
I have been actively using Tika for pdf file reading. One
issue I found is the parsing order. As shown in attached
image, the parsing order of pdf file is not based on position
of texts.
As suggested in this github link
<https://github.com/chrismattmann/tika-python/issues/266>, I
used a customized config file (see attached), hoping to solve
the issue. But this has not worked out. If any chance, can you
please review this issue, and provide any insights or solutions?
Thanks so much in advance.
Regards,
Luke
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org