PDF Extraction Failed for scientific document

Morkus Mon, 06 Aug 2018 10:27:49 -0700

Hello all,

For the first time ever, a PDF I tried to extract with Tika, failed.


A scientific article with lots of symbols and such, by these authors:

Beyond the Words: Predicting User Personality from

Heterogeneous Information

Honghao Weiy;, Fuzheng Zhangy, Nicholas Jing Yuanz,

Chuan Caoz, Hao Fuz, Xing Xiey, Yong Ruiy, Wei-Ying May

yMicrosoft ResearchzMicrosoft

Department of Computer Science and Technology, Tsinghua University

[email protected],

{fuzzhang, nicholas.yuan, chcao, fuha, xingx, yongrui, wyma}@microsoft.com

------------

I have tika-core 1.18 and tika-parsers 1.18.

Is it unusual to have a failed PDF translation?

Suggestions?

I can include the PDF in an email, but wanted to ask first.

Thanks!

Sent from [ProtonMail](https://protonmail.com), Swiss-based encrypted email.

Sent from [ProtonMail](https://protonmail.com), Swiss-based encrypted email.

PDF Extraction Failed for scientific document

Reply via email to