Try UTF-8 encoding the URLs or the parameters themselves. If you are using 
Tika-Python, then use the Python
encode library…

 

Cheers,

Chris

 

 

 

From: radhia bezzine <bezzinerad...@gmail.com>
Date: Thursday, February 22, 2018 at 6:03 AM
To: "Mattmann, Chris A (1761)" <chris.a.mattm...@jpl.nasa.gov>
Subject: Issue with apache Tika

 

Hello Dear ! 

 

I hope your are doing well.

 

I am writing to you because i have an issue running apache Tika on Python.

I'm trying to parse content & metadata from many urls (existing in the internet)

however Tika returns some times an error like " invalid argument "

i troubleshooted  the problem and i realized that some url include forbidden 
characters that is why apache tika mention " invalid argument "

I really don't know how to deal with this problem, i tried other tools but i 
think tika is matching with my need.

 

Thank you very much for you time.

 

Best regards! 

 

Radhia

Reply via email to