Try UTF-8 encoding the URLs or the parameters themselves. If you are using 
Tika-Python, then use the Python
encode library…







From: radhia bezzine <>
Date: Thursday, February 22, 2018 at 6:03 AM
To: "Mattmann, Chris A (1761)" <>
Subject: Issue with apache Tika


Hello Dear ! 


I hope your are doing well.


I am writing to you because i have an issue running apache Tika on Python.

I'm trying to parse content & metadata from many urls (existing in the internet)

however Tika returns some times an error like " invalid argument "

i troubleshooted  the problem and i realized that some url include forbidden 
characters that is why apache tika mention " invalid argument "

I really don't know how to deal with this problem, i tried other tools but i 
think tika is matching with my need.


Thank you very much for you time.


Best regards! 



Reply via email to