Re: Fwd: Tika not parsing underlines

Nick Burch Wed, 04 Jan 2017 23:58:47 -0800

On Thu, 5 Jan 2017, Kamesh Joshi wrote:

I am trying to parse the attached the pdf.but it does not give me the
places where the underline is present it just returns me plain text.
Please help me how can i also get the underline present in pdf or some way
to split text based on that.


I am using curl -T Downloads/kameshjoshi.pdf  http://localhost:9998/tika
--header "Accept: text/plain" in my command line.

You need to ask Tika to give you the HTML version to be able to spotmarkup like underlines. Swap that accept header to text/html and youshould then be able to see them


Nick

Re: Fwd: Tika not parsing underlines

Reply via email to