Re: Fwd: Tika not parsing underlines

Kamesh Joshi Thu, 05 Jan 2017 00:52:13 -0800

I already tried that but it does not give me any indication for the
underline present in the line it juts give me data in text data in <p></p>
tags


On Thu, Jan 5, 2017 at 1:27 PM, Nick Burch <[email protected]> wrote:

> On Thu, 5 Jan 2017, Kamesh Joshi wrote:
>
>> I am trying to parse the attached the pdf.but it does not give me the
>> places where the underline is present it just returns me plain text.
>> Please help me how can i also get the underline present in pdf or some way
>> to split text based on that.
>>
>> I am using curl -T Downloads/kameshjoshi.pdf  http://localhost:9998/tika
>> --header "Accept: text/plain" in my command line.
>>
>
> You need to ask Tika to give you the HTML version to be able to spot
> markup like underlines. Swap that accept header to text/html and you should
> then be able to see them
>
> Nick
>

Re: Fwd: Tika not parsing underlines

Reply via email to