I already tried that but it does not give me any indication for the underline present in the line it juts give me data in text data in <p></p> tags
On Thu, Jan 5, 2017 at 1:27 PM, Nick Burch <[email protected]> wrote: > On Thu, 5 Jan 2017, Kamesh Joshi wrote: > >> I am trying to parse the attached the pdf.but it does not give me the >> places where the underline is present it just returns me plain text. >> Please help me how can i also get the underline present in pdf or some way >> to split text based on that. >> >> I am using curl -T Downloads/kameshjoshi.pdf http://localhost:9998/tika >> --header "Accept: text/plain" in my command line. >> > > You need to ask Tika to give you the HTML version to be able to spot > markup like underlines. Swap that accept header to text/html and you should > then be able to see them > > Nick >
