Re: Tika 2.1.0 pdf parser

nskarthik Fri, 22 Oct 2021 08:15:02 -0700

Hi

I plan to get Text/images out of pdf/docx/xlsx./html/csv/mht......so on


Instead of using POI / PDFBox /... thought Tika would be single source of Data 
extraction...

Hence wanted to use the same.


with regards
Karthik

On 2021/10/22 14:41:38, AJ Weber <[email protected]> wrote: 
> 
> >>> Question :  Need to extract Text / images at page level using java.
> >>> Did not find any example on www or Tika website.
> 
> Why not use a library specifically suited to the job like Apache PDFBox 
> (directly)?
>   
> 
>

Re: Tika 2.1.0 pdf parser

Reply via email to