Re: --header "X-Tika-OCR: false" ; an option to fully disable OCR for each request

2021-06-10 Thread Nick Burch
On Thu, 10 Jun 2021, Cristian Zamfir wrote: Got it, thanks. What are your thoughts on using Tika 2.x while still in beta? Is it likely to be more stable than 1,26? I presume it has passed the same extensive test suite. Usage stability wise, it's as good as 1.x. API stability wise things are s

Re: --header "X-Tika-OCR: false" ; an option to fully disable OCR for each request

2021-06-10 Thread Cristian Zamfir
Got it, thanks. What are your thoughts on using Tika 2.x while still in beta? Is it likely to be more stable than 1,26? I presume it has passed the same extensive test suite. On Thu, Jun 10, 2021 at 4:00 PM Nick Burch wrote: > On Thu, 10 Jun 2021, Cristian Zamfir wrote: > > Thanks Nick. Looks l

Re: --header "X-Tika-OCR: false" ; an option to fully disable OCR for each request

2021-06-10 Thread Nick Burch
On Thu, 10 Jun 2021, Cristian Zamfir wrote: Thanks Nick. Looks like the option I was looking for is the 3rd one, but the docs say it is only available in Tika 2.x - am I right? I've just done a grep of the codebase, and it isn't in the 1.x branch, only main = 2.x. So, Tika 2.x only Nick

Re: --header "X-Tika-OCR: false" ; an option to fully disable OCR for each request

2021-06-10 Thread Cristian Zamfir
Thanks Nick. Looks like the option I was looking for is the 3rd one, but the docs say it is only available in Tika 2.x - am I right? On Thu, Jun 10, 2021 at 3:47 PM Nick Burch wrote: > On Thu, 10 Jun 2021, Cristian Zamfir wrote: > > It would be nice if this was feasible via the headers of each

Re: --header "X-Tika-OCR: false" ; an option to fully disable OCR for each request

2021-06-10 Thread Nick Burch
On Thu, 10 Jun 2021, Cristian Zamfir wrote: It would be nice if this was feasible via the headers of each request. I find it more convenient to use if/else in my code than in the yaml files used for k8s configuration. Is there such an option? Three options, see https://cwiki.apache.org/conflu

--header "X-Tika-OCR: false" ; an option to fully disable OCR for each request

2021-06-10 Thread Cristian Zamfir
Hi, I read through how to disable OCR here https://cwiki.apache.org/confluence/display/TIKA/TikaOCR and I am wondering if there is another option I missed. I would basically like to disable OCR fully (also for images, images in doc files, etc --- not just for PDFs) -- so the same as through a cus