Re: Tesseract PSM=0

Tim Allison Fri, 22 Jan 2021 07:07:45 -0800

I don't think the TesseractOCRParser is set up to parse this type of
output.  PRs welcomed...if there's a generalizable use case for
this(?).


On Fri, Jan 22, 2021 at 9:31 AM Peter Kronenberg
<[email protected]> wrote:
>
> What is the expected behavior of Tika when using PSM 0?   When using 
> Tesseract directly from the command line, I get this
>
>
>
> c:\TestFiles>tesseract --psm 0 Dickens.png stdout
>
> Page number: 0
>
> Orientation in degrees: 0
>
> Rotate: 0
>
> Orientation confidence: 8.75
>
> Script: Latin
>
> Script confidence: 2.86
>
>
>
> But from Tika, I’m not getting any output.  There’s obviously no OCR output, 
> since PSM 0 doesn’t do OCR.  It just does Orientation and Script detection. 
> So where is that Tesseract output going?

Re: Tesseract PSM=0

Reply via email to