Hi,

But for the Aurochs file I'm getting "Empty page!!". I have not been able
> to get a command working for both.


invest some time to reading documentation:
https://github.com/tesseract-ocr/tessdoc/blob/main/ImproveQuality.md

Is there a way to say something like "try without PSM and if empty page try
> with psm 7"?


Tesseract is OCR *engine* (with simple image layout detection), so if you
need to apply some logic you need to implement it by yourself.

Is that possible to provide my own list of possible words to look for?
> Like, can I provide "Aurochs, Greaves, Lightning" and enforce the OCR to
> use only those possible words?


Yes it is. Read documentation how. But the effect of customized
dictionaries is very limited usually.

Best regards,

Zdenko


pi 26. 9. 2025 o 0:31 Jean-Marc Spaggiari <[email protected]>
napísal(a):

> Hi,
>
> I have 2 images pretty similar that I want to OCR.
>
> [image: image_1758836719_box0_score0_87.jpg]
> [image: image_1758836841_box0_score0_87.jpg]
> I think they are both pretty good quality. To OCR the 2nd one I'm using
> this command:
> tesseract image_1758836841_box0_score0_87.jpg stdout --dpi 600 --psm 7 -l
> eng
>
> And I'm getting exactly what is in the picture.
> However, the same command for the first picture doesn't return anything.
>
> Now, if I change the command for this one:
> tesseract image_1758836719_box0_score0_87.jpg stdout --dpi 600 -l eng
>
> I'm getting some output with a lot of noise:
> Detected 6 diacritics
> — sl O
>
> a e any aS |
> Lightning Greaves
>
> But for the Aurochs file I'm getting "Empty page!!". I have not been able
> to get a command working for both.
>
> So I have a few questions here.
>
>    - Is there a way to say something like "try without PSM and if empty
>    page try with psm 7"?
>    - Is that possible to provide my own list of possible words to look
>    for? Like, can I provide "Aurochs, Greaves, Lightning" and enforce the OCR
>    to use only those possible words?
>
>
> Thanks,
>
> JM
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion visit
> https://groups.google.com/d/msgid/tesseract-ocr/CAPQV63Uuzf7%2Bro%3Dfi3ff_7cswa%3DjvMAA7nPaynSxP1ZVG_YQ2g%40mail.gmail.com
> <https://groups.google.com/d/msgid/tesseract-ocr/CAPQV63Uuzf7%2Bro%3Dfi3ff_7cswa%3DjvMAA7nPaynSxP1ZVG_YQ2g%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8xANFomF23LnfAZfYDH9Zce0K33_UW1YPW4WmJMT_6RkQ%40mail.gmail.com.

Reply via email to