> tesseract -v tesseract 5.0.0-alpha-20210401-66-g91b2b4 leptonica-1.81.0 (Apr 16 2021, 16:18:45) [MSC v.1928 LIB Release x64] libgif 5.2.1 : libjpeg 6b (libjpeg-turbo 2.0.91) : libpng 1.6.37 : libtiff 4.2.0 : zlib 1.2.11 : libwebp 1.2.0 : libopenjp2 2.4.0 Found AVX2 Found AVX Found FMA Found SSE4.1 Found libarchive 3.5.1 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.6 libzstd/1.4.9
> echo %TESSDATA_PREFIX% t:\Project-Personal\tessdata_best\tessdata > tesseract 5_may_2021.jpg - --psm 7 -l eng 5 MAY 2021 5_may_2021.jpg is your first image (white text on black) Zdenko ut 11. 5. 2021 o 11:19 Juanjo Gómez Navarro <[email protected]> napísal(a): > Good morning, I'm trying to use Tesseract to read dates in image files. > The problem I have is that the image is rather small. This is the cropped > image with the date I have to process: > > > [image: test-raw.jpg] > > After some processing with Scikit-Image (rescaling, adding a white border, > erosion and binarising) I get this image: > > [image: processed.png] > > To me it reads pretty well. Still, tesseract reads "» MAY 2021". The "5" > is missing. > > How can I process the image to get the desired output, i.e. "5 MAY 2021". > > I'm using tesseract 4.1.1 with pytesseract. > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/75f4dfc8-7ec0-4334-8b11-72fc268f1b83n%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/75f4dfc8-7ec0-4334-8b11-72fc268f1b83n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8yJsAXCBHBTo2sOj-JSaEtjXf_n1hr7FQRJpv3sZkw5WA%40mail.gmail.com.

