Try out the single line PSM modes (7 and 13). I've had the best luck with
13 on single line images. Also, see to removing the extra black marks that
aren't part of the letters.

On Tue, Aug 27, 2019 at 5:12 AM Stephane Charette <
[email protected]> wrote:

> I have a large number of images that contain a single line of alphanumeric
> data.  My scans so far have not been great, and I could use some assistance.
>
> Several vars are turned off as recommended in the docs:
>
>     key.push_back("load_system_dawg");
>     val.push_back("false");
>     key.push_back("load_freq_dawg");
>     val.push_back("false");
>
>
> These are set at initialization:
>
>     tess->Init(nullptr, "eng", tesseract::OEM_DEFAULT, nullptr, 0, &key, 
> &val, false);
>     tess->SetPageSegMode(tesseract::PageSegMode::PSM_SINGLE_LINE);
>
>
> Some images are close, such as this one:
>
> [image: "32 EC 5"]
> ...which is interpreted as "SZ2EC 3".
>
> Other like this one return a blank string:
>
> [image: "30 B 9"]
> And then I have some like this one which is so close, but Tesseract
> removes the spaces between the letters, so this example results in "1201":
>
> [image: "12 O 1"]
> I've posted my full .cpp test file and more example images showing the
> problem on StackOverflow:
> https://stackoverflow.com/questions/57670769/how-to-get-tesseract-to-recognize-these-alphanumeric-strings
>
> Thanks,
>
> Stéphane
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/f721e105-d0d6-4322-b9c5-6c5f2d487d06%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/f721e105-d0d6-4322-b9c5-6c5f2d487d06%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CABtjQ9%2Bz7hHOoPkcptg8vx8NANFQ0J6nzBXisqQ7z1mb6USTyw%40mail.gmail.com.

Reply via email to