try the finetuned traineddata from

https://github.com/Shreeshrii/tessdata_shreetest/commit/0108263ad0c4c9bd11e0c8190a81fb36e2e4e56a


On Sat, Mar 30, 2019 at 1:47 AM Martin Emmerson <[email protected]> wrote:

> Yikes!   Thanks for the reply, but I could barely follow the discussion on
> that pull request.   It seems the answer at least for now is that there
> isn't a straightforward way to restrict character set without being
> somewhat familiar with the code base and dev environment (which I'm not).
> Thanks anyway; I'll try to figure out some external workarounds.
>
> On Thursday, March 28, 2019 at 11:03:59 PM UTC-7, shree wrote:
>>
>> See https://github.com/tesseract-ocr/tesseract/pull/2294
>>
>> On Fri, 29 Mar 2019, 11:17 Martin Emmerson, <[email protected]> wrote:
>>
>>> Is there a way to restrict the character set that tesseract-ocr will
>>> attempt to identify?  I'm scanning USA-based receipts which have a fairly
>>> simple set of monospaced characters but, for example, often '1' will get
>>> misidentified as '|', and a whole host of other simple substitution
>>> errors.  If I could just restrict tesseract to [-a-zA-Z0-9,.$()/] it would
>>> be an immediate boost to accuracy.  (Hoping for a way that doesn't involved
>>> having to retrain from scratch on the limited set.)
>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/2180d37f-50fd-47e6-9f48-c3ff73b1569e%40googlegroups.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/2180d37f-50fd-47e6-9f48-c3ff73b1569e%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/df5177e4-32d0-4015-a863-02878ef53f9b%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/df5177e4-32d0-4015-a863-02878ef53f9b%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>


-- 

____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXLTNbXFz5mFNoHnurW8orWyudiA0FCOqF_tq_gVoONAA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to