James,

Were you able to get this to work for you with 3.04/3.05?

I get accurate results using Tesseract 4.0 alpha, though it takes longer
with --oem 1 than --oem 0.


./troublewith98-300.jpg
Tesseract Open Source OCR Engine v4.00.00alpha-385-gab41465 with Leptonica

real    0m1.203s
user    0m0.578s
sys     0m0.203s
Tesseract Open Source OCR Engine v4.00.00alpha-385-gab41465 with Leptonica

real    0m4.485s
user    0m5.125s
sys     0m0.234s

See attached ..

You can test with
https://sourceforge.net/projects/vietocr/files/vietocr.net/5.0alpha/
which uses Tesseract.NET (Tesseract 4.00alpha 362b68e)


ShreeDevi
____________________________________________________________
भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com

On Sun, Apr 23, 2017 at 9:25 AM, ShreeDevi Kumar <shreesh...@gmail.com>
wrote:

> Try training using more samples of 8, 9, B etc.
>
> What results do you get with the provided eng.traineddata?  Are they
> better or worse?
>
> Have you tried changing DPI of image to 300?
>
> - excuse the brevity, sent from mobile
>
> On 22-Apr-2017 10:29 PM, "James Abney" <abne...@gmail.com> wrote:
>
>> Oh yes I guess I forgot to include that information, I did train using
>> only that font and with the same size font. I am on windows 7 and I used
>> 3.05 to train, although the .net wrapper i use is 3.04. I don't see how it
>> has difficulty with the 9 and 8, seems very odd.
>>
>> On Friday, April 21, 2017 at 11:05:49 PM UTC-4, shree wrote:
>>>
>>> Which version of Tesseract. Which o/s?
>>>
>>> If all your text is in tungsten-semibold, have you tried training with
>>> just that font?
>>>
>>> - excuse the brevity, sent from mobile
>>>
>>>
>>> On 22-Apr-2017 12:50 AM, "James Abney" <abn...@gmail.com> wrote:
>>>
>>> The font is tungsten semibold
>>>
>>>
>>> On Friday, April 21, 2017 at 2:08:53 PM UTC-4, James Abney wrote:
>>>>
>>>> I'm having issues with tesseract dealing with the number 9 and 8
>>>> especially when they are next to each other. This is really the only issue
>>>> I have. Even when ocr a tiff file it shows 123456789 as 123456788. I will
>>>> link an example. Any help is appreciated. The following image is an example
>>>> where my software using tesseract interprets the 899B8993B as 88888-838.
>>>>
>>>>
>>>> <https://lh3.googleusercontent.com/-HF3RzbqMD6I/WPo8RYC6GaI/AAAAAAAAAJg/phkq6dgtvSE5f3upJQrfowEp1vyW8TQXwCLcB/s1600/troublewith98.png>
>>>>
>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to tesseract-oc...@googlegroups.com.
>>> To post to this group, send email to tesser...@googlegroups.com.
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>> gid/tesseract-ocr/4a0c2a52-3eb5-4884-9371-111a6fbea73b%40goo
>>> glegroups.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/4a0c2a52-3eb5-4884-9371-111a6fbea73b%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>>
>>> --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to tesseract-ocr+unsubscr...@googlegroups.com.
>> To post to this group, send email to tesseract-ocr@googlegroups.com.
>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>> To view this discussion on the web visit https://groups.google.com/d/ms
>> gid/tesseract-ocr/414a0ab1-8b9a-48a6-8571-795345ac316f%40googlegroups.com
>> <https://groups.google.com/d/msgid/tesseract-ocr/414a0ab1-8b9a-48a6-8571-795345ac316f%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVOcgryCqD77SZgHKDuJqgGCQmW9U9zFdgOoG8HT%2BHK3Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Bssss-s38

899B8993B

 

B8888-838

899889938

 

899B8993B

899889938

Reply via email to