Thanks for letting me know. No I haven't had a chance. I will try 4.0
although I have never manually dealt with tesseract. I've been using
programs for 3.x that trained and made box files automatically.

On Apr 24, 2017 12:43 AM, "ShreeDevi Kumar" <[email protected]> wrote:

> James,
>
> Were you able to get this to work for you with 3.04/3.05?
>
> I get accurate results using Tesseract 4.0 alpha, though it takes longer
> with --oem 1 than --oem 0.
>
>
> ./troublewith98-300.jpg
> Tesseract Open Source OCR Engine v4.00.00alpha-385-gab41465 with Leptonica
>
> real    0m1.203s
> user    0m0.578s
> sys     0m0.203s
> Tesseract Open Source OCR Engine v4.00.00alpha-385-gab41465 with Leptonica
>
> real    0m4.485s
> user    0m5.125s
> sys     0m0.234s
>
> See attached ..
>
> You can test with https://sourceforge.net/projects/vietocr/files/
> vietocr.net/5.0alpha/
> which uses Tesseract.NET (Tesseract 4.00alpha 362b68e)
>
>
> ShreeDevi
> ____________________________________________________________
> भजन - कीर्तन - आरती @ http://bhajans.ramparivar.com
>
> On Sun, Apr 23, 2017 at 9:25 AM, ShreeDevi Kumar <[email protected]>
> wrote:
>
>> Try training using more samples of 8, 9, B etc.
>>
>> What results do you get with the provided eng.traineddata?  Are they
>> better or worse?
>>
>> Have you tried changing DPI of image to 300?
>>
>> - excuse the brevity, sent from mobile
>>
>> On 22-Apr-2017 10:29 PM, "James Abney" <[email protected]> wrote:
>>
>>> Oh yes I guess I forgot to include that information, I did train using
>>> only that font and with the same size font. I am on windows 7 and I used
>>> 3.05 to train, although the .net wrapper i use is 3.04. I don't see how it
>>> has difficulty with the 9 and 8, seems very odd.
>>>
>>> On Friday, April 21, 2017 at 11:05:49 PM UTC-4, shree wrote:
>>>>
>>>> Which version of Tesseract. Which o/s?
>>>>
>>>> If all your text is in tungsten-semibold, have you tried training with
>>>> just that font?
>>>>
>>>> - excuse the brevity, sent from mobile
>>>>
>>>>
>>>> On 22-Apr-2017 12:50 AM, "James Abney" <[email protected]> wrote:
>>>>
>>>> The font is tungsten semibold
>>>>
>>>>
>>>> On Friday, April 21, 2017 at 2:08:53 PM UTC-4, James Abney wrote:
>>>>>
>>>>> I'm having issues with tesseract dealing with the number 9 and 8
>>>>> especially when they are next to each other. This is really the only issue
>>>>> I have. Even when ocr a tiff file it shows 123456789 as 123456788. I will
>>>>> link an example. Any help is appreciated. The following image is an 
>>>>> example
>>>>> where my software using tesseract interprets the 899B8993B as 88888-838.
>>>>>
>>>>>
>>>>> <https://lh3.googleusercontent.com/-HF3RzbqMD6I/WPo8RYC6GaI/AAAAAAAAAJg/phkq6dgtvSE5f3upJQrfowEp1vyW8TQXwCLcB/s1600/troublewith98.png>
>>>>>
>>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "tesseract-ocr" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> To post to this group, send email to [email protected].
>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>>> gid/tesseract-ocr/4a0c2a52-3eb5-4884-9371-111a6fbea73b%40goo
>>>> glegroups.com
>>>> <https://groups.google.com/d/msgid/tesseract-ocr/4a0c2a52-3eb5-4884-9371-111a6fbea73b%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>>
>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>> To view this discussion on the web visit https://groups.google.com/d/ms
>>> gid/tesseract-ocr/414a0ab1-8b9a-48a6-8571-795345ac316f%40goo
>>> glegroups.com
>>> <https://groups.google.com/d/msgid/tesseract-ocr/414a0ab1-8b9a-48a6-8571-795345ac316f%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
> --
> You received this message because you are subscribed to a topic in the
> Google Groups "tesseract-ocr" group.
> To unsubscribe from this topic, visit https://groups.google.com/d/
> topic/tesseract-ocr/ekDV9gLb-80/unsubscribe.
> To unsubscribe from this group and all its topics, send an email to
> [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/tesseract-ocr/CAG2NduVOcgryCqD77SZgHKDuJqgGCQmW9U9zFdgOoG8HT%2BHK3Q%
> 40mail.gmail.com
> <https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduVOcgryCqD77SZgHKDuJqgGCQmW9U9zFdgOoG8HT%2BHK3Q%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CA%2By7dTc3Tou53_O%2Bcys%3DOAZCX9x6%2BzHe4egLx0UXKmQXgTFgcA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to