Hi Falke,

Thanks for trying this out.
The hindi language tesseract data files should work. While I was working in
2007-2008, Hindi language data files were not available. A bengali guy
called debayanin tried hard to use hindi / devanagari.
Today the hindi language data files (tessdata) are available. I haven't
tested it. But I am sure it should work.
The question has been answered. Nepali Language should be able to use the
hindi data files. It all depends on how much accurate the results for Hindi
are. If Hindi is detected flawlessly, it should work similarly with Nepali.
There is a slight difference in Nepali that some characters from Hindi are
not used. However they are in the devanagari chart. Its good for Nepali
that Nepali does not use those characters. If it had been the reverse, we
should train again to incorporate those characters.

So everything should be fine.
Thanks for testing out with the Nepali sample image. The result is not good
but I think it can be done after digging out with correct Hindi tessdata
and the new tesseract. Uh thanks everyone for reading this.


2012/5/1 Falke <[email protected]>

> I subjected your png to some pre-processing (resize, blur, threshold,
> etc.) and got slightly better results:
>
> ---------- my results -----------
> दृप्राक्वछ संसारन्मा पाइज्ञे प्रश्मीइरनंआ टूपबम्भन्दा चन्नाग्ध र
> बुद्धिन्मात्न प्राणी
> हो । यसले अक्वफ्लो बुद्धिको उपयोग ब्वगरेर संसारन्नाई बं सत्नाएको छा
> ह्नरासँचद्धरि इसको चन्नान्धीठो रांहॉका सवं प्रग़गीत्माई कट्सभाएको छा
> एक
> सइपरांन्मा सार:; प्रग़गीहाँ टज्ञइगत्मरज्ञइचक्वत्म चह्मर्ल आक्वछठो
> छात्र पात्रों छाडी
> चन्द्रछग़आ सठोत्त पाइत्मा इउप्तिसकेको छा रांप्तत्ये शयद्धत्प्त
> न्माप्तिसत्माई
> हृपृह्नयुको हपुरब्रबाट बत्ताठज्ञे अऋत्तलुल्या औंषणा बत्नाएप्त कि ।
> संसारका सवं
> न्माप्तिटूपत्माई ष्टकप्ताश सिपइरांश्वठज्ञे अज्वगुबन्म बत्नाएत्न कि ।
> घऊहिरिएर हैच हो अले
> न्नानंछ, शो अद-छे रास संदृपारको कति ड्डप्तलौठो प्राणी रहेछ ।
> ---------- end my results ------
>
> But, essentially, it's much better to start with higher-resolution
> scans.
>
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>



-- 
Rajesh Pandey

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to