Sir,
This is Ravi Roshan, student of MCA final year, Pondicherry University. I 
am doing project on OCR Hindi for this I am taking help of trsseract, but 
it is not working for Hindi.
Can you please tell me for Hindi what font you are using.
 

On Wednesday, 10 July 2013 10:59:48 UTC+5:30, Kazem Jahanbakhsh wrote:
>
> Hi everyone,
>
> We have a set of images taken from buses head signs which displays bus id 
> and its route details displayed by LEDs. Our goal is to "*USE Tesseract 
> to Extract Texts Written in the Cropped Images*". When we selected the 
> first image shown below which reads as "*30 ROYAL OAK EX*", we got "*30 
> RIWHL 0fl|( EX*" as the output. As you see, tesseract only detected some 
> of the characters correctly.
>
> ,<https://lh4.googleusercontent.com/-hFOIsEuVsUw/UdztzLbnqUI/AAAAAAAAAGw/OdNG99jkr3s/s1600/30_bus.jpg>
>
> We also tested tesseract with another headsign image input shown below 
> which reads as "*26    UVIC*". However, in this case tesseract returned 
> an empty string! 
>
>
> <https://lh4.googleusercontent.com/-tVeJU0Hyjis/Udzu19sURfI/AAAAAAAAAG8/Zme6iJHd_sA/s1600/bus_26_headsign.jpg>
>
> So, we have two questions:
>
> 1- Can we use Tesseract for such a task: specifically passing above image 
> with an english text inside and expecting to extract the text?
> 2- If the above assumption is valid, what's the reason that tesseract 
> fails detecting the right text? Do we need to train tesseract with fonts 
> used in the bus head signs? If so, how can we do such a task? Finally, are 
> there any wiki pages that we can read which explains the internal 
> algorithms of tesseract and how it extracts texts from images?
>
> Any help would be really appreciated.
>
> Kazem
>
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to