Hi 

 I am new to OCR and Tesseract. I have to work with Sindhi script which is 
little different than Arabic. Arabic is supported by Tesseract. While I am 
not sure if Sindhi script image file can be  processed with Tesseract. I 
have attached Sindhi alphabet. It has 52 characters some are same as Arabic 
and some are like Persian and few more.  My question is, are these 
sindhi characters recognized by the OCR? if not then 
what shall I do so that tesseract can recognise the characters. Do I just 
need to train Tesseract on the new characters or do I need to extend 
tessearct API?? 

Please find the image of Sindhi alphabate attached.


Thanks

Meena

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

<<attachment: Sindhi script.jpg>>

Reply via email to