Salam Thanks for triggering this topic :) I ain't that good, but I'll have a go and read the pdf file you pointed out, skimming through it, I think it has all you need to understand the mechanism required for an Arabic OCR engine. Or I may leave it for my final year project during my study :) salaam
On 9/25/07, Afief Halumi <[EMAIL PROTECTED]> wrote: > > Funny, I think me, linuxz and oomlx had a talk about it at #arabeyes. > There has been a paper published outlining some nice methods for arabic > OCR, but the math of it is just beyond me. > > here is the paper: > http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-495.pdf > > On 9/25/07, Mohamed Magdy <[EMAIL PROTECTED] > wrote: > > > > Salam > > > > As mentioned earlier > > > > http://lists.arabeyes.org/archives/developer/2006/September/msg00013.html > > > > It may be worthwhile and faster if Arabic support is implemented into > > Tesseract-ocr .. > > > > The important thing is the support of unicode.. tesseract 2.0 > > http://code.google.com/p/tesseract-ocr/ can use and understand unicode > > and could be trained for any language that don't have its characters > > joined.. > > > > What it is lacking is mentioned in the training page : > > > > > Tesseract can only handle left-to-right languages. While you can get > > > something out with a right-to-left language, the output file will be > > > ordered as if the text were left-to-right. Top-to-bottom languages > > > will currently be hopeless. > > > > > > Tesseract is unlikely to be able to handle connected scripts like > > > Arabic. It will take some specialized algorithms to handle this case, > > > and right now it doesn't have them. > > > > > http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract > > > > I did a very very simple test : > > > > http://groups.google.com/group/tesseract-ocr/browse_thread/thread/b1b27838c68681ab > > > > If you could help, please please do so. > > > > Note:- As far as I know, right now..there is NO working Arabic-capable > > OCR engine.. free or otherwise.. I doubt if Sahkr software can detect > > anything. > > > > --alnokta > > _______________________________________________ > > Developer mailing list > > [email protected] > > http://lists.arabeyes.org/mailman/listinfo/developer > > > > _______________________________________________ > Developer mailing list > [email protected] > http://lists.arabeyes.org/mailman/listinfo/developer >
_______________________________________________ Developer mailing list [email protected] http://lists.arabeyes.org/mailman/listinfo/developer

