Salam
Thanks for triggering this topic :)
I ain't that good, but I'll have a go and read the pdf file you pointed out,
skimming through it, I think it has all you need to understand the mechanism
required
for an Arabic OCR engine. Or I may leave it for my final year project during
my study :)
salaam

On 9/25/07, Afief Halumi <[EMAIL PROTECTED]> wrote:
>
> Funny, I think me, linuxz and oomlx had a talk about it at #arabeyes.
> There has been a paper published outlining some nice methods for arabic
> OCR, but the math of it is just beyond me.
>
> here is the paper:
> http://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-495.pdf
>
> On 9/25/07, Mohamed Magdy <[EMAIL PROTECTED] > wrote:
> >
> > Salam
> >
> > As mentioned earlier
> >
> > http://lists.arabeyes.org/archives/developer/2006/September/msg00013.html
> >
> > It may be worthwhile and faster if Arabic support is implemented into
> > Tesseract-ocr ..
> >
> > The important thing is the support of unicode.. tesseract 2.0
> > http://code.google.com/p/tesseract-ocr/ can use and understand unicode
> > and could be trained for any language that don't have its characters
> > joined..
> >
> > What it is lacking is mentioned in the training page :
> >
> > > Tesseract can only handle left-to-right languages. While you can get
> > > something out with a right-to-left language, the output file will be
> > > ordered as if the text were left-to-right. Top-to-bottom languages
> > > will currently be hopeless.
> > >
> > > Tesseract is unlikely to be able to handle connected scripts like
> > > Arabic. It will take some specialized algorithms to handle this case,
> > > and right now it doesn't have them.
> > >
> > http://code.google.com/p/tesseract-ocr/wiki/TrainingTesseract
> >
> > I did a very very simple test :
> >
> > http://groups.google.com/group/tesseract-ocr/browse_thread/thread/b1b27838c68681ab
> >
> > If you could help, please please do so.
> >
> > Note:- As far as I know, right now..there is NO working Arabic-capable
> > OCR engine.. free or otherwise.. I doubt if Sahkr software can detect
> > anything.
> >
> > --alnokta
> > _______________________________________________
> > Developer mailing list
> > [email protected]
> > http://lists.arabeyes.org/mailman/listinfo/developer
>
>
>
> _______________________________________________
> Developer mailing list
> [email protected]
> http://lists.arabeyes.org/mailman/listinfo/developer
>
_______________________________________________
Developer mailing list
[email protected]
http://lists.arabeyes.org/mailman/listinfo/developer

رد على