Hi Sankarshan, Having done some more reading[1], I am now positive that the factor of domain adaptability (due to poor scan or tattered documents), that I was concerned with in my last email, is out of the scope for now, however it my be included when trying to make the system more robust.
I can see most of the work has been done with tesserct 2.x , but I would like to look into tesseract 3.x, which is reported to have better support for connected-script based languages. I am currently trying to fond out more details about the implementation of support for hindi [2]. At this point, I would also like to read about the proposal/work approach from last year on the same project. Could you provide me with a copy of the same? [1] http://www.cvc.uab.es/icdar2009/papers/3725a671.pdf [2] http://research.ijcaonline.org/volume39/number6/pxc3877076.pdf -- -Regards, Debajyoti Nag http://twitter.com/aramis7d
_______________________________________________ Project-ideas mailing list [email protected] http://lists.ankur.org.in/listinfo.cgi/project-ideas-ankur.org.in
