Hi Debayan, The Dhvani [1] (Text to Speech System) project has some Bengali to English transliteration algorithm. However, I am not sure whether you can use them directly for your requirements. Nevertheless, it could be useful.
Here is a link to the code https://dhvani.svn.sourceforge.net/svnroot/dhvani/trunk/dhvani/src/ Best, Golam [1] http://dhvani.sourceforge.net/ On Sun, Apr 19, 2009 at 11:09 AM, Debayan Banerjee <debaya...@gmail.com> wrote: > Does anyone know of any libraries that can transliterate bengali to > english. There are tools to the reverse. I need this to solve the last > remaining road-block in OCR. > The thing is Tesseract-OCR uses a data structure called > directed-acyclic-word-graph to store dictionaries for lookup. After an > OCR has been performed the OCR system matches the output with entries > in this d.a.w.g. file. Unfortunately the data structure is not suited > to complex scripts like ours > <http://groups.google.com/group/tesseract-ocr/browse_thread/thread/5495c4e348a4b272/a6dcfe5d92babb35?lnk=gst&q=dawg%2Bwieghts#a6dcfe5d92babb35>. > There are 2 solutions. 1) I figure out a suitable data structure that > handles Indic script and implement. 2) I transliterate the entire > dictionary and the OCR output to english (26 characters instead of the > 500 odd for bengali) and then match. I think this should work. > Any suggestions? > > [1] http://hacking-tesseract.blogspot.com/ > [2] http://code.google.com/p/tesseract-ocr > > > -- > Be Intelligent, Use GNU/Linux > > http://debayanin.googlepages.com/ > http://debayan.wordpress.com > http://lug.nitdgp.ac.in > > ------------------------------------------------------------------------------ > Stay on top of everything new and different, both inside and > around Java (TM) technology - register by April 22, and save > $200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco. > 300 plus technical and hands-on sessions. Register today. > Use priority code J9JMT32. http://p.sf.net/sfu/p > _______________________________________________ > Bengalinux-core mailing list > Bengalinux-core@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/bengalinux-core > ------------------------------------------------------------------------------ Stay on top of everything new and different, both inside and around Java (TM) technology - register by April 22, and save $200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco. 300 plus technical and hands-on sessions. Register today. Use priority code J9JMT32. http://p.sf.net/sfu/p _______________________________________________ Bengalinux-core mailing list Bengalinux-core@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bengalinux-core