Hi, this is to announce a new project on sourceforge called BOCRA to develop an OCR software targeted towards Bengali, but also useable for other scripts, especially ones similar to Bengali. See the webpage
http://bocra.sf.net for more details (the page is slightly outdated, but mostly accurate). There's a mailing list too, so feel free to ask questions there. Before you get your hopes up, note that BOCRA does not work out of the box (it needs to be trained for every new font) and is very unlikely to work well with noisy input images. Status and availability: The software is currently in alpha status (by my definition), i.e. it mostly works to my satisfaction in most of my machines. It uses Qt 4 and R (both of which are required to compile the sources). The sources (licensed under the GNU GPL) are hosted as a subversion archive on sourceforge, so you need a subversion client to get up to date sources. A snapshot tarball is also available, which you can download without a subversion client. OS requirements: Both Qt 4 and R are available on all major platforms. However, I have tested BOCRA only on Linux, so I don't know if it will work on other platforms (e.g. Windows). There will probably be issues in linking Qt and R together on Windows, so I don't expect it to work out of the box. Deepayan -- http://www.stat.wisc.edu/~deepayan/