Vineet, you may be interested in the following: In 2003, the DARPA TIDES program conducted a "surprise language" exercise, in which participants tried to ramp up a variety of Hindi language processing capabilities in a very short time frame (a month or so, I believe). This newsletter describes it in a little more detail:
http://language.cnri.reston.va.us/TeamTIDES/tt02e3-final.pdf I think this page at the LDC describes some of these collected resources: http://www.ldc.upenn.edu/myl/hindi.html In particular, the latter lists several sources of parallel data. Some of the links are dead, but this may give you a lead on putting together a corpus. Good luck! - John Burger MITRE _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
