Hello all,

I am Karan SIngla, pursuing BTech in CSE and MS in Computational
Linguistics from IIIT, Hyderabad. I have been working rigrously in Machine
Translation from last one year, and was part of SEECAT project at CBS,
Denmark

I haven't worked in open source but will like to contribute to this project.

I am thinking of adding a new language pair to the project, As there is no
existing MT system for English-Hindi and Hindi-English. There will be
various experiments that can be done to make a state-of-art system



Motivation:
===> For choosing Hindi : It can be a kept as a pivot language for various
other Indian languages that have a similar word order.

===> Why ?? No existing Good MT model for this Language Pair


Freely Available Parallel Data released in WMT 14
Large Mono-Lingual Data Available
Parallel Data in 10 Indian Languages including Hindi ( further chaining,
can be tried)

Experiments in Mind:
Data Cleaning( it's not really parallel, as formed by extracting data from
PDF etc)
Noise Cleaning in Basic Phrase Table ( removing mis-aligned pairs )
Applying Transfer Grammar ( Lab at LTRC, IIIT-Hyderabad has a transfer
grammar for re-ordering English in Hindi word-order, this has proved to
give better alignments)
 Re-ranking ( using RNN based LM with features such as features from mica
parser)

And other part of the project will be to assist the translator in this CAT
system, with the possible translations from translation memory and MT
output. He can choose accordingly and post edit it

Do u think, it can be a nice idea ??

Also If there I will be happy to know the progress of the chaining
experiment for the translation ??

Regards,
Karan
LTRC, IIIT-Hyderabad
------------------------------------------------------------------------------
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis & security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to