Hi all,
I am thinking of adding a new language pair to the project, As there is no
existing MT system for English-Hindi and Hindi-English. There will be
various experiments that can be done to make a state-of-art system

Motivation:
Parallel Data released in WMT 14
Large Mono-Lingual Data Available


Experiments in Mind:
Data Cleaning( it's not really parallel, as formed by extracting data from
PDF etc)
Noise Cleaning in Basic Phrase Table ( removing mis-aligned pairs )
Applying Transfer Grammer ( Lab at LTRC, IIIT-Hyderabad has a transfer
grammer for re-ordering English in Hindi word-order, this has proved to
give better alignments)
 Re-ranking ( using RNN based LM with features such as features from mica
parser)

then integrating this system to provide sugestions to the translator, also
using translation memory for suggestions.

I have worked in a building a CAT system called SEECAT
http://bridge.cbs.dk/platform/?q=SEECAT

There can be a other possibility of providing speech (ASR) as alternate
input tool for the user, which has proved to make a translater fast.

Am i thinking on the right track ??

Regards,
Karan
LTRC, IIIT-Hyderabad
------------------------------------------------------------------------------
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis & security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk
_______________________________________________
Apertium-stuff mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/apertium-stuff

Reply via email to