Hi, the LDC is providing parallel corpora of the size of 200 million words which are widely used to build statistical machine translation engines.
-phi On Fri, Oct 17, 2008 at 8:08 PM, Ray Wen <[EMAIL PROTECTED]> wrote: > Hi, > > We are a software company specializing in information retrieval. One of > our U.S. Air Force customer has been using Moses to translate from > Chinese to English. And we have the ability to produce XML markup files. > Generally, Chinese person, location and organization names are tagged > with english as parameter. > > We'd like to see how much it would help the translation by feeding our > name entity tagged XML file to Moses. However, we do not have a large > enough parallel Chinese-English corpora to train. And our customer is > not able to provide us one because of the copy right issue. > > Before resorting to train on movie subtitles, we are wondering if you > know any Chinese-English language model and phrase table that is readily > available? > > Thank you for the help > > Ray > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
