Hi, You may consider reading this paper ( http://aclweb.org/anthology-new/E/E12/E12-1016.pdf) to figure out the answer for your question.
-- Cheers, Vu On Fri, May 11, 2012 at 9:00 AM, Wang Pidong <[email protected]> wrote: > In my opinion, that depends on the differences between the source language > and the target language, and also depends on the domain of the test set. > > 1. if the two languages are quite different, e.g. Chinese-English: the > words are totally different, and the grammars are also different, so we > need more training data; > > 2. if the test set contains many different domains of texts, of course the > training data also need to contain these domains in order to get good > performance. > > Best wishes! > Pidong > > On 11 May 2012 00:02, tharaka weheragoda <[email protected]> wrote: > >> Hi All, >> If anybody knows about the minimum amount of parallel data required for >> SMT to perform well please let me know. >> >> Thanks in advance! >> Tharaka >> >> _______________________________________________ >> Moses-support mailing list >> [email protected] >> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> > > > -- > Wang Pidong > > Department of Computer Science > School of Computing > National University of Singapore > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
