In my opinion, that depends on the differences between the source language
and the target language, and also depends on the domain of the test set.

1. if the two languages are quite different, e.g. Chinese-English: the
words are totally different, and the grammars are also different, so we
need more training data;

2. if the test set contains many different domains of texts, of course the
training data also need to contain these domains in order to get good
performance.

Best wishes!
Pidong

On 11 May 2012 00:02, tharaka weheragoda <[email protected]> wrote:

> Hi All,
>   If anybody knows about the minimum amount of parallel data required for
> SMT to perform well please let me know.
>
> Thanks in advance!
> Tharaka
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>


-- 
Wang Pidong

Department of Computer Science
School of Computing
National University of Singapore
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to