Re: [Moses-support] How to use relevant training data to train a good engine for our domain?

Barry Haddow Sat, 07 Aug 2010 10:19:20 -0700

Hi Wenlong

There's no single best way to do this, but it's something people have thought 
about, and there are a few different things you can try. For example this paper 
does a comparison of different methods of combining training data:


http://aclweb.org/anthology-new/W/W07/W07-0733.pdf

There's been other papers on this topic - have a look at the system papers in 
the recent ACL workshops on machine translation,

best regards
Barry

On Saturday 07 Aug 2010 11:48:43 Wenlong Yang wrote:
> Hi Guys,
> 
> 
> I have a question here:
> I want to train a moses engine for domain A, now I have some training data
> for domain A (for example, 40000 words) and more training data (for
>  example, 200000 words) which is not specifically belongs domain A, but
>  also relevant. How can I use the extra training data to generate the
>  highest quality Moses engine for domain A? I mean, how to use the 40000
>  lines' relevant data better?
> 
> Just simply combine these two sets of training data together? Is this the
> best solution?
> 
> Thanks so,
> Wenlong
> 

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] How to use relevant training data to train a good engine for our domain?

Reply via email to