Hi Wenlong There's no single best way to do this, but it's something people have thought about, and there are a few different things you can try. For example this paper does a comparison of different methods of combining training data:
http://aclweb.org/anthology-new/W/W07/W07-0733.pdf There's been other papers on this topic - have a look at the system papers in the recent ACL workshops on machine translation, best regards Barry On Saturday 07 Aug 2010 11:48:43 Wenlong Yang wrote: > Hi Guys, > > > I have a question here: > I want to train a moses engine for domain A, now I have some training data > for domain A (for example, 40000 words) and more training data (for > example, 200000 words) which is not specifically belongs domain A, but > also relevant. How can I use the extra training data to generate the > highest quality Moses engine for domain A? I mean, how to use the 40000 > lines' relevant data better? > > Just simply combine these two sets of training data together? Is this the > best solution? > > Thanks so, > Wenlong > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
