Hello, I have a large phrase-based translation system. Alignment was done with mgiza, and took a few weeks. I now have a small amount of extremely relevant new bitext (~50,000 lines) that I would like to use to augment the model, without having to retrain everything. The new data contains many important words that are not found anywhere else in the training data, so lexical tables (at least) would need to be updated along with adding in new alignments. I could run the rest of training (steps 3+) no problem, as long as the relevant files from steps 1 and 2 are updated in a reasonable way. Is there some way for me to do this? or should I just cut my losses and retrain the entire thing?
Thanks, Elliot
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
