Hi Vineet, Vineet Kashyap schrieb: > Hi all > > Just wanted to know a few things about applying > linguistic inputs to improve a baseline SMT system. > > 1. Should we tag both the source and the target language > for training? And when we are tuning/testing should that > data be tagged as well? > The necessity to tag both language corpora depends on what you want to do with them. If you translate with factors e.g. 0,1,2 - > 0 , then you only need factor 0 for the target site. Normally you want to tag both language data with all (or almost all) factors to try out different combinations of factors. Have a look at: http://www.statmt.org/moses/?n=Moses.FactoredTutorial There are a couple of examples with different factor combinations.
The tuning set for MERT should be of the same format as the training set. The test set of the foreign language, too. The target language test set depends on what translation you want on the output. If it's just the word token, you'll need an untagged version of the test set. Moses normally tells you if there is a problem with the numbers of factors within a data set. > 2. Also, can language model be tagged and will it make > any improvements. > > 3. I have the enhanced Brill's tagger for english are there > any other good ones? > > Finally, when applying linguistic inputs to both source and > target language should they have the same no. of tags like > pos, any other morphological information. > > You are free to choose the number of factors. It does not need to be the same. The essential question is how you want to combine these factors because that's what you need to tell Moses by the train-factored-phrase-model.perl parameters. Best regards, Christine > Thanks in advance. > > Regards, > > Vineet > > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
