Hi all 1. is there a way to output unknown words to a separate file instead of dropping them as i think we can add those words to the dictionary which will improve the accuracy ?
2. also, when adding dictionary to the parallel corpus as suggested by Phillip in the previous post you have one word in the source language and the other in the target language is that correct? 3. Does BLEU uses a reference file with accurate human translations to estimate a score ? And if not would it be better to evaluate the system with such a reference file with accurate translations ? 4. what value of BLEU means good translations ? in percentage... and for comparison purposes how would a human judge a MT system's performance ? 5. can we train higher order language models with SRILM with a small corpus or have to use IRSTLM ? Thanks a lot in advance for taking the time in answering these questions. Regards, Vineet _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
