The tokenizer uses a default setting if it doesn't know the language. Its se up for European languages so it may not do a food job with Persian. However, if it crashes, that's not because of the language, but something is wrong. You should look in the .STDERR file to see what's wrong
Hieu Sent from my flying horse On 14 Feb 2011, at 09:57 PM, amin farajian <[email protected]> wrote: Hello Dear Hoang, During training the system, I faced a problem. In the first steps, ems produces this error: executing /home/amin/mt/work/Basic/3lm/steps/1/EVALUATION_test_tokenize-reference.1 via sh (2) executing /home/amin/mt/work/Basic/3lm/steps/1/LM_toy_binarize.1 via --- on hold executing /home/amin/mt/work/Basic/3lm/steps/1/EVALUATION_test_input-from-sgm.1 via --- on hold step EVALUATION:test:tokenize-reference crashed But it continues without any other problem. I think this is because I'm using Persian-English language pairs and there is no tokenizer for Persian. But I wanted to know that can this crash affect the training procedure?
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
