Dear Moses devs/users,

@Marcin, thanks for the tip on the trie, I'll try out the trie.

About the 100 MERT iterations, when i tried to run mert-moses.pl on that
target language with 71GB of binarized language model on a 3000 line dev
set, it took more than one day to tune using 10 threads.  Is that normal?

For a different experiment with a 38GB binarized language model, it took
max 4-5 hours to tune with 10 threads on a 3000 lines dev set. (all the
phrase-tables and rerodering-tables are binarized)

I ran mert-moses.pl with only the model directory and the path to
moses.ini.

Regards,
Liling


binarizing like this gives you a lot smaller file:

build_binary trie -a 22 -b 8 -q 8 lm.arpa.gz lm.kenlm

This uses quantization, in theory that could cause quality loss, but I
never saw that happen. Remove "-b 8 -q 8" if you are afraid of that, the
file will be larger, but still a lot smaller than what you have. That's
about all I do. You said "100 MERT iterations" ... what do you mean by
that? Also the LM uses memory mapping in shared memory, so running
several moses instances in parallel does not use additional memory due
to the LM, similar for the phrase table.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to