Hi everyone, One thing we did this week at MT Marathon was a speed comparison of Joshua 6.1 (release candidate) with Moses2, which is a ground-up rewrite of Moses designed for speed (see the attached paper). Moses2 is 4–6x faster than Moses phrase-based, and 100x (!) faster than Moses hiero.
I tested using two moderate-to-large sized datasets that Hieu Hoang (CC'd) provided me with: ar-en and ru-en. Timing results are from 10,000 sentences in each corpus. The average ar-en sentence length is 7.5, and for ru-en is 28. I only ran one test for each language, so there could be some variance if I averaged, but I think the results look pretty consistent. The timing is end-to-end (including model load times, which Moses2 tends to be a bit faster at). Note also that Joshua does not have lexicalized distortion, while Moses2 does. This means the BLEU scores are a bit lower for Joshua: 62.85 versus 63.49. This shouldn't really affect runtime, however. I'm working on the ru-en, but here are the ar-en results:
Some conclusions: - Hieu has done some bang-up work on the Moses2 rewrite! Joshua is in general about 3x slower than Moses2 - We don't have a Moses comparison, but extrapolating from Hieu's paper, it seems we might be as fast as or faster than Moses phrase-based decoding, and are a ton faster on Hiero. I'm going to send my models to Hieu so he can test on his machine, and then we'll have a better feel for this, including how it scales on a machine with many more processors. matt