I'll ask Hieu; I don't anticipate any problems. One potential problem is that that models occupy about 15--20 GB; do you think Jenkins would host this?
(ru-en grammars still packing, results will probably not be in until much later today) matt > On Sep 17, 2016, at 3:19 PM, Tommaso Teofili <[email protected]> > wrote: > > Hi Matt, > > I think it'd be really valuable if we could be able to repeat the same > tests (given parallel corpus is available) in the future, any chance you > can share script / code to do that ? We may even consider adding a Jenkins > job dedicated to continuously monitor performances as we work on Joshua > master branch. > > WDYT? > > Anyway thanks for sharing the very interesting comparisons. > Regards, > Tommaso > > Il giorno sab 17 set 2016 alle ore 12:29 Matt Post <[email protected]> ha > scritto: > >> Ugh, I think the mailing list deleted the attachment. Here is an attempt >> around our censors: >> >> https://www.dropbox.com/s/80up63reu4q809y/ar-en-joshua-moses2.png?dl=0 >> >> >>> On Sep 17, 2016, at 12:21 PM, Matt Post <[email protected]> wrote: >>> >>> Hi everyone, >>> >>> One thing we did this week at MT Marathon was a speed comparison of >> Joshua 6.1 (release candidate) with Moses2, which is a ground-up rewrite of >> Moses designed for speed (see the attached paper). Moses2 is 4–6x faster >> than Moses phrase-based, and 100x (!) faster than Moses hiero. >>> >>> I tested using two moderate-to-large sized datasets that Hieu Hoang >> (CC'd) provided me with: ar-en and ru-en. Timing results are from 10,000 >> sentences in each corpus. The average ar-en sentence length is 7.5, and for >> ru-en is 28. I only ran one test for each language, so there could be some >> variance if I averaged, but I think the results look pretty consistent. The >> timing is end-to-end (including model load times, which Moses2 tends to be >> a bit faster at). >>> >>> Note also that Joshua does not have lexicalized distortion, while Moses2 >> does. This means the BLEU scores are a bit lower for Joshua: 62.85 versus >> 63.49. This shouldn't really affect runtime, however. >>> >>> I'm working on the ru-en, but here are the ar-en results: >>> >>> >>> >>> Some conclusions: >>> >>> - Hieu has done some bang-up work on the Moses2 rewrite! Joshua is in >> general about 3x slower than Moses2 >>> >>> - We don't have a Moses comparison, but extrapolating from Hieu's paper, >> it seems we might be as fast as or faster than Moses phrase-based decoding, >> and are a ton faster on Hiero. I'm going to send my models to Hieu so he >> can test on his machine, and then we'll have a better feel for this, >> including how it scales on a machine with many more processors. >>> >>> matt >>> >>> >> >>
