Hi folks, Sorry this took so long, long story. But the four models that Hieu shared with me are ready. You can download them here; they're each about 15–20 GB.
http://cs.jhu.edu/~post/files/joshua-hiero-ar-en.tbz <http://cs.jhu.edu/~post/files/joshua-hiero-ar-en.tbz> http://cs.jhu.edu/~post/files/joshua-phrase-ar-en.tbz <http://cs.jhu.edu/~post/files/joshua-phrase-ar-en.tbz> http://cs.jhu.edu/~post/files/joshua-hiero-ru-en.tbz <http://cs.jhu.edu/~post/files/joshua-hiero-ru-en.tbz> http://cs.jhu.edu/~post/files/joshua-hiero-ru-en.tbz It'd be great if someone could test them on a machine with lots of cores, to see how things scale. matt > On Sep 22, 2016, at 9:09 AM, Matt Post <p...@cs.jhu.edu> wrote: > > Hi folks, > > I have finished the comparison. Here you can find graphs for ar-en and ru-en. > The ground-up rewrite of Moses is > about 2x–3x faster than Joshua. > > http://imgur.com/a/FcIbW <http://imgur.com/a/FcIbW> > > One implication (untested) is that we are likely as fast as or faster than > Moses. > > We could brainstorm things to do to close this gap. I'd be much happier with > 2x or even 1.5x than with 3x, and I bet we could narrow this down. But I'd > like to get the 6.1 release out of the way, first, so I'm pushing this off to > next month. Sound cool? > > matt > > >> On Sep 19, 2016, at 6:26 AM, Matt Post <p...@cs.jhu.edu >> <mailto:p...@cs.jhu.edu>> wrote: >> >> I can't believe I did this, but I mis-colored one of the hiero lines, and >> the Numbers legend doesn't show the line type. If you reload the dropbox >> file, it's fixed now. The difference is about 3x for both. Here's the table. >> >> Threads >> Joshua >> Moses2 >> Joshua (hiero) >> Moses2 (hiero) >> Phrase rate >> Hiero rate >> 1 >> 178 >> 65 >> 2116 >> 1137 >> 2.74 >> 1.86 >> 2 >> 109 >> 42 >> 1014 >> 389 >> 2.60 >> 2.61 >> 4 >> 78 >> 29 >> 596 >> 213 >> 2.69 >> 2.80 >> 6 >> 72 >> 25 >> 473 >> 154 >> 2.88 >> 3.07 >> >> I'll put the models together and share them later today. This was on a >> 6-core machine and I agree it'd be nice to test with something much higher. >> >> matt >> >> >>> On Sep 19, 2016, at 5:33 AM, kellen sunderland <kellen.sunderl...@gmail.com >>> <mailto:kellen.sunderl...@gmail.com><mailto:kellen.sunderl...@gmail.com >>> <mailto:kellen.sunderl...@gmail.com>>> wrote: >>> >>> Do we just want to store these models somewhere temporarily? I've got a >>> OneDrive account and could share the models from there (as long as they're >>> below 500GBs or so). >>> >>> On Mon, Sep 19, 2016 at 11:32 AM, kellen sunderland >>> <kellen.sunderl...@gmail.com <mailto:kellen.sunderl...@gmail.com> >>> <mailto:kellen.sunderl...@gmail.com <mailto:kellen.sunderl...@gmail.com>>> >>> wrote: >>> Very nice results. I think getting to within 25% of a optimized c++ >>> decoder from a Java decoder is impressive. Great that Hieu has put in the >>> work to make moses2 so fast as well, that gives organizations two quite >>> nice decoding engines to choose from, both with reasonable performance. >>> >>> Matt: I had a question about the x axis here. Is that number of threads? >>> We should be scaling more or less linearly with the number of threads, is >>> that the case here? If you post the models somewhere I can also do a quick >>> benchmark on a machine with a few more cores. >>> >>> -Kellen >>> >>> >>> On Mon, Sep 19, 2016 at 10:53 AM, Tommaso Teofili >>> <tommaso.teof...@gmail.com >>> <mailto:tommaso.teof...@gmail.com><mailto:tommaso.teof...@gmail.com >>> <mailto:tommaso.teof...@gmail.com>>> wrote: >>> Il giorno sab 17 set 2016 alle ore 15:23 Matt Post <p...@cs.jhu.edu >>> <mailto:p...@cs.jhu.edu><mailto:p...@cs.jhu.edu <mailto:p...@cs.jhu.edu>>> >>> ha >>> scritto: >>> >>>> I'll ask Hieu; I don't anticipate any problems. One potential problem is >>>> that that models occupy about 15--20 GB; do you think Jenkins would host >>>> this? >>>> >>> >>> I'm not sure, can such models be downloaded and pruned at runtime, or do >>> they need to exist on the Jenkins machine ? >>> >>> >>>> >>>> (ru-en grammars still packing, results will probably not be in until much >>>> later today) >>>> >>>> matt >>>> >>>> >>>>> On Sep 17, 2016, at 3:19 PM, Tommaso Teofili <tommaso.teof...@gmail.com >>>>> <mailto:tommaso.teof...@gmail.com><mailto:tommaso.teof...@gmail.com >>>>> <mailto:tommaso.teof...@gmail.com>>> >>>> wrote: >>>>> >>>>> Hi Matt, >>>>> >>>>> I think it'd be really valuable if we could be able to repeat the same >>>>> tests (given parallel corpus is available) in the future, any chance you >>>>> can share script / code to do that ? We may even consider adding a >>>> Jenkins >>>>> job dedicated to continuously monitor performances as we work on Joshua >>>>> master branch. >>>>> >>>>> WDYT? >>>>> >>>>> Anyway thanks for sharing the very interesting comparisons. >>>>> Regards, >>>>> Tommaso >>>>> >>>>> Il giorno sab 17 set 2016 alle ore 12:29 Matt Post <p...@cs.jhu.edu >>>>> <mailto:p...@cs.jhu.edu><mailto:p...@cs.jhu.edu >>>>> <mailto:p...@cs.jhu.edu>>> ha >>>>> scritto: >>>>> >>>>>> Ugh, I think the mailing list deleted the attachment. Here is an attempt >>>>>> around our censors: >>>>>> >>>>>> https://www.dropbox.com/s/80up63reu4q809y/ar-en-joshua-moses2.png?dl=0 >>>>>> <https://www.dropbox.com/s/80up63reu4q809y/ar-en-joshua-moses2.png?dl=0><https://www.dropbox.com/s/80up63reu4q809y/ar-en-joshua-moses2.png?dl=0 >>>>>> >>>>>> <https://www.dropbox.com/s/80up63reu4q809y/ar-en-joshua-moses2.png?dl=0>> >>>>>> >>>>>> >>>>>>> On Sep 17, 2016, at 12:21 PM, Matt Post <p...@cs.jhu.edu >>>>>>> <mailto:p...@cs.jhu.edu><mailto:p...@cs.jhu.edu >>>>>>> <mailto:p...@cs.jhu.edu>>> wrote: >>>>>>> >>>>>>> Hi everyone, >>>>>>> >>>>>>> One thing we did this week at MT Marathon was a speed comparison of >>>>>> Joshua 6.1 (release candidate) with Moses2, which is a ground-up >>>> rewrite of >>>>>> Moses designed for speed (see the attached paper). Moses2 is 4–6x faster >>>>>> than Moses phrase-based, and 100x (!) faster than Moses hiero. >>>>>>> >>>>>>> I tested using two moderate-to-large sized datasets that Hieu Hoang >>>>>> (CC'd) provided me with: ar-en and ru-en. Timing results are from 10,000 >>>>>> sentences in each corpus. The average ar-en sentence length is 7.5, and >>>> for >>>>>> ru-en is 28. I only ran one test for each language, so there could be >>>> some >>>>>> variance if I averaged, but I think the results look pretty consistent. >>>> The >>>>>> timing is end-to-end (including model load times, which Moses2 tends to >>>> be >>>>>> a bit faster at). >>>>>>> >>>>>>> Note also that Joshua does not have lexicalized distortion, while >>>> Moses2 >>>>>> does. This means the BLEU scores are a bit lower for Joshua: 62.85 >>>> versus >>>>>> 63.49. This shouldn't really affect runtime, however. >>>>>>> >>>>>>> I'm working on the ru-en, but here are the ar-en results: >>>>>>> >>>>>>> >>>>>>> >>>>>>> Some conclusions: >>>>>>> >>>>>>> - Hieu has done some bang-up work on the Moses2 rewrite! Joshua is in >>>>>> general about 3x slower than Moses2 >>>>>>> >>>>>>> - We don't have a Moses comparison, but extrapolating from Hieu's >>>> paper, >>>>>> it seems we might be as fast as or faster than Moses phrase-based >>>> decoding, >>>>>> and are a ton faster on Hiero. I'm going to send my models to Hieu so he >>>>>> can test on his machine, and then we'll have a better feel for this, >>>>>> including how it scales on a machine with many more processors. >>>>>>> >>>>>>> matt