Re: moses2 vs. joshua

Matt Post Wed, 05 Oct 2016 17:20:27 -0700

Hi folks,

Sorry this took so long, long story. But the four models that Hieu shared with 
me are ready. You can download them here; they're each about 15–20 GB.


  http://cs.jhu.edu/~post/files/joshua-hiero-ar-en.tbz 
<http://cs.jhu.edu/~post/files/joshua-hiero-ar-en.tbz>
  http://cs.jhu.edu/~post/files/joshua-phrase-ar-en.tbz 
<http://cs.jhu.edu/~post/files/joshua-phrase-ar-en.tbz>
  http://cs.jhu.edu/~post/files/joshua-hiero-ru-en.tbz 
<http://cs.jhu.edu/~post/files/joshua-hiero-ru-en.tbz>
  http://cs.jhu.edu/~post/files/joshua-hiero-ru-en.tbz

It'd be great if someone could test them on a machine with lots of cores, to 
see how things scale.

matt

> On Sep 22, 2016, at 9:09 AM, Matt Post <p...@cs.jhu.edu> wrote:
> 
> Hi folks,
> 
> I have finished the comparison. Here you can find graphs for ar-en and ru-en. 
> The ground-up rewrite of Moses is 
> about 2x–3x faster than Joshua.
> 
>       http://imgur.com/a/FcIbW <http://imgur.com/a/FcIbW>
> 
> One implication (untested) is that we are likely as fast as or faster than 
> Moses.
> 
> We could brainstorm things to do to close this gap. I'd be much happier with 
> 2x or even 1.5x than with 3x, and I bet we could narrow this down. But I'd 
> like to get the 6.1 release out of the way, first, so I'm pushing this off to 
> next month. Sound cool?
> 
> matt
> 
> 
>> On Sep 19, 2016, at 6:26 AM, Matt Post <p...@cs.jhu.edu 
>> <mailto:p...@cs.jhu.edu>> wrote:
>> 
>> I can't believe I did this, but I mis-colored one of the hiero lines, and 
>> the Numbers legend doesn't show the line type. If you reload the dropbox 
>> file, it's fixed now. The difference is about 3x for both. Here's the table.
>> 
>> Threads
>> Joshua
>> Moses2
>> Joshua (hiero)
>> Moses2 (hiero)
>> Phrase rate
>> Hiero rate
>> 1
>> 178
>> 65
>> 2116
>> 1137
>> 2.74
>> 1.86
>> 2
>> 109
>> 42
>> 1014
>> 389
>> 2.60
>> 2.61
>> 4
>> 78
>> 29
>> 596
>> 213
>> 2.69
>> 2.80
>> 6
>> 72
>> 25
>> 473
>> 154
>> 2.88
>> 3.07
>> 
>> I'll put the models together and share them later today. This was on a 
>> 6-core machine and I agree it'd be nice to test with something much higher.
>> 
>> matt
>> 
>> 
>>> On Sep 19, 2016, at 5:33 AM, kellen sunderland <kellen.sunderl...@gmail.com 
>>> <mailto:kellen.sunderl...@gmail.com><mailto:kellen.sunderl...@gmail.com 
>>> <mailto:kellen.sunderl...@gmail.com>>> wrote:
>>> 
>>> Do we just want to store these models somewhere temporarily?  I've got a 
>>> OneDrive account and could share the models from there (as long as they're 
>>> below 500GBs or so).
>>> 
>>> On Mon, Sep 19, 2016 at 11:32 AM, kellen sunderland 
>>> <kellen.sunderl...@gmail.com <mailto:kellen.sunderl...@gmail.com> 
>>> <mailto:kellen.sunderl...@gmail.com <mailto:kellen.sunderl...@gmail.com>>> 
>>> wrote:
>>> Very nice results.  I think getting to within 25% of a optimized c++ 
>>> decoder from a Java decoder is impressive.  Great that Hieu has put in the 
>>> work to make moses2 so fast as well, that gives organizations two quite 
>>> nice decoding engines to choose from, both with reasonable performance.
>>> 
>>> Matt: I had a question about the x axis here.  Is that number of threads?  
>>> We should be scaling more or less linearly with the number of threads, is 
>>> that the case here?  If you post the models somewhere I can also do a quick 
>>> benchmark on a machine with a few more cores. 
>>> 
>>> -Kellen
>>> 
>>> 
>>> On Mon, Sep 19, 2016 at 10:53 AM, Tommaso Teofili 
>>> <tommaso.teof...@gmail.com 
>>> <mailto:tommaso.teof...@gmail.com><mailto:tommaso.teof...@gmail.com 
>>> <mailto:tommaso.teof...@gmail.com>>> wrote:
>>> Il giorno sab 17 set 2016 alle ore 15:23 Matt Post <p...@cs.jhu.edu 
>>> <mailto:p...@cs.jhu.edu><mailto:p...@cs.jhu.edu <mailto:p...@cs.jhu.edu>>> 
>>> ha
>>> scritto:
>>> 
>>>> I'll ask Hieu; I don't anticipate any problems. One potential problem is
>>>> that that models occupy about 15--20 GB; do you think Jenkins would host
>>>> this?
>>>> 
>>> 
>>> I'm not sure, can such models be downloaded and pruned at runtime, or do
>>> they need to exist on the Jenkins machine ?
>>> 
>>> 
>>>> 
>>>> (ru-en grammars still packing, results will probably not be in until much
>>>> later today)
>>>> 
>>>> matt
>>>> 
>>>> 
>>>>> On Sep 17, 2016, at 3:19 PM, Tommaso Teofili <tommaso.teof...@gmail.com 
>>>>> <mailto:tommaso.teof...@gmail.com><mailto:tommaso.teof...@gmail.com 
>>>>> <mailto:tommaso.teof...@gmail.com>>>
>>>> wrote:
>>>>> 
>>>>> Hi Matt,
>>>>> 
>>>>> I think it'd be really valuable if we could be able to repeat the same
>>>>> tests (given parallel corpus is available) in the future, any chance you
>>>>> can share script / code to do that ? We may even consider adding a
>>>> Jenkins
>>>>> job dedicated to continuously monitor performances as we work on Joshua
>>>>> master branch.
>>>>> 
>>>>> WDYT?
>>>>> 
>>>>> Anyway thanks for sharing the very interesting comparisons.
>>>>> Regards,
>>>>> Tommaso
>>>>> 
>>>>> Il giorno sab 17 set 2016 alle ore 12:29 Matt Post <p...@cs.jhu.edu 
>>>>> <mailto:p...@cs.jhu.edu><mailto:p...@cs.jhu.edu 
>>>>> <mailto:p...@cs.jhu.edu>>> ha
>>>>> scritto:
>>>>> 
>>>>>> Ugh, I think the mailing list deleted the attachment. Here is an attempt
>>>>>> around our censors:
>>>>>> 
>>>>>> https://www.dropbox.com/s/80up63reu4q809y/ar-en-joshua-moses2.png?dl=0 
>>>>>> <https://www.dropbox.com/s/80up63reu4q809y/ar-en-joshua-moses2.png?dl=0><https://www.dropbox.com/s/80up63reu4q809y/ar-en-joshua-moses2.png?dl=0
>>>>>>  
>>>>>> <https://www.dropbox.com/s/80up63reu4q809y/ar-en-joshua-moses2.png?dl=0>>
>>>>>> 
>>>>>> 
>>>>>>> On Sep 17, 2016, at 12:21 PM, Matt Post <p...@cs.jhu.edu 
>>>>>>> <mailto:p...@cs.jhu.edu><mailto:p...@cs.jhu.edu 
>>>>>>> <mailto:p...@cs.jhu.edu>>> wrote:
>>>>>>> 
>>>>>>> Hi everyone,
>>>>>>> 
>>>>>>> One thing we did this week at MT Marathon was a speed comparison of
>>>>>> Joshua 6.1 (release candidate) with Moses2, which is a ground-up
>>>> rewrite of
>>>>>> Moses designed for speed (see the attached paper). Moses2 is 4–6x faster
>>>>>> than Moses phrase-based, and 100x (!) faster than Moses hiero.
>>>>>>> 
>>>>>>> I tested using two moderate-to-large sized datasets that Hieu Hoang
>>>>>> (CC'd) provided me with: ar-en and ru-en. Timing results are from 10,000
>>>>>> sentences in each corpus. The average ar-en sentence length is 7.5, and
>>>> for
>>>>>> ru-en is 28. I only ran one test for each language, so there could be
>>>> some
>>>>>> variance if I averaged, but I think the results look pretty consistent.
>>>> The
>>>>>> timing is end-to-end (including model load times, which Moses2 tends to
>>>> be
>>>>>> a bit faster at).
>>>>>>> 
>>>>>>> Note also that Joshua does not have lexicalized distortion, while
>>>> Moses2
>>>>>> does. This means the BLEU scores are a bit lower for Joshua: 62.85
>>>> versus
>>>>>> 63.49. This shouldn't really affect runtime, however.
>>>>>>> 
>>>>>>> I'm working on the ru-en, but here are the ar-en results:
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> Some conclusions:
>>>>>>> 
>>>>>>> - Hieu has done some bang-up work on the Moses2 rewrite! Joshua is in
>>>>>> general about 3x slower than Moses2
>>>>>>> 
>>>>>>> - We don't have a Moses comparison, but extrapolating from Hieu's
>>>> paper,
>>>>>> it seems we might be as fast as or faster than Moses phrase-based
>>>> decoding,
>>>>>> and are a ton faster on Hiero. I'm going to send my models to Hieu so he
>>>>>> can test on his machine, and then we'll have a better feel for this,
>>>>>> including how it scales on a machine with many more processors.
>>>>>>> 
>>>>>>> matt

Re: moses2 vs. joshua

Reply via email to