Re: moses2 vs. joshua

Matt Post Mon, 10 Oct 2016 07:49:28 -0700

Not stupid! You can use the shell script I bundled up. Here's how I ran the 
timing tests.


        for n in 64 48 32 16 8 4 2 1; do 
                for name in moses2 joshua; do 
                        echo $name $n; bash time-$name.sh > out.$name.$n 2> 
log.$name.$n
                done
        done

matt


> On Oct 10, 2016, at 6:42 AM, Tommaso Teofili <[email protected]> 
> wrote:
> 
> sorry if this is again a stupid question, but I'm still getting my head
> around all the possible execution options, now that I've downloaded the
> above models, which scripts should i use to run/evaluate them for the
> comparison to be consistent with what others did ?
> 
> Regards,
> Tommaso
> 
> Il giorno gio 6 ott 2016 alle ore 18:13 Mattmann, Chris A (3980) <
> [email protected]> ha scritto:
> 
>> Here here, great job and thanks for hosting
>> 
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Principal Data Scientist, Engineering Administrative Office (3010)
>> Manager, Open Source Projects Formulation and Development Office (8212)
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 168-519, Mailstop: 168-527
>> Email: [email protected]
>> WWW:  http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Director, Information Retrieval and Data Science Group (IRDS)
>> Adjunct Associate Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> WWW: http://irds.usc.edu/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 
>> 
>> On 10/6/16, 12:49 AM, "kellen sunderland" <[email protected]>
>> wrote:
>> 
>>    Will do, but it might be a few days before I get the time to do a
>> proper
>>    test.  Thanks for hosting Matt.
>> 
>>    On Thu, Oct 6, 2016 at 2:19 AM, Matt Post <[email protected]> wrote:
>> 
>>> Hi folks,
>>> 
>>> Sorry this took so long, long story. But the four models that Hieu
>> shared
>>> with me are ready. You can download them here; they're each about
>> 15–20 GB.
>>> 
>>>  http://cs.jhu.edu/~post/files/joshua-hiero-ar-en.tbz
>>>  http://cs.jhu.edu/~post/files/joshua-phrase-ar-en.tbz
>>>  http://cs.jhu.edu/~post/files/joshua-hiero-ru-en.tbz
>>>  http://cs.jhu.edu/~post/files/joshua-hiero-ru-en.tbz
>>> 
>>> It'd be great if someone could test them on a machine with lots of
>> cores,
>>> to see how things scale.
>>> 
>>> matt
>>> 
>>> On Sep 22, 2016, at 9:09 AM, Matt Post <[email protected]> wrote:
>>> 
>>> Hi folks,
>>> 
>>> I have finished the comparison. Here you can find graphs for ar-en
>> and
>>> ru-en. The ground-up rewrite of Moses is
>>> about 2x–3x faster than Joshua.
>>> 
>>> http://imgur.com/a/FcIbW
>>> 
>>> One implication (untested) is that we are likely as fast as or
>> faster than
>>> Moses.
>>> 
>>> We could brainstorm things to do to close this gap. I'd be much
>> happier
>>> with 2x or even 1.5x than with 3x, and I bet we could narrow this
>> down. But
>>> I'd like to get the 6.1 release out of the way, first, so I'm
>> pushing this
>>> off to next month. Sound cool?
>>> 
>>> matt
>>> 
>>> 
>>> On Sep 19, 2016, at 6:26 AM, Matt Post <[email protected]> wrote:
>>> 
>>> I can't believe I did this, but I mis-colored one of the hiero
>> lines, and
>>> the Numbers legend doesn't show the line type. If you reload the
>> dropbox
>>> file, it's fixed now. The difference is about 3x for both. Here's
>> the table.
>>> 
>>> Threads
>>> Joshua
>>> Moses2
>>> Joshua (hiero)
>>> Moses2 (hiero)
>>> Phrase rate
>>> Hiero rate
>>> 1
>>> 178
>>> 65
>>> 2116
>>> 1137
>>> 2.74
>>> 1.86
>>> 2
>>> 109
>>> 42
>>> 1014
>>> 389
>>> 2.60
>>> 2.61
>>> 4
>>> 78
>>> 29
>>> 596
>>> 213
>>> 2.69
>>> 2.80
>>> 6
>>> 72
>>> 25
>>> 473
>>> 154
>>> 2.88
>>> 3.07
>>> 
>>> I'll put the models together and share them later today. This was on
>> a
>>> 6-core machine and I agree it'd be nice to test with something much
>> higher.
>>> 
>>> matt
>>> 
>>> 
>>> On Sep 19, 2016, at 5:33 AM, kellen sunderland <
>>> [email protected]<mailto:[email protected]
>>> <[email protected]>>> wrote:
>>> 
>>> Do we just want to store these models somewhere temporarily?  I've
>> got a
>>> OneDrive account and could share the models from there (as long as
>> they're
>>> below 500GBs or so).
>>> 
>>> On Mon, Sep 19, 2016 at 11:32 AM, kellen sunderland <
>>> [email protected] <mailto:[email protected]
>>> <[email protected]>>> wrote:
>>> Very nice results.  I think getting to within 25% of a optimized c++
>>> decoder from a Java decoder is impressive.  Great that Hieu has put
>> in the
>>> work to make moses2 so fast as well, that gives organizations two
>> quite
>>> nice decoding engines to choose from, both with reasonable
>> performance.
>>> 
>>> Matt: I had a question about the x axis here.  Is that number of
>> threads?
>>> We should be scaling more or less linearly with the number of
>> threads, is
>>> that the case here?  If you post the models somewhere I can also do
>> a quick
>>> benchmark on a machine with a few more cores.
>>> 
>>> -Kellen
>>> 
>>> 
>>> On Mon, Sep 19, 2016 at 10:53 AM, Tommaso Teofili <
>>> [email protected]<mailto:[email protected]
>>> <[email protected]>>> wrote:
>>> Il giorno sab 17 set 2016 alle ore 15:23 Matt Post <[email protected]<
>>> mailto:[email protected] <[email protected]>>> ha
>>> scritto:
>>> 
>>> I'll ask Hieu; I don't anticipate any problems. One potential
>> problem is
>>> that that models occupy about 15--20 GB; do you think Jenkins would
>> host
>>> this?
>>> 
>>> 
>>> I'm not sure, can such models be downloaded and pruned at runtime,
>> or do
>>> they need to exist on the Jenkins machine ?
>>> 
>>> 
>>> 
>>> (ru-en grammars still packing, results will probably not be in until
>> much
>>> later today)
>>> 
>>> matt
>>> 
>>> 
>>> On Sep 17, 2016, at 3:19 PM, Tommaso Teofili <
>> [email protected]<
>>> mailto:[email protected] <[email protected]>>>
>>> 
>>> wrote:
>>> 
>>> 
>>> Hi Matt,
>>> 
>>> I think it'd be really valuable if we could be able to repeat the
>> same
>>> tests (given parallel corpus is available) in the future, any chance
>> you
>>> can share script / code to do that ? We may even consider adding a
>>> 
>>> Jenkins
>>> 
>>> job dedicated to continuously monitor performances as we work on
>> Joshua
>>> master branch.
>>> 
>>> WDYT?
>>> 
>>> Anyway thanks for sharing the very interesting comparisons.
>>> Regards,
>>> Tommaso
>>> 
>>> Il giorno sab 17 set 2016 alle ore 12:29 Matt Post <[email protected]<
>>> mailto:[email protected] <[email protected]>>> ha
>>> scritto:
>>> 
>>> Ugh, I think the mailing list deleted the attachment. Here is an
>> attempt
>>> around our censors:
>>> 
>>> 
>> https://www.dropbox.com/s/80up63reu4q809y/ar-en-joshua-moses2.png?dl=0<
>>> 
>> https://www.dropbox.com/s/80up63reu4q809y/ar-en-joshua-moses2.png?dl=0>
>>> 
>>> 
>>> On Sep 17, 2016, at 12:21 PM, Matt Post <[email protected]<mailto:
>> post@
>>> cs.jhu.edu <[email protected]>>> wrote:
>>> 
>>> Hi everyone,
>>> 
>>> One thing we did this week at MT Marathon was a speed comparison of
>>> 
>>> Joshua 6.1 (release candidate) with Moses2, which is a ground-up
>>> 
>>> rewrite of
>>> 
>>> Moses designed for speed (see the attached paper). Moses2 is 4–6x
>> faster
>>> than Moses phrase-based, and 100x (!) faster than Moses hiero.
>>> 
>>> 
>>> I tested using two moderate-to-large sized datasets that Hieu Hoang
>>> 
>>> (CC'd) provided me with: ar-en and ru-en. Timing results are from
>> 10,000
>>> sentences in each corpus. The average ar-en sentence length is 7.5,
>> and
>>> 
>>> for
>>> 
>>> ru-en is 28. I only ran one test for each language, so there could be
>>> 
>>> some
>>> 
>>> variance if I averaged, but I think the results look pretty
>> consistent.
>>> 
>>> The
>>> 
>>> timing is end-to-end (including model load times, which Moses2 tends
>> to
>>> 
>>> be
>>> 
>>> a bit faster at).
>>> 
>>> 
>>> Note also that Joshua does not have lexicalized distortion, while
>>> 
>>> Moses2
>>> 
>>> does. This means the BLEU scores are a bit lower for Joshua: 62.85
>>> 
>>> versus
>>> 
>>> 63.49. This shouldn't really affect runtime, however.
>>> 
>>> 
>>> I'm working on the ru-en, but here are the ar-en results:
>>> 
>>> 
>>> 
>>> Some conclusions:
>>> 
>>> - Hieu has done some bang-up work on the Moses2 rewrite! Joshua is in
>>> 
>>> general about 3x slower than Moses2
>>> 
>>> 
>>> - We don't have a Moses comparison, but extrapolating from Hieu's
>>> 
>>> paper,
>>> 
>>> it seems we might be as fast as or faster than Moses phrase-based
>>> 
>>> decoding,
>>> 
>>> and are a ton faster on Hiero. I'm going to send my models to Hieu
>> so he
>>> can test on his machine, and then we'll have a better feel for this,
>>> including how it scales on a machine with many more processors.
>>> 
>>> 
>>> matt
>>> 
>>> 
>>> 
>> 
>> 
>>

Re: moses2 vs. joshua

Reply via email to