Not stupid! You can use the shell script I bundled up. Here's how I ran the
timing tests.
for n in 64 48 32 16 8 4 2 1; do
for name in moses2 joshua; do
echo $name $n; bash time-$name.sh > out.$name.$n 2>
log.$name.$n
done
done
matt
> On Oct 10, 2016, at 6:42 AM, Tommaso Teofili <[email protected]>
> wrote:
>
> sorry if this is again a stupid question, but I'm still getting my head
> around all the possible execution options, now that I've downloaded the
> above models, which scripts should i use to run/evaluate them for the
> comparison to be consistent with what others did ?
>
> Regards,
> Tommaso
>
> Il giorno gio 6 ott 2016 alle ore 18:13 Mattmann, Chris A (3980) <
> [email protected]> ha scritto:
>
>> Here here, great job and thanks for hosting
>>
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Principal Data Scientist, Engineering Administrative Office (3010)
>> Manager, Open Source Projects Formulation and Development Office (8212)
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 168-519, Mailstop: 168-527
>> Email: [email protected]
>> WWW: http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Director, Information Retrieval and Data Science Group (IRDS)
>> Adjunct Associate Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> WWW: http://irds.usc.edu/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>
>>
>> On 10/6/16, 12:49 AM, "kellen sunderland" <[email protected]>
>> wrote:
>>
>> Will do, but it might be a few days before I get the time to do a
>> proper
>> test. Thanks for hosting Matt.
>>
>> On Thu, Oct 6, 2016 at 2:19 AM, Matt Post <[email protected]> wrote:
>>
>>> Hi folks,
>>>
>>> Sorry this took so long, long story. But the four models that Hieu
>> shared
>>> with me are ready. You can download them here; they're each about
>> 15–20 GB.
>>>
>>> http://cs.jhu.edu/~post/files/joshua-hiero-ar-en.tbz
>>> http://cs.jhu.edu/~post/files/joshua-phrase-ar-en.tbz
>>> http://cs.jhu.edu/~post/files/joshua-hiero-ru-en.tbz
>>> http://cs.jhu.edu/~post/files/joshua-hiero-ru-en.tbz
>>>
>>> It'd be great if someone could test them on a machine with lots of
>> cores,
>>> to see how things scale.
>>>
>>> matt
>>>
>>> On Sep 22, 2016, at 9:09 AM, Matt Post <[email protected]> wrote:
>>>
>>> Hi folks,
>>>
>>> I have finished the comparison. Here you can find graphs for ar-en
>> and
>>> ru-en. The ground-up rewrite of Moses is
>>> about 2x–3x faster than Joshua.
>>>
>>> http://imgur.com/a/FcIbW
>>>
>>> One implication (untested) is that we are likely as fast as or
>> faster than
>>> Moses.
>>>
>>> We could brainstorm things to do to close this gap. I'd be much
>> happier
>>> with 2x or even 1.5x than with 3x, and I bet we could narrow this
>> down. But
>>> I'd like to get the 6.1 release out of the way, first, so I'm
>> pushing this
>>> off to next month. Sound cool?
>>>
>>> matt
>>>
>>>
>>> On Sep 19, 2016, at 6:26 AM, Matt Post <[email protected]> wrote:
>>>
>>> I can't believe I did this, but I mis-colored one of the hiero
>> lines, and
>>> the Numbers legend doesn't show the line type. If you reload the
>> dropbox
>>> file, it's fixed now. The difference is about 3x for both. Here's
>> the table.
>>>
>>> Threads
>>> Joshua
>>> Moses2
>>> Joshua (hiero)
>>> Moses2 (hiero)
>>> Phrase rate
>>> Hiero rate
>>> 1
>>> 178
>>> 65
>>> 2116
>>> 1137
>>> 2.74
>>> 1.86
>>> 2
>>> 109
>>> 42
>>> 1014
>>> 389
>>> 2.60
>>> 2.61
>>> 4
>>> 78
>>> 29
>>> 596
>>> 213
>>> 2.69
>>> 2.80
>>> 6
>>> 72
>>> 25
>>> 473
>>> 154
>>> 2.88
>>> 3.07
>>>
>>> I'll put the models together and share them later today. This was on
>> a
>>> 6-core machine and I agree it'd be nice to test with something much
>> higher.
>>>
>>> matt
>>>
>>>
>>> On Sep 19, 2016, at 5:33 AM, kellen sunderland <
>>> [email protected]<mailto:[email protected]
>>> <[email protected]>>> wrote:
>>>
>>> Do we just want to store these models somewhere temporarily? I've
>> got a
>>> OneDrive account and could share the models from there (as long as
>> they're
>>> below 500GBs or so).
>>>
>>> On Mon, Sep 19, 2016 at 11:32 AM, kellen sunderland <
>>> [email protected] <mailto:[email protected]
>>> <[email protected]>>> wrote:
>>> Very nice results. I think getting to within 25% of a optimized c++
>>> decoder from a Java decoder is impressive. Great that Hieu has put
>> in the
>>> work to make moses2 so fast as well, that gives organizations two
>> quite
>>> nice decoding engines to choose from, both with reasonable
>> performance.
>>>
>>> Matt: I had a question about the x axis here. Is that number of
>> threads?
>>> We should be scaling more or less linearly with the number of
>> threads, is
>>> that the case here? If you post the models somewhere I can also do
>> a quick
>>> benchmark on a machine with a few more cores.
>>>
>>> -Kellen
>>>
>>>
>>> On Mon, Sep 19, 2016 at 10:53 AM, Tommaso Teofili <
>>> [email protected]<mailto:[email protected]
>>> <[email protected]>>> wrote:
>>> Il giorno sab 17 set 2016 alle ore 15:23 Matt Post <[email protected]<
>>> mailto:[email protected] <[email protected]>>> ha
>>> scritto:
>>>
>>> I'll ask Hieu; I don't anticipate any problems. One potential
>> problem is
>>> that that models occupy about 15--20 GB; do you think Jenkins would
>> host
>>> this?
>>>
>>>
>>> I'm not sure, can such models be downloaded and pruned at runtime,
>> or do
>>> they need to exist on the Jenkins machine ?
>>>
>>>
>>>
>>> (ru-en grammars still packing, results will probably not be in until
>> much
>>> later today)
>>>
>>> matt
>>>
>>>
>>> On Sep 17, 2016, at 3:19 PM, Tommaso Teofili <
>> [email protected]<
>>> mailto:[email protected] <[email protected]>>>
>>>
>>> wrote:
>>>
>>>
>>> Hi Matt,
>>>
>>> I think it'd be really valuable if we could be able to repeat the
>> same
>>> tests (given parallel corpus is available) in the future, any chance
>> you
>>> can share script / code to do that ? We may even consider adding a
>>>
>>> Jenkins
>>>
>>> job dedicated to continuously monitor performances as we work on
>> Joshua
>>> master branch.
>>>
>>> WDYT?
>>>
>>> Anyway thanks for sharing the very interesting comparisons.
>>> Regards,
>>> Tommaso
>>>
>>> Il giorno sab 17 set 2016 alle ore 12:29 Matt Post <[email protected]<
>>> mailto:[email protected] <[email protected]>>> ha
>>> scritto:
>>>
>>> Ugh, I think the mailing list deleted the attachment. Here is an
>> attempt
>>> around our censors:
>>>
>>>
>> https://www.dropbox.com/s/80up63reu4q809y/ar-en-joshua-moses2.png?dl=0<
>>>
>> https://www.dropbox.com/s/80up63reu4q809y/ar-en-joshua-moses2.png?dl=0>
>>>
>>>
>>> On Sep 17, 2016, at 12:21 PM, Matt Post <[email protected]<mailto:
>> post@
>>> cs.jhu.edu <[email protected]>>> wrote:
>>>
>>> Hi everyone,
>>>
>>> One thing we did this week at MT Marathon was a speed comparison of
>>>
>>> Joshua 6.1 (release candidate) with Moses2, which is a ground-up
>>>
>>> rewrite of
>>>
>>> Moses designed for speed (see the attached paper). Moses2 is 4–6x
>> faster
>>> than Moses phrase-based, and 100x (!) faster than Moses hiero.
>>>
>>>
>>> I tested using two moderate-to-large sized datasets that Hieu Hoang
>>>
>>> (CC'd) provided me with: ar-en and ru-en. Timing results are from
>> 10,000
>>> sentences in each corpus. The average ar-en sentence length is 7.5,
>> and
>>>
>>> for
>>>
>>> ru-en is 28. I only ran one test for each language, so there could be
>>>
>>> some
>>>
>>> variance if I averaged, but I think the results look pretty
>> consistent.
>>>
>>> The
>>>
>>> timing is end-to-end (including model load times, which Moses2 tends
>> to
>>>
>>> be
>>>
>>> a bit faster at).
>>>
>>>
>>> Note also that Joshua does not have lexicalized distortion, while
>>>
>>> Moses2
>>>
>>> does. This means the BLEU scores are a bit lower for Joshua: 62.85
>>>
>>> versus
>>>
>>> 63.49. This shouldn't really affect runtime, however.
>>>
>>>
>>> I'm working on the ru-en, but here are the ar-en results:
>>>
>>>
>>>
>>> Some conclusions:
>>>
>>> - Hieu has done some bang-up work on the Moses2 rewrite! Joshua is in
>>>
>>> general about 3x slower than Moses2
>>>
>>>
>>> - We don't have a Moses comparison, but extrapolating from Hieu's
>>>
>>> paper,
>>>
>>> it seems we might be as fast as or faster than Moses phrase-based
>>>
>>> decoding,
>>>
>>> and are a ton faster on Hiero. I'm going to send my models to Hieu
>> so he
>>> can test on his machine, and then we'll have a better feel for this,
>>> including how it scales on a machine with many more processors.
>>>
>>>
>>> matt
>>>
>>>
>>>
>>
>>
>>