Re: moses2 vs. joshua

Tommaso Teofili Mon, 10 Oct 2016 03:42:59 -0700

sorry if this is again a stupid question, but I'm still getting my head
around all the possible execution options, now that I've downloaded the
above models, which scripts should i use to run/evaluate them for the
comparison to be consistent with what others did ?


Regards,
Tommaso

Il giorno gio 6 ott 2016 alle ore 18:13 Mattmann, Chris A (3980) <
[email protected]> ha scritto:

> Here here, great job and thanks for hosting
>
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Principal Data Scientist, Engineering Administrative Office (3010)
> Manager, Open Source Projects Formulation and Development Office (8212)
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 168-519, Mailstop: 168-527
> Email: [email protected]
> WWW:  http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Director, Information Retrieval and Data Science Group (IRDS)
> Adjunct Associate Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> WWW: http://irds.usc.edu/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
> On 10/6/16, 12:49 AM, "kellen sunderland" <[email protected]>
> wrote:
>
>     Will do, but it might be a few days before I get the time to do a
> proper
>     test.  Thanks for hosting Matt.
>
>     On Thu, Oct 6, 2016 at 2:19 AM, Matt Post <[email protected]> wrote:
>
>     > Hi folks,
>     >
>     > Sorry this took so long, long story. But the four models that Hieu
> shared
>     > with me are ready. You can download them here; they're each about
> 15–20 GB.
>     >
>     >   http://cs.jhu.edu/~post/files/joshua-hiero-ar-en.tbz
>     >   http://cs.jhu.edu/~post/files/joshua-phrase-ar-en.tbz
>     >   http://cs.jhu.edu/~post/files/joshua-hiero-ru-en.tbz
>     >   http://cs.jhu.edu/~post/files/joshua-hiero-ru-en.tbz
>     >
>     > It'd be great if someone could test them on a machine with lots of
> cores,
>     > to see how things scale.
>     >
>     > matt
>     >
>     > On Sep 22, 2016, at 9:09 AM, Matt Post <[email protected]> wrote:
>     >
>     > Hi folks,
>     >
>     > I have finished the comparison. Here you can find graphs for ar-en
> and
>     > ru-en. The ground-up rewrite of Moses is
>     > about 2x–3x faster than Joshua.
>     >
>     > http://imgur.com/a/FcIbW
>     >
>     > One implication (untested) is that we are likely as fast as or
> faster than
>     > Moses.
>     >
>     > We could brainstorm things to do to close this gap. I'd be much
> happier
>     > with 2x or even 1.5x than with 3x, and I bet we could narrow this
> down. But
>     > I'd like to get the 6.1 release out of the way, first, so I'm
> pushing this
>     > off to next month. Sound cool?
>     >
>     > matt
>     >
>     >
>     > On Sep 19, 2016, at 6:26 AM, Matt Post <[email protected]> wrote:
>     >
>     > I can't believe I did this, but I mis-colored one of the hiero
> lines, and
>     > the Numbers legend doesn't show the line type. If you reload the
> dropbox
>     > file, it's fixed now. The difference is about 3x for both. Here's
> the table.
>     >
>     > Threads
>     > Joshua
>     > Moses2
>     > Joshua (hiero)
>     > Moses2 (hiero)
>     > Phrase rate
>     > Hiero rate
>     > 1
>     > 178
>     > 65
>     > 2116
>     > 1137
>     > 2.74
>     > 1.86
>     > 2
>     > 109
>     > 42
>     > 1014
>     > 389
>     > 2.60
>     > 2.61
>     > 4
>     > 78
>     > 29
>     > 596
>     > 213
>     > 2.69
>     > 2.80
>     > 6
>     > 72
>     > 25
>     > 473
>     > 154
>     > 2.88
>     > 3.07
>     >
>     > I'll put the models together and share them later today. This was on
> a
>     > 6-core machine and I agree it'd be nice to test with something much
> higher.
>     >
>     > matt
>     >
>     >
>     > On Sep 19, 2016, at 5:33 AM, kellen sunderland <
>     > [email protected]<mailto:[email protected]
>     > <[email protected]>>> wrote:
>     >
>     > Do we just want to store these models somewhere temporarily?  I've
> got a
>     > OneDrive account and could share the models from there (as long as
> they're
>     > below 500GBs or so).
>     >
>     > On Mon, Sep 19, 2016 at 11:32 AM, kellen sunderland <
>     > [email protected] <mailto:[email protected]
>     > <[email protected]>>> wrote:
>     > Very nice results.  I think getting to within 25% of a optimized c++
>     > decoder from a Java decoder is impressive.  Great that Hieu has put
> in the
>     > work to make moses2 so fast as well, that gives organizations two
> quite
>     > nice decoding engines to choose from, both with reasonable
> performance.
>     >
>     > Matt: I had a question about the x axis here.  Is that number of
> threads?
>     > We should be scaling more or less linearly with the number of
> threads, is
>     > that the case here?  If you post the models somewhere I can also do
> a quick
>     > benchmark on a machine with a few more cores.
>     >
>     > -Kellen
>     >
>     >
>     > On Mon, Sep 19, 2016 at 10:53 AM, Tommaso Teofili <
>     > [email protected]<mailto:[email protected]
>     > <[email protected]>>> wrote:
>     > Il giorno sab 17 set 2016 alle ore 15:23 Matt Post <[email protected]<
>     > mailto:[email protected] <[email protected]>>> ha
>     > scritto:
>     >
>     > I'll ask Hieu; I don't anticipate any problems. One potential
> problem is
>     > that that models occupy about 15--20 GB; do you think Jenkins would
> host
>     > this?
>     >
>     >
>     > I'm not sure, can such models be downloaded and pruned at runtime,
> or do
>     > they need to exist on the Jenkins machine ?
>     >
>     >
>     >
>     > (ru-en grammars still packing, results will probably not be in until
> much
>     > later today)
>     >
>     > matt
>     >
>     >
>     > On Sep 17, 2016, at 3:19 PM, Tommaso Teofili <
> [email protected]<
>     > mailto:[email protected] <[email protected]>>>
>     >
>     > wrote:
>     >
>     >
>     > Hi Matt,
>     >
>     > I think it'd be really valuable if we could be able to repeat the
> same
>     > tests (given parallel corpus is available) in the future, any chance
> you
>     > can share script / code to do that ? We may even consider adding a
>     >
>     > Jenkins
>     >
>     > job dedicated to continuously monitor performances as we work on
> Joshua
>     > master branch.
>     >
>     > WDYT?
>     >
>     > Anyway thanks for sharing the very interesting comparisons.
>     > Regards,
>     > Tommaso
>     >
>     > Il giorno sab 17 set 2016 alle ore 12:29 Matt Post <[email protected]<
>     > mailto:[email protected] <[email protected]>>> ha
>     > scritto:
>     >
>     > Ugh, I think the mailing list deleted the attachment. Here is an
> attempt
>     > around our censors:
>     >
>     >
> https://www.dropbox.com/s/80up63reu4q809y/ar-en-joshua-moses2.png?dl=0<
>     >
> https://www.dropbox.com/s/80up63reu4q809y/ar-en-joshua-moses2.png?dl=0>
>     >
>     >
>     > On Sep 17, 2016, at 12:21 PM, Matt Post <[email protected]<mailto:
> post@
>     > cs.jhu.edu <[email protected]>>> wrote:
>     >
>     > Hi everyone,
>     >
>     > One thing we did this week at MT Marathon was a speed comparison of
>     >
>     > Joshua 6.1 (release candidate) with Moses2, which is a ground-up
>     >
>     > rewrite of
>     >
>     > Moses designed for speed (see the attached paper). Moses2 is 4–6x
> faster
>     > than Moses phrase-based, and 100x (!) faster than Moses hiero.
>     >
>     >
>     > I tested using two moderate-to-large sized datasets that Hieu Hoang
>     >
>     > (CC'd) provided me with: ar-en and ru-en. Timing results are from
> 10,000
>     > sentences in each corpus. The average ar-en sentence length is 7.5,
> and
>     >
>     > for
>     >
>     > ru-en is 28. I only ran one test for each language, so there could be
>     >
>     > some
>     >
>     > variance if I averaged, but I think the results look pretty
> consistent.
>     >
>     > The
>     >
>     > timing is end-to-end (including model load times, which Moses2 tends
> to
>     >
>     > be
>     >
>     > a bit faster at).
>     >
>     >
>     > Note also that Joshua does not have lexicalized distortion, while
>     >
>     > Moses2
>     >
>     > does. This means the BLEU scores are a bit lower for Joshua: 62.85
>     >
>     > versus
>     >
>     > 63.49. This shouldn't really affect runtime, however.
>     >
>     >
>     > I'm working on the ru-en, but here are the ar-en results:
>     >
>     >
>     >
>     > Some conclusions:
>     >
>     > - Hieu has done some bang-up work on the Moses2 rewrite! Joshua is in
>     >
>     > general about 3x slower than Moses2
>     >
>     >
>     > - We don't have a Moses comparison, but extrapolating from Hieu's
>     >
>     > paper,
>     >
>     > it seems we might be as fast as or faster than Moses phrase-based
>     >
>     > decoding,
>     >
>     > and are a ton faster on Hiero. I'm going to send my models to Hieu
> so he
>     > can test on his machine, and then we'll have a better feel for this,
>     > including how it scales on a machine with many more processors.
>     >
>     >
>     > matt
>     >
>     >
>     >
>
>
>

Re: moses2 vs. joshua

Reply via email to