sorry if this is again a stupid question, but I'm still getting my head around all the possible execution options, now that I've downloaded the above models, which scripts should i use to run/evaluate them for the comparison to be consistent with what others did ?
Regards, Tommaso Il giorno gio 6 ott 2016 alle ore 18:13 Mattmann, Chris A (3980) < [email protected]> ha scritto: > Here here, great job and thanks for hosting > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Chris Mattmann, Ph.D. > Principal Data Scientist, Engineering Administrative Office (3010) > Manager, Open Source Projects Formulation and Development Office (8212) > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > Office: 168-519, Mailstop: 168-527 > Email: [email protected] > WWW: http://sunset.usc.edu/~mattmann/ > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Director, Information Retrieval and Data Science Group (IRDS) > Adjunct Associate Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > WWW: http://irds.usc.edu/ > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > On 10/6/16, 12:49 AM, "kellen sunderland" <[email protected]> > wrote: > > Will do, but it might be a few days before I get the time to do a > proper > test. Thanks for hosting Matt. > > On Thu, Oct 6, 2016 at 2:19 AM, Matt Post <[email protected]> wrote: > > > Hi folks, > > > > Sorry this took so long, long story. But the four models that Hieu > shared > > with me are ready. You can download them here; they're each about > 15–20 GB. > > > > http://cs.jhu.edu/~post/files/joshua-hiero-ar-en.tbz > > http://cs.jhu.edu/~post/files/joshua-phrase-ar-en.tbz > > http://cs.jhu.edu/~post/files/joshua-hiero-ru-en.tbz > > http://cs.jhu.edu/~post/files/joshua-hiero-ru-en.tbz > > > > It'd be great if someone could test them on a machine with lots of > cores, > > to see how things scale. > > > > matt > > > > On Sep 22, 2016, at 9:09 AM, Matt Post <[email protected]> wrote: > > > > Hi folks, > > > > I have finished the comparison. Here you can find graphs for ar-en > and > > ru-en. The ground-up rewrite of Moses is > > about 2x–3x faster than Joshua. > > > > http://imgur.com/a/FcIbW > > > > One implication (untested) is that we are likely as fast as or > faster than > > Moses. > > > > We could brainstorm things to do to close this gap. I'd be much > happier > > with 2x or even 1.5x than with 3x, and I bet we could narrow this > down. But > > I'd like to get the 6.1 release out of the way, first, so I'm > pushing this > > off to next month. Sound cool? > > > > matt > > > > > > On Sep 19, 2016, at 6:26 AM, Matt Post <[email protected]> wrote: > > > > I can't believe I did this, but I mis-colored one of the hiero > lines, and > > the Numbers legend doesn't show the line type. If you reload the > dropbox > > file, it's fixed now. The difference is about 3x for both. Here's > the table. > > > > Threads > > Joshua > > Moses2 > > Joshua (hiero) > > Moses2 (hiero) > > Phrase rate > > Hiero rate > > 1 > > 178 > > 65 > > 2116 > > 1137 > > 2.74 > > 1.86 > > 2 > > 109 > > 42 > > 1014 > > 389 > > 2.60 > > 2.61 > > 4 > > 78 > > 29 > > 596 > > 213 > > 2.69 > > 2.80 > > 6 > > 72 > > 25 > > 473 > > 154 > > 2.88 > > 3.07 > > > > I'll put the models together and share them later today. This was on > a > > 6-core machine and I agree it'd be nice to test with something much > higher. > > > > matt > > > > > > On Sep 19, 2016, at 5:33 AM, kellen sunderland < > > [email protected]<mailto:[email protected] > > <[email protected]>>> wrote: > > > > Do we just want to store these models somewhere temporarily? I've > got a > > OneDrive account and could share the models from there (as long as > they're > > below 500GBs or so). > > > > On Mon, Sep 19, 2016 at 11:32 AM, kellen sunderland < > > [email protected] <mailto:[email protected] > > <[email protected]>>> wrote: > > Very nice results. I think getting to within 25% of a optimized c++ > > decoder from a Java decoder is impressive. Great that Hieu has put > in the > > work to make moses2 so fast as well, that gives organizations two > quite > > nice decoding engines to choose from, both with reasonable > performance. > > > > Matt: I had a question about the x axis here. Is that number of > threads? > > We should be scaling more or less linearly with the number of > threads, is > > that the case here? If you post the models somewhere I can also do > a quick > > benchmark on a machine with a few more cores. > > > > -Kellen > > > > > > On Mon, Sep 19, 2016 at 10:53 AM, Tommaso Teofili < > > [email protected]<mailto:[email protected] > > <[email protected]>>> wrote: > > Il giorno sab 17 set 2016 alle ore 15:23 Matt Post <[email protected]< > > mailto:[email protected] <[email protected]>>> ha > > scritto: > > > > I'll ask Hieu; I don't anticipate any problems. One potential > problem is > > that that models occupy about 15--20 GB; do you think Jenkins would > host > > this? > > > > > > I'm not sure, can such models be downloaded and pruned at runtime, > or do > > they need to exist on the Jenkins machine ? > > > > > > > > (ru-en grammars still packing, results will probably not be in until > much > > later today) > > > > matt > > > > > > On Sep 17, 2016, at 3:19 PM, Tommaso Teofili < > [email protected]< > > mailto:[email protected] <[email protected]>>> > > > > wrote: > > > > > > Hi Matt, > > > > I think it'd be really valuable if we could be able to repeat the > same > > tests (given parallel corpus is available) in the future, any chance > you > > can share script / code to do that ? We may even consider adding a > > > > Jenkins > > > > job dedicated to continuously monitor performances as we work on > Joshua > > master branch. > > > > WDYT? > > > > Anyway thanks for sharing the very interesting comparisons. > > Regards, > > Tommaso > > > > Il giorno sab 17 set 2016 alle ore 12:29 Matt Post <[email protected]< > > mailto:[email protected] <[email protected]>>> ha > > scritto: > > > > Ugh, I think the mailing list deleted the attachment. Here is an > attempt > > around our censors: > > > > > https://www.dropbox.com/s/80up63reu4q809y/ar-en-joshua-moses2.png?dl=0< > > > https://www.dropbox.com/s/80up63reu4q809y/ar-en-joshua-moses2.png?dl=0> > > > > > > On Sep 17, 2016, at 12:21 PM, Matt Post <[email protected]<mailto: > post@ > > cs.jhu.edu <[email protected]>>> wrote: > > > > Hi everyone, > > > > One thing we did this week at MT Marathon was a speed comparison of > > > > Joshua 6.1 (release candidate) with Moses2, which is a ground-up > > > > rewrite of > > > > Moses designed for speed (see the attached paper). Moses2 is 4–6x > faster > > than Moses phrase-based, and 100x (!) faster than Moses hiero. > > > > > > I tested using two moderate-to-large sized datasets that Hieu Hoang > > > > (CC'd) provided me with: ar-en and ru-en. Timing results are from > 10,000 > > sentences in each corpus. The average ar-en sentence length is 7.5, > and > > > > for > > > > ru-en is 28. I only ran one test for each language, so there could be > > > > some > > > > variance if I averaged, but I think the results look pretty > consistent. > > > > The > > > > timing is end-to-end (including model load times, which Moses2 tends > to > > > > be > > > > a bit faster at). > > > > > > Note also that Joshua does not have lexicalized distortion, while > > > > Moses2 > > > > does. This means the BLEU scores are a bit lower for Joshua: 62.85 > > > > versus > > > > 63.49. This shouldn't really affect runtime, however. > > > > > > I'm working on the ru-en, but here are the ar-en results: > > > > > > > > Some conclusions: > > > > - Hieu has done some bang-up work on the Moses2 rewrite! Joshua is in > > > > general about 3x slower than Moses2 > > > > > > - We don't have a Moses comparison, but extrapolating from Hieu's > > > > paper, > > > > it seems we might be as fast as or faster than Moses phrase-based > > > > decoding, > > > > and are a ton faster on Hiero. I'm going to send my models to Hieu > so he > > can test on his machine, and then we'll have a better feel for this, > > including how it scales on a machine with many more processors. > > > > > > matt > > > > > > > > >
