Here here, great job and thanks for hosting ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Principal Data Scientist, Engineering Administrative Office (3010) Manager, Open Source Projects Formulation and Development Office (8212) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Director, Information Retrieval and Data Science Group (IRDS) Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA WWW: http://irds.usc.edu/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
On 10/6/16, 12:49 AM, "kellen sunderland" <[email protected]> wrote: Will do, but it might be a few days before I get the time to do a proper test. Thanks for hosting Matt. On Thu, Oct 6, 2016 at 2:19 AM, Matt Post <[email protected]> wrote: > Hi folks, > > Sorry this took so long, long story. But the four models that Hieu shared > with me are ready. You can download them here; they're each about 15–20 GB. > > http://cs.jhu.edu/~post/files/joshua-hiero-ar-en.tbz > http://cs.jhu.edu/~post/files/joshua-phrase-ar-en.tbz > http://cs.jhu.edu/~post/files/joshua-hiero-ru-en.tbz > http://cs.jhu.edu/~post/files/joshua-hiero-ru-en.tbz > > It'd be great if someone could test them on a machine with lots of cores, > to see how things scale. > > matt > > On Sep 22, 2016, at 9:09 AM, Matt Post <[email protected]> wrote: > > Hi folks, > > I have finished the comparison. Here you can find graphs for ar-en and > ru-en. The ground-up rewrite of Moses is > about 2x–3x faster than Joshua. > > http://imgur.com/a/FcIbW > > One implication (untested) is that we are likely as fast as or faster than > Moses. > > We could brainstorm things to do to close this gap. I'd be much happier > with 2x or even 1.5x than with 3x, and I bet we could narrow this down. But > I'd like to get the 6.1 release out of the way, first, so I'm pushing this > off to next month. Sound cool? > > matt > > > On Sep 19, 2016, at 6:26 AM, Matt Post <[email protected]> wrote: > > I can't believe I did this, but I mis-colored one of the hiero lines, and > the Numbers legend doesn't show the line type. If you reload the dropbox > file, it's fixed now. The difference is about 3x for both. Here's the table. > > Threads > Joshua > Moses2 > Joshua (hiero) > Moses2 (hiero) > Phrase rate > Hiero rate > 1 > 178 > 65 > 2116 > 1137 > 2.74 > 1.86 > 2 > 109 > 42 > 1014 > 389 > 2.60 > 2.61 > 4 > 78 > 29 > 596 > 213 > 2.69 > 2.80 > 6 > 72 > 25 > 473 > 154 > 2.88 > 3.07 > > I'll put the models together and share them later today. This was on a > 6-core machine and I agree it'd be nice to test with something much higher. > > matt > > > On Sep 19, 2016, at 5:33 AM, kellen sunderland < > [email protected]<mailto:[email protected] > <[email protected]>>> wrote: > > Do we just want to store these models somewhere temporarily? I've got a > OneDrive account and could share the models from there (as long as they're > below 500GBs or so). > > On Mon, Sep 19, 2016 at 11:32 AM, kellen sunderland < > [email protected] <mailto:[email protected] > <[email protected]>>> wrote: > Very nice results. I think getting to within 25% of a optimized c++ > decoder from a Java decoder is impressive. Great that Hieu has put in the > work to make moses2 so fast as well, that gives organizations two quite > nice decoding engines to choose from, both with reasonable performance. > > Matt: I had a question about the x axis here. Is that number of threads? > We should be scaling more or less linearly with the number of threads, is > that the case here? If you post the models somewhere I can also do a quick > benchmark on a machine with a few more cores. > > -Kellen > > > On Mon, Sep 19, 2016 at 10:53 AM, Tommaso Teofili < > [email protected]<mailto:[email protected] > <[email protected]>>> wrote: > Il giorno sab 17 set 2016 alle ore 15:23 Matt Post <[email protected]< > mailto:[email protected] <[email protected]>>> ha > scritto: > > I'll ask Hieu; I don't anticipate any problems. One potential problem is > that that models occupy about 15--20 GB; do you think Jenkins would host > this? > > > I'm not sure, can such models be downloaded and pruned at runtime, or do > they need to exist on the Jenkins machine ? > > > > (ru-en grammars still packing, results will probably not be in until much > later today) > > matt > > > On Sep 17, 2016, at 3:19 PM, Tommaso Teofili <[email protected]< > mailto:[email protected] <[email protected]>>> > > wrote: > > > Hi Matt, > > I think it'd be really valuable if we could be able to repeat the same > tests (given parallel corpus is available) in the future, any chance you > can share script / code to do that ? We may even consider adding a > > Jenkins > > job dedicated to continuously monitor performances as we work on Joshua > master branch. > > WDYT? > > Anyway thanks for sharing the very interesting comparisons. > Regards, > Tommaso > > Il giorno sab 17 set 2016 alle ore 12:29 Matt Post <[email protected]< > mailto:[email protected] <[email protected]>>> ha > scritto: > > Ugh, I think the mailing list deleted the attachment. Here is an attempt > around our censors: > > https://www.dropbox.com/s/80up63reu4q809y/ar-en-joshua-moses2.png?dl=0< > https://www.dropbox.com/s/80up63reu4q809y/ar-en-joshua-moses2.png?dl=0> > > > On Sep 17, 2016, at 12:21 PM, Matt Post <[email protected]<mailto:post@ > cs.jhu.edu <[email protected]>>> wrote: > > Hi everyone, > > One thing we did this week at MT Marathon was a speed comparison of > > Joshua 6.1 (release candidate) with Moses2, which is a ground-up > > rewrite of > > Moses designed for speed (see the attached paper). Moses2 is 4–6x faster > than Moses phrase-based, and 100x (!) faster than Moses hiero. > > > I tested using two moderate-to-large sized datasets that Hieu Hoang > > (CC'd) provided me with: ar-en and ru-en. Timing results are from 10,000 > sentences in each corpus. The average ar-en sentence length is 7.5, and > > for > > ru-en is 28. I only ran one test for each language, so there could be > > some > > variance if I averaged, but I think the results look pretty consistent. > > The > > timing is end-to-end (including model load times, which Moses2 tends to > > be > > a bit faster at). > > > Note also that Joshua does not have lexicalized distortion, while > > Moses2 > > does. This means the BLEU scores are a bit lower for Joshua: 62.85 > > versus > > 63.49. This shouldn't really affect runtime, however. > > > I'm working on the ru-en, but here are the ar-en results: > > > > Some conclusions: > > - Hieu has done some bang-up work on the Moses2 rewrite! Joshua is in > > general about 3x slower than Moses2 > > > - We don't have a Moses comparison, but extrapolating from Hieu's > > paper, > > it seems we might be as fast as or faster than Moses phrase-based > > decoding, > > and are a ton faster on Hiero. I'm going to send my models to Hieu so he > can test on his machine, and then we'll have a better feel for this, > including how it scales on a machine with many more processors. > > > matt > > >
