thanks Matt, I'll try it out. Regards, Tommaso
Il giorno lun 10 ott 2016 alle ore 16:49 Matt Post <[email protected]> ha scritto: > Not stupid! You can use the shell script I bundled up. Here's how I ran > the timing tests. > > for n in 64 48 32 16 8 4 2 1; do > for name in moses2 joshua; do > echo $name $n; bash time-$name.sh > out.$name.$n > 2> log.$name.$n > done > done > > matt > > > > On Oct 10, 2016, at 6:42 AM, Tommaso Teofili <[email protected]> > wrote: > > > > sorry if this is again a stupid question, but I'm still getting my head > > around all the possible execution options, now that I've downloaded the > > above models, which scripts should i use to run/evaluate them for the > > comparison to be consistent with what others did ? > > > > Regards, > > Tommaso > > > > Il giorno gio 6 ott 2016 alle ore 18:13 Mattmann, Chris A (3980) < > > [email protected]> ha scritto: > > > >> Here here, great job and thanks for hosting > >> > >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> Chris Mattmann, Ph.D. > >> Principal Data Scientist, Engineering Administrative Office (3010) > >> Manager, Open Source Projects Formulation and Development Office (8212) > >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > >> Office: 168-519, Mailstop: 168-527 > >> Email: [email protected] > >> WWW: http://sunset.usc.edu/~mattmann/ > >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> Director, Information Retrieval and Data Science Group (IRDS) > >> Adjunct Associate Professor, Computer Science Department > >> University of Southern California, Los Angeles, CA 90089 USA > >> WWW: http://irds.usc.edu/ > >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >> > >> > >> On 10/6/16, 12:49 AM, "kellen sunderland" <[email protected]> > >> wrote: > >> > >> Will do, but it might be a few days before I get the time to do a > >> proper > >> test. Thanks for hosting Matt. > >> > >> On Thu, Oct 6, 2016 at 2:19 AM, Matt Post <[email protected]> wrote: > >> > >>> Hi folks, > >>> > >>> Sorry this took so long, long story. But the four models that Hieu > >> shared > >>> with me are ready. You can download them here; they're each about > >> 15–20 GB. > >>> > >>> http://cs.jhu.edu/~post/files/joshua-hiero-ar-en.tbz > >>> http://cs.jhu.edu/~post/files/joshua-phrase-ar-en.tbz > >>> http://cs.jhu.edu/~post/files/joshua-hiero-ru-en.tbz > >>> http://cs.jhu.edu/~post/files/joshua-hiero-ru-en.tbz > >>> > >>> It'd be great if someone could test them on a machine with lots of > >> cores, > >>> to see how things scale. > >>> > >>> matt > >>> > >>> On Sep 22, 2016, at 9:09 AM, Matt Post <[email protected]> wrote: > >>> > >>> Hi folks, > >>> > >>> I have finished the comparison. Here you can find graphs for ar-en > >> and > >>> ru-en. The ground-up rewrite of Moses is > >>> about 2x–3x faster than Joshua. > >>> > >>> http://imgur.com/a/FcIbW > >>> > >>> One implication (untested) is that we are likely as fast as or > >> faster than > >>> Moses. > >>> > >>> We could brainstorm things to do to close this gap. I'd be much > >> happier > >>> with 2x or even 1.5x than with 3x, and I bet we could narrow this > >> down. But > >>> I'd like to get the 6.1 release out of the way, first, so I'm > >> pushing this > >>> off to next month. Sound cool? > >>> > >>> matt > >>> > >>> > >>> On Sep 19, 2016, at 6:26 AM, Matt Post <[email protected]> wrote: > >>> > >>> I can't believe I did this, but I mis-colored one of the hiero > >> lines, and > >>> the Numbers legend doesn't show the line type. If you reload the > >> dropbox > >>> file, it's fixed now. The difference is about 3x for both. Here's > >> the table. > >>> > >>> Threads > >>> Joshua > >>> Moses2 > >>> Joshua (hiero) > >>> Moses2 (hiero) > >>> Phrase rate > >>> Hiero rate > >>> 1 > >>> 178 > >>> 65 > >>> 2116 > >>> 1137 > >>> 2.74 > >>> 1.86 > >>> 2 > >>> 109 > >>> 42 > >>> 1014 > >>> 389 > >>> 2.60 > >>> 2.61 > >>> 4 > >>> 78 > >>> 29 > >>> 596 > >>> 213 > >>> 2.69 > >>> 2.80 > >>> 6 > >>> 72 > >>> 25 > >>> 473 > >>> 154 > >>> 2.88 > >>> 3.07 > >>> > >>> I'll put the models together and share them later today. This was on > >> a > >>> 6-core machine and I agree it'd be nice to test with something much > >> higher. > >>> > >>> matt > >>> > >>> > >>> On Sep 19, 2016, at 5:33 AM, kellen sunderland < > >>> [email protected]<mailto:[email protected] > >>> <[email protected]>>> wrote: > >>> > >>> Do we just want to store these models somewhere temporarily? I've > >> got a > >>> OneDrive account and could share the models from there (as long as > >> they're > >>> below 500GBs or so). > >>> > >>> On Mon, Sep 19, 2016 at 11:32 AM, kellen sunderland < > >>> [email protected] <mailto:[email protected] > >>> <[email protected]>>> wrote: > >>> Very nice results. I think getting to within 25% of a optimized c++ > >>> decoder from a Java decoder is impressive. Great that Hieu has put > >> in the > >>> work to make moses2 so fast as well, that gives organizations two > >> quite > >>> nice decoding engines to choose from, both with reasonable > >> performance. > >>> > >>> Matt: I had a question about the x axis here. Is that number of > >> threads? > >>> We should be scaling more or less linearly with the number of > >> threads, is > >>> that the case here? If you post the models somewhere I can also do > >> a quick > >>> benchmark on a machine with a few more cores. > >>> > >>> -Kellen > >>> > >>> > >>> On Mon, Sep 19, 2016 at 10:53 AM, Tommaso Teofili < > >>> [email protected]<mailto:[email protected] > >>> <[email protected]>>> wrote: > >>> Il giorno sab 17 set 2016 alle ore 15:23 Matt Post <[email protected]< > >>> mailto:[email protected] <[email protected]>>> ha > >>> scritto: > >>> > >>> I'll ask Hieu; I don't anticipate any problems. One potential > >> problem is > >>> that that models occupy about 15--20 GB; do you think Jenkins would > >> host > >>> this? > >>> > >>> > >>> I'm not sure, can such models be downloaded and pruned at runtime, > >> or do > >>> they need to exist on the Jenkins machine ? > >>> > >>> > >>> > >>> (ru-en grammars still packing, results will probably not be in until > >> much > >>> later today) > >>> > >>> matt > >>> > >>> > >>> On Sep 17, 2016, at 3:19 PM, Tommaso Teofili < > >> [email protected]< > >>> mailto:[email protected] <[email protected]>>> > >>> > >>> wrote: > >>> > >>> > >>> Hi Matt, > >>> > >>> I think it'd be really valuable if we could be able to repeat the > >> same > >>> tests (given parallel corpus is available) in the future, any chance > >> you > >>> can share script / code to do that ? We may even consider adding a > >>> > >>> Jenkins > >>> > >>> job dedicated to continuously monitor performances as we work on > >> Joshua > >>> master branch. > >>> > >>> WDYT? > >>> > >>> Anyway thanks for sharing the very interesting comparisons. > >>> Regards, > >>> Tommaso > >>> > >>> Il giorno sab 17 set 2016 alle ore 12:29 Matt Post <[email protected]< > >>> mailto:[email protected] <[email protected]>>> ha > >>> scritto: > >>> > >>> Ugh, I think the mailing list deleted the attachment. Here is an > >> attempt > >>> around our censors: > >>> > >>> > >> https://www.dropbox.com/s/80up63reu4q809y/ar-en-joshua-moses2.png?dl=0< > >>> > >> https://www.dropbox.com/s/80up63reu4q809y/ar-en-joshua-moses2.png?dl=0> > >>> > >>> > >>> On Sep 17, 2016, at 12:21 PM, Matt Post <[email protected]<mailto: > >> post@ > >>> cs.jhu.edu <[email protected]>>> wrote: > >>> > >>> Hi everyone, > >>> > >>> One thing we did this week at MT Marathon was a speed comparison of > >>> > >>> Joshua 6.1 (release candidate) with Moses2, which is a ground-up > >>> > >>> rewrite of > >>> > >>> Moses designed for speed (see the attached paper). Moses2 is 4–6x > >> faster > >>> than Moses phrase-based, and 100x (!) faster than Moses hiero. > >>> > >>> > >>> I tested using two moderate-to-large sized datasets that Hieu Hoang > >>> > >>> (CC'd) provided me with: ar-en and ru-en. Timing results are from > >> 10,000 > >>> sentences in each corpus. The average ar-en sentence length is 7.5, > >> and > >>> > >>> for > >>> > >>> ru-en is 28. I only ran one test for each language, so there could be > >>> > >>> some > >>> > >>> variance if I averaged, but I think the results look pretty > >> consistent. > >>> > >>> The > >>> > >>> timing is end-to-end (including model load times, which Moses2 tends > >> to > >>> > >>> be > >>> > >>> a bit faster at). > >>> > >>> > >>> Note also that Joshua does not have lexicalized distortion, while > >>> > >>> Moses2 > >>> > >>> does. This means the BLEU scores are a bit lower for Joshua: 62.85 > >>> > >>> versus > >>> > >>> 63.49. This shouldn't really affect runtime, however. > >>> > >>> > >>> I'm working on the ru-en, but here are the ar-en results: > >>> > >>> > >>> > >>> Some conclusions: > >>> > >>> - Hieu has done some bang-up work on the Moses2 rewrite! Joshua is in > >>> > >>> general about 3x slower than Moses2 > >>> > >>> > >>> - We don't have a Moses comparison, but extrapolating from Hieu's > >>> > >>> paper, > >>> > >>> it seems we might be as fast as or faster than Moses phrase-based > >>> > >>> decoding, > >>> > >>> and are a ton faster on Hiero. I'm going to send my models to Hieu > >> so he > >>> can test on his machine, and then we'll have a better feel for this, > >>> including how it scales on a machine with many more processors. > >>> > >>> > >>> matt > >>> > >>> > >>> > >> > >> > >> > >
