Hi, On Thu, Sep 29, 2011 at 10:49 PM, Marcello Federico <[email protected]> wrote: > Hi folks, > > although "just slower" you might indeed still want to compile IRSTLM and > SRILM at least to estimate and build your LM files. > > We are currently working to get a new release of IRSTLM ready which should > be both faster and thread safe. It will also provide new features that will > allow to > manage different sorts of LMs which are currently not supported by the other > LM libraries.
What are the difference between the various language models exactly? I see Moses currently supports 4 different (SRI, IRST, RandLM, KenLM). Reading the language model page (http://www.statmt.org/moses/?n=FactoredTraining.BuildingLanguageModel), I would understand that SRI is not that good for big data, IRST is better for big data (space and memory), RandLM is even better for space and memory (so for very huge data), but much slower (4 times!) at execution. And finally KenLM seems to beat them all, both for memory and speed! Reading like this, that seems like the description clearly advantages KenLM (which is also the default in Moses). Is that accurate? Is there other characteristics of the various LMs (especially about the Open Source ones, I don't really care about SRI)? Should I expect some of them to be more experimental/less reliable/less stable? And what about the "quality" of translation? Is there any comparison possible about this? (I know obviously that depends a lot on the data, and that we are speaking about Machine Translation, hence "quality" is not a word which applies that well. But maybe some flagrant differences have been raised on these LMs for the same input data?) > Please notice that we are NOT inviting you to NOT use other's software. > We are in fact thankful to the other open source developers for providing > to the community good quality and useful software. > > We also like to see fair comparisons among different implementations and > believe that these can stimulate further technology improvements to the > benefit of all. > > Finally, we like thinking of our work like a friendly competition in which no > one is trying to diminish the other's work. Don't worry. I understand this all. :-) That's what is good with Free Software after all! That's perfectly normal to advertise your project (as long, as you say, this is done in a friendly way towards competition). Thanks for all the responses I got here. This list is quite friendly indeed! ;-) Jehan > Greetings, > > Marcello Federico > > ------------------------------------------------------------------------------------------------ > > On Sep 26, 2011, at 11:44 AM, Kenneth Heafield wrote: > >> Hi, >> >> Since the sample language models are provided for you, it is no >> longer necessary to compile SRILM or IRSTLM (though you can if you want >> to use the specific features they provide; otherwise they're just >> slower). I've updated the getting started documentation. >> >> Kenneth >> >> On 09/26/11 09:32, Jehan Pages wrote: >>> Hi, >>> >>> On Mon, Sep 26, 2011 at 3:48 PM, Nicola Bertoldi <[email protected]> wrote: >>>> I am going to release (very soon) a new version of Moses including new LM >>>> types >>>> Stay tuned on IRSTLM webpage >>>> >>>> If you need immediately, get the code from the IRSTLM SF repository >>>> >>>> you can download revision 452, which properly interfaces with the latest >>>> revision of Moses >>> Thanks for the answer. As right now, I am mainly testing this engine, >>> the development version from the repo suits me ok. Anyway Moses >>> compiled fine using revision 452 of IRSTLM. So that's great. Thanks >>> again! >>> >>> Also just to be sure, in the "getting started" page, the sample models >>> which are linked are only for SRILM, right? Because I wanted to test >>> as explained in the page, and I get: >>> >>> [...] >>> Start loading LanguageModel lm/europarl.srilm.gz : [0.000] seconds >>> ERROR:Language model type unknown. Probably not compiled into library >>> Segmentation fault >>> >>> >>> Seeing the srilm.gz extension, I guess that won't work with only >>> IRSTLM compiled in. That information may be worth being updated into >>> the "Getting started" page. :-) >>> I guess I'll have to test directly with more complete data. >>> >>> Jehan >>> >>> _______________________________________________ >>> Moses-support mailing list >>> [email protected] >>> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> _______________________________________________ >> Moses-support mailing list >> [email protected] >> http://mailman.mit.edu/mailman/listinfo/moses-support > > > > > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
