For KenLM, the answer to both your questions is yes. Visual studio project files should be in Moses. If not, you can get the standalone from https://github.com/kpu/kenlm and use kenlm.vcxproj .
Kenneth On 10/05/12 12:24, krishna nanda wrote: > Hello Hieu, > > Thanks for your reply. > > Just to reconfirm your answers: > > >Phrase-table and lexical reordering- no problem. LM - only ken LM. > There is no visual studio project for IRSTLM > > 1. I can build the binarized language model, binarized phrase table and > binarized lexical reordering table, all in *linux* and directly use them > with the moses decoder compiled in *native* *windows*, and it should > work fine right? > > 2. If I want to build the binarized phrase table and binarized lexical > reordering table in *native windows*, I can still use GIZA. And if I > want to build the binarized language model in *native windows*, KenLM > has a visual studio project for it? > > Thank you > Krishna > > On Fri, Oct 5, 2012 at 4:31 PM, Hieu Hoang <[email protected] > <mailto:[email protected]>> wrote: > > hi krishna > > > On 04/10/2012 18:16, krishna nanda wrote: >> Hello Hieu, >> >> I have some questions on getting moses to work on native windows: >> >> 1. I see that under other builds, there is a moses visual studio >> project. Is that project up to date? > i'm not sure. I was told it was working about a month ago. It should > be reasonably up-to-date > >> >> 2. From this thread: >> http://comments.gmane.org/gmane.comp.nlp.moses.user/6991 >> I understand that moses scripts for training/tuning have not been >> ported to native windows > correct. It requires cygwin > >> . >> >> And from here: http://www.statmt.org/moses/?n=Moses.FAQ#ntoc9 >> I understand that moses decoder works fine in native windows. > The visual studio project compiles the decoder natively. See Q1. > >> So, if I build a binarized language model, binarized phrase table >> and binarized reordering table in linux, would it be ok to use >> these binarized models with the decoder in native windows? > Phrase-table and lexical reordering- no problem. LM - only ken LM. > There is no visual studio project for IRSTLM > >> >> 3. Lastly, I also saw in the above thread that you are porting >> KenLM to windows. That would be very helpful, as I was looking to >> build binarized language models in native windows. > it's included. > NB - Both KenLM and IRSTLM uses memory mapping. Therefore, to use > binary LM bigger than 2GB, a 64-bit OS has to be used. > NB2. Cygwin is 32-bit, even on 64-bit windows. > >> >> Thank you Hieu >> Krishna >> >> >> On Mon, Sep 24, 2012 at 7:31 AM, krishna nanda >> <[email protected] <mailto:[email protected]>> wrote: >> >> Hello Hieu, >> >> It was more out of curiosity. Since when we build moses, we do >> specify one of the four language models (Ken/IRST/SRI/Rand), I >> was wondering how easy it is to modify moses to accept some >> other language model. >> >> When you say: "The Moses framework is allow anyone to write >> their own LM wrapper so that it can be used in the decoder" >> >> do you mean if I do have an ARPA file created from some >> language model, I can use it with moses like how I use IRSTLM >> with moses, just that the ARPA file is not created using IRSTLM. >> >> I am experimenting creating a small ARPA file from raw n gram >> data and feeding it to moses. I have n grams (1,2,3) and their >> probabilities and backoff weights. I formatted them as >> required by ARPA >> >> (http://www.speech.sri.com/projects/srilm/manpages/ngram-format.5.html) >> >> But, when I try to binarize the ARPA file in moses, I get the >> following error: >> * >> * >> *"The context of every 3-gram should appear as a 2-gram"* >> >> Since the 2 grams and 3 grams are extracted from the same >> data, I was not sure why the above error message would not be >> true. I traced the error to the file "lm/search_hashed.cc". >> The ARPA formatting in itself seems ok to me. I am not sure >> what I might be missing. >> >> Thanks for your time Hieu >> Krishna >> >> >> On Tue, Sep 18, 2012 at 5:31 PM, Hieu Hoang >> <[email protected] <mailto:[email protected]>> >> wrote: >> >> glad to hear that it's working with cygwin. If anyone out >> there who's willing to occasionally test Moses on cygwin >> and report any problems, I will be very grateful. >> >> I'm curious, what is the benefit of the MS Web LM and >> MITLM? The Moses framework is allow anyone to write their >> own LM wrapper so that it can be used in the decoder. If >> these other LM have advantages, it'll be good to >> incorporate them. >> >> >> On 18/09/2012 04:36, krishna nanda wrote: >>> Hello Hieu, >>> >>> Thanks for your reply. Yes, I managed to run moses in >>> cygwin without any problems. No changes were required. >>> >>> I have another question: >>> I saw that moses has support for Ken/IRST/SRI/Rand >>> Language models. >>> But is there an easy way to consume other language models >>> like the Microsoft web language model or MITLM in moses? >>> >>> Thank you >>> Krishna >>> >>> >>> On Fri, Aug 24, 2012 at 10:01 PM, Hieu Hoang >>> <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> the last 2 number (4 31) are not scores and are >>> ignored. They are counts that was used to calculate >>> the probabilities. >>> >>> there's no option to calculate joint probabilities. i >>> suppose you need to calculate p(s) or p(t) which can >>> be done, but may require a lot of memory. try it >>> yourself and add it to moses if it works. >>> >>> so you managed to run moses in cygwin all the way to >>> getting a bleu score? was there anything you need to >>> change? >>> >>> On 23/08/2012 16:57, krishna nanda wrote: >>>> Hello Hoang and Barry, >>>> >>>> Thanks a lot for your reply. I was able to install >>>> moses and run it in cygwin. I have some quick >>>> questions: >>>> >>>> I found the format (contents) of the phrase table here: >>>> >>>> http://www.statmt.org/moses/?n=FactoredTraining.ScorePhrases >>>> >>>> according to which, there are 5 scores for a phrase >>>> pair. >>>> >>>> But, the phrase table I generated from the news >>>> corpus has 7 scores for a phrase pair like this: >>>> ,au cours de ||| ,as of ||| .25 2.36556e-06 >>>> 0.0333581 0.00128335 2.718 ||| 4 31 >>>> I was not clear on the above format. >>>> >>>> Secondly, is there an option to also generate joint >>>> probabilities of phrase pairs? >>>> >>>> Thank you >>>> Krishna >>>> >>>> >>>> On Sun, Aug 19, 2012 at 11:12 PM, Hieu Hoang >>>> <[email protected] >>>> <mailto:[email protected]>> wrote: >>>> >>>> running Moses on cygwin should be the same as >>>> running it on linux or mac. If you have any >>>> problems, please get back to us. >>>> >>>> the document you pointed to is old now, i've >>>> changed the website to reflect that. >>>> >>>> On 17/08/2012 08:45, krishna nanda wrote: >>>>> Hello, >>>>> >>>>> I am looking to install Moses in Cygwin. >>>>> However, I found the document on the website >>>>> under "windows installation" to be not up to date: >>>>> http://www.statmt.org/moses/?n=Development.GetStarted >>>>> >>>>> >>>>> It still uses >>>>> "regenerate-makefiles.sh",which is not there in >>>>> the latest source from github. Is there an >>>>> updated version? >>>>> >>>>> Thank you >>>>> Krishna >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> _______________________________________________ >>>>> Moses-support mailing list >>>>> [email protected] <mailto:[email protected]> >>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>> >>>> >>>> _______________________________________________ >>>> Moses-support mailing list >>>> [email protected] <mailto:[email protected]> >>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>> >>>> >>> >>> >> >> >> > > > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
