whoops, forgot link. see "class-based models" section in: http://kheafield.com/code/kenlm/estimation/
~amittai On 1/23/16 13:08, amittai axelrod wrote: > > The reason for using Witten-Bell was because > > Kneser-Ney wasn’t able to cope up with the counts being generated for > > coarse language models. > > that is indeed an annoyance with kndiscount. however, you can now try > using "--discount_fallback" with kenlm. it works for me, even with tens > of classes. > > cheers, > ~amittai > > On 1/23/16 07:11, Jasneet Sabharwal wrote: >> Thanks Ken & Hieu, >> >> I’ll give KenLM a try. The reason for using Witten-Bell was because >> Kneser-Ney wasn’t able to cope up with the counts being generated for >> coarse language models. Sp, I’ll train my LM using SRILM with ngram >> order 8 and WB smoothing and use KenLM with order 8 in Moses. >> >> Best, >> Jasneet >>> On Jan 23, 2016, at 3:38 AM, Kenneth Heafield <mo...@kheafield.com >>> <mailto:mo...@kheafield.com>> wrote: >>> >>> Hi, >>> >>> You can compile with --max-kenlm-order=8 or change the setting in the >>> Eclipse files. >>> >>> The ARPA file format is interchangeable. You can build an ARPA using >>> SRILM and Witten-Bell (though Bob Moore once called me out at a >>> conference for suggesting that as an alternative to Kneser-Ney) then >>> load with KenLM. >>> >>> Kenneth >>> >>> On 01/23/2016 05:39 AM, Jasneet Sabharwal wrote: >>>> Thanks Hieu. >>>> >>>> I’m using the eclipse project for development. I followed your video to >>>> set it up and I have linked the srilm and irstlm installations in the >>>> root directory of mosesdecoder. I first tried to compile the project, >>>> but neither the SRILM nor the IRSTLM LM cpp files get compiled. So, I >>>> added LM_IRST and included "${workspace_loc}/../../irstlm/include” path >>>> in the C/C++ Build settings of the project. But I still cannot compile >>>> IRST.cpp. >>>> >>>> The reason I’m not using the included KenLM is because my new feature >>>> function requires an 8-gram language model with witten bell smoothing, >>>> which is provided by SRILM. As, IRSTLM can use SRILM generated language >>>> models, so I decided to call IRSTLM code inside my feature function to >>>> get the score for a phrase. >>>> >>>> Any pointers on how can I debug the eclipse project with IRSTLM/SRILM? >>>> >>>> Best, >>>> Jasneet >>>> >>>> PS: When I compile the whole project using "./bjam -j4 >>>> —with-boost=<absolute path to boost> —with-cmph=<absolute path to cmph> >>>> —with-irstlm=<absolute path to irstlm>”, it successfully compiles >>>> without any errors. >>>> >>>> >>>>> On Jan 19, 2016, at 4:39 PM, Hieu Hoang <hieuho...@gmail.com >>>>> <mailto:hieuho...@gmail.com> >>>>> <mailto:hieuho...@gmail.com>> wrote: >>>>> >>>>> I believe Nadir Durrani's OSM uses KenLM inside it. You can look in >>>>> moses/FF/OSM-Feature >>>>> for tips >>>>> >>>>> On 20/01/16 00:31, Jasneet Sabharwal wrote: >>>>>> Thanks Hieu. >>>>>> >>>>>> One last question. What do you think is the best way to load the >>>>>> SRILM language model inside my custom feature function and to get a >>>>>> score for a string that my feature function created? >>>>>> >>>>>> Best,beli >>>>>> Jasneet >>>>>>> On Jan 17, 2016, at 3:45 AM, Hieu Hoang >>>>>>> <<mailto:hieuho...@gmail.com>hieuho...@gmail.com >>>>>>> <mailto:hieuho...@gmail.com>> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 17/01/16 04:05, Jasneet Sabharwal wrote: >>>>>>>> Thanks Hieu, >>>>>>>> >>>>>>>> I had subscribed to the mailing list and I’m getting the digest, >>>>>>>> but not sure why my email went for your approval. When I get the >>>>>>>> alignments from GetAlignTerm(), the index of the source word is >>>>>>>> relative? To get the index in the source sentence, I’m assuming >>>>>>>> that I would need to get the starting position of the source words >>>>>>>> from CurrSourceWordsRange().GetStartPos() from current hypothesis >>>>>>>> and offset the source alignment index with that value? >>>>>>> yep. And to get the index in the target sentence, use >>>>>>> GetCurrTargetWordsRange().GetStartPos() >>>>>>>> >>>>>>>> Regards, >>>>>>>> Jasneet >>>>>>>>> On Jan 15, 2016, at 3:43 AM, Hieu Hoang <hieuho...@gmail.com >>>>>>>>> <mailto:hieuho...@gmail.com>> wrote: >>>>>>>>> >>>>>>>>> please subscribe to the Moses mailing list before posting to it. >>>>>>>>> You can subscribe here: >>>>>>>>> http://mailman.mit.edu/mailman/admin/moses-support >>>>>>>>> To answer you question - the target phrase has a method called >>>>>>>>> GetAlignTerm() >>>>>>>>> that contains the alignment for terminals. This comes from the >>>>>>>>> phrase-table, and ultimately from the word alignment. >>>>>>>>> >>>>>>>>> -------- Forwarded Message -------- >>>>>>>>> Subject:Moses-support post from jasneet.sabhar...@sfu.ca >>>>>>>>> <mailto:jasneet.sabhar...@sfu.ca> requires >>>>>>>>> approval >>>>>>>>> Date:Wed, 13 Jan 2016 23:36:50 -0500 >>>>>>>>> From:moses-support-ow...@mit.edu >>>>>>>>> <mailto:moses-support-ow...@mit.edu> >>>>>>>>> To:moses-support-ow...@mit.edu >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> As list administrator, your authorization is requested for the >>>>>>>>> following mailing list posting: >>>>>>>>> >>>>>>>>> List: Moses-support@mit.edu >>>>>>>>> From: jasneet.sabhar...@sfu.ca >>>>>>>>> Subject: Getting alignments for current hypothesis in phrase >>>>>>>>> based model >>>>>>>>> Reason: Post by non-member to a members-only list >>>>>>>>> >>>>>>>>> At your convenience, visit: >>>>>>>>> >>>>>>>>> http://mailman.mit.edu/mailman/admindb/moses-support >>>>>>>>> >>>>>>>>> to approve or deny the request. >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> <ForwardedMessage.eml><ForwardedMessage.eml> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Moses-support mailing list >>>>>>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu> >>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>>>>> >>>>>>> -- >>>>>>> Hieu Hoang >>>>>>> http://www.hoang.co.uk/hieu >>>>>> >>>>> >>>>> -- >>>>> Hieu Hoang >>>>> http://www.hoang.co.uk/hieu >>>> >>>> >>>> >>>> _______________________________________________ >>>> Moses-support mailing list >>>> Moses-support@mit.edu <mailto:Moses-support@mit.edu> >>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>> >>> _______________________________________________ >>> Moses-support mailing list >>> Moses-support@mit.edu <mailto:Moses-support@mit.edu> >>> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> >> >> _______________________________________________ >> Moses-support mailing list >> Moses-support@mit.edu >> http://mailman.mit.edu/mailman/listinfo/moses-support >> _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support