I have changed MAX_NUM_FACTORS in moses/src/TypeDef.h to 8, thinking that it would be easier to keep all possible factors in one giant factored corpus, and then at training time point train-model.perl to the specific factors to use in different experiments.
(By the way, is there any reason not to raise MAX_NUM_FACTORS per default? It took me a while to track down this error when I started working with more than 4 factors) As for the segmentation fault, it was caused by me erroneously specifying a redundant language model (unrelated to any target side translation factor) to train-model.perl. Nonetheless, thanks a bunch for the help! On Feb 14, 2011, at 4:09 PM, Kenneth Heafield wrote: > There should probably be some global check for this. Not just number of > possible factors, but that the LM factor number is less than the factors > present in the phrase table. > > Pretty sure you want: > > 9 0 3 europarl.cleaned.en.kblm.mmap > 9 1 3 europarl.cleaned.en.wsd.kblm.mmap > > with 1 in the second row. > > Also, for what it's worth, kenlm doesn't care what the file extension > is; I just use .mmap or .binary in the documentation. This is different > from IRST which does care. > > On 02/14/11 08:46, Hieu Hoang wrote: >> I think the maximum number of factors is 3. Since you're only using 2 >> factors, you should just use 0 & 1 >> >> Hieu >> Sent from my flying horse >> >> On 14 Feb 2011, at 09:36 PM, "Christian Rishøj Jensen" >> <[email protected]> wrote: >> >>> >>> Yes, using factors (in this case word sense disambiguation). >>> >>> The LM lines are: >>> >>> # language models: type(srilm/irstlm), factors, order, file >>> [lmodel-file] >>> 9 0 3 europarl.cleaned.en.kblm.mmap >>> 9 5 3 europarl.cleaned.en.wsd.kblm.mmap >>> >>> Language models are created using: >>> >>> % ngram-count -order 3 -interpolate -text IN -lm OUT >>> % build_binary IN OUT >>> >>> As you suspected, an error also occurs when using SRI: >>> >>> moses: LanguageModelSRI.cpp:150: virtual float >>> Moses::LanguageModelSRI::GetValue(const std::vector<const Moses::Word*, >>> std::allocator<const Moses::Word*> >&, const void**, unsigned int*) const: >>> Assertion `(*contextFactor[count-1])[factorType] != __null' failed. >>> >>> I am not quite sure what is causing this. >>> Could it be related to the use of binarized phrase tables? >>> >>> >>> >>> On Feb 10, 2011, at 4:00 PM, Kenneth Heafield wrote: >>> >>>> Weird. It's already checked that contextFactor is non-empty. This >>>> could be a bad or NULL Word * object or factor set incorrectly. >>>> >>>> Are you using factors? What are your LM lines from moses.ini? >>>> >>>> On 02/10/11 04:39, Christian Rishøj Jensen wrote: >>>>> >>>>> I am seeing a segmentation fault in KenLM this morning: >>>>> >>>>> reading bin ttable >>>>> size of OFF_T 8 >>>>> binary phrasefile loaded, default OFF_T: -1 >>>>> reading bin ttable >>>>> size of OFF_T 8 >>>>> binary phrasefile loaded, default OFF_T: -1 >>>>> Collecting options took 0.700 seconds >>>>> >>>>> Program received signal SIGSEGV, Segmentation fault. >>>>> GetValueGivenState (this=0x14dd8c0, contextFactor=<value optimized out>, >>>>> state=..., len=0x7fffffffd72c) at LanguageModelKen.cpp:179 >>>>> 179 std::size_t factor = >>>>> contextFactor.back()->GetFactor(GetFactorType())->GetId(); >>>>> (gdb) where >>>>> #0 GetValueGivenState (this=0x14dd8c0, contextFactor=<value optimized >>>>> out>, state=..., len=0x7fffffffd72c) at LanguageModelKen.cpp:179 >>>>> #1 0x00000000004927c8 in Moses::LanguageModel::Evaluate (this=0x14e3ee0, >>>>> hypo=..., ps=<value optimized out>, out=<value optimized out>) >>>>> at LanguageModel.cpp:227 >>>>> #2 0x0000000000426d5b in Moses::Hypothesis::CalcScore (this=0x5c02010, >>>>> futureScore=<value optimized out>) at Hypothesis.cpp:298 >>>>> #3 0x000000000044cc9a in Moses::SearchNormal::ExpandHypothesis >>>>> (this=0x22bed80, hypothesis=..., transOpt=<value optimized out>, >>>>> expectedScore=<value optimized out>) at SearchNormal.cpp:308 >>>>> #4 0x000000000044ceb9 in Moses::SearchNormal::ExpandAllHypotheses >>>>> (this=0x22bed80, hypothesis=..., startPos=<value optimized out>, >>>>> endPos=<value optimized out>) at SearchNormal.cpp:281 >>>>> #5 0x000000000044d23b in Moses::SearchNormal::ProcessOneHypothesis >>>>> (this=0x22bed80, hypothesis=<value optimized out>) at SearchNormal.cpp:247 >>>>> #6 0x000000000044e5a0 in Moses::SearchNormal::ProcessSentence >>>>> (this=0x22bed80) at SearchNormal.cpp:95 >>>>> #7 0x000000000043081c in Moses::Manager::ProcessSentence >>>>> (this=0x7fffffffdfc0) at Manager.cpp:100 >>>>> #8 0x000000000040a518 in TranslationTask::Run (this=0x22bd830) at >>>>> Main.cpp:87 >>>>> #9 0x00000000004086cf in main (argc=<value optimized out>, argv=<value >>>>> optimized out>) at Main.cpp:392 >>>>> >>>>> Is it obvious to anyone what might be the cause of this? >>>>> >>>>> I am using binarized, memory mapped language models. >>>>> >>>>> best >>>>> Christian >>>>> _______________________________________________ >>>>> Moses-support mailing list >>>>> [email protected] >>>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>>> _______________________________________________ >>>> Moses-support mailing list >>>> [email protected] >>>> http://mailman.mit.edu/mailman/listinfo/moses-support >>> >>> >>> >>> _______________________________________________ >>> Moses-support mailing list >>> [email protected] >>> http://mailman.mit.edu/mailman/listinfo/moses-support >>> _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
