Revision 4145 fixes this. All probing binary files will continue to work. All trie files built from an ARPA containing <unk> will continue to work. Trie binary files built from an ARPA without <unk> must be rebuilt using build_binary; after this is done it will no longer segfault and will return correct values. There isn't a nice way to check for these broken binary files and I'm hesitant to increment the version number because that would require everybody to rebuild.
The issue was that trie writes corrected (including <unk>) counts to the header but the vocabulary lookup table was sized based on the counts given in the ARPA file. When <unk> is missing from the ARPA file, I now pad the vocabulary to the size it expects for the corrected count. Sorry it took so long! Kenneth On 08/15/11 22:12, Kenneth Heafield wrote: > Ok I have reproduced the problem. It only happens when the ARPA file > is missing <unk> and is probably an off-by-one on vocabulary size. > I'll have a fix soon. > > Kenneth > > On 08/15/11 19:20, Kenneth Heafield wrote: >> Hi, >> >> Back from vacation and sorry but I'm having trouble reproducing >> this locally. >> >> - Latest Moses (revision 4143); I haven't made any changes that >> should impact language modeling since 4096. >> - svn status says the relevant source code is unmodified. >> - Tried an SRI model, including rebuilding with build_binary that >> ships with Moses. >> - Ran threaded and not threaded. >> >> Can you send me your very small SRILM model? Does it have <unk>? >> >> Kenneth >> >> On 08/04/11 11:42, Kenneth Heafield wrote: >>> Sorry I am slow to respond. This is my first thing to look at, but I >>> am traveling a lot through the 14th. >>> >>> Alex Fraser <[email protected]> wrote: >>> >>> Hi Kenneth -- >>> >>> Latest revision, 4096. Single threaded also crashes. >>> >>> Cheers, Alex >>> >>> >>> On Fri, Jul 29, 2011 at 6:00 PM, Kenneth Heafield <[email protected]> >>> wrote: >>> > Hi, >>> > >>> > There was a problem with this; thought it was fixed but maybe >>> it came >>> > back. Which revision are you running? Does it still happen if you >>> run >>> > single-threaded? >>> > >>> > Kenneth >>> > >>> > On 07/29/11 09:39, Alex Fraser wrote: >>> >> Hi Folks, >>> >> >>> >> Tom Hoar previously mentioned that he had a problem with KenLMs built >>> >> from SRILM crashing Moses. >>> >> >>> >> Fabienne Cap and I also have had a problem with this. It seems to be >>> >> restricted to using the trie option with build-binary. >>> >> >>> >> Ken, if you have any problems repr! >>> oducing >>> this, please let me know. I >>> >> can send you a very small SRILM trained language model that crashes >>> >> moses when converted to binary with the trie option, but works fine >>> as >>> >> a probing binary and using the original ARPA. (BTW, this is running >>> >> the decoder multi-threaded and the crash comes at some point during >>> >> decoding the first sentence, not during loading files) >>> >> >>> >> Cheers, Alex >>> >> >>> ------------------------------------------------------------------------ >>> >>> >> Moses-support mailing list >>> >> [email protected] >>> >> http://mailman.mit.edu/mailman/listinfo/moses-support >>> > >>> > >>> ------------------------------------------------------------------------ >>> >>> > Moses-support mailing list >>> > [email protected] >>> > http://mailman.mit.edu/mailman/listinfo/moses-support >>> > >>> >>> >>> _______________________________________________ >>> Moses-support mailing list >>> [email protected] >>> http://mailman.mit.edu/mailman/listinfo/moses-support >> >> >> _______________________________________________ >> Moses-support mailing list >> [email protected] >> http://mailman.mit.edu/mailman/listinfo/moses-support > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
