for the SRILM, you use the -unk flag;  RandLM does this by default if I
recall

Miles

On 16 August 2011 06:28, Tom Hoar <[email protected]>wrote:

> Ken,
>
> Does the online moses documentation refer to how to ensure the language
> model has <unk> in the vocabulary? I've never seen it.
>
> What's the best way to ensure a LM has the <unk> token in the vocabulary?
> Is it as simple as appending one line consisting of one <unk> token to the
> language model corpus? Or, is there command line switch for ngram-count,
> build-lm.sh, buildlm? Or, should we just edit the raw text language model
> and add it to the vocabulary manually?
>
> Thanks,
> Tom
>
>
>
> On Mon, 15 Aug 2011 22:12:36 +0100, Kenneth Heafield <[email protected]>
> wrote:
>
> Ok I have reproduced the problem.  It only happens when the ARPA file is
> missing and is probably an off-by-one on vocabulary size.  I'll have a fix
> soon.
>
> Kenneth
>
> On 08/15/11 19:20, Kenneth Heafield wrote:
>
> Hi,
>
>     Back from vacation and sorry but I'm having trouble reproducing this
> locally.
>
> - Latest Moses (revision 4143); I haven't made any changes that should
> impact language modeling since 4096.
> - svn status says the relevant source code is unmodified.
> - Tried an SRI model, including rebuilding with build_binary that ships
> with Moses.
> - Ran threaded and not threaded.
>
> Can you send me your very small SRILM model?  Does it have ?
>
> Kenneth
>
> On 08/04/11 11:42, Kenneth Heafield wrote:
>
> Sorry I am slow to respond. This is my first thing to look at, but I am
> traveling a lot through the 14th.
>
> Alex Fraser <[email protected]> wrote:
>>
>> Hi Kenneth --
>>
>> Latest revision, 4096. Single threaded also crashes.
>>
>> Cheers, Alex
>>
>>
>> On Fri, Jul 29, 2011 at 6:00 PM, Kenneth Heafield  <[email protected]> 
>> wrote:
>>
>> > Hi,
>> >
>> >        There was a problem with this; thought it was fixed but maybe it 
>> > came
>> > back.  Which revision are you running?  Does it still happen if you run
>> > single-threaded?
>> >
>> > Kenneth
>>
>> >
>> > On 07/29/11 09:39, Alex Fraser wrote:
>> >> Hi Folks,
>> >>
>> >> Tom Hoar previously mentioned that he had a problem with KenLMs built
>> >> from SRILM crashing Moses.
>> >>
>>
>> >> Fabienne Cap and I also have had a problem with this. It seems to be
>> >> restricted to using the trie option with build-binary.
>> >>
>> >> Ken, if you have any problems repr!
>>  oducing
>> this, please let me know. I
>>
>> >> can send you a very small SRILM trained language model that crashes
>> >> moses when converted to binary with the trie option, but works fine as
>> >> a probing binary and using the original ARPA. (BTW, this is running
>>
>> >> the decoder multi-threaded and the crash comes at some point during
>> >> decoding the first sentence, not during loading files)
>> >>
>> >> Cheers, Alex
>> >>
>> ------------------------------
>>
>> >> Moses-support mailing list
>>
>> >> [email protected]
>> >> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>> >
>> >
>> ------------------------------
>>
>> > Moses-support mailing list
>> > [email protected]
>> > http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>> >
>>
>>   _______________________________________________ Moses-support mailing
> list [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
> _______________________________________________ Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>


-- 
The University of Edinburgh is a charitable body, registered in Scotland,
with registration number SC005336.
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to