Works as intended.

On 01/13/2016 02:59 PM, Lane Schwartz wrote:
> Thanks, Kenneth. Here's what I get now. 
> 
>     $ ~/mosesdecoder.multisource.git/bin/lmplz -o 2 <<< "that is what
>     happens ? cssd has nothing more or voldemort or pastries in prague ."
>     === 1/5 Counting and sorting n-grams ===
>     Reading /tmp/sh-thd-1452698150 (deleted)
>     
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
>     tcmalloc: large alloc 29442056192 bytes == 0x1c74000 @ 
>     tcmalloc: large alloc 78512136192 bytes == 0x6de346000 @ 
>     
> ****************************************************************************************************
>     Unigram tokens 16 types 18
>     === 2/5 Calculating and sorting adjusted counts ===
>     Chain sizes: 1:216 2:107979354931
>     tcmalloc: large alloc 107979358208 bytes == 0x192a648000 @ 
>     terminate called after throwing an instance of
>     'lm::builder::BadDiscountException'
>       what(): 
>     /home/lanes/mosesdecoder.multisource.git/lm/builder/adjust_counts.cc:53
>     in void
>     lm::builder::{anonymous}::StatCollector::CalculateDiscounts(const
>     lm::builder::DiscountConfig&) threw BadDiscountException because
>     `s.n[j] == 0'.
>     Could not calculate Kneser-Ney discounts for 1-grams with adjusted
>     count 4 because we didn't observe any 1-grams with adjusted count 3;
>     Is this small or artificial data?
>     Try deduplicating the input.  To override this error for e.g. a
>     class-based model, rerun with --discount_fallback
>     Aborted (core dumped)
> 
> 
> 
> On Tue, Jan 12, 2016 at 5:40 PM, Kenneth Heafield <mo...@kheafield.com
> <mailto:mo...@kheafield.com>> wrote:
> 
>     Pushed the fix from kenlm master in October to Moses master.
> 
>     On 01/12/2016 10:34 PM, Lane Schwartz wrote:
>     > Steps to reproduce this error:
>     >
>     >     $ ~/mosesdecoder.git/bin/lmplz -o 2 <<< "that is what happens ? cssd
>     >     has nothing more or voldemort or pastries in prague ."
>     >     === 1/5 Counting and sorting n-grams ===
>     >     Reading /tmp/sh-thd-107574999377 (deleted)
>     >     
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
>     >     tcmalloc: large alloc 29442056192 bytes == 0x2ae2000 @
>     >     tcmalloc: large alloc 78512136192 bytes == 0x6df1b4000 @
>     >     
> ****************************************************************************************************
>     >     Unigram tokens 16 types 18
>     >     === 2/5 Calculating and sorting adjusted counts ===
>     >     Chain sizes: 1:216 2:107979354931
>     >     tcmalloc: large alloc 107979358208 bytes == 0x192b4b6000 @
>     >     lmplz: ./util/fixed_array.hh:104: T&
>     >     util::FixedArray<T>::operator[](std::size_t) [with T =
>     >     lm::NGramStream<lm::builder::BuildingPayload>; std::size_t = long
>     >     unsigned int]: Assertion `i < size()' failed.
>     >
>     >
>     >
>     >
>     > On Wed, Sep 30, 2015 at 11:41 AM, Kenneth Heafield <mo...@kheafield.com 
> <mailto:mo...@kheafield.com>
>     > <mailto:mo...@kheafield.com <mailto:mo...@kheafield.com>>> wrote:
>     >
>     >     That's bad.  Would you mind sending me privately a minimal
>     example of
>     >     the data that reproduces the problem?
>     >
>     >     Kenneth
>     >
>     >     On 09/30/2015 04:29 PM, Alex Martinez wrote:
>     >     > Hello,
>     >     > today I've pulled moses code and recompiled and some
>     experiments (EMS)
>     >     > that were already working are failing on the LM training
>     step with the
>     >     > following error:
>     >     >
>     >     > Executing: /opt/moses/bin/lmplz --text
>     >     > /home/alexmc/devel/toydata/process/lm/nc=pos.factored.1
>     --order 5
>     >     --arpa
>     >     > /home/alexmc/devel/toydata/process/lm/nc=pos.lm.1
>     --discount_fallback
>     >     > === 1/5 Counting and sorting n-grams ===
>     >     > Reading /mnt/a62/devel/toydata/process/lm/nc=pos.factored.1
>     >     >
>     >   
>      
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
>     >     > tcmalloc: large alloc 4753956864 bytes == 0x1f7c000 @
>     >     > tcmalloc: large alloc 22185107456 bytes == 0x11d536000 @
>     >     >
>     >   
>      
> ****************************************************************************************************
>     >     > Unigram tokens 2433135 types 47
>     >     > === 2/5 Calculating and sorting adjusted counts ===
>     >     > Chain sizes: 1:564 2:2630656000 3:4932480000 4:7891967488
>     >     5:11509120000
>     >     > tcmalloc: large alloc 11509121024 bytes == 0x1f7c000 @
>     >     > tcmalloc: large alloc 2630656000 bytes == 0x2aff70000 @
>     >     > tcmalloc: large alloc 4932485120 bytes == 0x34cc3a000 @
>     >     > tcmalloc: large alloc 7891968000 bytes == 0x64933c000 @
>     >     > lmplz: ./util/fixed_array.hh:104: T&
>     >     > util::FixedArray<T>::operator[](std::size_t) [with T =
>     >     > lm::NGramStream<lm::builder::BuildingPayload>; std::size_t =
>     long
>     >     > unsigned int]: Assertion `i < size()' failed.
>     >     >
>     >     > I'm runing a Linux server with Ubuntu 15.04
>     >     >
>     >     > Any help will be appreciated
>     >     >
>     >     > Alex Martínez
>     >     >
>     >     >
>     >     > _______________________________________________
>     >     > Moses-support mailing list
>     >     > Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>     <mailto:Moses-support@mit.edu <mailto:Moses-support@mit.edu>>
>     >     > http://mailman.mit.edu/mailman/listinfo/moses-support
>     >     >
>     >     _______________________________________________
>     >     Moses-support mailing list
>     >     Moses-support@mit.edu <mailto:Moses-support@mit.edu>
>     <mailto:Moses-support@mit.edu <mailto:Moses-support@mit.edu>>
>     >     http://mailman.mit.edu/mailman/listinfo/moses-support
>     >
>     >
>     >
>     >
>     > --
>     > When a place gets crowded enough to require ID's, social collapse
>     is not
>     > far away.  It is time to go elsewhere.  The best thing about space
>     travel
>     > is that it made it possible to go elsewhere.
>     >                 -- R.A. Heinlein, "Time Enough For Love"
> 
> 
> 
> 
> -- 
> When a place gets crowded enough to require ID's, social collapse is not
> far away.  It is time to go elsewhere.  The best thing about space travel
> is that it made it possible to go elsewhere.
>                 -- R.A. Heinlein, "Time Enough For Love"
_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to