Works as intended.
On 01/13/2016 02:59 PM, Lane Schwartz wrote:
> Thanks, Kenneth. Here's what I get now.
>
> $ ~/mosesdecoder.multisource.git/bin/lmplz -o 2 <<< "that is what
> happens ? cssd has nothing more or voldemort or pastries in prague ."
> === 1/5 Counting and sorting n-grams ===
> Reading /tmp/sh-thd-1452698150 (deleted)
>
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> tcmalloc: large alloc 29442056192 bytes == 0x1c74000 @
> tcmalloc: large alloc 78512136192 bytes == 0x6de346000 @
>
> ****************************************************************************************************
> Unigram tokens 16 types 18
> === 2/5 Calculating and sorting adjusted counts ===
> Chain sizes: 1:216 2:107979354931
> tcmalloc: large alloc 107979358208 bytes == 0x192a648000 @
> terminate called after throwing an instance of
> 'lm::builder::BadDiscountException'
> what():
> /home/lanes/mosesdecoder.multisource.git/lm/builder/adjust_counts.cc:53
> in void
> lm::builder::{anonymous}::StatCollector::CalculateDiscounts(const
> lm::builder::DiscountConfig&) threw BadDiscountException because
> `s.n[j] == 0'.
> Could not calculate Kneser-Ney discounts for 1-grams with adjusted
> count 4 because we didn't observe any 1-grams with adjusted count 3;
> Is this small or artificial data?
> Try deduplicating the input. To override this error for e.g. a
> class-based model, rerun with --discount_fallback
> Aborted (core dumped)
>
>
>
> On Tue, Jan 12, 2016 at 5:40 PM, Kenneth Heafield <[email protected]
> <mailto:[email protected]>> wrote:
>
> Pushed the fix from kenlm master in October to Moses master.
>
> On 01/12/2016 10:34 PM, Lane Schwartz wrote:
> > Steps to reproduce this error:
> >
> > $ ~/mosesdecoder.git/bin/lmplz -o 2 <<< "that is what happens ? cssd
> > has nothing more or voldemort or pastries in prague ."
> > === 1/5 Counting and sorting n-grams ===
> > Reading /tmp/sh-thd-107574999377 (deleted)
> >
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> > tcmalloc: large alloc 29442056192 bytes == 0x2ae2000 @
> > tcmalloc: large alloc 78512136192 bytes == 0x6df1b4000 @
> >
> ****************************************************************************************************
> > Unigram tokens 16 types 18
> > === 2/5 Calculating and sorting adjusted counts ===
> > Chain sizes: 1:216 2:107979354931
> > tcmalloc: large alloc 107979358208 bytes == 0x192b4b6000 @
> > lmplz: ./util/fixed_array.hh:104: T&
> > util::FixedArray<T>::operator[](std::size_t) [with T =
> > lm::NGramStream<lm::builder::BuildingPayload>; std::size_t = long
> > unsigned int]: Assertion `i < size()' failed.
> >
> >
> >
> >
> > On Wed, Sep 30, 2015 at 11:41 AM, Kenneth Heafield <[email protected]
> <mailto:[email protected]>
> > <mailto:[email protected] <mailto:[email protected]>>> wrote:
> >
> > That's bad. Would you mind sending me privately a minimal
> example of
> > the data that reproduces the problem?
> >
> > Kenneth
> >
> > On 09/30/2015 04:29 PM, Alex Martinez wrote:
> > > Hello,
> > > today I've pulled moses code and recompiled and some
> experiments (EMS)
> > > that were already working are failing on the LM training
> step with the
> > > following error:
> > >
> > > Executing: /opt/moses/bin/lmplz --text
> > > /home/alexmc/devel/toydata/process/lm/nc=pos.factored.1
> --order 5
> > --arpa
> > > /home/alexmc/devel/toydata/process/lm/nc=pos.lm.1
> --discount_fallback
> > > === 1/5 Counting and sorting n-grams ===
> > > Reading /mnt/a62/devel/toydata/process/lm/nc=pos.factored.1
> > >
> >
>
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> > > tcmalloc: large alloc 4753956864 bytes == 0x1f7c000 @
> > > tcmalloc: large alloc 22185107456 bytes == 0x11d536000 @
> > >
> >
>
> ****************************************************************************************************
> > > Unigram tokens 2433135 types 47
> > > === 2/5 Calculating and sorting adjusted counts ===
> > > Chain sizes: 1:564 2:2630656000 3:4932480000 4:7891967488
> > 5:11509120000
> > > tcmalloc: large alloc 11509121024 bytes == 0x1f7c000 @
> > > tcmalloc: large alloc 2630656000 bytes == 0x2aff70000 @
> > > tcmalloc: large alloc 4932485120 bytes == 0x34cc3a000 @
> > > tcmalloc: large alloc 7891968000 bytes == 0x64933c000 @
> > > lmplz: ./util/fixed_array.hh:104: T&
> > > util::FixedArray<T>::operator[](std::size_t) [with T =
> > > lm::NGramStream<lm::builder::BuildingPayload>; std::size_t =
> long
> > > unsigned int]: Assertion `i < size()' failed.
> > >
> > > I'm runing a Linux server with Ubuntu 15.04
> > >
> > > Any help will be appreciated
> > >
> > > Alex MartÃnez
> > >
> > >
> > > _______________________________________________
> > > Moses-support mailing list
> > > [email protected] <mailto:[email protected]>
> <mailto:[email protected] <mailto:[email protected]>>
> > > http://mailman.mit.edu/mailman/listinfo/moses-support
> > >
> > _______________________________________________
> > Moses-support mailing list
> > [email protected] <mailto:[email protected]>
> <mailto:[email protected] <mailto:[email protected]>>
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> >
> >
> >
> > --
> > When a place gets crowded enough to require ID's, social collapse
> is not
> > far away. It is time to go elsewhere. The best thing about space
> travel
> > is that it made it possible to go elsewhere.
> > -- R.A. Heinlein, "Time Enough For Love"
>
>
>
>
> --
> When a place gets crowded enough to require ID's, social collapse is not
> far away. It is time to go elsewhere. The best thing about space travel
> is that it made it possible to go elsewhere.
> -- R.A. Heinlein, "Time Enough For Love"
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support