Works as intended.
On 01/13/2016 02:59 PM, Lane Schwartz wrote: > Thanks, Kenneth. Here's what I get now. > > $ ~/mosesdecoder.multisource.git/bin/lmplz -o 2 <<< "that is what > happens ? cssd has nothing more or voldemort or pastries in prague ." > === 1/5 Counting and sorting n-grams === > Reading /tmp/sh-thd-1452698150 (deleted) > > ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100 > tcmalloc: large alloc 29442056192 bytes == 0x1c74000 @ > tcmalloc: large alloc 78512136192 bytes == 0x6de346000 @ > > **************************************************************************************************** > Unigram tokens 16 types 18 > === 2/5 Calculating and sorting adjusted counts === > Chain sizes: 1:216 2:107979354931 > tcmalloc: large alloc 107979358208 bytes == 0x192a648000 @ > terminate called after throwing an instance of > 'lm::builder::BadDiscountException' > what(): > /home/lanes/mosesdecoder.multisource.git/lm/builder/adjust_counts.cc:53 > in void > lm::builder::{anonymous}::StatCollector::CalculateDiscounts(const > lm::builder::DiscountConfig&) threw BadDiscountException because > `s.n[j] == 0'. > Could not calculate Kneser-Ney discounts for 1-grams with adjusted > count 4 because we didn't observe any 1-grams with adjusted count 3; > Is this small or artificial data? > Try deduplicating the input. To override this error for e.g. a > class-based model, rerun with --discount_fallback > Aborted (core dumped) > > > > On Tue, Jan 12, 2016 at 5:40 PM, Kenneth Heafield <mo...@kheafield.com > <mailto:mo...@kheafield.com>> wrote: > > Pushed the fix from kenlm master in October to Moses master. > > On 01/12/2016 10:34 PM, Lane Schwartz wrote: > > Steps to reproduce this error: > > > > $ ~/mosesdecoder.git/bin/lmplz -o 2 <<< "that is what happens ? cssd > > has nothing more or voldemort or pastries in prague ." > > === 1/5 Counting and sorting n-grams === > > Reading /tmp/sh-thd-107574999377 (deleted) > > > ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100 > > tcmalloc: large alloc 29442056192 bytes == 0x2ae2000 @ > > tcmalloc: large alloc 78512136192 bytes == 0x6df1b4000 @ > > > **************************************************************************************************** > > Unigram tokens 16 types 18 > > === 2/5 Calculating and sorting adjusted counts === > > Chain sizes: 1:216 2:107979354931 > > tcmalloc: large alloc 107979358208 bytes == 0x192b4b6000 @ > > lmplz: ./util/fixed_array.hh:104: T& > > util::FixedArray<T>::operator[](std::size_t) [with T = > > lm::NGramStream<lm::builder::BuildingPayload>; std::size_t = long > > unsigned int]: Assertion `i < size()' failed. > > > > > > > > > > On Wed, Sep 30, 2015 at 11:41 AM, Kenneth Heafield <mo...@kheafield.com > <mailto:mo...@kheafield.com> > > <mailto:mo...@kheafield.com <mailto:mo...@kheafield.com>>> wrote: > > > > That's bad. Would you mind sending me privately a minimal > example of > > the data that reproduces the problem? > > > > Kenneth > > > > On 09/30/2015 04:29 PM, Alex Martinez wrote: > > > Hello, > > > today I've pulled moses code and recompiled and some > experiments (EMS) > > > that were already working are failing on the LM training > step with the > > > following error: > > > > > > Executing: /opt/moses/bin/lmplz --text > > > /home/alexmc/devel/toydata/process/lm/nc=pos.factored.1 > --order 5 > > --arpa > > > /home/alexmc/devel/toydata/process/lm/nc=pos.lm.1 > --discount_fallback > > > === 1/5 Counting and sorting n-grams === > > > Reading /mnt/a62/devel/toydata/process/lm/nc=pos.factored.1 > > > > > > > ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100 > > > tcmalloc: large alloc 4753956864 bytes == 0x1f7c000 @ > > > tcmalloc: large alloc 22185107456 bytes == 0x11d536000 @ > > > > > > > **************************************************************************************************** > > > Unigram tokens 2433135 types 47 > > > === 2/5 Calculating and sorting adjusted counts === > > > Chain sizes: 1:564 2:2630656000 3:4932480000 4:7891967488 > > 5:11509120000 > > > tcmalloc: large alloc 11509121024 bytes == 0x1f7c000 @ > > > tcmalloc: large alloc 2630656000 bytes == 0x2aff70000 @ > > > tcmalloc: large alloc 4932485120 bytes == 0x34cc3a000 @ > > > tcmalloc: large alloc 7891968000 bytes == 0x64933c000 @ > > > lmplz: ./util/fixed_array.hh:104: T& > > > util::FixedArray<T>::operator[](std::size_t) [with T = > > > lm::NGramStream<lm::builder::BuildingPayload>; std::size_t = > long > > > unsigned int]: Assertion `i < size()' failed. > > > > > > I'm runing a Linux server with Ubuntu 15.04 > > > > > > Any help will be appreciated > > > > > > Alex MartÃnez > > > > > > > > > _______________________________________________ > > > Moses-support mailing list > > > Moses-support@mit.edu <mailto:Moses-support@mit.edu> > <mailto:Moses-support@mit.edu <mailto:Moses-support@mit.edu>> > > > http://mailman.mit.edu/mailman/listinfo/moses-support > > > > > _______________________________________________ > > Moses-support mailing list > > Moses-support@mit.edu <mailto:Moses-support@mit.edu> > <mailto:Moses-support@mit.edu <mailto:Moses-support@mit.edu>> > > http://mailman.mit.edu/mailman/listinfo/moses-support > > > > > > > > > > -- > > When a place gets crowded enough to require ID's, social collapse > is not > > far away. It is time to go elsewhere. The best thing about space > travel > > is that it made it possible to go elsewhere. > > -- R.A. Heinlein, "Time Enough For Love" > > > > > -- > When a place gets crowded enough to require ID's, social collapse is not > far away. It is time to go elsewhere. The best thing about space travel > is that it made it possible to go elsewhere. > -- R.A. Heinlein, "Time Enough For Love" _______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support