Hi All, I am trying to perform Language Model Training using the below command
~/mosesdecoder/bin/lmplz -o 3 <~/corpus/news-commentary-v8.fr-en.true.en > news-commentary-v8.fr-en.arpa.en followed similar to the link below : http://www.statmt.org/moses/?n=Moses.Baseline But I faced this issue : === 1/5 Counting and sorting n-grams === Reading /home/namrata/smt/corpus/news-commentary-v8.fr-en.true.en ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100 tcmalloc: large alloc 3135389696 bytes == 0x144e000 @ tcmalloc: large alloc 10451279872 bytes == 0xbc272000 @ Unigram tokens 0 types 3 === 2/5 Calculating and sorting adjusted counts === Chain sizes: 1:36 2:4734547456 3:8877277184 tcmalloc: large alloc 8877277184 bytes == 0x144e000 @ tcmalloc: large alloc 4734550016 bytes == 0x32ba4e000 @ terminate called after throwing an instance of 'lm::builder::BadDiscountException' what(): /home/namrata/smt/mosesdecoder/lm/builder/adjust_counts.cc:53 in void lm::builder::{anonymous}::StatCollector::CalculateDiscounts(const lm::builder::DiscountConfig&) threw BadDiscountException because `s.n[j] == 0'. Could not calculate Kneser-Ney discounts for 1-grams with adjusted count 2 because we didn't observe any 1-grams with adjusted count 1; Is this small or artificial data? Try deduplicating the input. To override this error for e.g. a class-based model, rerun with --discount_fallback So I tried changing my command to the following as : ~/mosesdecoder/bin/lmplz -o 3 <~/corpus/news-commentary-v8.fr-en.true.en > news-commentary-v8.fr-en.arpa.en *--**discount_fallback* After this When i run the command : ~/mosesdecoder/bin/build_binary \ news-commentary-v8.fr-en.arpa.en \ news-commentary-v8.fr-en.blm.en I am facing error like: lm/vocab.cc:324 in void lm::ngram::MissingSentenceMarker(const lm::ngram::Config&, const char*) threw SpecialWordMissingException. The ARPA file is missing </s> and the model is configured to reject these models. Run build_binary -s to disable this check. Byte: 66 ERROR Could anyone help me out with this, please. Regards, Namrata Hadimani On Thu, 22 Apr 2021 at 18:32, Namrata Hadimani <namrata.hadim...@mycit.ie> wrote: > Hi Hieu, > > Thanks for the help, I am able to successfully compile the moses ToolKit. > > Regards, > Namrata Hadimani > > On Thu, 22 Apr 2021 at 17:23, Hieu Hoang <hieuho...@gmail.com> wrote: > >> i've just successfully compiled moses-4.0 on ubuntu 20.04 with boost 1.71. >> >> there's something wrong with your boost installation >> On 4/22/2021 3:09 AM, ram anirudh cherukupally wrote: >> >> There is atleast 60 GB space, so I think it is not space issue. Has >> moses-4.0 been tested for compilation using boost 1.71? Do you recommend >> using boost 1.64 (as exemplified in Moses manual?) >> Thank you >> >> On Thu, Apr 22, 2021 at 1:15 PM Hieu Hoang <hieuho...@gmail.com> wrote: >> >>> there seems to be a problem with the boost library. Is the disk full? >>> Perhaps you should re-install boost >>> On 4/21/2021 10:39 PM, ram anirudh cherukupally wrote: >>> >>> Dear Moses-Support, >>> >>> Please find the build.log.gz attached as per the instructions when the >>> build failed. >>> My system specs: >>> >>> OS: Ubuntu 20.04 >>> RAM: 8 GB >>> libboost-dev version: 1.71.0.0ubuntu2 >>> >>> Command used for compiling moses: ./bjam -j4 >>> >>> Thanks and Regards >>> >>> -- >>> CH Ram Anirudh >>> >>> >>> _______________________________________________ >>> Moses-support mailing >>> listMoses-support@mit.eduhttp://mailman.mit.edu/mailman/listinfo/moses-support >>> >>> -- >>> Hieu Hoanghttp://statmt.org/hieu >>> >>> >> >> -- >> CH Ram Anirudh >> >> -- >> Hieu Hoanghttp://statmt.org/hieu >> >> _______________________________________________ >> Moses-support mailing list >> Moses-support@mit.edu >> http://mailman.mit.edu/mailman/listinfo/moses-support >> >
_______________________________________________ Moses-support mailing list Moses-support@mit.edu http://mailman.mit.edu/mailman/listinfo/moses-support