++
---------- Forwarded message ---------
From: Namrata Hadimani <[email protected]>
Date: Fri, 23 Apr 2021 at 00:39
Subject: Re: [Moses-support] reg. moses installation
To: Kenneth Heafield <[email protected]>
Hi Kenneth,
Actually I am facing error in creating the Tokenisation itself. I tried to
run through the tokeniser.perl script but it didn't help me . There is no
log generated for this functionality.
Could you guide me more.
*Below are the scripts I ran :*
~/mosesdecoder/scripts/tokenizer/tokenizer.perl -l en \
< ~/corpus/training/news-commentary-v8.fr-en.en \
> ~/corpus/news-commentary-v8.fr-en.tok.en
*And I am getting this as a response :*
Tokenizer Version 1.1
Language: en
Number of threads: 1
*After this Step, the tokens are not created and the file is Empty. How
will I fix this problem?*
*Thanks in Advance*
Regards,
Namrata Hadimani
On Thu, 22 Apr 2021 at 23:45, Kenneth Heafield <[email protected]> wrote:
> Your training corpus is empty.
>
> cat ~/corpus/news-commentary-v8.fr-en.true.en
>
> On 4/22/21 9:50 PM, Namrata Hadimani wrote:
> > Hi All,
> >
> > I am trying to perform Language Model Training using the below command
> >
> > ~/mosesdecoder/bin/lmplz -o 3
> <~/corpus/news-commentary-v8.fr-en.true.en >
> news-commentary-v8.fr-en.arpa.en
> >
> >
> > followed similar to the link below :
> > http://www.statmt.org/moses/?n=Moses.Baseline
> > <http://www.statmt.org/moses/?n=Moses.Baseline>
> >
> > But I faced this issue :
> > === 1/5 Counting and sorting n-grams ===
> > Reading /home/namrata/smt/corpus/news-commentary-v8.fr-en.true.en
> >
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
> > tcmalloc: large alloc 3135389696 bytes == 0x144e000 @
> > tcmalloc: large alloc 10451279872 bytes == 0xbc272000 @
> > Unigram tokens 0 types 3
> > === 2/5 Calculating and sorting adjusted counts ===
> > Chain sizes: 1:36 2:4734547456 3:8877277184
> > tcmalloc: large alloc 8877277184 bytes == 0x144e000 @
> > tcmalloc: large alloc 4734550016 bytes == 0x32ba4e000 @
> > terminate called after throwing an instance of
> > 'lm::builder::BadDiscountException'
> > what(): /home/namrata/smt/mosesdecoder/lm/builder/adjust_counts.cc:53
> > in void
> > lm::builder::{anonymous}::StatCollector::CalculateDiscounts(const
> > lm::builder::DiscountConfig&) threw BadDiscountException because `s.n[j]
> > == 0'.
> > Could not calculate Kneser-Ney discounts for 1-grams with adjusted count
> > 2 because we didn't observe any 1-grams with adjusted count 1; Is this
> > small or artificial data?
> > Try deduplicating the input. To override this error for e.g. a
> > class-based model, rerun with --discount_fallback
> >
> > So I tried changing my command to the following as
> > : ~/mosesdecoder/bin/lmplz -o 3
> > <~/corpus/news-commentary-v8.fr-en.true.en >
> > news-commentary-v8.fr-en.arpa.en *--**discount_fallback*
> >
> > After this When i run the command : ~/mosesdecoder/bin/build_binary \
> >
> > news-commentary-v8.fr-en.arpa.en \
> > news-commentary-v8.fr-en.blm.en
> >
> >
> >
> > I am facing error like: lm/vocab.cc:324 in void
> > lm::ngram::MissingSentenceMarker(const lm::ngram::Config&, const char*)
> > threw SpecialWordMissingException.
> > The ARPA file is missing </s> and the model is configured to reject
> > these models. Run build_binary -s to disable this check. Byte: 66
> > ERROR
> >
> > Could anyone help me out with this, please.
> >
> > Regards,
> > Namrata Hadimani
> >
> > On Thu, 22 Apr 2021 at 18:32, Namrata Hadimani
> > <[email protected] <mailto:[email protected]>> wrote:
> >
> > Hi Hieu,
> >
> > Thanks for the help, I am able to successfully compile the moses
> > ToolKit.
> >
> > Regards,
> > Namrata Hadimani
> >
> > On Thu, 22 Apr 2021 at 17:23, Hieu Hoang <[email protected]
> > <mailto:[email protected]>> wrote:
> >
> > i've just successfully compiled moses-4.0 on ubuntu 20.04 with
> > boost 1.71.
> >
> > there's something wrong with your boost installation
> >
> > On 4/22/2021 3:09 AM, ram anirudh cherukupally wrote:
> >> There is atleast 60 GB space, so I think it is not space
> >> issue. Has moses-4.0 been tested for compilation using boost
> >> 1.71? Do you recommend using boost 1.64 (as exemplified in
> >> Moses manual?)
> >> Thank you
> >>
> >> On Thu, Apr 22, 2021 at 1:15 PM Hieu Hoang
> >> <[email protected] <mailto:[email protected]>> wrote:
> >>
> >> there seems to be a problem with the boost library. Is the
> >> disk full? Perhaps you should re-install boost
> >>
> >> On 4/21/2021 10:39 PM, ram anirudh cherukupally wrote:
> >>> Dear Moses-Support,
> >>>
> >>> Please find the build.log.gz attached as per the
> >>> instructions when the build failed.
> >>> My system specs:
> >>>
> >>> OS: Ubuntu 20.04
> >>> RAM: 8 GB
> >>> libboost-dev version: 1.71.0.0ubuntu2
> >>>
> >>> Command used for compiling moses: ./bjam -j4
> >>>
> >>> Thanks and Regards
> >>>
> >>> --
> >>> CH Ram Anirudh
> >>>
> >>>
> >>> _______________________________________________
> >>> Moses-support mailing list
> >>> [email protected] <mailto:[email protected]>
> >>> http://mailman.mit.edu/mailman/listinfo/moses-support <
> http://mailman.mit.edu/mailman/listinfo/moses-support>
> >>
> >> --
> >> Hieu Hoang
> >> http://statmt.org/hieu <http://statmt.org/hieu>
> >>
> >>
> >>
> >> --
> >> CH Ram Anirudh
> >>
> > --
> > Hieu Hoang
> > http://statmt.org/hieu <http://statmt.org/hieu>
> >
> > _______________________________________________
> > Moses-support mailing list
> > [email protected] <mailto:[email protected]>
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> > <http://mailman.mit.edu/mailman/listinfo/moses-support>
> >
> >
> > _______________________________________________
> > Moses-support mailing list
> > [email protected]
> > http://mailman.mit.edu/mailman/listinfo/moses-support
> >
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support