Thanks a lot. I managed to create the lm using the perl script instead of using steps 1-5.
Regards Renu On December 5, 2013 at 10:00 PM Hieu Hoang <[email protected]> wrote: > Sorry, I was wrong and Prashant was correct. > ./compile-lm --text > creates the ARPA file. > > Perhaps an easier way to create a LM using IRSTLM is to use the Moses wrapper > script > scripts/generic/trainlm-irst2.perl > > This does steps 1 to 5 for you. Here is an example of how to run it > > /home/s0565741/workspace/github/hh/scripts/generic/trainlm-irst2.perl > -cores 4 -irst-dir /home/s0565741/workspace/bin/irstlm/bin -p 0 -order 5 > -text > /home/s0565741/workspace/experiment/europarl/en-es/lm/europarl.lowercased.1 > -lm /home/s0565741/workspace/experiment/europarl/en-es/lm/europarl.lm.1 > > > > > > On 5 December 2013 15:12, renubalyan <[email protected] > <mailto:[email protected]> > wrote: > > > Hi, > > > > Thanks for the response. > > > > I tried this option too, if I run the command without '--text yes' option > > then the command runs fine, However I wanted to ask one thing does this > > give me an arpa file or a binarized one? Because when I run the next command > > mentioned in the manual: > > > > 6. /home/renu/Desktop/mosesdecoder/bin/build_binary > > news-commentary-v8.fr-en.arpa.en news-commentary-v8.fr-en.blm.en > > > > I get the following output: > > > > Reading news-commentary-v8.fr-en.arpa.en > > > > > > ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100 > > > > > > **************************************************************************************************** > > lm/read_arpa.cc:63 in void lm::ReadARPACounts(util::FilePiece&, > > std::vector<long long unsigned int>&) threw FormatLoadException because > > `line.size() >= 4 && StringPiece(line.data(), 4) == "blmt"'. > > This looks like an IRSTLM binary file. Did you forget to pass --text yes > > to compile-lm? Byte: 40 File: news-commentary-v8.fr-en.arpa.en > > ERROR > > The last second line put in bold indicates that the one I am using is a > > binary file. > > Does that mean I already have a binary file and I do not need to use > > step 6 mentioned above (which infact is for converting from arpa to binary > > file) > > > > > > Thanks > > Renu > > > > > > > > > > > > On December 5, 2013 at 4:19 PM Hieu Hoang < [email protected] > > <mailto:[email protected]> > wrote: > > > > > > > I'm not sure what is > > > --text yes > > > this is how the EMS runs IRSTLM compile-lm: > > > .../compile-lm .../europarl_pos.lm.4 .../europarl_pos.binlm.4 > > > > > > > > > On 4 December 2013 15:58, renubalyan <[email protected] > > > <mailto:[email protected]> > wrote: > > > > > > > Hi, > > > > > > > > I am building the baseline system based on Moses manual > > > > instructions. > > > > > > > > I have installed Moses, GIZA++ and IRSTLM as mentioned in the > > > > manual. > > > > The corpus preparation (tokenization, ...cleaning) steps also goes > > > > well. > > > > > > > > However when I move to Language Model Training: I have some > > > > problems > > > > > > > > I am following these steps: > > > > > > > > 1. mkdir ~/lm > > > > > > > > 2. cd ~/lm > > > > > > > > 3. /home/renu/Desktop/irstlm/bin/add-start-end.sh < > > > > /home/renu/Desktop/corpus/news-commentary-v8.fr-en.true.en> > > > > news-commentary-v8.fr-en.sb.en > > > > > > > > 4. export IRSTLM=/home/renu/Desktop/irstlm; > > > > /home/renu/Desktop/irstlm/bin/build-lm.sh -i > > > > news-commentary-v8.fr-en.sb.en -t ./tmp -p -s improved-kneser-ney -o > > > > news-commentary-v8.fr-en.lm.en > > > > > > > > 5. /home/renu/Desktop/irstlm/bin/compile-lm --text yes > > > > news-commentary-v8.fr-en.lm.en.gz news-commentary-v8.fr-en.arpa.en > > > > > > > > Steps 1-4 work well but step 5 gives me -------(Warning:Too many > > > > parameters) > > > > > > > > I have searched the web for any possible solution but could not > > > > find any. > > > > > > > > I am not able to move ahead, kindly help. > > > > > > > > Thanks > > > > Renu > > > > > > > > > > > > > > > > ------------------------------------------------------------------------------------------------------------------------------- > > > > This e-mail is for the sole use of the intended recipient(s) and > > > > may > > > > contain confidential and privileged information. If you are not > > > > the > > > > intended recipient, please contact the sender by reply e-mail and > > > > destroy > > > > all copies and the original message. Any unauthorized review, use, > > > > disclosure, dissemination, forwarding, printing or copying of this > > > > email > > > > is strictly prohibited and appropriate legal action will be taken. > > > > > > > > > > > > ------------------------------------------------------------------------------------------------------------------------------- > > > > > > > > _______________________________________________ > > > > Moses-support mailing list > > > > [email protected] <mailto:[email protected]> > > > > http://mailman.mit.edu/mailman/listinfo/moses-support > > > > <http://mailman.mit.edu/mailman/listinfo/moses-support> > > > > > > > > > > > > > > > > -- > > > Hieu Hoang > > > Research Associate > > > University of Edinburgh > > > http://www.hoang.co.uk/hieu <http://www.hoang.co.uk/hieu> > > > > > > > > > > > > > > > > ------------------------------------------------------------------------------------------------------------------------------- > > This e-mail is for the sole use of the intended recipient(s) and may > > contain confidential and privileged information. If you are not the > > intended recipient, please contact the sender by reply e-mail and destroy > > all copies and the original message. Any unauthorized review, use, > > disclosure, dissemination, forwarding, printing or copying of this email > > is strictly prohibited and appropriate legal action will be taken. > > > > > > ------------------------------------------------------------------------------------------------------------------------------- > > > > > > -- > Hieu Hoang > Research Associate > University of Edinburgh > http://www.hoang.co.uk/hieu <http://www.hoang.co.uk/hieu> > > ------------------------------------------------------------------------------------------------------------------------------- This e-mail is for the sole use of the intended recipient(s) and may contain confidential and privileged information. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies and the original message. Any unauthorized review, use, disclosure, dissemination, forwarding, printing or copying of this email is strictly prohibited and appropriate legal action will be taken. -------------------------------------------------------------------------------------------------------------------------------
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
