Sorry, I was wrong and Prashant was correct.
./compile-lm --text
creates the ARPA file.
Perhaps an easier way to create a LM using IRSTLM is to use the Moses
wrapper script
scripts/generic/trainlm-irst2.perl
This does steps 1 to 5 for you. Here is an example of how to run it
/home/s0565741/workspace/github/hh/scripts/generic/trainlm-irst2.perl
-cores 4 -irst-dir /home/s0565741/workspace/bin/irstlm/bin -p 0 -order 5
-text
/home/s0565741/workspace/experiment/europarl/en-es/lm/europarl.lowercased.1
-lm /home/s0565741/workspace/experiment/europarl/en-es/lm/europarl.lm.1
On 5 December 2013 15:12, renubalyan <[email protected]> wrote:
> Hi,
>
> Thanks for the response.
>
> I tried this option too, if I run the command without '--text yes' option
> then the command runs fine, However I wanted to ask one thing does this
> give me an arpa file or a binarized one? Because when I run the next
> command mentioned in the manual:
>
> 6. /home/renu/Desktop/mosesdecoder/bin/build_binary
> news-commentary-v8.fr-en.arpa.en news-commentary-v8.fr-en.blm.en
>
> *I get the following output:*
>
> Reading news-commentary-v8.fr-en.arpa.en
> ----5---10---15---20---25---30---35---40---45---50---55---60---65---70---75---80---85---90---95--100
>
> ****************************************************************************************************
>
> lm/read_arpa.cc:63 in void lm::ReadARPACounts(util::FilePiece&,
> std::vector<long long unsigned int>&) threw FormatLoadException because
> `line.size() >= 4 && StringPiece(line.data(), 4) == "blmt"'.
> *This looks like an IRSTLM binary file. Did you forget to pass --text yes
> to compile-lm? Byte: 40 File: news-commentary-v8.fr-en.arpa.en*
> ERROR
>
> The last second line put in bold indicates that the one I am using is a
> binary file.
> Does that mean I already have a binary file and I do not need to use
> step 6 mentioned above (which infact is for converting from arpa to binary
> file)
>
>
> Thanks
> Renu
>
>
>
>
>
> On December 5, 2013 at 4:19 PM Hieu Hoang <[email protected]> wrote:
>
> I'm not sure what is
> --text yes
> this is how the EMS runs IRSTLM compile-lm:
> .../compile-lm .../europarl_pos.lm.4 .../europarl_pos.binlm.4
>
>
>
> On 4 December 2013 15:58, renubalyan <[email protected]> wrote:
>
> Hi,
>
> I am building the baseline system based on Moses manual instructions.
>
> I have installed Moses, GIZA++ and IRSTLM as mentioned in the manual.
> The corpus preparation (tokenization, ...cleaning) steps also goes well.
>
> However when I move to Language Model Training: I have some problems
>
> I am following these steps:
>
> 1. mkdir ~/lm
>
> 2. cd ~/lm
>
> 3. /home/renu/Desktop/irstlm/bin/add-start-end.sh <
> /home/renu/Desktop/corpus/news-commentary-v8.fr-en.true.en>
> news-commentary-v8.fr-en.sb.en
>
> 4. export IRSTLM=/home/renu/Desktop/irstlm;
> /home/renu/Desktop/irstlm/bin/build-lm.sh -i news-commentary-v8.fr-en.sb.en
> -t ./tmp -p -s improved-kneser-ney -o news-commentary-v8.fr-en.lm.en
>
> 5. /home/renu/Desktop/irstlm/bin/compile-lm --text yes
> news-commentary-v8.fr-en.lm.en.gz news-commentary-v8.fr-en.arpa.en
>
> Steps 1-4 work well but step 5 gives me -------(Warning:Too many
> parameters)
>
> I have searched the web for any possible solution but could not find any.
>
> I am not able to move ahead, kindly help.
>
> Thanks
> Renu
>
> -------------------------------------------------------------------------------------------------------------------------------
>
> This e-mail is for the sole use of the intended recipient(s) and may
> contain confidential and privileged information. If you are not the
> intended recipient, please contact the sender by reply e-mail and destroy
> all copies and the original message. Any unauthorized review, use,
> disclosure, dissemination, forwarding, printing or copying of this email
> is strictly prohibited and appropriate legal action will be taken.
> -------------------------------------------------------------------------------------------------------------------------------
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
>
> --
> Hieu Hoang
> Research Associate
> University of Edinburgh
> http://www.hoang.co.uk/hieu
>
>
>
>
> -------------------------------------------------------------------------------------------------------------------------------
>
> This e-mail is for the sole use of the intended recipient(s) and may
> contain confidential and privileged information. If you are not the
> intended recipient, please contact the sender by reply e-mail and destroy
> all copies and the original message. Any unauthorized review, use,
> disclosure, dissemination, forwarding, printing or copying of this email
> is strictly prohibited and appropriate legal action will be taken.
> -------------------------------------------------------------------------------------------------------------------------------
>
>
--
Hieu Hoang
Research Associate
University of Edinburgh
http://www.hoang.co.uk/hieu
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support