Re: [Moses-support] Fwd: Moses: Prepare Data, Build Language Model and Train Model

Llio Humphreys Tue, 19 Aug 2008 08:41:40 -0700

Dear Eric/Moses Support Group,

I am using Ubuntu with 3.5GB RAM and finally got
train-factored-phrase-model.perl to run!
I am now on the tuning part of the tutorial, and I'm still using the
Baseline data to test out the system on my machine.
I adapted the command for tuning from:


bin/moses-scripts/scripts-YYYYMMDD-HHMM/training/mert-moses.pl
working-dir/tuning/input working-dir/tuning/reference
moses/moses-cmd/src/moses working-dir/model/moses.ini --working-dir
working-dir/tuning --rootdir bin/moses-scripts/scripts-YYYYMMDD-HHMM

to

mert-moses.pl europarl/tuning/input europarl/tuning/reference moses
model/moses.ini --working-dir europarl/tuning --rootdir
/usr/share/moses/scripts >&mert-moses-run.out

I get the error message:

After default: -l mem_free=0.5G -hard
Using SCRIPTS_ROOTDIR: /usr/share/moses/scripts
Not executable: moses at /usr/bin/mert-moses.pl line 297.

mert-moses.pl line 297 is empty but the previous line says:

die "Not executable: $___DECODER" if ! -x $___DECODER;

Grateful for your advice.

Thanks,
Llio Humphreys




On Thu, Aug 14, 2008 at 12:52 PM, Eric Nichols <[EMAIL PROTECTED]> wrote:
> Greetings,
>
> In the moses package, I install everything into /usr/share/moses and
> symlink the scripts and moses command into /usr/bin.
> You can see a list of installed files by running the following command:
>
> # dpkg -L moses
>
> When you call a command like ngram-count or
> train-factored-phrase-model.perl, you do not need to specify the full
> path;
> the system will be able to find it. I do not know if it is strictly
> necessary to set -scripts-root-dir, but the value
> /usr/share/moses/scripts works fine.
>
> Eric Nichols
>
> On Thu, Aug 14, 2008 at 8:02 PM, Llio Humphreys <[EMAIL PROTECTED]> wrote:
>> Dear Murat, Anung, Hieu, Josh, Eric, Miles, Sara, Amittai,
>> thank you all for your help.  It is very, very much appreciated. I
>> decided to try Eric's packages, and it looks like the installation
>> worked.  I typed some of the
>>  commands in the Baseline instructions without arguments, and the
>>  program either output to the screen that I missed some arguments or
>>  gave a description of the program.  Thank you Eric!!!
>>
>>  Following the Baseline instructions
>>  (http://www.statmt.org/wmt08/baseline.html) I have now got to the
>>  following step:
>>
>>  Use SRILM to build language model:
>>  /path-to-srilm/bin/i686/ngram-count -order 5 -interpolate -kndiscount
>>  -text working-dir/lm/europarl.lowercased -lm
>>  working-dir/lm/europarl.lm
>>
>>  In my case, I was in folder home/llio/MOSESMTDATA.  I didn't know the
>>  path to ngram-count, but it was possible to invoke it without the
>>  path:
>>
>>  ngram-count -order 5 -interpolate -kndiscount -text
>>  europarl/lm/europarl.lowercased -lm europarl/lm/europarl.lm
>>
>>  I'm concerned about two things:
>>  1) this ngram-count step is taking a very long time.  I think I started
>>  it off around 6pm yesterday, but it's still going.  It's very
>>  resource-intensive, and it's difficult to get to  other windows open.
>>  I went to check up on it around 9pm, and couldn't find that particular
>>  terminal.  I thought I had closed that terminal by mistake, so I stupidly
>>  opened another one, and entered the same command.  I subsequently
>>  found that the original terminal was still open, so I closed the
>>  second one.  I'm not sure if issuing this command a second time on the
>>  same program and files on a different terminal would corrupt the
>>  original ngramcount step, and whether I should start it off again, or
>>  whether starting it off again would make things worse?   I looked up
>>  ngram-count 
>> (http://www.speech.sri.com/projects/srilm/manpages/ngram-count.1.html)
>>  and I don't think it outputs to any file, so I guess you have to be in
>>  the same terminal to do the next step?  I opened
>>  another terminal and typed 'top' to see what processes are running,
>>  and I know that ngram-count is doing something, but whether it's doing
>>  well or stuck in a loop, I can't say.  What I do find strange is that
>> the time for ngram-count is said to be 00:58:20, and it's been going
>> for hours.. I searched this problem in previous Moses Group emails and
>> I understand that if I run this with order 4 instead of 5 it will run
>> quicker with very similar results?  So, can I just stop what it's
>> doing, and run this command in the same terminal with order 4?  Are
>> there any files I need to 'touch' to ensure that it doesn't leave any
>> stone unturned?
>>
>>  2) how to do the next step:
>>
>>  
>> bin/moses-scripts/scripts-YYYYMMDD-HHMM/training/train-factored-phrase-model.perl
>>  -scripts-root-dir bin/moses-scripts/scripts-YYYYMMDD-HHMM -root-dir
>>  working-dir -corpus working-dir/corpus/europarl.lowercased -f fr -e en
>>  -alignment grow-diag-final-and -reordering msd-bidirectional-fe -lm
>>  0:5:working-dir/lm/europarl.lm:0
>>
>> I assume that like ngram-count, I can just type in
>> train-factored-phrase-model.perl without the full path...Do I need to
>> set the -scripts-root-dir paramater?  Are all the scripts in the same
>> place?
>>
>> Thank you,
>>
>> Llio
>>
>>
>>
>>
>>  On 8/14/08, Murat ALPEREN <[EMAIL PROTECTED]> wrote:
>>  > Dear Llio,
>>  >
>>  > You should be okay with installing moses finally if you have installed all
>>  > tha dependant packages before. I am not aware of the 'whereis' command, 
>> but
>>  > once you train your model, your moses.ini file which is created by 
>> training
>>  > script will take care of the paths. However, you should carefully supply
>>  > paths while training your model. Before training your model, you should 
>> have
>>  > two seperate corpus files which are lowercased, sentence aligned and
>>  > accordingly tokenized (there are supplementary tools for this). Once you
>>  > have your corpus in two seperate files such as corpus.en, and corpus.fr 
>> you
>>  > will run a training perl script: train-factored-phrase-model.pl with 
>> various
>>  > parameters. If you need further help with this command after installing
>>  > moses and all training scripts, send me a reply including your exact path
>>  > for your corpus files and I will try to figure out the training command 
>> for
>>  > your paths.
>>  >
>>  > Cheers
>>  >
>>  >
>>  > On 8/13/08, Llio Humphreys <[EMAIL PROTECTED]> wrote:
>>  > > Hi Murat,
>>  > > thanks for this.  I've got Ubuntu 8.04 so the Hardy Heron packages are
>>  > > what I need also
>>  > > (http://cl.naist.jp/~eric-n/ubuntu-nlp/dists/hardy/all/).
>>  > >
>>  > > I think I already got the order wrong...(sign of panic maybe?)
>>  > > I clicked on mckls deb and the package installer said it was already
>>  > installed.
>>  > > I clicked on srilm deb and the package installer said it was already
>>  > > installed, so I clicked Reinstall package.
>>  > >
>>  > > I can't find anything that says the order of installation, but note
>>  > > that the workshop baseline model requires installing giza before mckls
>>  > > Do I need to uninstall mkcls (if so how? is it just a matter of
>>  > > deleting the .exc file?) or is it enough to click on Reinstall
>>  > > package?
>>  > >
>>  > > When all this is done, how do I use Moses?  Many of the commands in
>>  > > the baseline model
>>  > (http://www.statmt.org/wmt08/baseline.html) require
>>  > > pathnames to the various scripts and data:  is it necessary to amend
>>  > > these commands or can I just type 'whereis' command to find what I
>>  > > need?
>>  > >
>>  > > Thanks,
>>  > > Llio
>>  > >
>>  > >
>>  > > On Wed, Aug 13, 2008 at 1:48 PM, Murat ALPEREN <[EMAIL PROTECTED]>
>>  > wrote:
>>  > > > Dear Llio,
>>  > > >
>>  > > > Eric's page will probably help you, I have installed pre-compiled 
>> debian
>>  > > > based Ubuntu - Hardy Heron packages. All the necessary binaries are
>>  > included
>>  > > > in Eric's repository which will guide you for the dependancies, that
>>  > means
>>  > > > there's an order of installation which you should follow. As far as I
>>  > > > remember you should first install srilm, then mkcls, giza and finally
>>  > moses.
>>  > > > Then you will be able to train your models or run any model on your
>>  > machine
>>  > > >
>>  > > > Regards
>>  > > >
>>  > > >
>>  > > > On 8/13/08, Anung Ariwibowo <[EMAIL PROTECTED]> wrote:
>>  > > >>
>>  > > >> Hi Llio,
>>  > > >>
>>  > > >> I can compile SRILM in Linux Ubuntu without problem. Can you post the
>>  > > >> error message here, maybe we can help.
>>  > > >>
>>  > > >> Cheers,
>>  > > >> Anung
>>  > > >>
>>  > > >> On Wed, Aug 13, 2008 at 8:29 PM, Llio Humphreys <[EMAIL PROTECTED]>
>>  > > >> wrote:
>>  > > >>>
>>  > > >>> Dear Josh/Hieu,
>>  > > >>> many thanks for your replies.  The default shell is bash, and 
>> updating
>>  > > >>> the .profile file worked - thanks for that tip.  I look forward to
>>  > > >>> hearing more from you about the ./model/extract.0-0.o.part* problem.
>>  > > >>> My apologies for my ignorance of Unix matters: I'd like to think of
>>  > > >>> myself as a newbie rather than one who is averse to learning about
>>  > > >>> these things, and the further information you have provided has been
>>  > > >>> useful and interesting.  Hieu mentioned that Anung Ariwibowo got 
>> Moses
>>  > > >>> to work when he transferred to a Linux machine.  A colleague has
>>  > > >>> kindly let me borrow a Linux/Ubuntu machine, but I have already run
>>  > > >>> into problems compiling SRILM!  So, I'll see if Eric Nichols's
>>  > > >>> packages will take care of that:
>>  > > >>>
>>  > http://cl.naist.jp/~eric-n/ubuntu-nlp/dists/feisty/nlp/
>>  > > >>> Best regards,
>>  > > >>> Llio
>>  > > >>>
>>  > > >>>
>>  > > >>>
>>  > > >>> On 8/13/08, Josh Schroeder <[EMAIL PROTECTED]> wrote:
>>  > > >>> > Hi Llio,
>>  > > >>> >
>>  > > >>> >
>>  > > >>> > > you may have already received my email on the following problem
>>  > when
>>  > > >>> > > building the language model:
>>  > > >>> > >
>>  > > >>> > > Executing: cat ./model/extract.0-0.o.part* > 
>> ./model/extract.0-0.o
>>  > > >>> > > cat: ./model/extract.0-0.o.part*: No such file or directory
>>  > > >>> > > Exit code: 1
>>  > > >>> > >
>>  > > >>> >
>>  > > >>> >  That's building the phrase table, not the language model. It 
>> seems
>>  > > >>> > like
>>  > > >>> > several people on the list are having problems with this step, so
>>  > I'm
>>  > > >>> > going
>>  > > >>> > to take a look at the training process and post something to the
>>  > list
>>  > > >>> > in the
>>  > > >>> > next day or two.
>>  > > >>> >
>>  > > >>> >
>>  > > >>> > >
>>  > > >>> > > 1. You mention that Moses does not use environment variables.
>>  > > >>> > > However, in order to get SRILM to work, I found it necessary to
>>  > > >>> > > create
>>  > > >>> > > environment variables and pass these on to SRILM's make:
>>  > > >>> > >
>>  > > >>> > > make SRILM=$PWD MACHINE_TYPE=macosx
>>  > > >>> > >
>>  > > >>> >
>>  > > >>> >
>>  > 
>> PATH=/bin:/sbin:/usr/bin:/usr/sbin:/Users/lliohumphreys/MT/MOSESSUITE/srilm:/Users/lliohumphreys/MT/MOSESSUITE/srilm/bin:/Users/lliohumphreys/MT/MOSESSUITE/srilm/bin/macosx:/sw/bin/gawk
>>  > > >>> > >
>>  > MANPATH=/Users/lliohumphreys/MT/MOSESSUITE/srilm/man
>>  > > >>> > LC_NUMERIC=C
>>  > > >>> > >
>>  > > >>> > > In addition, I was also required to type in the following 
>> command
>>  > for
>>  > > >>> > > moses-scripts:
>>  > > >>> > >
>>  > > >>> > > export
>>  > > >>> >
>>  > > >>> >
>>  > 
>> SCRIPTS_ROOTDIR=/Users/lliohumphreys/MT/MOSESSUITE/bin/moses-scripts/scripts-20080811-1801
>>  > > >>> > >
>>  > > >>> > >
>>  > > >>> >
>>  > > >>> >  Sorry, I should have been more clear. Moses itself, the decoder
>>  > that
>>  > > >>> > loads
>>  > > >>> > a trained phrase table and language model and translates text, is 
>> a
>>  > > >>> > self-contained command-line program that doesn't require 
>> environment
>>  > > >>> > variables.
>>  > > >>> >
>>  > > >>> >  Your first example is compiling SRILM. This is not part of the
>>  > Moses
>>  > > >>> > toolkit: it's a toolkit of its own for language modeling and a ton
>>  > of
>>  > > >>> > other
>>  > > >>> > stuff. We use it as one of two possible integrated language models
>>  > (the
>>  > > >>> > other is IRSTLM) with Moses.
>>  > > >>> >
>>  > > >>> >  Your second example is part of the training regime. Yes, there is
>>  > some
>>  > > >>> > use
>>  > > >>> > of the SCRIPTS_ROOTDIR in the
>>  > > >>> > train-factored-phrase-model.perl, but for most
>>  > training
>>  > > >>> > support scripts that come with moses there is a flag that lets you
>>  > > >>> > specify
>>  > > >>> > SCRIPTS_ROOTDIR at the command line instead of storing it as an
>>  > > >>> > environment
>>  > > >>> > variable. In train-factored-phrase-model it's "-scripts-root-dir",
>>  > > >>> > which I
>>  > > >>> > think you've actually used in one of your other emails.
>>  > > >>> >
>>  > > >>> >
>>  > > >>> >
>>  > > >>> > > If I open a new terminal and echo these variables, most of them
>>  > are
>>  > > >>> > > blank, and PATH just gives the default bin paths.
>>  > > >>> > >
>>  > > >>> > > So, how do I make them permanent?  I assume that if I want to 
>> use
>>  > > >>> > > Moses again, it needs to have access to these variables?  How 
>> can
>>  > I
>>  > > >>> > > ensure that I can close the terminal, go home, open a new 
>> terminal
>>  > > >>> > > the
>>  > > >>> > > next day and get Moses working again?  A colleague suggested I
>>  > update
>>  > > >>> > > the .bashrc file to update each new terminal session with these
>>  > > >>> > > environment variables. However, my Mac system does not appear to
>>  > have
>>  > > >>> > > a .bashrc system as a default, and when I created one in my home
>>  > > >>> > > directory and opened a new terminal, it did not access the 
>> .bashrc
>>  > > >>> > > file.
>>  > > >>> > >
>>  > > >>> >
>>  > > >>> >  Here's some info on environment variables on the Mac, found with 
>> a
>>  > > >>> > quick
>>  > > >>> > Google search:
>>  > > >>>
>>  > >  http://www.macdevcenter.com/pub/a/mac/2004/02/24/bash.html
>>  > > >>> >
>>  > > >>> >  I tried it with .profile, that worked fine. Are you sure you're 
>> set
>>  > to
>>  > > >>> > use
>>  > > >>> > the bash shell? Try ' echo $SHELL ' in Terminal.
>>  > > >>> >
>>  > > >>> >
>>  > > >>> > > 2. You say that you ran the decoder on your laptop just fine, 
>> but
>>  > had
>>  > > >>> > > to change a few scripts for training.  I have very basic 
>> knowledge
>>  > of
>>  > > >>> > > Unix systems and installing open-source software: would it be
>>  > > >>> > > possible
>>  > > >>> > > for you to detail the changes you did to the scripts to get it 
>> to
>>  > run
>>  > > >>> > > on a Mac?  Although I need this information urgently, it may 
>> also
>>  > be
>>  > > >>> > > useful for other students who are installing Moses on a Mac and
>>  > who
>>  > > >>> > > may also have basic knowledge of Unix installation procedures.
>>  > > >>> > >
>>  > > >>> >
>>  > > >>> >  I'll look into this. Mac isn't really the platform of choice for
>>  > > >>> > training a
>>  > > >>> > Moses model and I do most of my work on linux. If I recall
>>  > correctly,
>>  > > >>> > an
>>  > > >>> > Intel-based Mac should be easier to get working than a PowerPC 
>> one.
>>  > The
>>  > > >>> > *decoder* does work on my Intel-based laptop, but I haven't run a
>>  > full
>>  > > >>> > training setup locally in some time -- most of the time we're
>>  > working
>>  > > >>> > with
>>  > > >>> > so much data that I use a cluster of linux machines instead of my
>>  > Mac.
>>  > > >>> >
>>  > > >>> >  As a word of caution: Moses isn't an out-of-the box translation
>>  > > >>> > solution
>>  > > >>> > for end users. It's research software undergoing active 
>> development,
>>  > so
>>  > > >>> > almost every user -- on any platform --  will need to muck around 
>> in
>>  > > >>> > the
>>  > > >>> > scripts at some point, or face a compile error or runtime crash. 
>> The
>>  > > >>> > ability
>>  > > >>> > to deal with unix/linux command line tools, and debug code and
>>  > scripts
>>  > > >>> > when
>>  > > >>> > necessary, is really important. That being said, I'll see what I 
>> can
>>  > do
>>  > > >>> > about highlighting where the scripts might have problems on the 
>> Mac.
>>  > > >>> >
>>  > > >>> >
>>  > > >>> > > 3. My final question: which is embarrasingly basic...can I use 
>> the
>>  > > >>> > > one
>>  > > >>> > > installation of Moses for different corpora, or do I need to do 
>> a
>>  > > >>> > > separate installation for each one?  Can I have separate
>>  > > >>> > > installations
>>  > > >>> > > of SRILM, Giza and mckls, or should they all reference the same
>>  > > >>> > > libraries?
>>  > > >>> > >
>>  > > >>> >
>>  > > >>> >  All you need to do to have moses use different corpora is point 
>> it
>>  > to
>>  > > >>> > a
>>  > > >>> > different moses.ini file. Assuming you have compiled moses with
>>  > support
>>  > > >>> > for
>>  > > >>> > the language model specified in the file (IRSTLM or SRILM), it 
>> will
>>  > > >>> > translate. You should only need one copy of giza, mkcls, 
>> irst/srilm,
>>  > > >>> > and
>>  > > >>> > moses. The code stays the same, it's the data model that's
>>  > different.
>>  > > >>> >
>>  > > >>> >  -Josh
>>  > > >>> >
>>  > > >>> >
>>  > > >>> >
>>  > > >>> >  --
>>  > > >>> >  The University of Edinburgh is a charitable body, registered in
>>  > > >>> >  Scotland, with registration number SC005336.
>>  > > >>> >
>>  > > >>> >
>>  > > >>> _______________________________________________
>>  > > >>> Moses-support mailing list
>>  > > >>> [email protected]
>>  > > >>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>  > > >>>
>>  > > >>
>>  > > >>
>>  > > >> --
>>  > > >> barliant at {gmail.com, yahoo.com}
>>  > > >> Starting July 2008, barliant at cbn.net.id is no longer active
>>  > > >> Visit my Blog at barliant dot blogspot dot com
>>  > > >>
>>  > > >> _______________________________________________
>>  > > >> Moses-support mailing list
>>  > > >> [email protected]
>>  > > >> http://mailman.mit.edu/mailman/listinfo/moses-support
>>  > > >>
>>  > > >
>>  > > >
>>  > >
>>  >
>>  >
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Fwd: Moses: Prepare Data, Build Language Model and Train Model

Reply via email to