Hello everyone!  I'm having some trouble I am hoping someone can help
me with.  For the most part I have been following the instructions on
this page: http://www.statmt.org/moses_steps.html.  I an running
Ubuntu Jaunty Jackalope (9.04) 64 bit OS with about 10 gigs of memory
on it.  With a few modifications and changes I was able to get all the
software installed and working.  I used the following alignment and
corpus:

http://wt.jrc.it/lt/Acquis/JRC-Acquis.3.0/alignments/jrc-en-es.xml.gz
http://wt.jrc.it/lt/Acquis/JRC-Acquis.3.0/corpus/jrc-en.tgz
http://wt.jrc.it/lt/Acquis/JRC-Acquis.3.0/corpus/jrc-es.tgz

I ran these commands on them:

perl getAlignmentWithText.pl jrc-en-es.xml > alignedCorpus_en_es.xml
grep '^<s1>' alignedCorpus_en_es.xml > ../../work/corpus/alignment_en
grep '^<s2>' alignedCorpus_en_es.xml > ../../work/corpus/alignment_es

Which I believe is how I am suppose to do it.  I then tokenized and
lowercased each file.  I also cleaned it using the following:

clean-corpus-n.perl work/corpus/alignment.tok.lowercase en es
work/corpus/alignment.tok.lowercase.clean 1 50

I then followed the Memory-Map LM and Phrase Table instructions to get
the binary phrase and reordering tables.  Here is my moses-bin.ini
file:

---
# input factors
[input-factors]
0

# mapping steps
[mapping]
0 T 0

# translation tables: source-factors, target-factors, number of scores, file
[ttable-file]
1 0 0 5 /home/mwade/demo/work2/model/phrase-table

# no generation models, no generation-file section

# language models: type(srilm/irstlm), factors, order, file
[lmodel-file]
1 0 3 /home/mwade/demo/work2/lm/alignment.tok.lowercase.clean.en.blm.mm


# limit on how many phrase translations e for each phrase f are loaded
# 0 = all elements loaded
[ttable-limit]
20

# distortion (reordering) files
[distortion-file]
0-0 wbe-msd-bidirectional-fe-allff 6
/home/mwade/demo/work2/model/reordering-table.wbe-msd-bidirectional-fe
---

Everything else in the file is default, but I can post the entire
thing if needed.  I do not fully understand the 1005 in front of the
phrase table, the 103 in front of the lmodel-file, or the 0-0 and 6 in
the distortion file.  That may be part of my problem.  I ran the
sanity check and it passed.  However when I run the following command:

nohup nice tools/moses-scripts/scripts-20100503-1638/training/mert-moses.pl
work/corpus/alignment.lowercase.en work/corpus/alignment.lowercase.es
tools/moses/moses-cmd/src/moses work/model/moses.ini --working-dir
work/tuning/mert --rootdir
/home/mwade/demo/tools/moses-scripts/scripts-20100503-1638
--decoder-flags "-v 0" >& work/tuning/mert.out &

The machine runs out of memory.  Here is the relevant info from mert.out:

filtering /home/mwade/demo/work/model/reordering-table.wbe-msd-bidirectional-fe
-> 
/home/mwade/demo/work/tuning/mert/filtered/reordering-table.wbe-msd-bidirectional-fe...
Can't open 'gzip -cd
/home/mwade/demo/work/model/reordering-table.wbe-msd-bidirectional-fe.gz
|' at 
/home/mwade/demo/tools/moses-scripts/scripts-20100503-1638//training/filter-model-given-input.pl
line 222.
Exit code: 12
Failed to filter the tables. at
tools/moses-scripts/scripts-20100503-1638/training/mert-moses.pl line
491.


It doesn't look like it is using the binary version of the
reordering-table to me.  The file it says it can not open  does exist
and I am able to open it fine by hand.  I ran the command using the
'head -n 100' subset and it worked fine, so everything else appears to
be working properly.  I just can't get it to run all the way through
without running out of memory on the full file set.  I ran an strace
on the command that showed my memory issue:

clone(child_stack=0,
flags=CLONE_CHILD_CLEARTID|CLONE_CHILD_SETTID|SIGCHLD,
child_tidptr=0x2b7986ec3e20) = -1 ENOMEM (Cannot allocate memory)


I know this is a lot of data to go through, and I hope I provided
enough information for someone to help me out.  If anything else is
needed, please let me know.  I really need to find a solution to this,
and I am sure it is a simple fix, but I can't find it.  Thanks in
advance!

-----
Eric Parker
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to