Hi Hieu!
I commented out line 593 of moses-parallel.pl and changed the following
line by removing the $tmpStartTranslationId var and everything works
fine now!
Thank you!
Sandra
# my $tmpStartTranslationId = "-start-translation-id
$currStartTranslationId";
print OUT "$mosescmd $mosesparameters $tmpalioutfile $tmpwordgraphlist
$tmpsearchgraphlist $tmpnbestlist $inputmethod
${inputfile}.$splitpfx$idx >
$tmpdir/${inputfile}.$splitpfx$idx.trans\n\n";
Von: Hieu Hoang [mailto:[email protected]]
Gesendet: Donnerstag, 26. Januar 2012 05:01
An: Noubours, Sandra
Cc: [email protected]
Betreff: Re: [Moses-support] Problem tuning hierarchical models on
Cluster
hi sandra
i added the
-start-translation-id
argument to the script
moses-parallel.pl line 593
however, I must admit, i didn't test it on an SGE cluster and assumed it
would work. It's not important, it's only used when the phrase table
needs to be loaded per sentence, typically when using suffix array. This
has not been fully implemented.
If you delete it and it works for you, please tell me and I'll roll
back
apologies
On Wed, Jan 25, 2012 at 4:15 PM, Noubours, Sandra
<[email protected]> wrote:
Hello,
I encountered some problem when tuning hierarchical models using
SunGrid.
The step TUNING_tune crashes and the error log tells me that file splits
-ab, -ac, etc. have not been entirely translated:
---------------------->
...
Split (-ab) were not entirely translated
outputN=0 inputN=84
outputfile=input.tok.1.split4926-ab.trans
inputfile=input.tok.1.split4926-ab
Split (-ac) were not entirely translated
outputN=0 inputN=84
...
Executing: qdel 717048
Exit code: 1
Translation was not performed correctly
or some of the submitted jobs died.
qdel function was called for all submitted jobs
Exit code: 1
The decoder died. CONFIG WAS -w -0.285714 -lm 0.142857 -tm 0.057143
0.057143 0.057143 0.057143 0.057143 0.285714
...
<----------------------
I saw that the first split is translated correctly (-aa) and everything
is fine but then, from the second split on, the generated files are
empty, i.e.:
---------------------->
.../tuning/tmp.1/ tmp4926/run1.best100.out.split4926-ab
.../tuning/tmp.1/tmp4926/input.tok.1.split4926-ab.trans
...
<----------------------
The files generated from the first split (-aa) are ok.
The logfile just tells me that the job has been submitted and the file
out.job.4926-ab shows me that the decoder worked fine and translated
everything :
---------------------->
Linux tyr 2.6.37.6-0.7-default #1 SMP 2011-07-21 02:17:24 +0200 x86_64
x86_64 x86_64 GNU/Linux
ulimit: Command not found.
Defined parameters (per moses.ini or switch):
config: /smt-work/tuning/moses.filtered.ini.1
cube-pruning-pop-limit: 1000
input-factors: 0
input-file: input.tok.1.split16649-aa
inputtype: 0
lmodel-file: 1 0 5 /smt-work/lm/prsde.binlm.1
mapping: 0 T 0 1 T 1
max-chart-span: 20 1000
n-best-list:
/smt-work/tuning/tmp.1/tmp16649/run1.best100.out.split16649-aa 100
non-terminals: X
search-algorithm: 3
start-translation-id: 0
ttable-file: 2 0 0 5
/smt-work/tuning/filtered.1/phrase-table.0-0.1.1.bin 6 0 0 1
/smt-work/model/glue-grammar.1
ttable-limit: 20
weight-l: 0.142857
weight-t: 0.057143 0.057143 0.057143 0.057143 0.057143
0.285714
weight-w: -0.285714
Loading lexical distortion models...have 0 models
Start loading LanguageModel /smt-work/lm/prsde.binlm.1 : [0.000] seconds
In LanguageModelIRST::Load: nGramOrder = 5
Language Model Type of /smt-work/lm/prsde.binlm.1 is 1
Qblmt
loadbin()
reading 256 centers
reading 256 centers
reading 256 centers
reading 256 centers
reading 256 centers
lmtable::loadbin_dict()
dict->size(): 260011
loadbin_level (level 1)
loading 260011 1-grams
done (level1)
loadbin_level (level 2)
loading 2194489 2-grams
done (level2)
loadbin_level (level 3)
loading 4658390 3-grams
done (level3)
loadbin_level (level 4)
loading 5850497 4-grams
done (level4)
loadbin_level (level 5)
loading 6015709 5-grams
done (level5)
done
OOV code is 260010
IRST: m_unknownId=260010
Finished loading LanguageModels : [0.000] seconds
Using uniform ttable-limit of 20 for all translation tables.
Start loading PhraseTable
/smt-work/tuning/filtered.1/phrase-table.0-0.1.1.bin : [0.000] seconds
filePath: /smt-work/tuning/filtered.1/phrase-table.0-0.1.1.bin
Start loading PhraseTable /smt-work/model/glue-grammar.1 : [0.000]
seconds
filePath: /smt-work/model/glue-grammar.1
Finished loading phrase tables : [0.000] seconds
Start loading phrase table from /smt-work/model/glue-grammar.1 : [0.000]
seconds
Start loading new format pt model : [0.000] seconds
Finished loading phrase tables : [0.000] seconds
Created input-output object : [0.000] seconds
Translating: <s> I go home </s>
0 1 2
1 20 0
19 0
1
BEST TRANSLATION: 44 S </s> :0-0 : pC=0.000, c=-0.573 [0..2] 24
[total=-1.166] <<-1.303, 0.000, -6.626, -6.675, -4.867, -3.650, -1.163,
1.000, 1.000>>
reset caches
Translation took 0.050 seconds
...
End. : [645.000] seconds
reset mmap
exit status 0
exit status 0
exit status 0
<----------------------
But as mentioned above the generated translation and n-best files are
empty.
Then I had a look at the starting bash scripts and I saw that it may
have something to do with the option "-start-translation-id" : The bash
script of split -aa is run with "-start-translation-id 0" but the
following splits are run with "-start-translation-id 84",
"-start-translation-id 168", and so on.
The job bash script then looks like this:
---------------------->
/smt/moses/dist/bin/moses_chart -w -0.285714 -lm 0.142857 -tm 0.057143
0.057143 0.057143 0.057143 0.057143 0.285714 -config
/smt-work/tuning/moses.filtered.ini.1 -inputtype 0 -start-translation-id
84-n-best-list
/smt-work/tuning/tmp.1/tmp4926/run1.best100.out.split4926-ab 100
-input-file /smt-work/tuning/tmp.1/input.tok.1.split4926-ab >
/smt-work/tuning/tmp.1/tmp4926/input.tok.1.split4926-ab.trans
<----------------------
When I run exactly the same script changing "-start-translation-id 84"
to "-start-translation-id 0" everything works fine and the files are
generated.
I thought about deleting the option "-start-translation-id" but I fear
that it might be important for the all over tuning on the cluster (when
the corpus file is splitted and then processed in parts). So maybe
something is broken in the "moses_chart" concerning parallel processing
or maybe I made an error when compiling? (When I run experiments for
normal phrase models calling "moses" instead of "moses_chart" and
without using "-hierarchical" everything works fine.)
Thanks for your help in advance!
Sandra
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support