Re: [Moses-support] Problem tuning hierarchical models on Cluster

Hieu Hoang Wed, 25 Jan 2012 20:02:45 -0800

hi sandra

i added the
   -start-translation-id
 argument to the script
  moses-parallel.pl line 593


however, I must admit, i didn't test it on an SGE cluster and assumed it
would work. It's not important, it's only used when the phrase table needs
to be loaded per sentence, typically when using suffix array. This has not
been fully implemented.

 If you delete it and it works for you, please tell me and I'll roll back

apologies

On Wed, Jan 25, 2012 at 4:15 PM, Noubours, Sandra <
[email protected]> wrote:

> Hello,****
>
> ** **
>
> I encountered some problem when tuning hierarchical models using SunGrid.*
> ***
>
> The step TUNING_tune crashes and the error log tells me that file splits
> –ab, -ac, etc. have not been entirely translated:****
>
> ** **
>
> ---------------------->****
>
> …****
>
> Split (-ab) were not entirely translated****
>
> outputN=0 inputN=84****
>
> outputfile=input.tok.1.split4926-ab.trans
> inputfile=input.tok.1.split4926-ab****
>
> Split (-ac) were not entirely translated****
>
> outputN=0 inputN=84****
>
> …****
>
> Executing: qdel 717048****
>
> Exit code: 1****
>
> Translation was not performed correctly****
>
> or some of the submitted jobs died.****
>
> qdel function was called for all submitted jobs****
>
> Exit code: 1****
>
> The decoder died. CONFIG WAS -w -0.285714 -lm 0.142857 -tm 0.057143
> 0.057143 0.057143 0.057143 0.057143 0.285714****
>
> …****
>
> <----------------------****
>
> ** **
>
> I saw that the first split is translated correctly (-aa) and everything is
> fine but then, from the second split on, the generated files are empty,
> i.e.:****
>
> ** **
>
> ---------------------->****
>
> …/tuning/tmp.1/ tmp4926/run1.best100.out.split4926-ab****
>
> …/tuning/tmp.1/tmp4926/input.tok.1.split4926-ab.trans****
>
> …****
>
> <----------------------****
>
> The files generated from the first split (-aa) are ok.****
>
> ** **
>
> The logfile just tells me that the job has been submitted and the file
> out.job.4926-ab shows me that the decoder worked fine and translated
> everything :****
>
> ---------------------->****
>
> Linux tyr 2.6.37.6-0.7-default #1 SMP 2011-07-21 02:17:24 +0200 x86_64
> x86_64 x86_64 GNU/Linux****
>
> ulimit: Command not found.****
>
> Defined parameters (per moses.ini or switch):****
>
>                 config: /smt-work/tuning/moses.filtered.ini.1 ****
>
>                 cube-pruning-pop-limit: 1000 ****
>
>                 input-factors: 0 ****
>
>                 input-file: input.tok.1.split16649-aa ****
>
>                 inputtype: 0 ****
>
>                 lmodel-file: 1 0 5 /smt-work/lm/prsde.binlm.1 ****
>
>                 mapping: 0 T 0 1 T 1 ****
>
>                 max-chart-span: 20 1000 ****
>
>                 n-best-list:
> /smt-work/tuning/tmp.1/tmp16649/run1.best100.out.split16649-aa 100 ****
>
>                 non-terminals: X ****
>
>                 search-algorithm: 3 ****
>
>                 start-translation-id: 0 ****
>
>                 ttable-file: 2 0 0 5
> /smt-work/tuning/filtered.1/phrase-table.0-0.1.1.bin 6 0 0 1
> /smt-work/model/glue-grammar.1 ****
>
>                 ttable-limit: 20 ****
>
>                 weight-l: 0.142857 ****
>
>                 weight-t: 0.057143 0.057143 0.057143 0.057143 0.057143
> 0.285714 ****
>
>                 weight-w: -0.285714 ****
>
> Loading lexical distortion models...have 0 models****
>
> Start loading LanguageModel /smt-work/lm/prsde.binlm.1 : [0.000] seconds**
> **
>
> In LanguageModelIRST::Load: nGramOrder = 5****
>
> Language Model Type of /smt-work/lm/prsde.binlm.1 is 1****
>
> Qblmt****
>
> loadbin()****
>
> reading  256 centers****
>
> reading  256 centers****
>
> reading  256 centers****
>
> reading  256 centers****
>
> reading  256 centers****
>
> lmtable::loadbin_dict()****
>
> dict->size(): 260011****
>
> loadbin_level (level 1)****
>
> loading 260011 1-grams****
>
> done (level1)****
>
> loadbin_level (level 2)****
>
> loading 2194489 2-grams****
>
> done (level2)****
>
> loadbin_level (level 3)****
>
> loading 4658390 3-grams****
>
> done (level3)****
>
> loadbin_level (level 4)****
>
> loading 5850497 4-grams****
>
> done (level4)****
>
> loadbin_level (level 5)****
>
> loading 6015709 5-grams****
>
> done (level5)****
>
> done****
>
> OOV code is 260010****
>
> IRST: m_unknownId=260010****
>
> Finished loading LanguageModels : [0.000] seconds****
>
> Using uniform ttable-limit of 20 for all translation tables.****
>
> Start loading PhraseTable
> /smt-work/tuning/filtered.1/phrase-table.0-0.1.1.bin : [0.000] seconds****
>
> filePath: /smt-work/tuning/filtered.1/phrase-table.0-0.1.1.bin****
>
> Start loading PhraseTable /smt-work/model/glue-grammar.1 : [0.000] seconds
> ****
>
> filePath: /smt-work/model/glue-grammar.1****
>
> Finished loading phrase tables : [0.000] seconds****
>
> Start loading phrase table from /smt-work/model/glue-grammar.1 : [0.000]
> seconds****
>
> Start loading new format pt model : [0.000] seconds****
>
> Finished loading phrase tables : [0.000] seconds****
>
> Created input-output object : [0.000] seconds****
>
> Translating: <s> I go home </s> ****
>
> ** **
>
>   0   1   2 ****
>
>   1  20   0 ****
>
>    19   0 ****
>
>       1 ****
>
> BEST TRANSLATION: 44 S </s> :0-0 : pC=0.000, c=-0.573 [0..2] 24
> [total=-1.166] <<-1.303, 0.000, -6.626, -6.675, -4.867, -3.650, -1.163,
> 1.000, 1.000>>****
>
> reset caches****
>
> Translation took 0.050 seconds****
>
> ...****
>
> End. : [645.000] seconds****
>
> reset mmap****
>
> exit status 0****
>
> exit status 0****
>
> exit status 0****
>
> <----------------------****
>
> ** **
>
> But as mentioned above the generated translation and n-best files are
> empty. ****
>
> ** **
>
> Then I had a look at the starting bash scripts and I saw that it may have
> something to do with the option ”-start-translation-id” : The bash script
> of split -aa is run with “–start-translation-id 0” but the following splits
> are run with ”-start-translation-id 84“, “-start-translation-id 168”, and
> so on.****
>
> The job bash script then looks like this:****
>
> ---------------------->****
>
> /smt/moses/dist/bin/moses_chart  -w -0.285714 -lm 0.142857 -tm 0.057143
> 0.057143 0.057143 0.057143 0.057143 0.285714 -config
> /smt-work/tuning/moses.filtered.ini.1 -inputtype 0 -start-translation-id
> 84-n-best-list /smt-work/tuning/tmp.1/tmp4926/run1.best100.out.split4926-ab
> 100 -input-file /smt-work/tuning/tmp.1/input.tok.1.split4926-ab >
> /smt-work/tuning/tmp.1/tmp4926/input.tok.1.split4926-ab.trans  ****
>
> <----------------------****
>
> ** **
>
> When I run exactly the same script changing “-start-translation-id 84” to
> “-start-translation-id 0” everything works fine and the files are
> generated. ****
>
> I thought about deleting the option “-start-translation-id”  but I fear
> that it might be important for the all over tuning on the cluster (when the
> corpus file is splitted and then processed in parts). So maybe something is
> broken in the “moses_chart” concerning parallel processing or maybe I made
> an error when compiling? (When I run experiments for normal phrase models
> calling “moses” instead of “moses_chart” and without using “—hierarchical”
> everything works fine.)****
>
> ** **
>
> Thanks for your help in advance!****
>
> ** **
>
> Sandra ****
>
> ** **
>
> ** **
>
> ** **
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Problem tuning hierarchical models on Cluster

Reply via email to