Hi Guchun

The mert.out file doesn't help that much. Is there any more information in the 
err and out files? 
eg
/home/guchun/Work/tasks/ro-en/tuning-sge/out.job12017-aa
/home/guchun/Work/tasks/ro-en/tuning-sge/err.job12017-aa

cheers - Barry

On Tuesday 15 Nov 2011 22:01:41 Guchun Zhang wrote:
> Hi there,
> 
> I am trying to tune on a SGE cluster. I ran the following command on the
> head node,
> 
> /home/guchun/Work/moses-scripts/scripts-20111111-1703/training/mert-moses.p
> l \
> /home/guchun/Work/tasks/ro-en/corpus/euparl.lc.ro \
> /home/guchun/Work/tasks/ro-en/corpus/euparl.lc.en \
> /home/guchun/Work/mosesdecoder/moses-cmd/src/moses \
> /home/guchun/Work/tasks/ro-en/trained/model/moses.ini \
> --mertdir /home/guchun/Work/mosesdecoder/mert/ \
> --rootdir /home/guchun/Work/moses-scripts/scripts-20111111-1703/ \
> --working-dir /home/guchun/Work/tasks/ro-en/tuning-sge/ \
> --jobs 2 --decoder-flag "-v 0" >&
> /home/guchun/Work/tasks/ro-en/tuning-sge/mert.out &
> 
> I got the following error,
> 
> check_exit_status
> check_exit_status of job -aa
> check_exit_status of job -ab
> *wc: euparl.lc.ro.split12017-aa.trans: No such file or directory*
> *Split (-aa) were not entirely translated*
> outputN= inputN=11966
> outputfile=euparl.lc.ro.split12017-aa.trans
> inputfile=euparl.lc.ro.split12017-aa
> *Split (-ab) were not entirely translated*
> outputN=0 inputN=11966
> outputfile=euparl.lc.ro.split12017-ab.trans
> inputfile=euparl.lc.ro.split12017-ab
> *everything crashed, not trying to resubmit jobs*
> *Got interrupt or something failed.*
> kill_all_and_quit
> qdel 56
> Executing: qdel 56
> Exit code: 1
> qdel 57
> Executing: qdel 57
> Exit code: 1
> Translation was not performed correctly
> or some of the submitted jobs died.
> qdel function was called for all submitted jobs
> Exit code: 1
> The decoder died. CONFIG WAS -w -0.322581 -lm 0.161290 -d 0.193548 -tm
> 0.064516 0.064516 0.064516 0.064516 0.064516
> 
> Any clue what may cause the problem? I have also attached the output file
> (mert.out) for full inspection.
> 
> Everything runs fine in serial execution (without --job 2).
> 
> I wonder if this can attribute to my SGE configuration. So if possible,
> could you please also give some advice on the parameter configuration of
> SGE?
> 
> Many thanks in advance,
> 
> Guchun
> 
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to