Hi Barry,

In out.job12017-aa,

Linux bunix-server 2.6.35-30-generic #60-Ubuntu SMP Mon Sep 19 20:45:08 UTC
2011 i686 GNU/Linux
ulimit: Command not found.
/home/guchun/Work/mosesdecoder/moses-cmd/src/moses: Exec format error.
Wrong Architecture.
Newline in variable name.

bunix-server is the hostname of the execution node. Complaints are similar
in out.job12017-ab (run on another node), too.

Cheers,

Guchun

On 16 November 2011 09:21, Barry Haddow <[email protected]> wrote:

> Hi Guchun
>
> The mert.out file doesn't help that much. Is there any more information in
> the
> err and out files?
> eg
> /home/guchun/Work/tasks/ro-en/tuning-sge/out.job12017-aa
> /home/guchun/Work/tasks/ro-en/tuning-sge/err.job12017-aa
>
> cheers - Barry
>
> On Tuesday 15 Nov 2011 22:01:41 Guchun Zhang wrote:
> > Hi there,
> >
> > I am trying to tune on a SGE cluster. I ran the following command on the
> > head node,
> >
> >
> /home/guchun/Work/moses-scripts/scripts-20111111-1703/training/mert-moses.p
> > l \
> > /home/guchun/Work/tasks/ro-en/corpus/euparl.lc.ro \
> > /home/guchun/Work/tasks/ro-en/corpus/euparl.lc.en \
> > /home/guchun/Work/mosesdecoder/moses-cmd/src/moses \
> > /home/guchun/Work/tasks/ro-en/trained/model/moses.ini \
> > --mertdir /home/guchun/Work/mosesdecoder/mert/ \
> > --rootdir /home/guchun/Work/moses-scripts/scripts-20111111-1703/ \
> > --working-dir /home/guchun/Work/tasks/ro-en/tuning-sge/ \
> > --jobs 2 --decoder-flag "-v 0" >&
> > /home/guchun/Work/tasks/ro-en/tuning-sge/mert.out &
> >
> > I got the following error,
> >
> > check_exit_status
> > check_exit_status of job -aa
> > check_exit_status of job -ab
> > *wc: euparl.lc.ro.split12017-aa.trans: No such file or directory*
> > *Split (-aa) were not entirely translated*
> > outputN= inputN=11966
> > outputfile=euparl.lc.ro.split12017-aa.trans
> > inputfile=euparl.lc.ro.split12017-aa
> > *Split (-ab) were not entirely translated*
> > outputN=0 inputN=11966
> > outputfile=euparl.lc.ro.split12017-ab.trans
> > inputfile=euparl.lc.ro.split12017-ab
> > *everything crashed, not trying to resubmit jobs*
> > *Got interrupt or something failed.*
> > kill_all_and_quit
> > qdel 56
> > Executing: qdel 56
> > Exit code: 1
> > qdel 57
> > Executing: qdel 57
> > Exit code: 1
> > Translation was not performed correctly
> > or some of the submitted jobs died.
> > qdel function was called for all submitted jobs
> > Exit code: 1
> > The decoder died. CONFIG WAS -w -0.322581 -lm 0.161290 -d 0.193548 -tm
> > 0.064516 0.064516 0.064516 0.064516 0.064516
> >
> > Any clue what may cause the problem? I have also attached the output file
> > (mert.out) for full inspection.
> >
> > Everything runs fine in serial execution (without --job 2).
> >
> > I wonder if this can attribute to my SGE configuration. So if possible,
> > could you please also give some advice on the parameter configuration of
> > SGE?
> >
> > Many thanks in advance,
> >
> > Guchun
> >
>
>


-- 

*Guchun Zhang*

Localization Engineer
Alpha CRC Ltd | Cambridge, UK
Direct: +44 1223 431035
[email protected] <[email protected]>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to