Hi Lane According to your qacct, your job was running for over 2 days and hit 25G of memory. Could the system have killed the job for exceeding some resource limit?
I think it's the shell that prints 'sh: line 1: 29188 Killed', and the 29188 is the pid. cheers - Barry On Friday 17 Feb 2012 19:41:35 Lane Schwartz wrote: > Hi all, > > A number of my jobs keep dying during MERT, and I'm having trouble > tracking down what's going on. I submit all of my jobs using SGE, so > it's possible there's an interaction there. > > Can anyone help me understand what's going on below: > > > sh: line 1: 29188 Killed > /free/lane/slm-merging-trunk/moses-cmd/src/moses -config > /scratch4/lane/2011-12-15_europarl/config/de-en/filtered/filtered.ttable20. > dist05.synlm50.ini -inputtype 0 -w -0.178571 -slm 0.178571 -lm 0.089286 -d > 0.053571 > 0.053571 0.053571 0.053571 0.053571 0.053571 0.053571 -tm 0.035714 > 0.035714 0.035714 0.035714 0.035714 -n-best-list run1.best100.out 100 > -input-file /scratch4/lane/2011-12-15_europarl/corpus/dev.tok.norm.de > > > run1.out > > Exit code: 137 > The decoder died. CONFIG WAS -w -0.178571 -slm 0.178571 -lm 0.089286 > -d 0.053571 0.053571 0.053571 0.053571 0.053571 0.053571 0.053571 -tm > 0.035714 0.035714 0.035714 0.035714 0.035714 > > > > I've searched for the meaning of exit code 137, and what I've read > says that's the exit code for a process that received kill signal 9. > > I'm especially puzzled by "sh: line 1: 29188 Killed". > > I'm pretty sure that the safesystem function in the moses-mert.pl > script is printing "Exit code: 137", and I'm assuming that the moses > command itself is being launched by the "system(@_)" command within > that same safesystem function. But I don't know what is responsible > for printing "sh: line 1: 29188 Killed", or what "line 1" and "29188" > refer to. > > For what it's worth, I'm attaching the results of running qacct -j on > the job after it died. I don't think it is relevant, but I guess it > could be. > > Thanks, > Lane > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
