Re: [Moses-support] Use of qsub array in moses-parallel.pl

Suzy Howlett Wed, 22 Dec 2010 14:20:47 -0800

Hi Lane,

Sorry it's taken me a while to get back to you. I've attempted to run anEMS experiment with your moses-parallel.pl, and (somewhatunsurprisingly) it failed.


For reference, my tuning/tmp.1 directory contains:
WR18184.W.log
filtered/
filterphrases.err
filterphrases.out
input.lc.1.split19899-aa    to    input.lc.1.split19899-aj
input.lc.1.split19899.trans    (empty)
job19899.bash
job19899.bash.e14659-1 to job19899.bash.e14659-10
job19899.log
job19899.sync_workaround_script.sh
mert1.W.log   (empty)
run1.moses.ini
run1.out    (empty)
tmp19899/   (empty)

I've attached a copy of job19899.bash and job19899.bash.e14659-1.

First problem: I don't have an environment variable $SGE_TASK_ID. So,it's always equal to "", and ${idxarray[$SGE_TASK_ID]} is also "".


I'm also not sure whether I have a $TASK_ID. Is it an SGE variable?

Second problem: This if statement:
if [ "" == "$SGE_TASK_ID" ]; then
        echo "Job was not submitted as an array job
"       exit 1
fi

For me, this prints "exit 1" instead of executing it. I assume that'snot what's supposed to happen?

Third: the output job19899.bash.e14659-1 shows it didn't like the Mosescommand in job19899.bash. Apart from the input file not existing(because the affix from idxarray didn't work), can anyone spot what'swrong? I can't figure it out.


Suzy


On 18/12/10 1:55 AM, Lane Schwartz wrote:

I have a modified version of moses-parallel.pl
<http://moses-parallel.pl> that uses the qsub -t flag to submit child
jobs as array jobs. I've verified that I get identical results using the
modified version and the current version from trunk.
Before I check this in, I would appreciate it if other could do a small
test run to verify that the modified version works the same on their
systems. Suzy, I'm especially interested in your feedback, since you're
running Torque instead of SGE.
 From your perspective as a user, there is no change in how you call
moses-parallel.pl <http://moses-parallel.pl>. The changes that you
should expect to see are:
* When you look at your child jobs using qstat or qmon, they will all
share the same job-ID, but will each have a unique ja-task-ID
* Child jobs will all show up with the name MOSES, instead of MOSES-aa,
MOSES-ab, etc. I tried to find a way to maintain the old naming format,
but AFAIK there's no way to do that with array job submission
* The temporary out.job* and err.job* files created during the run will
end with numeric suffixes (corresponding to the child ja-task-ID)
instead of the current alphabetic (-aa, -ab, -ac,...) suffixes. Again,
I tried but was unable to maintain the old naming scheme.
Thanks,
Lane

On Tue, Dec 14, 2010 at 4:07 PM, Lane Schwartz <[email protected]
<mailto:[email protected]>> wrote:

    I was wondering if any consideration has been given to using qsub's
    job array functionality in moses-parallel.pl
    <http://moses-parallel.pl/>.
    Using the qsub -t flag, jobs can be tied together, so that if the
    parent job is killed via qdel, all of the children are also killed.
    Currently, if a parallel job needs to be killed, the children must
    be manually deleted. This is OK if you only have one parallel job
    running, but if you have many, and you haven't overridden the
    default job name, things become hairier.
    I would potentially be willing to make the change, but I wanted to
    hear people's thoughts on the matter first.
    Cheers,
    Lane




--
When a place gets crowded enough to require ID's, social collapse is not
far away.  It is time to go elsewhere.  The best thing about space travel
is that it made it possible to go elsewhere.
                 -- R.A. Heinlein, "Time Enough For Love"


--
Suzy Howlett
http://www.showlett.id.au/

#!/bin/bash

#$ -S /bin/bash
#$ -o /home/showlett/working/reordered-wmt09/tuning/tmp.1/out.job19899.$TASK_ID
#$ -e /home/showlett/working/reordered-wmt09/tuning/tmp.1/err.job19899.$TASK_ID
#$ -N mert1

if [ "" == "$SGE_TASK_ID" ]; then
        echo "Job was not submitted as an array job
"       exit 1
fi

uname -a

ulimit -c 0

cd /home/showlett/working/reordered-wmt09/tuning/tmp.1

idxarray[1]=-aa
idxarray[2]=-ab
idxarray[3]=-ac
idxarray[4]=-ad
idxarray[5]=-ae
idxarray[6]=-af
idxarray[7]=-ag
idxarray[8]=-ah
idxarray[9]=-ai
idxarray[10]=-aj

/usr/local/moses/moses-cmd/src/moses  -w -0.217391 -lm 0.108696 -d 0.065217 
0.065217 0.065217 0.065217 0.065217 0.065217 0.065217 -tm 0.043478 0.043478 
0.043478 0.043478 0.043478  -config filtered/moses.ini -inputtype 0   
-n-best-list 
/home/showlett/working/reordered-wmt09/tuning/tmp.1/tmp19899/run1.best100.out.split19899${idxarray[$SGE_TASK_ID]}
 100 -input-file input.lc.1.split19899${idxarray[$SGE_TASK_ID]} > 
/home/showlett/working/reordered-wmt09/tuning/tmp.1/tmp19899/input.lc.1.split19899${idxarray[$SGE_TASK_ID]}.trans

echo exit status $?

\mv -f 
/home/showlett/working/reordered-wmt09/tuning/tmp.1/tmp19899/run1.best100.out.split19899${idxarray[$SGE_TASK_ID]}
 .

echo exit status $?

\mv -f 
/home/showlett/working/reordered-wmt09/tuning/tmp.1/tmp19899/input.lc.1.split19899${idxarray[$SGE_TASK_ID]}.trans
 .

echo exit status $?

Job was not submitted as an array job
 exit 1
Linux n002 2.6.31.6-2.caos #1 SMP Tue Nov 17 11:46:26 PST 2009 x86_64 x86_64 
x86_64 GNU/Linux
Defined parameters (per moses.ini or switch):
        config: filtered/moses.ini 
        distortion-file: 0-0 wbe-msd-bidirectional-fe-allff 6 
/home/showlett/working/reordered-wmt09/tuning/tmp.1/filtered/reordering-table.1.wbe-msd-bidirectional-fe
 
        distortion-limit: 6 
        input-factors: 0 
        input-file: input.lc.1.split19899 
        inputtype: 0 
        lmodel-file: 0 0 3 
/home/showlett/working/baseline-wmt09/lm/europarl.lm.1 
        mapping: 0 T 0 
        n-best-list: 
/home/showlett/working/reordered-wmt09/tuning/tmp.1/tmp19899/run1.best100.out.split19899
 100 
        ttable-file: 1 0 0 5 
/home/showlett/working/reordered-wmt09/tuning/tmp.1/filtered/phrase-table.0-0.1.1
 
        ttable-limit: 20 
        weight-d: 0.065217 0.065217 0.065217 0.065217 0.065217 0.065217 
0.065217 
        weight-l: 0.108696 
        weight-t: 0.043478 0.043478 0.043478 0.043478 0.043478 
        weight-w: -0.217391 
Usage:
        -beam-threshold (b): threshold for threshold pruning
        -cache-path: ?
        -clean-lm-cache: clean language model caches after N translations 
(default N=1)
        -config (f): location of the configuration file
        -consensus-decoding (con): use consensus decoding (De Nero et. al. 2009)
        -constraint: Location of the file with target sentences to produce 
constraining the search
        -continue-partial-translation (cpt): start from nonempty hypothesis
        -cube-pruning-diversity (cbd): How many hypotheses should be created 
for each coverage. (default = 0)
        -cube-pruning-pop-limit (cbp): How many hypotheses should be popped for 
each stack. (default = 1000)
        -description: Source language, target language, description
        -disable-discarding (dd): disable hypothesis discarding
        -distortion: configurations for each factorized/lexicalized reordering 
model.
        -distortion-file: source factors (0 if table independent of source), 
target factors, location of the factorized/lexicalized reordering tables
        -distortion-limit (dl): distortion (reordering) limit in maximum number 
of words (0 = monotone, -1 = unlimited)
        -drop-unknown (du): drop unknown words instead of copying them
        -early-discarding-threshold (edt): threshold for constructing 
hypotheses based on estimate cost
        -factor-delimiter (fd): specify a different factor delimiter than the 
default
        -generation-file: location and properties of the generation table
        -global-lexical-file (gl): discriminatively trained global lexical 
translation model file
        -include-alignment-in-n-best: include word alignment in the n-best 
list. default is false
        -input-factors: list of factors in the input
        -input-file (i): location of the input file to be translated
        -inputtype: text (0), confusion network (1), word lattice (2) (default 
= 0)
        -labeled-n-best-list: print out labels for each weight type in n-best 
list. default is true
        -lattice-hypo-set: to use lattice as hypo set during lattice MBR
        -link-param-count: Number of parameters on word links when using 
confusion networks or lattices (default = 1)
        -lmbr-map-weight: weight given to map solution when doing lattice MBR 
(default 0)
        -lmbr-p: unigram precision value for lattice mbr
        -lmbr-pruning-factor: average number of nodes/word wanted in pruned 
lattice
        -lmbr-r: ngram precision decay value for lattice mbr
        -lmbr-thetas: theta(s) for lattice mbr calculation
        -lminimum-bayes-risk (lmbr): use lattice miminum Bayes risk to 
determine best translation
        -lmodel-dub: dictionary upper bounds of language models
        -lmodel-file: location and properties of the language models
        -mapping: description of decoding steps
        -max-chart-span: maximum num. of source word chart rules can consume 
(default 10)
        -max-partial-trans-opt: maximum number of partial translation options 
per input span (during mapping steps)
        -max-phrase-length: maximum phrase length (default 20)
        -max-trans-opt-per-coverage: maximum number of translation options per 
input span (after applying mapping steps)
        -mbr-scale: scaling factor to convert log linear score probability in 
MBR decoding (default 1.0)
        -mbr-size: number of translation candidates considered in MBR decoding 
(default 200)
        -minimum-bayes-risk (mbr): use miminum Bayes risk to determine best 
translation
        -monotone-at-punctuation (mp): do not reorder over punctuation
        -n-best-factor: factor to compute the maximum number of contenders 
(=factor*nbest-size). value 0 means infinity, i.e. no threshold. default is 0
        -n-best-list: file and size of n-best-list to be generated; specify - 
as the file in order to write to STDOUT
        -non-terminals: list of non-term symbols, space separated
        -output-factors: list if factors in the output
        -output-hypo-score: Output the hypo score to stdout with the output 
string. For search error analysis. Default is false
        -output-search-graph (osg): Output connected hypotheses of search into 
specified filename
        -output-search-graph-extended (osgx): Output connected hypotheses of 
search into specified filename, in extended format
        -output-word-graph (owg): Output stack info as word graph. Takes 
filename, 0=only hypos in stack, 1=stack + nbest hypos
        -persistent-cache-size: maximum size of cache for translation options 
(default 10,000 input phrases)
        -phrase-drop-allowed (da): if present, allow dropping of source words
        -print-alignment-info: Output word-to-word alignment into the log file. 
Word-to-word alignments are takne from the phrase table if any. Default is false
        -print-alignment-info-in-n-best: Include word-to-word alignment in the 
n-best list. Word-to-word alignments are takne from the phrase table if any. 
Default is false
        -print-all-derivations: to print all derivations in search graph
        -recover-input-path (r): (conf net/word lattice only) - recover input 
path corresponding to the best translation
        -report-all-factors: report all factors in output, not just first
        -report-all-factors-in-n-best: Report all factors in n-best-lists. 
Default is false
        -report-segmentation (t): report phrase segmentation in the output
        -rule-limit: a little like table limit. But for chart decoding rules. 
Default is DEFAULT_MAX_TRANS_OPT_SIZE
        -search-algorithm: Which search algorithm to use. 0=normal stack, 
1=cube pruning, 2=cube growing. (default = 0)
        -show-weights: print feature weights and exit
        -source-label-overlap: What happens if a span already has a label. 
0=add more. 1=replace. 2=discard. Default is 0
        -stack (s): maximum stack size for histogram pruning
        -stack-diversity (sd): minimum number of hypothesis of each coverage in 
stack (default 0)
        -threads (th): number of threads to use in decoding (defaults to 
single-threaded)
        -time-out: seconds after which is interrupted (-1=no time-out, default 
is -1)
        -translation-details (T): for each best hypothesis, report translation 
details to the given file
        -translation-option-threshold (tot): threshold for translation options 
relative to best for input phrase
        -translation-systems: specify multiple translation systems, each 
consisting of an id, followed by a set of models ids, eg '0 T1 R1 L0'
        -ttable-file: location and properties of the translation tables
        -ttable-limit (ttl): maximum number of translation table entries per 
input phrase
        -unknown-lhs: file containing target lhs of unknown words. 1 per line: 
LHS prob
        -use-alignment-info: Use word-to-word alignment: actually it is only 
used to output the word-to-word alignment. Word-to-word alignments are taken 
from the phrase table if any. Default is false.
        -use-persistent-cache: cache translation options across sentences 
(default true)
        -verbose (v): verbosity level of the logging
        -weight-d (d): weight(s) for distortion (reordering components)
        -weight-e (e): weight for word deletion
        -weight-generation (g): weight(s) for generation components
        -weight-i (I): weight(s) for word insertion - used for parameters from 
confusion network and lattice input links
        -weight-l (lm): weight(s) for language models
        -weight-lex (lex): weight for global lexical model
        -weight-lr (lr): weight(s) for lexicalized reordering, if not included 
in weight-d
        -weight-t (tm): weights for translation model components
        -weight-u (u): weight for unknown word penalty
        -weight-w (w): weight for word penalty
        -xml-input (xi): allows markup of input with desired translations and 
probabilities. values can be 'pass-through' (default), 'inclusive', 
'exclusive', 'ignore'
exit status 1
mv: cannot stat 
`/home/showlett/working/reordered-wmt09/tuning/tmp.1/tmp19899/run1.best100.out.split19899':
 No such file or directory
exit status 1
mv: cannot move 
`/home/showlett/working/reordered-wmt09/tuning/tmp.1/tmp19899/input.lc.1.split19899.trans'
 to `./input.lc.1.split19899.trans': No such file or directory
exit status 1

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Use of qsub array in moses-parallel.pl

Reply via email to