Hi Lane,
Sorry it's taken me a while to get back to you. I've attempted to run an
EMS experiment with your moses-parallel.pl, and (somewhat
unsurprisingly) it failed.
For reference, my tuning/tmp.1 directory contains:
WR18184.W.log
filtered/
filterphrases.err
filterphrases.out
input.lc.1.split19899-aa to input.lc.1.split19899-aj
input.lc.1.split19899.trans (empty)
job19899.bash
job19899.bash.e14659-1 to job19899.bash.e14659-10
job19899.log
job19899.sync_workaround_script.sh
mert1.W.log (empty)
run1.moses.ini
run1.out (empty)
tmp19899/ (empty)
I've attached a copy of job19899.bash and job19899.bash.e14659-1.
First problem: I don't have an environment variable $SGE_TASK_ID. So,
it's always equal to "", and ${idxarray[$SGE_TASK_ID]} is also "".
I'm also not sure whether I have a $TASK_ID. Is it an SGE variable?
Second problem: This if statement:
if [ "" == "$SGE_TASK_ID" ]; then
echo "Job was not submitted as an array job
" exit 1
fi
For me, this prints "exit 1" instead of executing it. I assume that's
not what's supposed to happen?
Third: the output job19899.bash.e14659-1 shows it didn't like the Moses
command in job19899.bash. Apart from the input file not existing
(because the affix from idxarray didn't work), can anyone spot what's
wrong? I can't figure it out.
Suzy
On 18/12/10 1:55 AM, Lane Schwartz wrote:
I have a modified version of moses-parallel.pl
<http://moses-parallel.pl> that uses the qsub -t flag to submit child
jobs as array jobs. I've verified that I get identical results using the
modified version and the current version from trunk.
Before I check this in, I would appreciate it if other could do a small
test run to verify that the modified version works the same on their
systems. Suzy, I'm especially interested in your feedback, since you're
running Torque instead of SGE.
From your perspective as a user, there is no change in how you call
moses-parallel.pl <http://moses-parallel.pl>. The changes that you
should expect to see are:
* When you look at your child jobs using qstat or qmon, they will all
share the same job-ID, but will each have a unique ja-task-ID
* Child jobs will all show up with the name MOSES, instead of MOSES-aa,
MOSES-ab, etc. I tried to find a way to maintain the old naming format,
but AFAIK there's no way to do that with array job submission
* The temporary out.job* and err.job* files created during the run will
end with numeric suffixes (corresponding to the child ja-task-ID)
instead of the current alphabetic (-aa, -ab, -ac,...) suffixes. Again,
I tried but was unable to maintain the old naming scheme.
Thanks,
Lane
On Tue, Dec 14, 2010 at 4:07 PM, Lane Schwartz <[email protected]
<mailto:[email protected]>> wrote:
I was wondering if any consideration has been given to using qsub's
job array functionality in moses-parallel.pl
<http://moses-parallel.pl/>.
Using the qsub -t flag, jobs can be tied together, so that if the
parent job is killed via qdel, all of the children are also killed.
Currently, if a parallel job needs to be killed, the children must
be manually deleted. This is OK if you only have one parallel job
running, but if you have many, and you haven't overridden the
default job name, things become hairier.
I would potentially be willing to make the change, but I wanted to
hear people's thoughts on the matter first.
Cheers,
Lane
--
When a place gets crowded enough to require ID's, social collapse is not
far away. It is time to go elsewhere. The best thing about space travel
is that it made it possible to go elsewhere.
-- R.A. Heinlein, "Time Enough For Love"
--
Suzy Howlett
http://www.showlett.id.au/
#!/bin/bash
#$ -S /bin/bash
#$ -o /home/showlett/working/reordered-wmt09/tuning/tmp.1/out.job19899.$TASK_ID
#$ -e /home/showlett/working/reordered-wmt09/tuning/tmp.1/err.job19899.$TASK_ID
#$ -N mert1
if [ "" == "$SGE_TASK_ID" ]; then
echo "Job was not submitted as an array job
" exit 1
fi
uname -a
ulimit -c 0
cd /home/showlett/working/reordered-wmt09/tuning/tmp.1
idxarray[1]=-aa
idxarray[2]=-ab
idxarray[3]=-ac
idxarray[4]=-ad
idxarray[5]=-ae
idxarray[6]=-af
idxarray[7]=-ag
idxarray[8]=-ah
idxarray[9]=-ai
idxarray[10]=-aj
/usr/local/moses/moses-cmd/src/moses -w -0.217391 -lm 0.108696 -d 0.065217
0.065217 0.065217 0.065217 0.065217 0.065217 0.065217 -tm 0.043478 0.043478
0.043478 0.043478 0.043478 -config filtered/moses.ini -inputtype 0
-n-best-list
/home/showlett/working/reordered-wmt09/tuning/tmp.1/tmp19899/run1.best100.out.split19899${idxarray[$SGE_TASK_ID]}
100 -input-file input.lc.1.split19899${idxarray[$SGE_TASK_ID]} >
/home/showlett/working/reordered-wmt09/tuning/tmp.1/tmp19899/input.lc.1.split19899${idxarray[$SGE_TASK_ID]}.trans
echo exit status $?
\mv -f
/home/showlett/working/reordered-wmt09/tuning/tmp.1/tmp19899/run1.best100.out.split19899${idxarray[$SGE_TASK_ID]}
.
echo exit status $?
\mv -f
/home/showlett/working/reordered-wmt09/tuning/tmp.1/tmp19899/input.lc.1.split19899${idxarray[$SGE_TASK_ID]}.trans
.
echo exit status $?
Job was not submitted as an array job
exit 1
Linux n002 2.6.31.6-2.caos #1 SMP Tue Nov 17 11:46:26 PST 2009 x86_64 x86_64
x86_64 GNU/Linux
Defined parameters (per moses.ini or switch):
config: filtered/moses.ini
distortion-file: 0-0 wbe-msd-bidirectional-fe-allff 6
/home/showlett/working/reordered-wmt09/tuning/tmp.1/filtered/reordering-table.1.wbe-msd-bidirectional-fe
distortion-limit: 6
input-factors: 0
input-file: input.lc.1.split19899
inputtype: 0
lmodel-file: 0 0 3
/home/showlett/working/baseline-wmt09/lm/europarl.lm.1
mapping: 0 T 0
n-best-list:
/home/showlett/working/reordered-wmt09/tuning/tmp.1/tmp19899/run1.best100.out.split19899
100
ttable-file: 1 0 0 5
/home/showlett/working/reordered-wmt09/tuning/tmp.1/filtered/phrase-table.0-0.1.1
ttable-limit: 20
weight-d: 0.065217 0.065217 0.065217 0.065217 0.065217 0.065217
0.065217
weight-l: 0.108696
weight-t: 0.043478 0.043478 0.043478 0.043478 0.043478
weight-w: -0.217391
Usage:
-beam-threshold (b): threshold for threshold pruning
-cache-path: ?
-clean-lm-cache: clean language model caches after N translations
(default N=1)
-config (f): location of the configuration file
-consensus-decoding (con): use consensus decoding (De Nero et. al. 2009)
-constraint: Location of the file with target sentences to produce
constraining the search
-continue-partial-translation (cpt): start from nonempty hypothesis
-cube-pruning-diversity (cbd): How many hypotheses should be created
for each coverage. (default = 0)
-cube-pruning-pop-limit (cbp): How many hypotheses should be popped for
each stack. (default = 1000)
-description: Source language, target language, description
-disable-discarding (dd): disable hypothesis discarding
-distortion: configurations for each factorized/lexicalized reordering
model.
-distortion-file: source factors (0 if table independent of source),
target factors, location of the factorized/lexicalized reordering tables
-distortion-limit (dl): distortion (reordering) limit in maximum number
of words (0 = monotone, -1 = unlimited)
-drop-unknown (du): drop unknown words instead of copying them
-early-discarding-threshold (edt): threshold for constructing
hypotheses based on estimate cost
-factor-delimiter (fd): specify a different factor delimiter than the
default
-generation-file: location and properties of the generation table
-global-lexical-file (gl): discriminatively trained global lexical
translation model file
-include-alignment-in-n-best: include word alignment in the n-best
list. default is false
-input-factors: list of factors in the input
-input-file (i): location of the input file to be translated
-inputtype: text (0), confusion network (1), word lattice (2) (default
= 0)
-labeled-n-best-list: print out labels for each weight type in n-best
list. default is true
-lattice-hypo-set: to use lattice as hypo set during lattice MBR
-link-param-count: Number of parameters on word links when using
confusion networks or lattices (default = 1)
-lmbr-map-weight: weight given to map solution when doing lattice MBR
(default 0)
-lmbr-p: unigram precision value for lattice mbr
-lmbr-pruning-factor: average number of nodes/word wanted in pruned
lattice
-lmbr-r: ngram precision decay value for lattice mbr
-lmbr-thetas: theta(s) for lattice mbr calculation
-lminimum-bayes-risk (lmbr): use lattice miminum Bayes risk to
determine best translation
-lmodel-dub: dictionary upper bounds of language models
-lmodel-file: location and properties of the language models
-mapping: description of decoding steps
-max-chart-span: maximum num. of source word chart rules can consume
(default 10)
-max-partial-trans-opt: maximum number of partial translation options
per input span (during mapping steps)
-max-phrase-length: maximum phrase length (default 20)
-max-trans-opt-per-coverage: maximum number of translation options per
input span (after applying mapping steps)
-mbr-scale: scaling factor to convert log linear score probability in
MBR decoding (default 1.0)
-mbr-size: number of translation candidates considered in MBR decoding
(default 200)
-minimum-bayes-risk (mbr): use miminum Bayes risk to determine best
translation
-monotone-at-punctuation (mp): do not reorder over punctuation
-n-best-factor: factor to compute the maximum number of contenders
(=factor*nbest-size). value 0 means infinity, i.e. no threshold. default is 0
-n-best-list: file and size of n-best-list to be generated; specify -
as the file in order to write to STDOUT
-non-terminals: list of non-term symbols, space separated
-output-factors: list if factors in the output
-output-hypo-score: Output the hypo score to stdout with the output
string. For search error analysis. Default is false
-output-search-graph (osg): Output connected hypotheses of search into
specified filename
-output-search-graph-extended (osgx): Output connected hypotheses of
search into specified filename, in extended format
-output-word-graph (owg): Output stack info as word graph. Takes
filename, 0=only hypos in stack, 1=stack + nbest hypos
-persistent-cache-size: maximum size of cache for translation options
(default 10,000 input phrases)
-phrase-drop-allowed (da): if present, allow dropping of source words
-print-alignment-info: Output word-to-word alignment into the log file.
Word-to-word alignments are takne from the phrase table if any. Default is false
-print-alignment-info-in-n-best: Include word-to-word alignment in the
n-best list. Word-to-word alignments are takne from the phrase table if any.
Default is false
-print-all-derivations: to print all derivations in search graph
-recover-input-path (r): (conf net/word lattice only) - recover input
path corresponding to the best translation
-report-all-factors: report all factors in output, not just first
-report-all-factors-in-n-best: Report all factors in n-best-lists.
Default is false
-report-segmentation (t): report phrase segmentation in the output
-rule-limit: a little like table limit. But for chart decoding rules.
Default is DEFAULT_MAX_TRANS_OPT_SIZE
-search-algorithm: Which search algorithm to use. 0=normal stack,
1=cube pruning, 2=cube growing. (default = 0)
-show-weights: print feature weights and exit
-source-label-overlap: What happens if a span already has a label.
0=add more. 1=replace. 2=discard. Default is 0
-stack (s): maximum stack size for histogram pruning
-stack-diversity (sd): minimum number of hypothesis of each coverage in
stack (default 0)
-threads (th): number of threads to use in decoding (defaults to
single-threaded)
-time-out: seconds after which is interrupted (-1=no time-out, default
is -1)
-translation-details (T): for each best hypothesis, report translation
details to the given file
-translation-option-threshold (tot): threshold for translation options
relative to best for input phrase
-translation-systems: specify multiple translation systems, each
consisting of an id, followed by a set of models ids, eg '0 T1 R1 L0'
-ttable-file: location and properties of the translation tables
-ttable-limit (ttl): maximum number of translation table entries per
input phrase
-unknown-lhs: file containing target lhs of unknown words. 1 per line:
LHS prob
-use-alignment-info: Use word-to-word alignment: actually it is only
used to output the word-to-word alignment. Word-to-word alignments are taken
from the phrase table if any. Default is false.
-use-persistent-cache: cache translation options across sentences
(default true)
-verbose (v): verbosity level of the logging
-weight-d (d): weight(s) for distortion (reordering components)
-weight-e (e): weight for word deletion
-weight-generation (g): weight(s) for generation components
-weight-i (I): weight(s) for word insertion - used for parameters from
confusion network and lattice input links
-weight-l (lm): weight(s) for language models
-weight-lex (lex): weight for global lexical model
-weight-lr (lr): weight(s) for lexicalized reordering, if not included
in weight-d
-weight-t (tm): weights for translation model components
-weight-u (u): weight for unknown word penalty
-weight-w (w): weight for word penalty
-xml-input (xi): allows markup of input with desired translations and
probabilities. values can be 'pass-through' (default), 'inclusive',
'exclusive', 'ignore'
exit status 1
mv: cannot stat
`/home/showlett/working/reordered-wmt09/tuning/tmp.1/tmp19899/run1.best100.out.split19899':
No such file or directory
exit status 1
mv: cannot move
`/home/showlett/working/reordered-wmt09/tuning/tmp.1/tmp19899/input.lc.1.split19899.trans'
to `./input.lc.1.split19899.trans': No such file or directory
exit status 1
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support