Lewis John McGibbney created JOSHUA-318:
-------------------------------------------
Summary: scripts/training/run_tuner.py should enable configurable
memory usage when invioking joshua-decoder
Key: JOSHUA-318
URL: https://issues.apache.org/jira/browse/JOSHUA-318
Project: Joshua
Issue Type: Improvement
Components: tuner
Affects Versions: 6.0.5
Reporter: Lewis John McGibbney
Fix For: 6.2
When I run the run_tuner.py script I can easily run into the following
{code}
[mert-1] rebuilding...
dep=/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en
dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config
[CHANGED]
dep=tune/model/grammar.gz.packed/slice_00000.source [CHANGED]
dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config.final
[NOT FOUND]
cmd=/usr/local/incubator-joshua/scripts/training/run_tuner.py
/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en
/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.ru
--tunedir /usr/local/joshua_resources/russian_experiments/exp3/tune --tuner
mert --decoder
/usr/local/joshua_resources/russian_experiments/exp3/tune/decoder_command
--decoder-config
/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config
--decoder-output-file
/usr/local/joshua_resources/russian_experiments/exp3/tune/output.nbest
--decoder-log-file
/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.log
--iterations 10 --metric 'BLEU 4 closest'
JOB FAILED (return code 1)
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
at
org.apache.joshua.decoder.ff.tm.packed.PackedGrammar$PackedSlice.initializeFeatureStructures(PackedGrammar.java:385)
at
org.apache.joshua.decoder.ff.tm.packed.PackedGrammar$PackedSlice.<init>(PackedGrammar.java:368)
at
org.apache.joshua.decoder.ff.tm.packed.PackedGrammar.<init>(PackedGrammar.java:153)
at
org.apache.joshua.decoder.Decoder.initializeTranslationGrammars(Decoder.java:458)
at org.apache.joshua.decoder.Decoder.initialize(Decoder.java:389)
at org.apache.joshua.decoder.Decoder.<init>(Decoder.java:128)
at org.apache.joshua.decoder.JoshuaDecoder.main(JoshuaDecoder.java:69)
Traceback (most recent call last):
File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 553,
in <module>
main(sys.argv)
File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 536,
in main
run_zmert(opts.tunedir, opts.source, opts.target, opts.decoder,
opts.decoder_config, opts.decoder_output_file, opts)
File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 417,
in run_zmert
opts.metric, opts.iterations or 10)
File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 399,
in setup_configs
for feature,weight in get_features(config):
File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 351,
in get_features
output = check_output("%s/bin/joshua-decoder -c %s -show-weights -v 0" %
(JOSHUA, config_file), shell=True)
File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 626, in
check_output
**kwargs).stdout
File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 708, in
run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command
'/usr/local/incubator-joshua/bin/joshua-decoder -c
/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config
-show-weights -v 0' returned non-zero exit status 1
{code}
This is because, by default the joshua-decoder script runs with 4g of memory.
The run_runer.py script should be flexible enough to continue with the memory
allocation provided when a pipe was initially invoked. This value should then
be passed to the joshua-decoder script.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)