[
https://issues.apache.org/jira/browse/JOSHUA-318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15609503#comment-15609503
]
Lewis John McGibbney commented on JOSHUA-318:
---------------------------------------------
The following code is where the sh*t his the fan
{code}
def get_features(config_file):
"""Queries the decoder for all dense features that will be fired by the
feature
functions activated in the config file"""
output = check_output("%s/bin/joshua-decoder -c %s -show-weights -v 0" %
(JOSHUA, config_file), shell=True)
features = []
for index, item in enumerate(output.split('\n')):
if item != "":
features.append(tuple(item.split()))
return features
{code}
> scripts/training/run_tuner.py should enable configurable memory usage when
> invioking joshua-decoder
> ---------------------------------------------------------------------------------------------------
>
> Key: JOSHUA-318
> URL: https://issues.apache.org/jira/browse/JOSHUA-318
> Project: Joshua
> Issue Type: Improvement
> Components: tuner
> Affects Versions: 6.0.5
> Reporter: Lewis John McGibbney
> Fix For: 6.2
>
>
> When I run the run_tuner.py script I can easily run into the following
> {code}
> [mert-1] rebuilding...
> dep=/usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en
> dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config
> [CHANGED]
> dep=tune/model/grammar.gz.packed/slice_00000.source [CHANGED]
>
> dep=/usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config.final
> [NOT FOUND]
> cmd=/usr/local/incubator-joshua/scripts/training/run_tuner.py
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.en
> /usr/local/joshua_resources/russian_experiments/exp3/data/tune/corpus.ru
> --tunedir /usr/local/joshua_resources/russian_experiments/exp3/tune --tuner
> mert --decoder
> /usr/local/joshua_resources/russian_experiments/exp3/tune/decoder_command
> --decoder-config
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config
> --decoder-output-file
> /usr/local/joshua_resources/russian_experiments/exp3/tune/output.nbest
> --decoder-log-file
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.log
> --iterations 10 --metric 'BLEU 4 closest'
> JOB FAILED (return code 1)
> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
> at
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar$PackedSlice.initializeFeatureStructures(PackedGrammar.java:385)
> at
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar$PackedSlice.<init>(PackedGrammar.java:368)
> at
> org.apache.joshua.decoder.ff.tm.packed.PackedGrammar.<init>(PackedGrammar.java:153)
> at
> org.apache.joshua.decoder.Decoder.initializeTranslationGrammars(Decoder.java:458)
> at org.apache.joshua.decoder.Decoder.initialize(Decoder.java:389)
> at org.apache.joshua.decoder.Decoder.<init>(Decoder.java:128)
> at org.apache.joshua.decoder.JoshuaDecoder.main(JoshuaDecoder.java:69)
> Traceback (most recent call last):
> File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 553,
> in <module>
> main(sys.argv)
> File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 536,
> in main
> run_zmert(opts.tunedir, opts.source, opts.target, opts.decoder,
> opts.decoder_config, opts.decoder_output_file, opts)
> File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 417,
> in run_zmert
> opts.metric, opts.iterations or 10)
> File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 399,
> in setup_configs
> for feature,weight in get_features(config):
> File "/usr/local/incubator-joshua/scripts/training/run_tuner.py", line 351,
> in get_features
> output = check_output("%s/bin/joshua-decoder -c %s -show-weights -v 0" %
> (JOSHUA, config_file), shell=True)
> File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 626, in
> check_output
> **kwargs).stdout
> File "/Users/lmcgibbn/miniconda3/lib/python3.5/subprocess.py", line 708, in
> run
> output=stdout, stderr=stderr)
> subprocess.CalledProcessError: Command
> '/usr/local/incubator-joshua/bin/joshua-decoder -c
> /usr/local/joshua_resources/russian_experiments/exp3/tune/joshua.config
> -show-weights -v 0' returned non-zero exit status 1
> {code}
> This is because, by default the joshua-decoder script runs with 4g of memory.
> The run_runer.py script should be flexible enough to continue with the memory
> allocation provided when a pipe was initially invoked. This value should then
> be passed to the joshua-decoder script.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)