Hi,
I am trying to use MML but it's crashing at the
TRAINING_mml-filter-before-wa step. I could not resolve the problem. The
error and conf entries are copied here.
The corpus-mml-score.3 contains lines equal to my in-domain data and have
score 99999 on all lines. Is this correct?
Thank you,
Regards,
Hassan
------------------------------------------------------------------------------
/work/moses-2013-07-10/scripts/ems/support/mml-filter.py
/training/corpus-mml.3.ini
2013-08-31 12:29:57,126 Loading configuration from
/training/corpus-mml.3.ini
2013-08-31 12:29:57,128 Configuration:
general:strategy = Score
general:source_language = ar
general:target_language = en
general:input_stem = /training/corpus.1
general:output_stem = /training/corpus-mml.3
general:domain_file = /model/domains.3
general:domain_file_out = /training/corpus-mml.3
score:score_file = /training/corpus-mml-score.3
score:proportion = 0.9
2013-08-31 12:29:57,170 Retaining at least 0 entries and ignoring 149244
Traceback (most recent call last):
File "/work/moses-2013-07-10/scripts/ems/support/mml-filter.py", line
156, in <module>
main()
File "/work/moses-2013-07-10/scripts/ems/support/mml-filter.py", line
111, in main
strategy = strategy_class(config)
File "/work/moses-2013-07-10/scripts/ems/support/mml-filter.py", line 72,
in __init__
[float(line[:-1]) for line in open(self.score_file)],
reverse=True)[ignore_count + count]
IndexError: list index out of range
~
------------------------------------------------------------------------
Here are the entries in the conf file:
[MML]
### specifications for language models to be trained
#
lm-training = $srilm-dir/ngram-count
lm-settings = "-interpolate -kndiscount -unk"
lm-binarizer = $moses-src-dir/bin/build_binary
lm-query = $moses-src-dir/bin/query
order = 5
type = 8
raw-indomain-source = $training/train.$pair-extension.$input-extension
raw-indomain-target = $training/train.$pair-extension.$output-extension
outdomain-stem = /adapt/un.$pair-extension.utf8.ng.clean
settings = "--line-count 100000"
In TRAINING
### filtering some corpora with modified Moore-Lewis
# specify corpora to be filtered and ratio to be kept, either before or
after word alignment
mml-filter-corpora = /adapt/un.$pair-extension.utf8.ng.clean
mml-before-wa = "-proportion 0.9"
#mml-after-wa = "-proportion 0.9"
### domain adaptation settings
# options: sparse, any of: indicator, subset, ratio
domain-features = "subset"
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support