Hi,

I am trying to use MML but it's crashing at the
TRAINING_mml-filter-before-wa step. I could not resolve the problem. The
error and conf entries are copied here.

The corpus-mml-score.3 contains lines equal to my in-domain data and have
score 99999 on all lines. Is this correct?

Thank you,

Regards,
Hassan

------------------------------------------------------------------------------
/work/moses-2013-07-10/scripts/ems/support/mml-filter.py
/training/corpus-mml.3.ini
2013-08-31 12:29:57,126 Loading configuration from
/training/corpus-mml.3.ini
2013-08-31 12:29:57,128 Configuration:
general:strategy = Score
general:source_language = ar
general:target_language = en
general:input_stem = /training/corpus.1
general:output_stem = /training/corpus-mml.3
general:domain_file = /model/domains.3
general:domain_file_out = /training/corpus-mml.3
score:score_file = /training/corpus-mml-score.3
score:proportion = 0.9

2013-08-31 12:29:57,170 Retaining at least 0 entries and ignoring 149244
Traceback (most recent call last):
  File "/work/moses-2013-07-10/scripts/ems/support/mml-filter.py", line
156, in <module>
    main()
  File "/work/moses-2013-07-10/scripts/ems/support/mml-filter.py", line
111, in main
    strategy = strategy_class(config)
  File "/work/moses-2013-07-10/scripts/ems/support/mml-filter.py", line 72,
in __init__
    [float(line[:-1]) for line in open(self.score_file)],
reverse=True)[ignore_count + count]
IndexError: list index out of range
~
------------------------------------------------------------------------

Here are the entries in the conf file:

[MML]

### specifications for language models to be trained
#

lm-training = $srilm-dir/ngram-count
lm-settings = "-interpolate -kndiscount -unk"
lm-binarizer = $moses-src-dir/bin/build_binary
lm-query = $moses-src-dir/bin/query
order = 5
type = 8

raw-indomain-source = $training/train.$pair-extension.$input-extension
raw-indomain-target = $training/train.$pair-extension.$output-extension

outdomain-stem = /adapt/un.$pair-extension.utf8.ng.clean
settings = "--line-count 100000"


In TRAINING

### filtering some corpora with modified Moore-Lewis
# specify corpora to be filtered and ratio to be kept, either before or
after word alignment
mml-filter-corpora =  /adapt/un.$pair-extension.utf8.ng.clean
mml-before-wa = "-proportion 0.9"
#mml-after-wa = "-proportion 0.9"

### domain adaptation settings
# options: sparse, any of: indicator, subset, ratio
domain-features = "subset"
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to