Hi Moses friends,
My Friend, Junhui Li, told me one bug about -max-phrase-length
parameter in train-model.perl for Hierarchical training. After my testing
with a very small corpus, it is true. Here, Junhui Li and me both think the
-max-phrase-length is not used for extract-rules of training-model.perl for
Hierarchical training.
The rule extraction code part of training-moerl.perl is like as
follows.
-----------------------------------------------------------------
if ($_HIERARCHICAL)
{
$cmd = "$RULE_EXTRACT $alignment_file_e $alignment_file_f
$alignment_file_a $extract_file";
$cmd .= " --GlueGrammar $___GLUE_GRAMMAR_FILE" if $_GLUE_GRAMMAR;
$cmd .= " --UnknownWordLabel $_UNKNOWN_WORD_LABEL_FILE" if
$_TARGET_SYNTAX && defined($_UNKNOWN_WORD_LABEL_FILE);
if (!defined($_GHKM)) {
$cmd .= " --SourceSyntax" if $_SOURCE_SYNTAX;
$cmd .= " --TargetSyntax" if $_TARGET_SYNTAX;
}
$cmd .= " ".$_EXTRACT_OPTIONS if defined($_EXTRACT_OPTIONS);
}
-----------------------------------------------------------------
Just now, I checked the extract-rules.cpp, there are some parameters could
be used, as follows.
-----------------------------------------------------------------
if (argc < 5) {
cerr << "syntax: extract-rules corpus.target corpus.source corpus.align
extract "
<< " [ --GlueGrammar FILE"
<< " | --UnknownWordLabel FILE"
<< " | --OnlyDirect"
<< " | --OutputNTLengths"
<< " | --MaxSpan[" << options.maxSpan << "]"
<< " | --MinHoleTarget[" << options.minHoleTarget << "]"
<< " | --MinHoleSource[" << options.minHoleSource << "]"
<< " | --MinWords[" << options.minWords << "]"
<< " | --MaxSymbolsTarget[" << options.maxSymbolsTarget << "]"
<< " | --MaxSymbolsSource[" << options.maxSymbolsSource << "]"
<< " | --MaxNonTerm[" << options.maxNonTerm << "]"
<< " | --MaxScope[" << options.maxScope << "]"
<< " | --SourceSyntax | --TargetSyntax"
<< " | --AllowOnlyUnalignedWords | --DisallowNonTermConsecTarget
|--NonTermConsecSource | --NoNonTermFirstWord | --NoFractionalCounting
]\n";
exit(1);
}
-----------------------------------------------------------------
Is there any friend can tell us in the above extract-rules parameters
which can be used for -max-phrase-length? Then we can fix this bug in git.
Thanks in advance.
Happy New Year~!
-Lang Jun
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support