Hi Moses friends,
        My Friend, Junhui Li, told me one bug about -max-phrase-length
parameter in train-model.perl for Hierarchical training. After my testing
with a very small corpus, it is true. Here, Junhui Li and me both think the
-max-phrase-length is not used for extract-rules of training-model.perl for
Hierarchical training.
        The rule extraction code part of training-moerl.perl is like as
follows.
-----------------------------------------------------------------
    if ($_HIERARCHICAL)
    {
        $cmd = "$RULE_EXTRACT $alignment_file_e $alignment_file_f
$alignment_file_a $extract_file";
        $cmd .= " --GlueGrammar $___GLUE_GRAMMAR_FILE" if $_GLUE_GRAMMAR;
        $cmd .= " --UnknownWordLabel $_UNKNOWN_WORD_LABEL_FILE" if
$_TARGET_SYNTAX && defined($_UNKNOWN_WORD_LABEL_FILE);
        if (!defined($_GHKM)) {
          $cmd .= " --SourceSyntax" if $_SOURCE_SYNTAX;
          $cmd .= " --TargetSyntax" if $_TARGET_SYNTAX;
        }
        $cmd .= " ".$_EXTRACT_OPTIONS if defined($_EXTRACT_OPTIONS);
    }
-----------------------------------------------------------------



Just now, I checked the extract-rules.cpp, there are some parameters could
be used, as follows.


-----------------------------------------------------------------
  if (argc < 5) {
    cerr << "syntax: extract-rules corpus.target corpus.source corpus.align
extract "
         << " [ --GlueGrammar FILE"
         << " | --UnknownWordLabel FILE"
         << " | --OnlyDirect"
         << " | --OutputNTLengths"
         << " | --MaxSpan[" << options.maxSpan << "]"
         << " | --MinHoleTarget[" << options.minHoleTarget << "]"
         << " | --MinHoleSource[" << options.minHoleSource << "]"
         << " | --MinWords[" << options.minWords << "]"
         << " | --MaxSymbolsTarget[" << options.maxSymbolsTarget << "]"
         << " | --MaxSymbolsSource[" << options.maxSymbolsSource << "]"
         << " | --MaxNonTerm[" << options.maxNonTerm << "]"
         << " | --MaxScope[" << options.maxScope << "]"
         << " | --SourceSyntax | --TargetSyntax"
         << " | --AllowOnlyUnalignedWords | --DisallowNonTermConsecTarget
|--NonTermConsecSource |  --NoNonTermFirstWord | --NoFractionalCounting
]\n";
    exit(1);
  }
-----------------------------------------------------------------

      Is there any friend can tell us in the above extract-rules parameters
which can be used for -max-phrase-length? Then we can fix this bug in git.

Thanks in advance.

Happy New Year~!
-Lang Jun
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to