good to know that the constrained decoding works. And yes, the reachability of the training data is only theoritical in the absence of pruning such as cube pruning, beams etc.

On 15/11/2016 20:00, Shuoyang Ding wrote:
Hi Hieu,

I’ve made change 1, 2, 4 before emailing you, and the coverage didn’t change much. It turns out the bottleneck is on beam-threshold — the default value was 1e-5, which is a pretty tough limit for constrained decoding.

After setting that to 0 I played around a little bit with cube-pruning limit. The coverage is around 25% to 40% depending on what number you use, but higher coverage comes with longer decoding time, which is what one would expect to happen.

Still, for string-to-tree constrained decoding the easiest way may still be decoding with phrase tables built per-sentence, since the decoding is generally slower. However, even for that, the default value of beam-threshold needs to be overridden in order to make it work properly.

Hope the info helps.

Regards,
Shuoyang Ding

Ph.D. Student
Center for Language and Speech Processing
Department of Computer Science
Johns Hopkins University

Hackerman Hall 225A
3400 N. Charles St.
Baltimore, MD 21218

http://cs.jhu.edu/~sding <http://cs.jhu.edu/%7Esding>

On Oct 28, 2016, at 9:27 AM, Hieu Hoang <hieuho...@gmail.com <mailto:hieuho...@gmail.com>> wrote:

good point. The decoder is set up to translate quickly so there's a few pruning parameters which throws out low scoring rules or hypotheses.

These are some of the pruning parameters you'll need to change (there may be more):
  1. [feature]
      PhraseDictionaryWHATEVER table-limit=0
  2. [cube-pruning-pop-limit]
      1000000
  3. [beam-threshold]
      0
  4. [stack]
      1000000
Make the change 1 at a time in case it makes decoding too slow, even with constrained decoding.

It may be that you have to run the decoder with phrase-tables that are trained only on 1 sentence at a time.

I'll be interested in knowing how you get on so let me know how it goes

On 26/10/2016 13:56, Shuoyang Ding wrote:
Hi All,

I’m trying to do syntax-based constrained decoding on the same data from which I extracted my rules, and I’m getting very low coverage (~12%). I’m using GHKM rule extraction which in theory should be able to reconstruct the target translation even only with minimal rules.

Judging from the search graph output, the decoder seems to prune out rules with very low scores, even if they are the only rule that can reconstruct the original reference.

I’m curious if there is a way in the current constrained decoding implementation such that I can disable pruning? Or at least, if it is feasible to do so?

Thanks!

Regards,
Shuoyang Ding

Ph.D. Student
Center for Language and Speech Processing
Department of Computer Science
Johns Hopkins University

Hackerman Hall 225A
3400 N. Charles St.
Baltimore, MD 21218

http://cs.jhu.edu/~sding <http://cs.jhu.edu/%7Esding>


_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support



_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to