Hi Hieu, I’ve made change 1, 2, 4 before emailing you, and the coverage didn’t change much. It turns out the bottleneck is on beam-threshold — the default value was 1e-5, which is a pretty tough limit for constrained decoding.
After setting that to 0 I played around a little bit with cube-pruning limit. The coverage is around 25% to 40% depending on what number you use, but higher coverage comes with longer decoding time, which is what one would expect to happen. Still, for string-to-tree constrained decoding the easiest way may still be decoding with phrase tables built per-sentence, since the decoding is generally slower. However, even for that, the default value of beam-threshold needs to be overridden in order to make it work properly. Hope the info helps. Regards, Shuoyang Ding Ph.D. Student Center for Language and Speech Processing Department of Computer Science Johns Hopkins University Hackerman Hall 225A 3400 N. Charles St. Baltimore, MD 21218 http://cs.jhu.edu/~sding <http://cs.jhu.edu/~sding> > On Oct 28, 2016, at 9:27 AM, Hieu Hoang <[email protected]> wrote: > > good point. The decoder is set up to translate quickly so there's a few > pruning parameters which throws out low scoring rules or hypotheses. > > These are some of the pruning parameters you'll need to change (there may be > more): > 1. [feature] > PhraseDictionaryWHATEVER table-limit=0 > 2. [cube-pruning-pop-limit] > 1000000 > 3. [beam-threshold] > 0 > 4. [stack] > 1000000 > Make the change 1 at a time in case it makes decoding too slow, even with > constrained decoding. > > It may be that you have to run the decoder with phrase-tables that are > trained only on 1 sentence at a time. > > I'll be interested in knowing how you get on so let me know how it goes > > On 26/10/2016 13:56, Shuoyang Ding wrote: >> Hi All, >> >> I’m trying to do syntax-based constrained decoding on the same data from >> which I extracted my rules, and I’m getting very low coverage (~12%). I’m >> using GHKM rule extraction which in theory should be able to reconstruct the >> target translation even only with minimal rules. >> >> Judging from the search graph output, the decoder seems to prune out rules >> with very low scores, even if they are the only rule that can reconstruct >> the original reference. >> >> I’m curious if there is a way in the current constrained decoding >> implementation such that I can disable pruning? Or at least, if it is >> feasible to do so? >> >> Thanks! >> >> Regards, >> Shuoyang Ding >> >> Ph.D. Student >> Center for Language and Speech Processing >> Department of Computer Science >> Johns Hopkins University >> >> Hackerman Hall 225A >> 3400 N. Charles St. >> Baltimore, MD 21218 >> >> http://cs.jhu.edu/~sding >> >> >> _______________________________________________ >> Moses-support mailing list >> [email protected] >> http://mailman.mit.edu/mailman/listinfo/moses-support >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
