Hi Hieu,

I’ve made change 1, 2, 4 before emailing you, and the coverage didn’t change 
much. It turns out the bottleneck is on beam-threshold — the default value was 
1e-5, which is a pretty tough limit for constrained decoding.

After setting that to 0 I played around a little bit with cube-pruning limit. 
The coverage is around 25% to 40% depending on what number you use, but higher 
coverage comes with longer decoding time, which is what one would expect to 
happen.

Still, for string-to-tree constrained decoding the easiest way may still be 
decoding with phrase tables built per-sentence, since the decoding is generally 
slower. However, even for that, the default value of beam-threshold needs to be 
overridden in order to make it work properly.

Hope the info helps.

Regards,
Shuoyang Ding

Ph.D. Student
Center for Language and Speech Processing
Department of Computer Science
Johns Hopkins University

Hackerman Hall 225A
3400 N. Charles St.
Baltimore, MD 21218

http://cs.jhu.edu/~sding <http://cs.jhu.edu/~sding>

> On Oct 28, 2016, at 9:27 AM, Hieu Hoang <[email protected]> wrote:
> 
> good point. The decoder is set up to translate quickly so there's a few 
> pruning parameters which throws out low scoring rules or hypotheses.
> 
> These are some of the pruning parameters you'll need to change (there may be 
> more):
>   1. [feature]
>       PhraseDictionaryWHATEVER table-limit=0
>   2. [cube-pruning-pop-limit]
>       1000000
>   3. [beam-threshold]
>       0
>   4. [stack]
>       1000000
> Make the change 1 at a time in case it makes decoding too slow, even with 
> constrained decoding.
> 
> It may be that you have to run the decoder with  phrase-tables that are 
> trained only on 1 sentence at a time.
> 
> I'll be interested in knowing how you get on so let me know how it goes
> 
> On 26/10/2016 13:56, Shuoyang Ding wrote:
>> Hi All,
>> 
>> I’m trying to do syntax-based constrained decoding on the same data from 
>> which I extracted my rules, and I’m getting very low coverage (~12%). I’m 
>> using GHKM rule extraction which in theory should be able to reconstruct the 
>> target translation even only with minimal rules.
>> 
>> Judging from the search graph output, the decoder seems to prune out rules 
>> with very low scores, even if they are the only rule that can reconstruct 
>> the original reference.
>> 
>> I’m curious if there is a way in the current constrained decoding 
>> implementation such that I can disable pruning? Or at least, if it is 
>> feasible to do so?
>> 
>> Thanks!
>> 
>> Regards,
>> Shuoyang Ding
>> 
>> Ph.D. Student
>> Center for Language and Speech Processing
>> Department of Computer Science
>> Johns Hopkins University
>> 
>> Hackerman Hall 225A
>> 3400 N. Charles St.
>> Baltimore, MD 21218
>> 
>> http://cs.jhu.edu/~sding
>> 
>> 
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
> 

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to