Dear Philipp (and others, if that stupid Barracuda spam filter at MIT allows my 
question to the list),

I've noticed there's a flag to turn on 'proper' conditioning in phrase extract. 
I have not carefully compared the outputs but I guess it would cause counting 
all occurrences of foreign (source) phrases f, regardless if they were aligned 
to a target phrase in a compatible fashion.

Am I correct that P(e|f) becomes deficient, i.e. not sum to 1 for a given f? 
(where P( not-aligned-consistently | f) would be the missing part).

Do properly-conditioned phrase tables indeed work better (in terms of BLEU or 
e.g. iterations of MERT loop)?


And one additional question: when extracting phrases, phrase-extract actually 
extracts all phrases that *are not incompatible* with the alignment. I'm 
thinking about a different method: just phrases that *are 'strictly' 
compatible*, which means I would extract:

a=A
c=C
abc=ABC

but not

ab=AB
bc=BC

from:

     a b c
A   *
B
C       *

Any experience with/intuition about that? Surely, there would be far fewer 
phrases extracted...

Thanks,
   Ondrej.

-- 
Ondrej Bojar (mailto:[EMAIL PROTECTED] / [EMAIL PROTECTED])
http://www.cuni.cz/~obo
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to