hi stefan
On 07/11/2011 18:16, Stefan Dumitrescu wrote:
Hi all,
I am rather new to the MT domain, and i have a few theoretical questions:
1. In the manual, at the factored training, there is the following
example (from language de to en):
--translation-factors 0-0 \
--generation-factors 0-2 \
My question: When translating, what happens after translation of
surface form to surface form (T0)? How does the generation table of
conditional probabilities p(surface_en|pos_en) affect the previous
translation? I mean, during the hypothesis expansion of the
translation 0-0 is the generation table used? Or only after the
translation results (after the beam search, etc), is the G0 generation
table used somehow?
with this trans + generation model, the model probability is p( target
0, 2 | source 0).
A cartesian product of the candidates in the translation & generation
steps is calculated BEFORE hypothesis expansion.
I think there's some explanation of it here
http://homepages.inf.ed.ac.uk/pkoehn/publications/emnlp2007-factored.pdf
for a more long-winded description, try chapter 2 of my thesis
http://www.statmt.org/~s0565741/ddd.pdf
I am confused because the previous example in the manual was just
--translation-factors 0-0,2 , where i kind of understand that during
the hypothesis expansion, a less probable hypothesis chain from the
factor 2 (POS) point of view will get a lower score (because of lower
probability in the POS LM). But how does the process work when adding
a generation table?
I'm trying to understand why i should choose to go with a t0-0 and
then a g0-2 instead of going directly with a t0-0,2 (for example).
good question ;) imo, you should only do that when you're there's
extreme morphological differences between the 2 languages and t0-0,2 has
data sparsity problems. Even then, it's not a good idea to depend on it
solely
Another example: for a chain of t1-1 g1-2 t3-2 g1,2-0 how do
the different translations and generations interact? Are they
sequential, parallel? Is there some resource/book/article with them
better explained?
you'll get in big trouble doing that. The cartesian product will blow up
the decoder during decoding unless you severely prune it. Then you'll
have issues with search errors
2. For multiple decoding paths like:
--translation-factors 0-0,2+1-0,2 \
--decoding-steps t0:t1
how is the best translation chosen? If we consider that both the
surface forms and the lemmas are found in the phrase tables (so no
backoff necessary), thus each decoding path outputting a valid answer,
how is the best answer chosen? Always the translation from the first
table?
Hypthesis expansion is performed with translation rules from both paths
and the best hypo is choosen simply from the 1 with the highest
(weighted) probability.
Thank you for taking the time to read so far :) and thanks again for
answering, because i don't really know where to ask this anywhere else,
Stefan
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support