On Tue, Oct 26, 2010 at 9:43 AM,  <sa...@kortec.nl> wrote:
> Hi,
>
> I have a question about PBSMT estimation. If I understand it correctly
> this is done in the following manner:
>
> - first IBM alignments in both directions
> - then an aligment heuristic such as grow-diag final
> - from this we create all possible phrase pairs, with some restrictions
> (nothing going outward etc.)

This is by far the most common approach, for instance it is used in
Moses.  There are a variety of other approaches that have been tried.

> 1) how are the phrase pairs counted for estimations. Will PBMST include
> all segmentations that can be created under these restrictions and count
> the phrase pairs in each segmentation (uniform over all segmentations)

There is no global inference in this method; a phrase pair is counted
once if it is observed to be consistent with the alignment, regardless
of how many segmentations it occurs in.

> 2) If this is the case, how does it deal with bottleneck sentences which
> have a lot of null alignements or a lot of smallest possible phrases. For
> instance in a long bitext sentence where the alignment will map the first
> source word to the first target word, then a null, then the third source
> to the third target word etc.
> In this example the search over all possible segmentations is very large,
> how does PBSMT deal with this?

Since there is no global inference it isn't necessary to enumerate
segmentations.  There are at most a quadratic number of source
phrases, and each one aligns to zero, one, or possibly a small number
of target phrases (depending on the treatment of null alignments). You
can easily enumerate them all and simply extract the corresponding
target phrases.  This is what Moses does.

For the global inference view, see this paper, which influenced a
great deal of subsequent research:
http://aclweb.org/anthology-new/W/W02/W02-1018.pdf

Cheers
Adam

_______________________________________________
Moses-support mailing list
Moses-support@mit.edu
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to