I did it, follow to Barry's suggestion.
I test on a super small corpus with 2 pairs of sentences and generate 800
bilingual phrases :-D
Thanks to you, Barry and Prof. Marcello Federico.
On Thu, Jan 31, 2013 at 4:08 AM, Barry Haddow <[email protected]>wrote:
> Hi Cuong
>
> If you pass the aligned sentences through your phrase extraction, and
> through Moses phrase extraction, one at a time then you should be able to
> see where the difference is. As Marcello said, it could be in the handling
> of unaligned words,
>
> cheers - Barry
>
>
> On 30/01/13 16:39, Cuong Hoang wrote:
>
> Hi all,
> I write a phrase extraction with the rule that is simple from Koehn et.
> al, 2003:
> *
> *
> *``We collect all aligned phrase pairs that are consistent with the word
> alignment: The words in a legal phrase pair are only aligned to each other,
> and not to words outside."*
>
> I test on a quite large bilingual corpus contained 500,000 pairs of
> sentences, and obtain 33 million phrase pairs.
> However, when I use Moses to extract phrases, I obtain around 90 million
> pairs.
>
> Does MOSES use some other rules, or there is something wrong, isn't it?
>
> Thanks,
> C. Hoang
> --
> *
> Best Regards,
> C. Hoang
>
> {Mimosa, SMT}@Addict
> *
>
>
> _______________________________________________
> Moses-support mailing [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
>
--
*
Best Regards,
C. Hoang
{Mimosa, SMT}@Addict
*
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support