Hi,

I ran this corpus through GIZA++ and did get the correct result:

% head training/corpus.11.*
==> training/corpus.11.en <==
sandalo camufluge
sandalo daino
sandalo madras
sandalo vernice

==> training/corpus.11.fr <==
camufluge sandalo
daino sandalo
madras sandalo
vernice sandalo

% zcat training/giz*11/fr-en.A3.final.gz
# Sentence pair (1) source length 2 target length 2 alignment score :
0.536702
camufluge sandalo
NULL ({ }) sandalo ({ 2 }) camufluge ({ 1 })
# Sentence pair (2) source length 2 target length 2 alignment score :
0.536702
daino sandalo
NULL ({ }) sandalo ({ 2 }) daino ({ 1 })
# Sentence pair (3) source length 2 target length 2 alignment score :
0.536702
madras sandalo
NULL ({ }) sandalo ({ 2 }) madras ({ 1 })
# Sentence pair (4) source length 2 target length 2 alignment score :
0.536702
vernice sandalo
NULL ({ }) sandalo ({ 2 }) vernice ({ 1 })


So, something must have gone wrong on your end. Are you sure that you
preparing the data in the correct format?


-phi

On Fri, Oct 9, 2015 at 7:02 AM, gang tang <[email protected]> wrote:

> Dear All,
>
> Since there are no answers to my questions, I assume that there are no
> easy fixes to the alignment problem. However, just out of curiosity,
> shouldn't there be alignment tools that take lexical considerations into
> account while aligning parallel corpus? I mean, alignment tools that look
> up translations for specific words in a domain-specifc dictionary during
> alignment? Could there be any reason that it is not an interesting area to
> explore?
>
> Best Regards,
>
> Gang
>
>
> 在 2015-09-25 19:34:13,"gang tang" <[email protected]> 写道:
>
> Dear all,
>
> I have a problem with alignment. I'd greatly appreciate if anyone can help
> solve my issue.
>
> I have the following corpus:
>
> “sandalo camufluge" -> "camufluge sandal"
> "sandalo daino" -> "daino sandal"
> "sandalo madras" -> "madras sandal"
> "sandalo vernice" -> "vernice sandal"
>
> The alignment software I used was GIZA++, and the alignment result was
> always 0-0 1-1, which meant that "sandalo" wasn't aligned with "sandal".
> And after training phrase.translation.table always had entries such as
> "sandalo" -> "camufluge", "sandalo" -> "daino", "sandalo"->"madras", and
> "sandalo"->"vernice", and no "sandalo"->"sandal". Is there any way this
> problem could be solved? Could I add more data to align "sandalo" with
> "sandal" and translate "sandalo" to "sandal"? How should I tune the system?
>
> Thanks for your attention,
>
> Gang
>
>
>
> 网易考拉iPhone6s玫瑰金5288元,现货不加价
> <http://rd.da.netease.com/redirect?t=ORBmhG&p=y7fo42&proId=1024&target=http%3A%2F%2Fwww.kaola.com%2Factivity%2Fdetail%2F4650.html%3Ftag%3Dea467f1dcce6ada85b1ae151610748b5>
>
>
>
>
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to