Dear Gang, I don't know any tool for word alignment using a dictionary.
Anyhow, Hunalign does sentence alignment with the help of dictionaries.
I have done some promising experiments using dictionaries to clean
sentence aligned corpora. I found that:
- dictionaries with domain specific vocabulary are very beneficial
- bad dictionaries e.g. created with GIZA++ are somewhat beneficial
- dictionaries are best used to prevent suspicious sentence pairs to be
  unduly removed. The other way around may remove a lot of good pairs
  with uncommon words.

Yours, Per Tunedal


On Fri, Oct 9, 2015, at 13:02, gang tang wrote:
> Dear All,
>
> Since there are no answers to my questions, I assume that there are no
> easy fixes to the alignment problem. However, just out of curiosity,
> shouldn't there be alignment tools that take lexical considerations
> into account while aligning parallel corpus? I mean, alignment tools
> that look up translations for specific words in a domain-specifc
> dictionary during alignment? Could there be any reason that it is not
> an interesting area to explore?
>
> Best Regards, Gang
>
>
>
> 在 2015-09-25 19:34:13,"gang tang" <[email protected]> 写道:
>> Dear all,
>>
>> I have a problem with alignment. I'd greatly appreciate if anyone can
>> help solve my issue.
>>
>> I have the following corpus:
>>
>> “sandalo camufluge" -> "camufluge sandal" "sandalo daino" -> "daino
>> sandal" "sandalo madras" -> "madras sandal" "sandalo vernice" ->
>> "vernice sandal"
>>
>> The alignment software I used was GIZA++, and the alignment result
>> was always 0-0 1-1, which meant that "sandalo" wasn't aligned with
>> "sandal". And after training phrase.translation.table always had
>> entries such as
"sandalo" -> "camufluge", "sandalo" -> "daino", "sandalo"->"madras",
and "sandalo"->"vernice", and no "sandalo"->"sandal". Is there any way
this problem could be solved? Could I add more data to align "sandalo"
with "sandal" and translate "sandalo" to "sandal"? How should I tune
the system?
>>
>> Thanks for your attention,
>>
>> Gang
>>
>>
>>
>>
>> 网易考拉iPhone6s玫瑰金5288元,现货不加价[1]


>>
>
>
>
>


>
> _________________________________________________
> Moses-support mailing list [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support



Links:

  1. 
http://rd.da.netease.com/redirect?t=ORBmhG&p=y7fo42&proId=1024&target=http%3A%2F%2Fwww.kaola.com%2Factivity%2Fdetail%2F4650.html%3Ftag%3Dea467f1dcce6ada85b1ae151610748b5
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to