Re: [Moses-support] Fwd: Different translations are obtained from the same decoder without alignment information

Ergun Bicici Fri, 24 Aug 2018 08:51:36 -0700

I am still waiting for the new results.

Ergun


On Fri, Aug 24, 2018 at 5:53 PM Bereketab Birhnu <[email protected]>
wrote:

> Thanks
>
> On Friday, August 24, 2018, Ergun Bicici <[email protected]> wrote:
>
>>
>> ok.
>>
>> On Fri, Aug 24, 2018 at 5:31 PM Hieu Hoang <[email protected]> wrote:
>>
>>> could you run with alignments, but WITHOUT -unknown-word-prefix UNK.
>>>
>>> alignments shouldn't change the translation but the OOV prefix may do
>>>
>>> Hieu Hoang
>>> http://statmt.org/hieu
>>>
>>>
>>> On Fri, 24 Aug 2018 at 15:29, Ergun Bicici <[email protected]> wrote:
>>>
>>>>
>>>> ok, thank you. I'll upload and send you a link.
>>>>
>>>> On Fri, Aug 24, 2018 at 5:27 PM Hieu Hoang <[email protected]> wrote:
>>>>
>>>>> that would be a bug.
>>>>>
>>>>> could you please make the model and input files available for
>>>>> download. I'll check it out
>>>>>
>>>>> Hieu Hoang
>>>>> http://statmt.org/hieu
>>>>>
>>>>>
>>>>> On Fri, 24 Aug 2018 at 15:15, Ergun Bicici <[email protected]> wrote:
>>>>>
>>>>>>
>>>>>> only the evaluation decoding steps are repeated that are steps 10, 9,
>>>>>> and 7 in the following steps in EMS output:
>>>>>> 48 TRAINING:consolidate ->      re-using (1)
>>>>>> 47 TRAINING:prepare-data ->     re-using (1)
>>>>>> 46 TRAINING:run-giza -> re-using (1)
>>>>>> 45 TRAINING:run-giza-inverse -> re-using (1)
>>>>>> 44 TRAINING:symmetrize-giza ->  re-using (1)
>>>>>> 43 TRAINING:build-lex-trans ->  re-using (1)
>>>>>> 40 TRAINING:build-osm ->        re-using (1)
>>>>>> 39 TRAINING:extract-phrases ->  re-using (1)
>>>>>> 38 TRAINING:build-reordering -> re-using (1)
>>>>>> 37 TRAINING:build-ttable ->     re-using (1)
>>>>>> 34 TRAINING:create-config ->    re-using (1)
>>>>>> 28 TUNING:truecase-input ->     re-using (1)
>>>>>> 24 TUNING:truecase-reference -> re-using (1)
>>>>>> 21 TUNING:filter ->     re-using (1)
>>>>>> 20 TUNING:apply-filter ->       re-using (1)
>>>>>> 19 TUNING:tune ->       re-using (1)
>>>>>> 18 TUNING:apply-weights ->      re-using (1)
>>>>>> 15 EVALUATION:test:truecase-input ->    re-using (1)
>>>>>> 12 EVALUATION:test:filter ->    re-using (1)
>>>>>> 11 EVALUATION:test:apply-filter ->      re-using (1)
>>>>>>
>>>>>>
>>>>>>
>>>>>> *10 EVALUATION:test:decode ->    run 9 EVALUATION:test:remove-markup
>>>>>> ->      run 7 EVALUATION:test:detruecase-output ->  run *3
>>>>>> EVALUATION:test:multi-bleu-c ->       run
>>>>>> 2 EVALUATION:test:analysis-coverage ->  re-using (1)
>>>>>> 1 EVALUATION:test:analysis-precision -> run
>>>>>>
>>>>>>
>>>>>> On Fri, Aug 24, 2018 at 4:39 PM Hieu Hoang <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> are you rerunning tuning for each case? Or are you using exactly the
>>>>>>> same moses.ini file for the with and with alignment experiments?
>>>>>>>
>>>>>>> Hieu Hoang
>>>>>>> http://statmt.org/hieu
>>>>>>>
>>>>>>>
>>>>>>> On Fri, 24 Aug 2018 at 14:34, Ergun Bicici <[email protected]> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> Dear Moses maintainers,
>>>>>>>>
>>>>>>>> I discovered that the translations obtained differ when alignment
>>>>>>>> flags (--mark-unknown --unknown-word-prefix UNK
>>>>>>>> --print-alignment-inf) are used. Comparison table is attached
>>>>>>>> (en-ru and ru-en are being recomputed). We expect them to be the same 
>>>>>>>> since
>>>>>>>> alignment flags only print additional information and they are not 
>>>>>>>> supposed
>>>>>>>> to alter decoding. In both, the same EMS system was re-run with the
>>>>>>>> alignment information flags or not.
>>>>>>>>
>>>>>>>>    - Average of the absolute difference is 0.0094 BLEU (about 1
>>>>>>>>    BLEU points).
>>>>>>>>    - Average of the difference is 0.0051 BLEU (about 0.5 BLEU
>>>>>>>>    points, results are better with alignment flags).
>>>>>>>>
>>>>>>>> 
>>>>>>>>
>>>>>>>> /opt/Programs/SMT/moses/mosesdecoder/bin/moses --version
>>>>>>>>
>>>>>>>> Moses code version (git tag or commit hash):
>>>>>>>>   mmt-mvp-v0.12.1-2775-g65c75ff07-dirty
>>>>>>>> Libraries used:
>>>>>>>>      Boost  version 1.62.0
>>>>>>>>
>>>>>>>> git status
>>>>>>>> On branch RELEASE-4.0
>>>>>>>> Your branch is up to date with 'origin/RELEASE-4.0'.
>>>>>>>>
>>>>>>>>
>>>>>>>> Note: Using alignment information to recase tokens was tried in [1]
>>>>>>>> for en-fi and en-tr to claim positive results. We tried this method in 
>>>>>>>> all
>>>>>>>> translation directions we considered as as can be seen in the align 
>>>>>>>> row,
>>>>>>>> this only improves the performance for tr-en and en-tr and for tr-en 
>>>>>>>> Moses
>>>>>>>> provides better translations without the alignment flags.
>>>>>>>> [1]The JHU Machine Translation Systems for WMT 2016
>>>>>>>> Shuoyang Ding, Kevin Duh, Huda Khayrallah, Philipp Koehn and Matt
>>>>>>>> Post
>>>>>>>> http://www.statmt.org/wmt16/pdf/W16-2310.pdf
>>>>>>>>
>>>>>>>>
>>>>>>>> Best Regards,
>>>>>>>> Ergun
>>>>>>>>
>>>>>>>> Ergun Biçici
>>>>>>>> http://bicici.github.com/ <http://ergunbicici.blogspot.com/>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Moses-support mailing list
>>>>>>>> [email protected]
>>>>>>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> --
>>>>>>
>>>>>> Regards,
>>>>>> Ergun
>>>>>>
>>>>>>
>>>>>>
>>>>
>>>> --
>>>>
>>>> Regards,
>>>> Ergun
>>>>
>>>>
>>>>
>>
>> --
>>
>> Regards,
>> Ergun
>>
>>
>>

-- 

Regards,
Ergun

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Fwd: Different translations are obtained from the same decoder without alignment information

Reply via email to