fyi

---------- Forwarded message ----------
From: Bill_Lang(Gmail) <[email protected]>
Date: Mon, Jan 9, 2012 at 10:20 AM
Subject: Re: [Moses-support] Weird BLEUS of Moses Hierarchical
To: Hieu Hoang <[email protected]>


Hi Hieu Hoang,
     Thanks for your interest. These days I was very busy for some reports.
Sorry for the bleus to you so late.

My newest result is as follows. Moses-Chart is better than Moses-Phrase.

FBIS, TUN on NIST02, mteval-v12, case insensitive
NIST02 0.3500
NIST03 0.3131
NIST04 0.3389
NIST05 0.2993

FBIS-hi TUN on NIST02,mteval-v12, case insensitive
NIST02 0.3663
NIST03 0.3265
NIST04 0.3657
NIST05 0.3168

2012/1/9 Hieu Hoang <[email protected]>

> good to know is been sorted. What are your scores for phrase-based and
> hierarchical, out of interest?
>
>
> On Mon, Jan 9, 2012 at 8:02 AM, Bill_Lang(Gmail) <[email protected]>wrote:
>
>> Dear Holger,
>>        Thanks for your kindly help. The weird BLEUs can be caused by many
>> factors. For my case, I have located the reason. It is caused by my
>> processing scripts for NIST testing data. When I used the original NIST
>> test data, everything is goes well now.
>>
>> Thanks.
>> -Lang Jun
>>
>> 2012/1/9 Holger Schwenk <[email protected]>
>>
>>> On 12/29/2011 10:57 AM, Bill_Lang(Gmail) wrote:
>>> > Hi Moses Friends,
>>> >     These days, I am running phrase based and hierarchical based
>>> > moses. Here my training corpus is FBIS (240k sentence pairs) for
>>> > Chinese to English translation. My moses version is updated on Dec 19,
>>> > 2011. After training I used NIST02, NIST03, NIST05 for tunning,
>>> > respectively. Here I got weird BLEUs as follows.
>>> >
>>> > Phrase Based Tunning on NIST02: NIST02 0.3176, NIST03 0.2827, NIST05
>>> > 0.2761
>>> > Phrase Based Tunning on NIST03: NIST02 0.3141, NIST03 0.2861, NIST05
>>> > 0.2746
>>> > Phrase Based Tunning on NIST05: NIST02 0.3109, NIST03 0.2831, NIST05
>>> > 0.2822
>>> >
>>> > Hierarchical Tunning on NIST02: NIST02 0.3403, NIST03 0.1620, NIST05
>>> > 0.1577
>>> > Hierarchical Tunning on NIST03: NIST02 0.3259, NIST03 0.1732, NIST05
>>> > 0.1669
>>> > Hierarchical Tunning on NIST05: NIST02 0.3286, NIST03 0.1689, NIST05
>>> > 0.1678
>>> >
>>> > I feel it is usual on Phrase based training, running, and testing.
>>> > Meanwhile, on Hierarchical, it is usual on NIST02. But it is so weird
>>> > on NIST03 and NIST05. The most strange thing, I think, is that running
>>> > on NIST03 or NIST05, the NIST02 BLEU is also usual.
>>>
>>> If I understand correctly your numbers, your hiero system gets good
>>> performance on NIST02, but not on the other data sets.
>>>
>>> In the past, I had some problems with filtering the rule-table on the
>>> test data, the script combine_factors.pl crashed when there are multiple
>>> spaces. The result was that many rules were missing and the output was
>>> badly translated.
>>>
>>> You may want to check the log-file, or whether your output has an usual
>>> large number of untranslated Chinese.
>>>
>>> hope this helps,
>>>
>>> Holger
>>>
>>> _______________________________________________
>>> Moses-support mailing list
>>> [email protected]
>>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>>
>>
>>
>> _______________________________________________
>> Moses-support mailing list
>> [email protected]
>> http://mailman.mit.edu/mailman/listinfo/moses-support
>>
>>
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to