Re: [Moses-support] Significance of BLEU using Multi-bleu

Adam Lopez Fri, 19 Sep 2008 00:59:42 -0700

On 19 Sep 2008, at 02:25, Vineet Kashyap wrote:

> Hi Adam
>
> Thanks for the detailed explanation.
>
> I sorted out my training and testing sets and finally
> i have managed to get the following scores:
>
> NIST      BLEU     Multi-bleu   IBM-BLEU
> 6.0819   0.2070      20.70       0.2438
>
> These were evaluated on a testing set of 5082 sentences
> as suggested to get reliable scores need to test on more
> data and the system was trained using only 6053 sentences.
>
> Obviously, accuracy can be further improved by tuning and
> dropping unknown words and increasing the corpus size.
>
> It took about 70 minutes for the decoding process
> which i guess can be improved by running moses in parallel.
>
> Now, my question is can such a small corpus suggest BLEU score of  
> 20% accuracy
> as i have made sure both training and testing is separate?
>
> Also, what is the range of each of these BLEU systems?


I'm not sure that I understand wither of these questions.  The  
absolute value of the BLEU score is not especially meaningful -- it is  
a tool to compare multiple systems.

> With multi-bleu i guess 100% accuracy will 100 and
> BLEU in general will between 0-1 and multiply by 100
> to get accuracy in percentage.

Yes, that's correct.

> But NIST, not sure

I don't recall the scale for NIST scores, but maybe someone else on  
the list does.

> Thanks again
>
> Vineet
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support


-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Re: [Moses-support] Significance of BLEU using Multi-bleu

Reply via email to