Hi,

a tuning set of 5000 sentences is pretty big, decoding such
a set may take several hours and hence tuning several days.

A commonly used sentence aligner is hunalign.
http://mokk.bme.hu/resources/hunalign
Not sure, what you mean by probability threshold in the context
of sentence alignment.

There is no straight-forward intuitive explanation of BLEU scores,
they are too dependent on domain, language pair, type of test set.
Bigger is better, on the same test set.

-phi

On Thu, Aug 7, 2008 at 12:29 PM, Vineet Kashyap
<[EMAIL PROTECTED]> wrote:
> Hi all
>
> I would like to know is the data used for both tuning
> and testing the same ?
>
> also how long would it take to tune say 5000 sentences using mert?
>
> can someone recommend a nice tool for sentence alignment ?
> i am currently using Microsoft's bilingual sentence aligner
> which seems to be very accurate but becomes slower for large
> number of sentences as it does a lot of iterations?
>
> also with respect to sentence alignment, there is something called
> as the probability threshold which i dont understand the importance
> of  other than a value between 0 and 1 is chosen
>
> also how to interpret a bleu score of say 15 or 20 in terms accuracy
> in percentage?
>
> Thanks
>
> Vineet
>
> _______________________________________________
> Moses-support mailing list
> [email protected]
> http://mailman.mit.edu/mailman/listinfo/moses-support
>
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to