Hi, a tuning set of 5000 sentences is pretty big, decoding such a set may take several hours and hence tuning several days.
A commonly used sentence aligner is hunalign. http://mokk.bme.hu/resources/hunalign Not sure, what you mean by probability threshold in the context of sentence alignment. There is no straight-forward intuitive explanation of BLEU scores, they are too dependent on domain, language pair, type of test set. Bigger is better, on the same test set. -phi On Thu, Aug 7, 2008 at 12:29 PM, Vineet Kashyap <[EMAIL PROTECTED]> wrote: > Hi all > > I would like to know is the data used for both tuning > and testing the same ? > > also how long would it take to tune say 5000 sentences using mert? > > can someone recommend a nice tool for sentence alignment ? > i am currently using Microsoft's bilingual sentence aligner > which seems to be very accurate but becomes slower for large > number of sentences as it does a lot of iterations? > > also with respect to sentence alignment, there is something called > as the probability threshold which i dont understand the importance > of other than a value between 0 and 1 is chosen > > also how to interpret a bleu score of say 15 or 20 in terms accuracy > in percentage? > > Thanks > > Vineet > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > > _______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
