Hi All Could someone please explain for me what does "significant test" and "p-value" mean? I've read both Koehn <https://pdfs.semanticscholar.org/0bc5/0d75597b99999634d909009153673deff56d.pdf>'s and Clark <https://www.cs.cmu.edu/~jhclark/pubs/significance.pdf>'s papers on significant test but I still don't know what does a p-value such as 0.05 means. Does it mean that if the difference between the scores of two systems is x, with probability=95%, if we repeat the experiments we will get the same difference? Does significant test deal with randomness of tuning process? or it deals with test set selection?
Any help would be appreciated
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
