The sgml format used in mteval is rather simple. In the example folder, check ref.xml, src.xml and tst.xml. You'll need to add a header/footer to your plain text files, and wrap each sentence around <seg id="SEG_ID"> and </seg> tags. -- Милош
On Wed, Jun 9, 2010 at 11:42 AM, sripirakas sakthithasan < [email protected]> wrote: > Hi Barry, > > I also had this same problem of converting plain text files into sgm > format. Yes, multi-bleu.perl is applicable to score using BLEU and I did it. > But scoring using NIST is impossible as mteval-v11b.pl only implements it. > What could be the solution for this issue?.. Please be kind enough pay your > attention on this request also. Thanking you in advance. > > > S.Sripirakas. > University of Colombo School of Computing. > > --- On *Wed, 6/9/10, Barry Haddow <[email protected]>* wrote: > > > From: Barry Haddow <[email protected]> > Subject: Re: [Moses-support] FW: Missing .sgm ref and src files > > To: [email protected] > Date: Wednesday, June 9, 2010, 2:04 AM > > > Hi Sanne > > You can use multi-bleu.perl to score plain text files. It's distributed > with > moses, > > regards > Barry > > On Wednesday 09 Jun 2010 08:36:46 Korzec, Sanne wrote: > > Recently I asked some help with evaluation. It has not been picked up > yet. > > I case you guys missed it, I'm reposting. > > > > Sanne > > > > ________________________________ > > From: > > [email protected]<http://mc/[email protected]>[mailto: > [email protected]<http://mc/[email protected]> > ] > > On Behalf Of Korzec, Sanne Sent: vrijdag 4 juni 2010 10:36 > > To: '[email protected] <http://mc/[email protected]>' > > Subject: [Moses-support] Missing .sgm ref and src files > > > > Hi, > > > > I would like to evaluate my system on the test set of wmt07. However, the > > BLEU evaluation script needs .sgm reference and source files as in this > > baseline script example: > > > > mteval-v11b.pl -r wmt07/devtest/devtest2006-ref.en.sgm -t > > working-dir/evaluation/devtest2006.output.sgm -s > > wmt07/devtest/devtest2006-src.fr.sgm -c When I download the test set > from > > 2007. All I can find are devtest2007.fr and devtest2007.en > > > > How can I evaluate this? Or how can I convert the plain text files to > the > > sgm format. > > > > Sanne > > > > _______________________________________________ > Moses-support mailing list > [email protected] <http://mc/[email protected]> > http://mailman.mit.edu/mailman/listinfo/moses-support > > > > _______________________________________________ > Moses-support mailing list > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > >
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
