I think wrap-xml.perl has limitations because it uses prepared reference .xml file as a template to create the test set .xml file. Creating the references still takes work.
I wrote the attached "makexml.py" script to create a complete set of srcset, refset, and tstset xml files. I quickly cleaned them up them for this email. Hope I didn't break them :) If it's broken, we will publish a fully-debuged version on our web site very soon. I hope you find them helpful. Tom On Wed, 9 Jun 2010 03:16:04 -0700 (PDT), sripirakas sakthithasan <[email protected]> wrote: > Hi Milos Stolic, > > Yes sure. I will check the examples and change my plain text data > accordingly. I was informed by Barry that wrap-xml.perl would also be > useful in automatic conversion. I hope that I can do it in either way... > Thanks a lot for your immediate attention on this issue... > Thank you. > > S.Sripirakas. > University of Colombo School of Computing. > > > --- On Wed, 6/9/10, Milos Stolic <[email protected]> wrote: > > From: Milos Stolic <[email protected]> > Subject: Re: [Moses-support] FW: Missing .sgm ref and src files > To: "sripirakas sakthithasan" <[email protected]> > Cc: "Barry Haddow" <[email protected]>, [email protected] > Date: Wednesday, June 9, 2010, 2:52 AM > > The sgml format used in mteval is rather simple. In the example folder, > check ref.xml, src.xml and tst.xml. You'll need to add a header/footer to > your plain text files, and wrap each sentence around <seg id="SEG_ID"> and > </seg> tags. > > -- > Милош > > On Wed, Jun 9, 2010 at 11:42 AM, sripirakas sakthithasan > <[email protected]> wrote: > > Hi Barry, > > I also had this same problem of converting plain text files into sgm > format. Yes, multi-bleu.perl is applicable to score using BLEU and I did > it. But scoring using NIST is impossible as mteval-v11b.pl only implements > it. What could be the solution for this issue?.. Please be kind enough pay > your attention on this request also. Thanking you in advance. > > > S.Sripirakas. > University of Colombo School of Computing. > > --- On Wed, 6/9/10, Barry Haddow <[email protected]> wrote: > > > From: Barry Haddow <[email protected]> > Subject: Re: [Moses-support] FW: Missing .sgm ref and src files > > To: [email protected] > Date: Wednesday, June 9, 2010, 2:04 AM > > Hi Sanne > > You can > use multi-bleu.perl to score plain text files. It's distributed with > moses, > > regards > Barry > > On Wednesday 09 Jun 2010 08:36:46 Korzec, Sanne wrote: >> Recently I asked some help with evaluation. It has not been picked up >> yet. > >> I case you guys missed it, I'm reposting. >> >> Sanne >> >> ________________________________ >> From: [email protected] >> [mailto:[email protected]] > >> On Behalf Of Korzec, Sanne Sent: vrijdag 4 juni 2010 10:36 >> To: '[email protected]' >> Subject: [Moses-support] Missing .sgm ref and src files > >> >> Hi, >> > >> I would like to evaluate my system on the test set of wmt07. However, the >> BLEU evaluation script needs .sgm reference and source files as in this >> baseline script example: >> >> mteval-v11b.pl -r wmt07/devtest/devtest2006-ref.en.sgm -t > >> working-dir/evaluation/devtest2006.output.sgm -s >> wmt07/devtest/devtest2006-src.fr.sgm -c When I download the test set >>from >> 2007. All I can find are devtest2007.fr and devtest2007.en > >> >> How can I evaluate this? Or how can I convert the plain text files to >> the >> sgm format. >> >> Sanne >> > > _______________________________________________ > Moses-support mailing list > > [email protected] > http://mailman.mit.edu/mailman/listinfo/moses-support > > > > > > > > > > > > _______________________________________________ > > Moses-support mailing list > > [email protected] > > http://mailman.mit.edu/mailman/listinfo/moses-support
mteval-makexml.tar.gz
Description: GNU Zip compressed data
_______________________________________________ Moses-support mailing list [email protected] http://mailman.mit.edu/mailman/listinfo/moses-support
