Hi Barry,

I found the problem: the testset file generated (testset.detokenized.sgm.9)
has a wrong structure (the start tag is srcset instead of tstset, properly
inserted at the end of the file) and 2 missing tag (trglang and sysid), .
What's the process that creates this file?

Thank you
Mauro


<srcset setid="testiw" srclang="any">
  <doc docid="doc" genre="testset" origlang="en">
    <seg id="1">Attenzione: ...
...
    <seg id="266">Se la compagnia ...</seg>
  </doc>
</tstset>



On Thu, Apr 12, 2012 at 5:11 PM, Barry Haddow <[email protected]>wrote:

> Hi Mauro
>
> > Could someone tell me where the problems is? Could someone send me a
> > well-formed sgm file that I can adopt my sgm test set to?
>
> The files released for wmt12 will work with the nist script. See
> http://www.statmt.org/wmt12/translation-task.html
> and look for the development sets, as they come with references.
> These should show which tags are required.
>
> cheers - Barry
>
>
> On Thursday 12 April 2012 11:39:31 Mauro Zanotti wrote:
> > Dear all,
> >
> > I ran an experiment and it crashed at the final evaluation step.
> >
> > In the relative step directory the REPORTING_report.9.STDERR shows
> >
> > ERROR (extract_nist_bleu): could not find BLEU score in file
> >
> '/opt/tools/moses/scripts/ems/example/ex7ama/evaluation/testset.nist-bleu-c
> > .9' ERROR (extract_nist_bleu): could not find BLEU score in file
> >
> '/opt/tools/moses/scripts/ems/example/ex7ama/evaluation/testset.nist-bleu.9
> > '
> >
> > So I checked the previous step EVALUATION_testset_nist-bleu.9
> > the EVALUATION_testset_nist-bleu.9.STDERR shows
> >
> > Use of uninitialized value $tst_id in string eq at
> > /opt/tools/moses/scripts/generic/mteval-v13a.pl line 488.
> > Not the same 'setid' attribute values across files at
> > /opt/tools/moses/scripts/generic/mteval-v13a.pl line 488.
> >
> > I launched the following command adding some debugging print and I found
> > the problem is that $tst_id is empty.
> >
> > /opt/tools/moses/scripts/generic/mteval-v13a.pl -s
> > /opt/tools/moses/scripts/ems/example/data-iw/testset/test-ama.en.sgm -r
> > /opt/tools/moses/scripts/ems/example/data-iw/testset/test-ama.it.sgm -t
> >
> /opt/tools/moses/scripts/ems/example/ex7ama/evaluation/testset.detokenized.
> > sgm.9
> >
> > >
> /opt/tools/moses/scripts/ems/example/ex7ama/evaluation/testset.nist-bleu.
> > >9
> >
> > My test files have the following structure
> >
> > <srcset setid="testiw" srclang="any">
> >   <doc docid="doc" genre="testset" origlang="en">
> >     <seg id="1">Warning: The booking cannot...</seg>
> > ...
> > <seg id="266">If the...</seg>
> >   </doc>
> > </srcset>
> >
> > <refset setid="testiw" trglang="it"  srclang="any">
> >   <doc sysid="ref" docid="doc" genre="testset" origlang="en">
> >     <seg id="1">Avvertenza: la prenotazione non può...</seg>
> > ...
> > <seg id="266">Se la...</seg>
> >   </doc>
> > </refset>
> >
> > Could someone tell me where the problems is? Could someone send me a
> > well-formed sgm file that I can adopt my sgm test set to?
> >
> > Thank you in advance
> > Mauro
> >
>
> --
> Barry Haddow
> University of Edinburgh
> +44 (0) 131 651 3173
>
> --
> The University of Edinburgh is a charitable body, registered in
> Scotland, with registration number SC005336.
>
>
_______________________________________________
Moses-support mailing list
[email protected]
http://mailman.mit.edu/mailman/listinfo/moses-support

Reply via email to