At OpenNLP we take the gold data, extract the text, run it through a one of our components and compute the scores on the fly.
In your case I would do it similar. I would take the XMI files generated from the gold standard corpus, copy the text and other input data from the initial view to a second view. Now you can run the tagger over the second view, after you did this an additional analysis engine could compute your performance scores. Jörn On 12/16/11 7:18 AM, [email protected] wrote:
Hi, I have got an XMI-file with the gold-standard as initial view and another XMI-file with tagger output as inital view. Now I would like to combine both into one CAS, such that the gold-standard is in the gold view and the tagger ouput in the tagger view. Than, I can easly compute the F-values. Any ideas how to do that? Is it possible to deserialze two CASes in such a way? The problem is that I have to use the two files. With one file and some annotators one could do it like this: Use an collection reader to read the gold-standard from the XMI-file. Create a new view for the tagger. Copy the sofa from the gold view to the new tagger view. Use an annotator to tag the tagger view sofa. You could repeat the last three steps for different taggers. Than you would have all annotations for all taggers in one huge CAS. What do you think about this? Greetings, Armin
