[ https://issues.apache.org/jira/browse/STANBOL-652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13401422#comment-13401422 ]
Pablo Mendes commented on STANBOL-652: -------------------------------------- I like it a lot. Saving the evaluation results to an RDF store would then allow one to slice-and-dice the results however he/she wants. For example, how many correct results for person annotations? SELECT count(?result) WHERE { ?result sbc:state sbc:benchmark-state-succeeded . ?result sbc:about ?annotation . ?annotation entity ?person . ?person a dbpedia:Person . } Sharing this RDF via a SPARQL endpoint would then allow prompt Web-based report generation (aka visualization of results) with something like Sgvizler, for example http://code.google.com/p/sgvizler/ Although what I'd actually do would be to get a CSV from this and use R to analyze the results. But even getting this CSV should be trivial from the RDF. > Benchmark should report evaluation summary > ------------------------------------------ > > Key: STANBOL-652 > URL: https://issues.apache.org/jira/browse/STANBOL-652 > Project: Stanbol > Issue Type: Improvement > Components: Testing > Reporter: Pablo Mendes > Priority: Minor > Labels: benchmark, evaluation > > The SBC is a nice way to perform manual inspection of the behavior of the > enhancement chain for different examples in the evaluation dataset. However, > for evaluations with several hundreds of examples, it would be interesting to > have scores that summarize the performance for the entire > dataset. For example, precision, recall and F1. An evaluation dataset is > available here in BDL: http://spotlight.dbpedia.org/download/stanbol/ -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira