Second that as well: ClearTK On Tue, Nov 6, 2012 at 7:46 PM, Himanshu Gahlot <[email protected]> wrote: > I second Steve. ClearTK is an NLP library built using UIMA (and uimaFIT). > It has a fully functional Named Entity Extraction example annotator > including evaluation using the MASC corpus. > > Himanshu > > > On Tue, Nov 6, 2012 at 4:28 AM, Steven Bethard > <[email protected]>wrote: > >> On Tue, Nov 6, 2012 at 9:26 AM, Yasen Kiprov <[email protected]> >> wrote: >> > I'm writing a named entity recognition system for text excerpts from the >> social/public domain: blogs, news, etc. I'm testing different approaches >> with rules and ML and I need to evaluate annotations accuracy (in terms of >> f-score against a gold corpus). My plan is to use the MASC corpus or build >> a custom one but the first task is to find the right tools for evaluation. >> >> This sounds a lot like an example we have in ClearTK (it also uses the >> MASC named entity data). The full cross-validation evaluation code is >> here: >> >> >> https://code.google.com/p/cleartk/source/browse/cleartk-examples/src/main/java/org/cleartk/examples/chunking/EvaluateNamedEntityChunker.java >> >> And the class that actually calculates F-score, etc. over annotations is >> here: >> >> >> https://code.google.com/p/cleartk/source/browse/cleartk-eval/src/main/java/org/cleartk/eval/AnnotationStatistics.java >> >> Steve >>
-- Renaud Richardet Blue Brain Project PhD candidate EPFL Station 15 CH-1015 Lausanne phone: +41-78-675-9501 http://people.epfl.ch/renaud.richardet
