That's great. Thanks. Is there something that describes which model to use for which AE. Or maybe put something in the model filename, or put the model in a separate subdirectory?
-- James > -----Original Message----- > From: [email protected] [mailto:dev- > [email protected]] On Behalf Of Chen, > Pei > Sent: Tuesday, April 09, 2013 3:29 PM > To: [email protected] > Subject: RE: ClearNLP POSTagger > > FYI: > This has been done in trunk in r. 1466216 > https://issues.apache.org/jira/browse/CTAKES-186 > If you would like to try it out or run some benchmarks before we decide if > we should make the default pipeline use this, just uncomment the below in > your Aggregate Descriptors. > > <delegateAnalysisEngine key="ClearPOSTagger"> <import > location="../../../ctakes-pos-tagger/desc/ClearNLPPOSTagger.xml"/> > </delegateAnalysisEngine> > <node>ClearPOSTagger</node> > > > -----Original Message----- > > From: Chen, Pei [mailto:[email protected]] > > Sent: Monday, April 08, 2013 5:14 PM > > To: [email protected] > > Subject: RE: ClearNLP POSTagger > > > > Hi Richard, > > Yes- the ClearNLP tools (POSTagger, Dependency Parser, SRL) in cTAKES > > were retrained with additional data (MiPAQ/SHARP). > > The Dependency Parser/SRL replaced the existing one because the old > > ClearParser ones were no longer supported. > > > > The ClearPOSTagger wasn't previously available in cTAKES, but we can > > certainly make it an optional one in case some folks may want to use > > it. I'll leave the default one (OpenNLP) as-is for the time being > > until we get some more users/tests/benchmarks/feedback... > > > > --Pei > > > > > -----Original Message----- > > > From: Richard Eckart de Castilho [mailto:[email protected] > > > darmstadt.de] > > > Sent: Monday, April 08, 2013 1:43 PM > > > To: <[email protected]> > > > Subject: Re: ClearNLP POSTagger > > > > > > Hi, > > > > > > did you train new models for the ClearNLP/OpenNLP tools? (Maybe I > > > knew if I had followed a past discussion on models more closely.) > > > > > > Cheers, > > > > > > -- Richard > > > > > > Am 08.04.2013 um 18:15 schrieb "Chen, Pei" > > > <[email protected]>: > > > > > > > Hi, > > > > While working on the Dependency Parser/SRL labeler, we also have > > > > a > > > POSTagger from ClearNLP. It is fairly simple and I have the code > > > ready (also trained on the same data as the dep parser- MiPaq/SHARP) > > > to > > be checked-in. > > > What does the folks think: > > > > We can include both Analysis Engines in the ctakes-pos-tagger > > > > project. But > > > should we leave the current OpenNLP in the default pipeline or > > > default to the latest? > > > > > > > > "The ClearNLP POS tagger shows more robust results on unknown > > > > words > > > by generalizing lexical features. You can find the reference from > this paper. > > > > Fast and Robust Part-of-Speech Tagging Using Dynamic Model > > > > Selection, > > > Jinho D. Choi, Martha Palmer, Proceedings of the 50th Annual Meeting > > > of the Association for Computational Linguistics (ACL'12), 363-367, > > > Jeju, > > Korea, 2012. > > > [1] It also uses AdaGrad for machine learning, which is a more > > > advanced learning algorithm than maximum entropy used by OpenNLP." > > > > > > > > [1] http://aclweb.org/anthology-new/P/P12/P12-2071.pdf > > > > > > > > > -- > > > ------------------------------------------------------------------- > > > Richard Eckart de Castilho > > > Technical Lead > > > Ubiquitous Knowledge Processing Lab (UKP-TUD) FB 20 Computer Science > > > Department Technische Universität Darmstadt Hochschulstr. 10, > > > D-64289 Darmstadt, Germany phone [+49] (0)6151 16-7477, fax -5455, > > > room > > > S2/02/B117 [email protected] > > > www.ukp.tu-darmstadt.de > > > Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de > > > -------------------------------------------------------------------
