Thanks James. Good idea, It's been moved to a clearnlp folder now (to indicate that it's a clearnlp model). org/apache/ctakes/postagger/models/clearnlp/mayo-en-pos-1.3.0.jar
Let me know if you get a chance to try it out/run some benchmarks see how it performs against the current. --Pei > -----Original Message----- > From: Masanz, James J. [mailto:[email protected]] > Sent: Tuesday, April 09, 2013 4:31 PM > To: '[email protected]' > Subject: RE: ClearNLP POSTagger > > That's great. Thanks. > > Is there something that describes which model to use for which AE. > Or maybe put something in the model filename, or put the model in a > separate subdirectory? > > -- James > > > > -----Original Message----- > > From: [email protected] > > [mailto:dev- [email protected]] > On > > Behalf Of Chen, Pei > > Sent: Tuesday, April 09, 2013 3:29 PM > > To: [email protected] > > Subject: RE: ClearNLP POSTagger > > > > FYI: > > This has been done in trunk in r. 1466216 > > https://issues.apache.org/jira/browse/CTAKES-186 > > If you would like to try it out or run some benchmarks before we > > decide if we should make the default pipeline use this, just uncomment > > the below in your Aggregate Descriptors. > > > > <delegateAnalysisEngine key="ClearPOSTagger"> <import > > location="../../../ctakes-pos-tagger/desc/ClearNLPPOSTagger.xml"/> > > </delegateAnalysisEngine> > > <node>ClearPOSTagger</node> > > > > > -----Original Message----- > > > From: Chen, Pei [mailto:[email protected]] > > > Sent: Monday, April 08, 2013 5:14 PM > > > To: [email protected] > > > Subject: RE: ClearNLP POSTagger > > > > > > Hi Richard, > > > Yes- the ClearNLP tools (POSTagger, Dependency Parser, SRL) in > > > cTAKES were retrained with additional data (MiPAQ/SHARP). > > > The Dependency Parser/SRL replaced the existing one because the old > > > ClearParser ones were no longer supported. > > > > > > The ClearPOSTagger wasn't previously available in cTAKES, but we can > > > certainly make it an optional one in case some folks may want to use > > > it. I'll leave the default one (OpenNLP) as-is for the time being > > > until we get some more users/tests/benchmarks/feedback... > > > > > > --Pei > > > > > > > -----Original Message----- > > > > From: Richard Eckart de Castilho [mailto:[email protected] > > > > darmstadt.de] > > > > Sent: Monday, April 08, 2013 1:43 PM > > > > To: <[email protected]> > > > > Subject: Re: ClearNLP POSTagger > > > > > > > > Hi, > > > > > > > > did you train new models for the ClearNLP/OpenNLP tools? (Maybe I > > > > knew if I had followed a past discussion on models more closely.) > > > > > > > > Cheers, > > > > > > > > -- Richard > > > > > > > > Am 08.04.2013 um 18:15 schrieb "Chen, Pei" > > > > <[email protected]>: > > > > > > > > > Hi, > > > > > While working on the Dependency Parser/SRL labeler, we also > > > > > have a > > > > POSTagger from ClearNLP. It is fairly simple and I have the code > > > > ready (also trained on the same data as the dep parser- > > > > MiPaq/SHARP) to > > > be checked-in. > > > > What does the folks think: > > > > > We can include both Analysis Engines in the ctakes-pos-tagger > > > > > project. But > > > > should we leave the current OpenNLP in the default pipeline or > > > > default to the latest? > > > > > > > > > > "The ClearNLP POS tagger shows more robust results on unknown > > > > > words > > > > by generalizing lexical features. You can find the reference from > > this paper. > > > > > Fast and Robust Part-of-Speech Tagging Using Dynamic Model > > > > > Selection, > > > > Jinho D. Choi, Martha Palmer, Proceedings of the 50th Annual > > > > Meeting of the Association for Computational Linguistics (ACL'12), > > > > 363-367, Jeju, > > > Korea, 2012. > > > > [1] It also uses AdaGrad for machine learning, which is a more > > > > advanced learning algorithm than maximum entropy used by > OpenNLP." > > > > > > > > > > [1] http://aclweb.org/anthology-new/P/P12/P12-2071.pdf > > > > > > > > > > > > -- > > > > ------------------------------------------------------------------ > > > > - > > > > Richard Eckart de Castilho > > > > Technical Lead > > > > Ubiquitous Knowledge Processing Lab (UKP-TUD) FB 20 Computer > > > > Science Department Technische Universität Darmstadt Hochschulstr. > > > > 10, > > > > D-64289 Darmstadt, Germany phone [+49] (0)6151 16-7477, fax -5455, > > > > room > > > > S2/02/B117 [email protected] > > > > www.ukp.tu-darmstadt.de > > > > Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de > > > > ------------------------------------------------------------------ > > > > -
