Forgot the refs: [1] http://wiki.apache.org/tika/GeoTopicParser [2] https://github.com/chrismattmann/lucene-geo-gazetteer.git
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm...@nasa.gov WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: jpluser <chris.a.mattm...@jpl.nasa.gov> Reply-To: "dev@opennlp.apache.org" <dev@opennlp.apache.org> Date: Friday, November 6, 2015 at 8:54 AM To: "dev@opennlp.apache.org" <dev@opennlp.apache.org> Subject: Question about OpenNLP and comparison to e.g., NTLK, Stanford NER, etc. >Hi Everyone, > >Hope you¹re well! I¹m new to the list, however I just wanted to >state I¹m really happy with what I¹ve seen with OpenNLP so far. >My team and I have built a Tika Parser [1] that uses OpenNLP¹s >location NER model, along with a Lucene Geo Names Gazeeteer [2] >to create a ³GeoTopicParser². We are improving it day to day. >OpenNLP has definitely come a long way from when I looked at it >years ago in its nascence. > >That said, I keep hearing from people I talk to in the NLP community >that for example OpenNLP is ³old², and that I should be looking >at e.g., Stanford¹s NER, and NTLK, etc. Besides obvious license >issues (NER is GPL as an example and I am only interested in ALv2 >or permissive license code), I don¹t have a great answer to whether >or not OpenNLP is old or not active, or not as good, etc. Can >devs on this list help me answer that question? I¹d like to be >able to tell these NLP people the next time I talk to them that >no, in fact, OpenNLP isn¹t *old*, and it¹s active, and there are >these X and Y lines of development, and here¹s where they are >going, etc. > >Looking at: > >https://whimsy.apache.org/board/minutes/OpenNLP > > >I see that you guys have done GSoC, are working on a Naive Bayes >classifier; summarization components, are making releases (seemingly >more frequently, etc.). I also looked at: > >https://reporter.apache.org/ > > >And I see your project health score and activity is excellent. > >Thanks and let me know. > >Cheers, >Chris > >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >Chris Mattmann, Ph.D. >Chief Architect >Instrument Software and Science Data Systems Section (398) >NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >Office: 168-519, Mailstop: 168-527 >Email: chris.a.mattm...@nasa.gov >WWW: http://sunset.usc.edu/~mattmann/ >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >Adjunct Associate Professor, Computer Science Department >University of Southern California, Los Angeles, CA 90089 USA >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > >