It's up! https://issues.apache.org/jira/browse/LUCENE-2899
It has sentence/tokenizer, pos, chunking and NER. Also some utility filters to fiddle with payloads. It is smart about caching models. It is done as a Lucene tokenizer/tokenfilter which is a fairly limiting arena. The opennlp build needs a little upgrading to work with the license validation in the Lucene build. OPENNLP-511 requests this. On Fri, Jun 1, 2012 at 4:18 AM, Svetoslav Marinov <[email protected]> wrote: > At Findwise we active use a number of OpenNLP components with both Hydra > and OpenPipeline when indexing with Solr. > > I look forward to see the result of the patch! > > Best, > Svetoslav > > On 2012-05-31 23:10, "Lance Norskog" <[email protected]> wrote: > >>Thanks. I have looked at UIMA several times and it seemed very >>complex. It has a lot of features, is mature, has an Eclipse app >>builder, etc. I could not keep it all in my head at once. The >>Solr/Lucene document pipeline features give little space for NLP >>features. Hydra or OpenPipeline give UIMA and OpenNLP "room to >>breathe". >> >>Are there free annotated text databases for UIMA? OpenNLP does not use >>any with open licences. It has binary models made from copyrighted >>annotations and so they cannot be checked into Apache. >> >>On Wed, May 30, 2012 at 6:11 PM, Christian Moen <[email protected]> wrote: >>> Hello Lance, >>> >>> This is very cool! I'm looking forward to having a look at this. >>> >>> >>> Christian Moen >>> http://atilika.com >>> >>> On May 31, 2012, at 9:54 AM, Lance Norskog wrote: >>> >>>> I'm creating a patch to integrate OpenNLP into the Lucene/Solr >>>> project. The SentenceDetector, Tokenizer, POS tagger, Chunker, and NER >>>> tools are included. The SentenceDetector and Tokenizer are a Lucene >>>> Tokenizer, and a Lucene TokenFilter takes this stream and runs >>>> POS/Chunking/NER on it, saving the tags as upper-case payloads. The >>>> patch includes a couple of handy combinations. For example, make a >>>> more focused search index by only indexing the nouns & verbs. >>>> >>>> Do you have any hints on how to package it? The documentation should >>>> include how to download and install the models. >>>> >>>> -- >>>> Lance Norskog >>>> [email protected] >>> >> >> >> >>-- >>Lance Norskog >>[email protected] >> > > -- Lance Norskog [email protected]
