I'm creating a patch to integrate OpenNLP into the Lucene/Solr
project. The SentenceDetector, Tokenizer, POS tagger, Chunker, and NER
tools are included. The SentenceDetector and Tokenizer are a Lucene
Tokenizer, and a Lucene TokenFilter takes this stream and runs
POS/Chunking/NER on it, saving the tags as upper-case payloads. The
patch includes a couple of handy combinations. For example, make a
more focused search index by only indexing the nouns & verbs.

Do you have any hints on how to package it? The documentation should
include how to download and install the models.

-- 
Lance Norskog
[email protected]

Reply via email to