[
https://issues.apache.org/jira/browse/LUCENE-2899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13294113#comment-13294113
]
Lance Norskog commented on LUCENE-2899:
---------------------------------------
bq. This really should just be a part of the analysis modules (with the
exception of the Solr example parts). I don't know exactly how we are handling
Solr examples anymore, but I seem to recall the general consensus was to not
proliferate them. Can we just expose the functionality in the main one?
A lot of Solr/Lucene features are only demoed in solrconfig/schema unit test
files (DIH for example). That is fine.
bq. The models are indeed tricky and I wonder how we can properly hook them
into the tests, if at all.
D'oh! Forgot about that. If we have tagged data in the project, it helps show
the other parts of an NLP suite. It's hard to get a full picture of the jigsaw
puzzle if you don't know NLP software.
> Add OpenNLP Analysis capabilities as a module
> ---------------------------------------------
>
> Key: LUCENE-2899
> URL: https://issues.apache.org/jira/browse/LUCENE-2899
> Project: Lucene - Java
> Issue Type: New Feature
> Components: modules/analysis
> Reporter: Grant Ingersoll
> Assignee: Grant Ingersoll
> Priority: Minor
> Attachments: LUCENE-2899.patch, opennlp_trunk.patch
>
>
> Now that OpenNLP is an ASF project and has a nice license, it would be nice
> to have a submodule (under analysis) that exposed capabilities for it. Drew
> Farris, Tom Morton and I have code that does:
> * Sentence Detection as a Tokenizer (could also be a TokenFilter, although it
> would have to change slightly to buffer tokens)
> * NamedEntity recognition as a TokenFilter
> We are also planning a Tokenizer/TokenFilter that can put parts of speech as
> either payloads (PartOfSpeechAttribute?) on a token or at the same position.
> I'd propose it go under:
> modules/analysis/opennlp
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]