[ 
https://issues.apache.org/jira/browse/OPENNLP-557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13555996#comment-13555996
 ] 

Joern Kottmann commented on OPENNLP-557:
----------------------------------------

To integrate this into OpenNLP we need to enhance/modify/extend the existing 
components and not provide complete new implementations which are specific for 
Polish. The idea in OpenNLP is that a user provides a model, and this model can 
then setup a component to work for the language and domain the model has been 
trained for. OpenNLP contains feature generation which is language specific for 
some languages already, but not yet for Polish.

I suggest that we move the discussion on how to achieve that to the dev mailing 
list. A good way would probably to discuss it component by component.
                
> Polish NLP
> ----------
>
>                 Key: OPENNLP-557
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-557
>             Project: OpenNLP
>          Issue Type: New Feature
>            Reporter: Tomek
>            Priority: Minor
>              Labels: patch
>
> Hi,
> recently we have developed some NLP tools for Polish language.
> We have implemented some OpenNLP interfaces (which we wanted to include in 
> OpenNLP project):
> -Sentence detector
> -Tokenizer
> -Document Categorizer  (it needs to include in project tc.xml and cache.db 
> files ,which are included in package)
> -Part-of-Speech Tagger
> -Chunker
> -Keyword Extractor
> download package: 
> https://dl.dropbox.com/u/4021344/polishNLP.7z
> package consist manual (manual/open_nlp_manual.html), javadoc, compiled java 
> libraries , sources, cache.db and tc.xml files (used in document categorizer).
> Tomek

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to