[jira] [Commented] (OPENNLP-701) Polish language support - Maxent binaries

Chris Krol / IBM (JIRA) Thu, 12 Jun 2014 04:37:24 -0700

    [ 
https://issues.apache.org/jira/browse/OPENNLP-701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14029059#comment-14029059
 ]


Chris Krol / IBM commented on OPENNLP-701:
------------------------------------------

Thanks for your response. 

Could you point me to the important packages or interfaces that would have to 
be implemented in order to fit well into the general OpenNLP design? 

My current idea is an  opennlp.tools.lang.polish dedicated Parser class for the 
corpus native format. 

I would be still contributing at least sentence detection and tokenizer, 
because they were created using a huge plaintext data set that's ready as 
provided. 

> Polish language support - Maxent binaries
> -----------------------------------------
>
>                 Key: OPENNLP-701
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-701
>             Project: OpenNLP
>          Issue Type: New Feature
>            Reporter: Chris Krol / IBM
>            Priority: Minor
>
> Hi, 
> Currently I'm working at IBM Poland and my manager approved the idea of 
> contributing various Maxent binaries for Polish language (sentence split, 
> sentence detection, POS tagging and morphological analysis, NER). 
> You could possibly put them on your download page. 
> We trained them using the Golden Standard human-annotated Polish National 
> Corpus (GPL 3.0). 
> Would this be also possible to give some credit (or any) to the fact that the 
> job's been done at IBM?
> I've already sent a mail to the devs,  but haven't seen any response for two 
> weeks now. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (OPENNLP-701) Polish language support - Maxent binaries

Reply via email to