Joern Kottmann created OPENNLP-933:
--------------------------------------
Summary: Replace leipzig corpus data with wikinews
Key: OPENNLP-933
URL: https://issues.apache.org/jira/browse/OPENNLP-933
Project: OpenNLP
Issue Type: Improvement
Reporter: Joern Kottmann
Wikinews is available in many languages and licensed under cc-a 2.5 which is
classified as class b license at Apache. It should be ok to include that for
testing resources to ensure OpenNLP works properly.
This data can be used for testing existing models and it can be partly
automatically annotated to test training of all our components.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)