[
https://issues.apache.org/jira/browse/OPENNLP-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17761328#comment-17761328
]
Martin Wiesner commented on OPENNLP-1190:
-----------------------------------------
In 2023, [https://www.lsi.upc.es/~nlp/tools/nerc/nerc.html] yields a 404 for
which reason the resource mentioned on the mailing list in 2014 is no longer
available this way.
> CONLL02 format
> --------------
>
> Key: OPENNLP-1190
> URL: https://issues.apache.org/jira/browse/OPENNLP-1190
> Project: OpenNLP
> Issue Type: Bug
> Components: Formats
> Affects Versions: tools-1.5.3
> Reporter: Luca
> Priority: Major
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> According to the documentation, the following should work
> bin/opennlp TokenNameFinderConverter conll02 -data esp.train -lang es -types
> per > es_corpus_train_persons.txt
> However currently it delivers error message since it expects 3 columns
> instead of 2 that are in the dataset.
> This is a bug, introduced at line 130 of
> opennlp.tools.formats.Conll02NameSampleStream.java where a length of 3 is
> imposed.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)