File Encoding Issues
--------------------
Key: OPENNLP-367
URL: https://issues.apache.org/jira/browse/OPENNLP-367
Project: OpenNLP
Issue Type: Bug
Components: Command Line Interface
Affects Versions: tools-1.5.2-incubating
Environment: All
Reporter: James Kosin
Assignee: James Kosin
The input and output encodings are not working correctly or are not properly
handled. A good example is the CoNLL 2002 data if correctly encoded in UTF-8
does not correctly work for training without specifying -Dfile.encoding=UTF-8
for the Java Command.
We already specify the input and expected output encoding on the cmdline
interface with the -encoding paramter. For some reason this isn't being
followed.
I'll work on fixing this for the next major release... :-)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira