File Encoding Issues
--------------------

                 Key: OPENNLP-367
                 URL: https://issues.apache.org/jira/browse/OPENNLP-367
             Project: OpenNLP
          Issue Type: Bug
          Components: Command Line Interface
    Affects Versions: tools-1.5.2-incubating
         Environment: All
            Reporter: James Kosin
            Assignee: James Kosin


The input and output encodings are not working correctly or are not properly 
handled.  A good example is the CoNLL 2002 data if correctly encoded in UTF-8 
does not correctly work for training without specifying -Dfile.encoding=UTF-8 
for the Java Command.

We already specify the input and expected output encoding on the cmdline 
interface with the -encoding paramter.  For some reason this isn't being 
followed.

I'll work on fixing this for the next major release...  :-)


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to