On 10/13/2011 4:23 AM, Jörn Kottmann wrote:
> On 10/13/11 4:54 AM, James Kosin wrote:
>> I found this article and several other references on how to fix these.
>> We may need to refactor the general output to the same encoding as the
>> input files to fix this on the terminal.
>>
>> http://hints.macworld.com/article.php?story=20050208053951714
>
> Doesn't the console need to know in which encoding characters are
> printed?
> I wonder if it works to use UTF-8 on windows.
>
> And the linux system might already had an UTF-8 default encoding.
> I will try it with the data we got on my Ubuntu test system
>
> Jörn
>
Windows only has the option for ANSI and Unicode.
Unfortunately, when I switch to Unicode, the models don't load anymore
and fail.  The file still shows block characters for the special
characters that don't display correctly.  I use Notepad++ and the file
shows the correct characters using UTF8.
Hmmm.....
Also, the "type utf-input.txt" ... doesn't produce the correct output
either on the display.
I'll experiment more on the chcp command.

The bad thing, there is no way to specify the encoding for the opennlp
models input file.  The default encoding for the OS is assumed here for
some reason.

James

Reply via email to