On Thursday, February 28, 2013 at 5:26 PM, Jörn Kottmann wrote:

> Hmm, pretty sure there is an encoding mismatch, do you know which  
> encoding is used by
> your JVM? I would guess that is not UTF-8. You can probably get around  
> the issue by re-encoding the input
> file to the encoding the JVM is using.
>  
> Have a look here:
> http://stackoverflow.com/questions/1749064/how-to-find-default-charset-encoding-in-java
>  
> Would be nice if you can run the println statements there.
>  
> Jörn  
Where ever this comes from ..  

$ java CharsetTest  
Default Charset=US-ASCII
file.encoding=Latin-1
Default Charset=US-ASCII
Default Charset in Use=ASCII

$ echo $JAVA_TOOL_OPTIONS
(empty)

$ export JAVA_TOOL_OPTIONS='-Dfile.encoding=UTF8'

$ java CharsetTest  
Picked up JAVA_TOOL_OPTIONS: -Dfile.encoding=UTF8
Default Charset=UTF-8
file.encoding=Latin-1
Default Charset=UTF-8
Default Charset in Use=UTF8



But this change itself didn't help .. output remains unchanged, so i took the 
road down to dirty-hack-land, applying the following change to bin/opennlp - 
for sure not how it should be .. but works at least for the moment:

-$JAVACMD -Xmx1024m -jar $OPENNLP_HOME/lib/opennlp-tools-*.jar $@
+$JAVACMD -Xmx1024m -Dfile.encoding=UTF8 -jar 
$OPENNLP_HOME/lib/opennlp-tools-*.jar $@



Reply via email to