On 02/28/2013 04:54 PM, Stefan Matheis wrote:
Hey Guys

I know, that may not be the normal use case, but my java knowledge is limited 
and a command line call would be the easiest way to integrate the OpenNLP 
capabilities into the project, so bear with me :)

$ cat input.txt
Sein Song "Nightcall" hat den Film "Drive" mit Ryan Gosling erst so richtig 
bekannt gemacht. Wir haben uns mit Vincent Belorgey, besser bekannt als Kavinsky, über sein 
Debütalbum, seine Musik und die 80er Jahre unterhalten.

$ bin/opennlp SimpleTokenizer < input.txt
Sein Song " Nightcall " hat den Film " Drive " mit Ryan Gosling erst so richtig 
bekannt gemacht . Wir haben uns mit Vincent Belorgey , besser bekannt als Kavinsky , ?? ber sein 
Deb ?? talbum , seine Musik und die 80 er Jahre unterhalten .
Average: 166.7 sent/s
Total: 1 sent

Runtime: 0.006s

Is my console misconfigured? The Input maybe not correct encoded? Or does it 
just not work?™ Of course i can work around that and create somehow a matching 
for those words originally containing Umlauts .. but, if it would be possible 
to avoid that? (:

While screening the web .. i found 
https://issues.apache.org/jira/browse/OPENNLP-172 but i'm not sure how that may 
or may not be related to me problem.


OpenNLP is using the platform default encoding to read from the console, that usually works as long as the platform default encoding
can encode the content which is passed to OpenNLP.

On which OS do you run? What is your platform encoding?

Jörn

Reply via email to