Apologies that I am late on this thread.

On 02/12/10 17:39, Sascha Wolfer wrote:
I seem to have a problem with the openNLP package, I'm actually stuck in the very beginning. Here's what I did:
> install.packages("openNLP")
> install.packages("openNLPmodels.de", repos = "http://datacube.wu.ac.at/";, type = "source")

> library(openNLPmodels.de)
> library(openNLP)

So I installed the main package as well as the supplementary german model. Now, I try to use the "sentDetect" function:

> s <- c("Das hier ist ein Satz. Und hier ist noch einer - sogar mit Gedankenstrich. Ist das nicht toll?")
> sentDetect(s, language = "de", model = "openNLPmodels.de")

I get the following error message which I can't make any sense of:

Fehler in .jnew("opennlp/maxent/io/SuffixSensitiveGISModelReader", .jnew("java.io.File", : java.io.FileNotFoundException: openNLPmodels.de (No such file or directory)

The correct syntax seems to be

sentDetect(s, model = system.file("models", "de-sent.bin", package = 
"openNLPmodels.de"))


but unfortunately I get

Error in .jcall(.jnew("opennlp/maxent/io/SuffixSensitiveGISModelReader",  :
  java.io.UTFDataFormatException: malformed input around byte 48


YMMV. But you get the idea on the syntax of the model= argument. This "works":

sentDetect(s, model = system.file("models", "sentdetect", "EnglishSD.bin.gz", package = 
"openNLPmodels.en"))
# [1] "Das hier ist ein Satz. " # [2] "Und hier ist noch einer - sogar mit Gedankenstrich. "
# [3] "Ist das nicht toll?"


Hope this helps you a little.

Allan

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to