Gabor,
Thanks for the suggestion, I'll try it out tonight or tomorrow.
Regards,
Richard
_
Richard R. Liu
Dittingerstr. 33
CH-4053 Basel
Switzerland
Tel. +41 79 708 67 66
Sent from my iPhone 3GS
On Apr 29, 2010, at 13:06, Gabor Grothendieck
ggrothendi...@gmail.com wrote:
In developing a machine learner to classify sentences in plain text
sources of scientific documents I have been using the caret
package and
following the procedures described in the vignettes. What I miss
in the
package -- but quite possibly I am overlooking it! -- is functions
I'm running R 2.10.0 under Mac OS X 10.5.8; however, I don't think this
is a Mac-specific problem.
I have a very large (158,908 possible sentences, ca. 58 MB) plain text
document d which I am
trying to tokenize: t - strapply(d, \\w+, perl = T). I am
encountering the following error:
Error in
When I run sentDetect in the openNLP package I receive a Java heap space
exception. How can I increase the heap space?
I am running the 64-bit Leopard version of R 2.9.2 and R.app on a Mac with
OS X 10.5.8.
Thanks,
Richard
--
View this message in context:
I'm new to R. I'm working with the text mining package tm. I have several
plain text documents in a directory, and I would like to read all the files
with extension .txt in that directory into a vector, one text document per
vector element. That is, v[1] would be the first document, v[2] the
kenhorvath wrote:
Paul Hiemstra wrote:
file_list = list.files(/where/are/the/files)
obj_list = lapply(file_list, FUN = yourfunction)
yourfunction is probably either read.table or some read function from
the tm package. So obj_list will become a list of either data.frame's or
6 matches
Mail list logo