Hi list

Closing this one off myself, this is what I did:

The error seems to concern the update of tm to version 0.6: the conversion to lower case text should now be:

> docs <- tm_map(docs, content_transformer(tolower))

Everything else seems to work fine thereafter.

The issue in the tutorial concerns section 3.1. wherein Graham creates a function toSpace. This seems to introduce an additional term that tm_map and later DocumentTermMatrix do not seem to know how to handle. This is probably an incorrect interpretation of what's going on, but the fix appears to be to use the above line earlier in the preparation stage.

If anyone has more informed insight, please share.

Cheers

Sun

On 25/02/15 17:33, Sun Shine wrote:
Hi list

I've been working my way through a tutorial on text mining ( http://onepager.togaware.com/TextMiningO.pdf ) and all was well until I came across this problem using tm (text miner):

++++++++++code+++++++++++++++++++
> docs <- tm_map(docs, content_transformer(tolower))
Warning messages:
1: In mclapply(x$content[i], function(d) tm_reduce(d, x$lazy$maps)) :
  all scheduled cores encountered errors in user code
2: In mclapply(content(x), FUN, ...) :
  all scheduled cores encountered errors in user code
++++++++++end-code++++++++++++++++

After some searching, it appears the best fix for this problem was to pass an explicit lazy=TRUE argument to tm, like this:

> docs <- tm_map(docs, content_transformer(tolower), lazy=TRUE)

However, a little further on in the tutorial to set up the text matrix, a related (?) error was returned:

++++++++++code+++++++++++++++++++
> dtm <- DocumentTermMatrix(docs)
Error in UseMethod("meta", x) :
no applicable method for 'meta' applied to an object of class "try-error"
In addition: Warning message:
In mclapply(unname(content(x)), termFreq, control) :
  all scheduled cores encountered errors in user code
++++++++++end-code++++++++++++++++

I tried applying the explicit lazy=TRUE again, but doesn't change things. I have gone over the tutorial again and have followed all of the steps (including loading the requisite libraries). Moreover, searching on the web seems to return several contradictory suggestions and I'm no wiser than I was before.

The closest I came to an answer was at Stack Overflow http://stackoverflow.com/questions/24771165/r-project-no-applicable-method-for-meta-applied-to-an-object-of-class-charact and that answer suggested using the latest tm (v 0.6) and claimed that the earlier tolower step was wrong. However, my code used the recommended: corpus <- tm_map(corpus, content_transformer(tolower))

Is there anyone on the list who could either sign-post me to a solution or assist in debugging this please?

I'm running R version 3.1.2 and tm is 0.6

Many thanks

Sun



______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to