[R] arules killed
Hi, I recently got a bizarre message when running arules. It just said Killed and quit. Anyone know why this might have happened? I am running R on an AWS quad xl ubuntu instance. Here is some information, including dataset size and the parameters: parameter specification: confidence minval smax arem aval originalSupport support minlen maxlen 0.00035812510.11 none FALSETRUE 3.581251e-05 2 4 target ext rules FALSE algorithmic control: filter tree heap memopt load sort verbose 0.1 TRUE TRUE FALSE TRUE2TRUE apriori - find association rules with the apriori algorithm version 4.21 (2004.05.09)(c) 1996-2004 Christian Borgelt set item appearances ...[1712 item(s)] done [0.00s]. set transactions ...[1712 item(s), 837696 transaction(s)] done [3.99s]. sorting and recoding items ... [1561 item(s)] done [1.83s]. creating transaction tree ... done [1.65s]. checking subsets of size 1 2 3Killed Thanks, Patrick McCann [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] pmml for random forest rules
Hi, I am having some trouble using R 2.13.1 for generating a pmml object of of class c('randomForest.formula', 'randomForest') I see that these methods are available: methods(pmml) [1] pmml.coxph*pmml.hclust* pmml.itemsets* pmml.kmeans* pmml.ksvm* pmml.lm* pmml.multinom* pmml.nnet* pmml.rpart* [10] pmml.rsf* pmml.rules*pmml.survreg* However, the R journal 1/1 pg 64 says there should be a method available ( http://journal.r-project.org/2009-1/RJournal_2009-1_Guazzelli+et+al.pdf ): Random Forest (and randomSurvivalForest) — randomForest (Breiman and Cutler. R port by A. Liaw and M. Wiener, 2009) and randomSurvivalForest (Ishwaran and Kogalur , 2009): PMML export of a randomSurvivalForest rsf object. This function gives the user the ability to export PMML containing the geometry of a forest. However, if I run these lines of code: library(randomForest) (iris.rf- randomForest(Species ~ ., data=iris)) pmml(iris.rf) I get this error: Error in UseMethod(pmml) : no applicable method for 'pmml' applied to an object of class c('randomForest.formula', 'randomForest') Also, if I run these lines of code data(Adult) ## Mine association rules. rules - apriori(Adult, parameter = list(supp = 0.5, conf = 0.9, target = rules)) pmml(rules) I get this error: pmml(rules) Error in function (classes, fdef, mtable) : unable to find an inherited method for function size, for signature itemMatrix With this traceback: traceback() 5: stop(unable to find an inherited method for function \, fdef@generic, \, for signature , cnames) 4: function (classes, fdef, mtable) { methods - .findInheritedMethods(classes, fdef, mtable) if (length(methods) == 1L) return(methods[[1L]]) else if (length(methods) == 0L) { cnames - paste(\, sapply(classes, as.character), \, sep = , collapse = , ) stop(unable to find an inherited method for function \, fdef@generic, \, for signature , cnames) } else stop(Internal error in finding inherited methods; didn't return a unique method) }(list(itemMatrix), function (object) standardGeneric(size), environment) 3: size(is.unique) 2: pmml.rules(rules) 1: pmml(rules) Thanks, Patrick McCann __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unique possible bug
The error I am referring to is in unique.c in Base R, it cannot accomodate greater than 2^29 values, even though it appears the overflow protection should be 2^30. The only relevance of the arules package is I was using it while I discovered this issue. Thanks, Patrick 2011/10/6 Uwe Ligges lig...@statistik.tu-dortmund.de: On 05.10.2011 22:15, Patrick McCann wrote: Hi, I am trying to read in a rather large list of transactions using the arules library. You mean the arules package? It seems in the coerce method into the dgCmatrix, it somewhere calls unique. Unique.c throws an error when n 536870912; however, when 4*n was modified to 2*n in 2004, the overflow protection should have changed from 2^29 to 2^30, right? If so, how would I change it in my copy? Do I have to recompile everything? Yes. There is also the way to ask the maintainer to improve it, but it won't work without reinstallation of the changed package sources. Uwe Ligges Thanks, Patrick McCann Here is a simple to reproduce example: runif(2^29+5)-a sum(unique(a))-b Error in unique.default(a) : length 536870917 is too large for hashing traceback() 3: unique.default(a) 2: unique(a) 1: unique(a) unique.default function (x, incomparables = FALSE, fromLast = FALSE, ...) { z- .Internal(unique(x, incomparables, fromLast)) if (is.factor(x)) factor(z, levels = seq_len(nlevels(x)), labels = levels(x), ordered = is.ordered(x)) else if (inherits(x, POSIXct)) structure(z, class = class(x), tzone = attr(x, tzone)) else if (inherits(x, Date)) structure(z, class = class(x)) else z } environment: namespace:base From http://svn.r-project.org/R/trunk/src/main/unique.c I see: /* Choose M to be the smallest power of 2 not less than 2*n and set K = log2(M). Need K= 1 and hence M= 2, and 2^M= 2^31 -1, hence n= 2^29. Dec 2004: modified from 4*n to 2*n, since in the worst case we have a 50% full table, and that is still rather efficient -- see R. Sedgewick (1998) Algorithms in C++ 3rd edition p.606. */ static void MKsetup(int n, HashData *d) { int n4 = 2 * n; if(n 0 || n 536870912) /* protect against overflow to -ve */ error(_(length %d is too large for hashing), n); d-M = 2; d-K = 1; while (d-M n4) { d-M *= 2; d-K += 1; } } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] unique possible bug
Hi, I am trying to read in a rather large list of transactions using the arules library. It seems in the coerce method into the dgCmatrix, it somewhere calls unique. Unique.c throws an error when n 536870912; however, when 4*n was modified to 2*n in 2004, the overflow protection should have changed from 2^29 to 2^30, right? If so, how would I change it in my copy? Do I have to recompile everything? Thanks, Patrick McCann Here is a simple to reproduce example: runif(2^29+5)-a sum(unique(a))-b Error in unique.default(a) : length 536870917 is too large for hashing traceback() 3: unique.default(a) 2: unique(a) 1: unique(a) unique.default function (x, incomparables = FALSE, fromLast = FALSE, ...) { z - .Internal(unique(x, incomparables, fromLast)) if (is.factor(x)) factor(z, levels = seq_len(nlevels(x)), labels = levels(x), ordered = is.ordered(x)) else if (inherits(x, POSIXct)) structure(z, class = class(x), tzone = attr(x, tzone)) else if (inherits(x, Date)) structure(z, class = class(x)) else z } environment: namespace:base From http://svn.r-project.org/R/trunk/src/main/unique.c I see: /* Choose M to be the smallest power of 2 not less than 2*n and set K = log2(M). Need K = 1 and hence M = 2, and 2^M = 2^31 -1, hence n = 2^29. Dec 2004: modified from 4*n to 2*n, since in the worst case we have a 50% full table, and that is still rather efficient -- see R. Sedgewick (1998) Algorithms in C++ 3rd edition p.606. */ static void MKsetup(int n, HashData *d) { int n4 = 2 * n; if(n 0 || n 536870912) /* protect against overflow to -ve */ error(_(length %d is too large for hashing), n); d-M = 2; d-K = 1; while (d-M n4) { d-M *= 2; d-K += 1; } } __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.