When I tried to run svm on the same data frame, memory usage as reported by top(1) doubled to 4GB almost right away and the function never returned (has been running for ~15 hours now). ^C does not stop it. This is most unusual, libsvm has always seemed very fast.
This is R version 2.13.1 (2011-07-08) (as distributed with ubuntu). > * Sam Steingold <f...@tah.bet> [2012-02-09 21:43:30 -0500]: > > I did this: > nb <- naiveBayes(users, platform) > pl <- predict(nb,users) > nrow(users) ==> 314781 > ncol(users) ==> 109 > > 1. naiveBayes() was quite fast (~20 seconds), while predict() was slow > (tens of minutes). why? > > 2. the predict results were completely off the mark (quite the opposite > of the expected overfitting). suffice it to show the tables: > > pl: > > android blackberry ipad iphone lg linux mac > 3 5 11 14 312723 5 11 > mobile nokia samsung symbian unknown windows > 1864 17 16 112 0 0 > > platform: > android blackberry ipad iphone lg linux mac > 18013 1221 2647 1328 4 2936 34336 > mobile nokia samsung symbian unknown windows > 18 88 39 103 2660 251388 > > i.e., nb classified nearly everything as "lg" while in the actual data > "lg" is virtually nonexistent. > > 3. when I print "nb", I see "A-priori probabilities" (which are what I > expected) and "Conditional probabilities" which are confusing because > there are only two of them, e.g.: > > android 0.048464998 0.43946764 > blackberry 0.001638002 0.04045564 > ipad 0.322251606 1.84940588 > iphone 0.030873494 0.23250250 > lg 0.000000000 0.00000000 > linux 0.023501362 0.34698919 > mac 0.082653774 1.22535027 > mobile 0.000000000 0.00000000 > nokia 0.000000000 0.00000000 > samsung 0.000000000 0.00000000 > symbian 0.000000000 0.00000000 > unknown 0.003759398 0.08219078 > windows 0.021158528 0.32916970 > > the predictors are integers. > is the first column for the 0 predictors and the second for all non-0? > Is there a way to ask naiveBayes to differenciate between non-0 values? > > thanks! -- Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X 11.0.11004000 http://www.childpsy.net/ http://openvotingconsortium.org http://iris.org.il http://jihadwatch.org http://camera.org http://www.memritv.org Don't ascribe to malice what can be adequately explained by stupidity. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.