Re: [R] Decision Tree: Am I Missing Anything?

2012-09-22 Thread Vik Rubenfeld
Bhupendrashinh, thanks again for telling me about RWeka. That made a big difference in a job I was working on this week. Have a great weekend. -Vik __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the

Re: [R] Decision Tree: Am I Missing Anything?

2012-09-22 Thread Bhupendrasinh Thakre
My pleasure. As a part of R team we are always here to help each other. Best Regards, Bhupendrasinh Thakre Sent from my iPhone On Sep 22, 2012, at 1:46 PM, Vik Rubenfeld v...@mindspring.com wrote: Bhupendrashinh, thanks again for telling me about RWeka. That made a big difference in a job

Re: [R] Decision Tree: Am I Missing Anything?

2012-09-22 Thread Max Kuhn
Vik, On Fri, Sep 21, 2012 at 12:42 PM, Vik Rubenfeld v...@mindspring.com wrote: Max, I installed C50. I have a question about the syntax. Per the C50 manual: ## Default S3 method: C5.0(x, y, trials = 1, rules= FALSE, weights = NULL, control = C5.0Control(), costs = NULL, ...) ## S3

Re: [R] Decision Tree: Am I Missing Anything?

2012-09-21 Thread Achim Zeileis
Hi, just to add a few points to the discussion: - rpart() is able to deal with responses with more than two classes. Setting method=class explicitly is not necessary if the response is a factor (as in this case). - If your tree on this data is so huge that it can't even be plotted, I

Re: [R] Decision Tree: Am I Missing Anything?

2012-09-21 Thread mxkuhn
There is also C5.0 in the C50 package. It tends to have smaller trees that C4.5 and much smaller trees than J48 when there are factor predictors. Also, it has an optional feature selection (winnow) step that can be used. Max On Sep 21, 2012, at 2:18 AM, Achim Zeileis achim.zeil...@uibk.ac.at

Re: [R] Decision Tree: Am I Missing Anything?

2012-09-21 Thread Vik Rubenfeld
Max, I installed C50. I have a question about the syntax. Per the C50 manual: ## Default S3 method: C5.0(x, y, trials = 1, rules= FALSE, weights = NULL, control = C5.0Control(), costs = NULL, ...) ## S3 method for class ’formula’ C5.0(formula, data, weights, subset, na.action = na.pass, ...) I

[R] Decision Tree: Am I Missing Anything?

2012-09-20 Thread Vik Rubenfeld
I'm working with some data from which a client would like to make a decision tree predicting brand preference based on inputs such as price, speed, etc. After running the decision tree analysis using rpart, it appears that this data is not capable of predicting brand preference. Here's the

Re: [R] Decision Tree: Am I Missing Anything?

2012-09-20 Thread Bhupendrasinh Thakre
Not very sure what the problem is as I was not able to take your data for run. You might want to use dput() command to present the data. Now on the programming side. As we can see that we have more than 2 levels for the brands and hence method = class is not able to able to understand what

Re: [R] Decision Tree: Am I Missing Anything?

2012-09-20 Thread Vik Rubenfeld
Thanks! Here's the dput output: dput(test.df) structure(list(BRND = structure(c(1L, 12L, 16L, 17L, 18L, 19L, 20L, 21L, 22L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 13L, 14L, 15L), .Label = c(Brand 1, Brand 10, Brand 11, Brand 12, Brand 13, Brand 14, Brand 15, Brand 16, Brand 17, Brand 18,

Re: [R] Decision Tree: Am I Missing Anything?

2012-09-20 Thread Vik Rubenfeld
Bhupendrashinh, thanks very much! I ran J48 on a respondent-level data set and got a 61.75% correct classification rate! Correctly Classified Instances 988 61.75 % Incorrectly Classified Instances 612 38.25 % Kappa statistic

Re: [R] Decision Tree: Am I Missing Anything?

2012-09-20 Thread Bhupendrasinh Thakre
One possible way to think of it is using variable reduction before going for J48. You may want to use several methods available for that. Again prediction for brands is more of a business question to me. Two solution which I can think of. 1. Variable reduction before decision tree. 2. Let the

Re: [R] Decision Tree: Am I Missing Anything?

2012-09-20 Thread Vik Rubenfeld
Very good. Could you point me in a couple of potential directions for variable reduction? E.g. correlation analysis? On Sep 20, 2012, at 10:36 PM, Bhupendrasinh Thakre wrote: One possible way to think of it is using variable reduction before going for J48. You may want to use several