Re: [R] Interquartile Range

2016-04-19 Thread Michael Artz
sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Tue, Apr 19, 2016 at 5:29 PM, Michael Artz > wrote: > > Again, IQR returns two both a .25 and a .75 value and it failed, which is > > why I didn't u

Re: [R] Interquartile Range

2016-04-19 Thread Michael Artz
he IQR; > > and the mode of a sample defined as above is generally a bad estimator > > of the mode of the distribution. To say more than that would take me > > too far afield. Post on stats.stackexchange.com if you want to know > > why (if it's even relevant). > > >

Re: [R] Interquartile Range

2016-04-19 Thread Michael Artz
PM, William Dunlap wrote: > If you show us, not just tell us about, a self-contained example > someone might show you a non-hacky way of getting the job done. > (I don't see an argument to plyr::ddply called 'transform'.) > > Bill Dunlap > TIBCO Software > wdunlap

Re: [R] Interquartile Range

2016-04-19 Thread Michael Artz
gt; >>> paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-") > >>> } > >>> ddply(data, ~groupColumn, summarise, col1_myIqr=myIqr(col1), > >>> col1_IQR=stats::IQR(col1)) > >>> # groupColumn col1_myIqr col1_

Re: [R] Interquartile Range

2016-04-19 Thread Michael Artz
t a function, it is an expression. ddplyr wants functions. > > > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > On Tue, Apr 19, 2016 at 7:56 AM, Michael Artz > wrote: > >> That didn't work Jim! >> >> Thanks anyway >> >> On Mon,

Re: [R] Interquartile Range

2016-04-19 Thread Michael Artz
oming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Tue, Apr 19, 2016 at 7:56 AM, Michael Artz > wrote: > > That didn't work Jim! > > > > Thanks anyway > > > > On

Re: [R] Interquartile Range

2016-04-19 Thread Michael Artz
tat$tenure) > > Jim > > > > On Tue, Apr 19, 2016 at 11:15 AM, Michael Artz > wrote: > > Hi, > > I am trying to show an interquartile range while grouping values using > > the function ddply(). So my function call now is like > > > >

[R] Interquartile Range

2016-04-18 Thread Michael Artz
Hi, I am trying to show an interquartile range while grouping values using the function ddply(). So my function call now is like groupedAll <- ddply(data ,~groupColumn ,summarise ,col1_mean=mean(col1) ,col2_mode=Mode(col2) #Fun

Re: [R] Decision Tree and Random Forrest

2016-04-15 Thread Michael Artz
l, > newdata=newdata) > > > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > On Fri, Apr 15, 2016 at 3:09 PM, Michael Artz > wrote: > >> I need the output to have groups and the probability any given record in >> that group then has of being in the resp

Re: [R] Decision Tree and Random Forrest

2016-04-15 Thread Michael Artz
ed to them. The examples >> are sort of similar. You just provided links to general info about trees. >> >> >> >> Sent from my Verizon, Samsung Galaxy smartphone >> >> >> ---- Original message >> From: Sarah Goslee >> Date: 4/13/

Re: [R] Decision Tree and Random Forrest

2016-04-13 Thread Michael Artz
f-the-trees-and-forests/ You can get the same kind of information from random forests, but it's less straightforward. If you want a clear set of rules as in your golf example, then you need rpart or similar. Sarah On Wed, Apr 13, 2016 at 6:02 PM, Michael Artz wrote: > Ah yes I will have to u

Re: [R] Decision Tree and Random Forrest

2016-04-13 Thread Michael Artz
t;The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Wed, Apr 13, 2016 at 2:11 PM, Michael Artz > wrote: > > Ok is there a way to

Re: [R] Decision Tree and Random Forrest

2016-04-13 Thread Michael Artz
Also that being said, just because random forest are not the same thing as decision trees does not mean that you can't get decision rules from random forest. On Wed, Apr 13, 2016 at 4:11 PM, Michael Artz wrote: > Ok is there a way to do it with decision tree? I just need to

Re: [R] Decision Tree and Random Forrest

2016-04-13 Thread Michael Artz
Cheers, > Bert > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along > and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Wed, Apr 13, 2016 at 1:40

[R] Decision Tree and Random Forrest

2016-04-13 Thread Michael Artz
Hi I'm trying to get the top decision rules from a decision tree. Eventually I will like to do this with R and Random Forrest. There has to be a way to output the decsion rules of each leaf node in an easily readable way. I am looking at the randomforrest and rpart packages and I dont see anything

[R] No color in plotting

2016-04-12 Thread Michael Artz
Hi I am having a problem with plot () and ggplot (). When I call one of these functions, the plotting area starts to look as though it is working, but nothijg ever is visible. Unless it was a dendrogram. Woth the bar chart, the plotting area just had an x and y axis and nothing else. I tried a b

[R] Dissimilarity matrix and number clusters determination

2016-04-11 Thread Michael Artz
Hi, I already have a dissimilarity matrix and I am submitting the results to the elbow.obj method to get an optimal number of clusters. Am I reading the below output correctly that I should have 17 clusters? code: top150 <- sampleset[1:150,] {cluster1 <- daisy(top150 , metric

Re: [R] why data frame's logical index isnt working

2016-04-07 Thread Michael Artz
I don't get it, I thought the double index was to indicate and individual element within a column(vector)? I will stop using data.frame, thanks a lot! On Thu, Apr 7, 2016 at 9:29 PM, David Winsemius wrote: > > > On Apr 7, 2016, at 6:46 PM, Michael Artz wrote: > > > &g

Re: [R] simple question on data frames assignment

2016-04-07 Thread Michael Artz
> Hadley > > On Thu, Apr 7, 2016 at 6:52 AM, David Barron wrote: > > ifelse is vectorised, so just use that without the loop. > > > > colordata$response <- ifelse(colordata$color == 'blue', 1, 0) > > > > David > > > > On 7 April 2016

Re: [R] simple question on data frames assignment

2016-04-07 Thread Michael Artz
' + 0 > > Hope this helps, > > Rui Barradas > > > Citando David Barron : > > ifelse is vectorised, so just use that without the loop. > > colordata$response <- ifelse(colordata$color == 'blue', 1, 0) > > David > > On 7 April 2016 at 12:41, M

[R] why data frame's logical index isnt working

2016-04-07 Thread Michael Artz
data.frame.$columnToAdd["CurrentColumnName" == "ConditionMet"] <- 1 Can someone please explain to me why the above command gives all NAs to columnToAdd? I thought this was possible in R to do logical expression in the index of a data frame [[alternative HTML version deleted]] __

Re: [R] simple question on data frames assignment

2016-04-07 Thread Michael Artz
of numbers (the answer f gave you) to the second argument of lapply > instead of a function. > -- > Sent from my phone. Please excuse my brevity. > > On April 7, 2016 7:31:18 AM PDT, Michael Artz > wrote: >> >> If you are not using an anonymous function and say you had writ

Re: [R] simple question on data frames assignment

2016-04-07 Thread Michael Artz
function that does the response calculation. The result is a data > frame (list of columns) with no column names, so I give the new columns > names based on the old column names. You could choose different names, e.g. > > names(responses) <- paste0( "response", 1:2 ) > > but y

Re: [R] simple question on data frames assignment

2016-04-07 Thread Michael Artz
Thaks so much! And how would you incorporate lapply() here? On Thu, Apr 7, 2016 at 6:52 AM, David Barron wrote: > ifelse is vectorised, so just use that without the loop. > > colordata$response <- ifelse(colordata$color == 'blue', 1, 0) > > David > > On 7

[R] simple question on data frames assignment

2016-04-07 Thread Michael Artz
Hi I'm not sure how to ask this, but its a very easy question to answer for an R person. What is an easy way to check for a column value and then assigne a new column a value based on that old column value? For example, Im doing colordata <- data.frame(id = c(1,2,3,4,5), color = c("blue", "red",

Re: [R] p values from GLM

2016-04-02 Thread Michael Artz
Maybe it's not the article itself for sale. Sometimes a company will charge a fee to have access to its knowledge base. Not because it owns all of the content, but because the articles, publications, etc have been tracked down and centralized. This is also the whole idea behind paying a company

Re: [R] Could not find function even though I have all necessary packages

2016-03-28 Thread Michael Artz
Thank you everyone I got it! I needed to install munsell was all. I was giving a typo when I tried to install munsell On Mon, Mar 28, 2016 at 12:01 PM, Michael Artz wrote: > Thanks. SessionInfo() did not show it. > > This is the error when I try library(caret) > > &g

Re: [R] Could not find function even though I have all necessary packages

2016-03-28 Thread Michael Artz
‘ggplot2’ On Mon, Mar 28, 2016 at 11:57 AM, Jeff Newmiller wrote: > Post plain text only please. > > Are you sure it loaded? Verify with sessionInfo()... > -- > Sent from my phone. Please excuse my brevity. > > On March 28, 2016 9:21:56 AM PDT, Michael Artz > wrote: > &

[R] Could not find function even though I have all necessary packages

2016-03-28 Thread Michael Artz
Hi, I am getting the error, Error: could not find function "createDataPartition" when I do the code dataFrame_data <- createDataPartition(data$colA, p=.7, list=FALSE) even though I have run already install.packages("caret", dependencies = c("Depends", "Imports", "Suggests")) and install.packa

Re: [R] Logistic Regression output baseline (reference) category

2016-03-25 Thread Michael Artz
it. However, it does not represent any one of the x variables by itself. Is there a way in R, to extrapolate the individual x variable intercepts from the equation somehow. On Tue, Mar 15, 2016 at 8:26 PM, David Winsemius wrote: > > > On Mar 15, 2016, at 1:27 PM, Michael Artz

[R] Logistic Regression output baseline (reference) category

2016-03-15 Thread Michael Artz
Hi, I am trying to use the summary from the glm function as a data source. I am using the call sink() then summary(logisticRegModel)$coefficients then sink(). The independent variables are categorical and thus there is always a baseline value for every category that is omitted from the glm outp

Re: [R] Prediction from a rank deficient fit may be misleading

2016-03-10 Thread Michael Artz
.2497 tenure -0.0702813 0.0077113 -9.114 < 2e-16 *** TotalCharges 0.0004276 0.874 4.892 9.97e-07 *** On Thu, Mar 10, 2016 at 4:05 PM, David Winsemius wrote: > > > On Mar 10, 2016, at 8:08 AM, Michael Artz > wrote: > > > >

[R] Prediction from a rank deficient fit may be misleading

2016-03-10 Thread Michael Artz
HI all, I have the following error - > resultVector <- predict(logitregressmodel, dataset1, type='response') Warning message: In predict.lm(object, newdata, se.fit, scale = 1, type = ifelse(type == : prediction from a rank-deficient fit may be misleading I have seen on internet that there ma