Again, IQR returns two both a .25 and a .75 value and it failed, which is why I didn't use it before. Also, the first function just returns tha same value repeating. Since they are the same, before the second call, using the mode function is just a way to grab one value. I could have used average, min, max, they all would have returned the same thing.
Mike On Tue, Apr 19, 2016 at 7:24 PM, Marc Schwartz <marc_schwa...@me.com> wrote: > Hi, > > Jumping into this thread mainly on the point of the mode of the > distribution, while also supporting Bert's comments below on theory. > > If the vector 'x' that is being passed to this function is an integer > vector, then a tabulation of the integers can yield a 'mode', presuming of > course that there is only one unique mode. You may have to decide how you > want to handle a multi-modal discrete distribution. > > If the vector 'x' is continuous (e.g. contains floating point values), > then a tabulation is going to be problematic for a variety of reasons. > > In that case, prior discussions on this point, have yielded the following > estimation of the mode of a continuous distribution by using: > > Mode <- function(x) { > D <- density(x) > D$x[which.max(D$y)] > } > > where the second line of the function gets you the value of 'x' at the > maximum of the density estimate. Of course, there is still the possibility > of a multi-modal distribution and the nuances of which kernel is used, > etc., etc. > > Food for thought. > > Regards, > > Marc Schwartz > > > > On Apr 19, 2016, at 7:07 PM, Bert Gunter <bgunter.4...@gmail.com> wrote: > > > > Well, instead of your functions try: > > > > Mode <- function(x) { > > tabx <- table(x) > > tabx[which.max(tabx)] > > } > > > > and use R's IQR function instead of yours. > > > > ... so I still don't get why you want to return a character string > > instead of a value for the IQR; > > and the mode of a sample defined as above is generally a bad estimator > > of the mode of the distribution. To say more than that would take me > > too far afield. Post on stats.stackexchange.com if you want to know > > why (if it's even relevant). > > > > Cheers, > > Bert > > Bert Gunter > > > > "The trouble with having an open mind is that people keep coming along > > and sticking things into it." > > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > > > > On Tue, Apr 19, 2016 at 4:25 PM, Michael Artz <michaelea...@gmail.com> > wrote: > >> Hi, > >> Here is what I am doing > >> > >> notGroupedAll <- ddply(data > >> ,~groupColumn > >> ,summarise > >> ,col1_mean=mean(col1) > >> ,col2_mode=Mode(col2) #Function I wrote for getting the > >> mode shown below > >> ,col3_Range=myIqr(col3) > >> ) > >> > >> groupedAll <- ddply(data > >> ,~groupColumn > >> ,summarise > >> ,col1_mean=mean(col1) > >> ,col2_mode=Mode(col2) #Function I wrote for getting the > >> mode shown below > >> ,col3_Range=Mode(col3) > >> ) > >> > >> #custom Mode function > >> Mode <- function(x) { > >> ux <- unique(x) > >> ux[which.max(tabulate(match(x, ux)))] > >> > >> #the range function > >> myIqr <- function(x) { > >> paste(round(quantile(x,0.375),0),round(quantile(x,0.625),0),sep="-") > >> } > >> > >> > >> } > >> > >> > >> Here is what I am doing!! :) > >> > >> > >> > >> On Tue, Apr 19, 2016 at 2:57 PM, William Dunlap <wdun...@tibco.com> > wrote: > >>> > >>> If you show us, not just tell us about, a self-contained example > >>> someone might show you a non-hacky way of getting the job done. > >>> (I don't see an argument to plyr::ddply called 'transform'.) > >>> > >>> Bill Dunlap > >>> TIBCO Software > >>> wdunlap tibco.com > >>> > >>> On Tue, Apr 19, 2016 at 12:18 PM, Michael Artz <michaelea...@gmail.com > > > >>> wrote: > >>>> > >>>> Oh thanks for that clarification Bert! Hope you enjoyed your > coffee! I > >>>> ended up just using the transform argument in the ddply function. It > worked > >>>> and it repeated, then I called a mode function in another call to > ddply that > >>>> summarised. Kinda hacky but oh well! > >>>> > >>>> On Tue, Apr 19, 2016 at 12:31 PM, Bert Gunter <bgunter.4...@gmail.com > > > >>>> wrote: > >>>>> > >>>>> ... and I'm getting another cup of coffee... > >>>>> > >>>>> -- Bert > >>>>> Bert Gunter > >>>>> > >>>>> "The trouble with having an open mind is that people keep coming > along > >>>>> and sticking things into it." > >>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >>>>> > >>>>> > >>>>> On Tue, Apr 19, 2016 at 10:30 AM, Bert Gunter < > bgunter.4...@gmail.com> > >>>>> wrote: > >>>>>> NO NO -- I am wrong! The paste() expression is of course evaluated. > >>>>>> It's just that a character string is returned of the form > "something - > >>>>>> something". > >>>>>> > >>>>>> I apologize for the confusion. > >>>>>> > >>>>>> -- Bert > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> Bert Gunter > >>>>>> > >>>>>> "The trouble with having an open mind is that people keep coming > along > >>>>>> and sticking things into it." > >>>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >>>>>> > >>>>>> > >>>>>> On Tue, Apr 19, 2016 at 10:25 AM, Bert Gunter < > bgunter.4...@gmail.com> > >>>>>> wrote: > >>>>>>> To be precise: > >>>>>>> > >>>>>>> paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-") > >>>>>>> > >>>>>>> is an expression that evaluates to a character string: > >>>>>>> "round(quantile(x,.25),0) - round(quantile(x,0.75),0)" > >>>>>>> > >>>>>>> no matter what the argument of your function, x. Hence > >>>>>>> > >>>>>>> return(paste(...)) will return this exact character string and > never > >>>>>>> evaluates x. > >>>>>>> > >>>>>>> > >>>>>>> Cheers, > >>>>>>> Bert > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> Bert Gunter > >>>>>>> > >>>>>>> "The trouble with having an open mind is that people keep coming > >>>>>>> along > >>>>>>> and sticking things into it." > >>>>>>> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > >>>>>>> > >>>>>>> > >>>>>>> On Tue, Apr 19, 2016 at 8:34 AM, William Dunlap via R-help > >>>>>>> <r-help@r-project.org> wrote: > >>>>>>>>> That didn't work Jim! > >>>>>>>> > >>>>>>>> It always helps to say how the suggestion did not work. Jim's > >>>>>>>> function had a typo in it - was that the problem? Or did you not > >>>>>>>> change the call to ddply to use that function. Here is something > >>>>>>>> that might "work" for you: > >>>>>>>> > >>>>>>>> library(plyr) > >>>>>>>> > >>>>>>>> data <- data.frame(groupColumn=rep(1:5,1:5), col1=2^(0:14)) > >>>>>>>> myIqr <- function(x) { > >>>>>>>> > >>>>>>>> paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-") > >>>>>>>> } > >>>>>>>> ddply(data, ~groupColumn, summarise, col1_myIqr=myIqr(col1), > >>>>>>>> col1_IQR=stats::IQR(col1)) > >>>>>>>> # groupColumn col1_myIqr col1_IQR > >>>>>>>> #1 1 1-1 0 > >>>>>>>> #2 2 2-4 1 > >>>>>>>> #3 3 12-24 12 > >>>>>>>> #4 4 112-320 208 > >>>>>>>> #5 5 2048-8192 6144 > >>>>>>>> > >>>>>>>> The important point is that > >>>>>>>> > >>>>>>>> paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-") > >>>>>>>> is not a function, it is an expression. ddplyr wants functions. > >>>>>>>> > >>>>>>>> > >>>>>>>> Bill Dunlap > >>>>>>>> TIBCO Software > >>>>>>>> wdunlap tibco.com > >>>>>>>> > >>>>>>>> On Tue, Apr 19, 2016 at 7:56 AM, Michael Artz > >>>>>>>> <michaelea...@gmail.com> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> That didn't work Jim! > >>>>>>>>> > >>>>>>>>> Thanks anyway > >>>>>>>>> > >>>>>>>>> On Mon, Apr 18, 2016 at 9:02 PM, Jim Lemon <drjimle...@gmail.com > > > >>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> Hi Michael, > >>>>>>>>>> At a guess, try this: > >>>>>>>>>> > >>>>>>>>>> iqr<-function(x) { > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > return(paste(round(quantile(x,0.25),0),round(quantile(x,0.75),0),sep="-") > >>>>>>>>>> } > >>>>>>>>>> > >>>>>>>>>> .col3_Range=iqr(datat$tenure) > >>>>>>>>>> > >>>>>>>>>> Jim > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Tue, Apr 19, 2016 at 11:15 AM, Michael Artz > >>>>>>>>>> <michaelea...@gmail.com> > >>>>>>>>>> wrote: > >>>>>>>>>>> Hi, > >>>>>>>>>>> I am trying to show an interquartile range while grouping > >>>>>>>>>>> values > >>>>>>>>> using > >>>>>>>>>>> the function ddply(). So my function call now is like > >>>>>>>>>>> > >>>>>>>>>>> groupedAll <- ddply(data > >>>>>>>>>>> ,~groupColumn > >>>>>>>>>>> ,summarise > >>>>>>>>>>> ,col1_mean=mean(col1) > >>>>>>>>>>> ,col2_mode=Mode(col2) #Function I wrote for > >>>>>>>>>>> getting > >>>>>>>>> the > >>>>>>>>>>> mode shown below > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > ,col3_Range=paste(as.character(round(quantile(datat$tenure,c(.25)))), > >>>>>>>>>>> as.character(round(quantile(data$tenure,c(.75)))), sep = "-") > >>>>>>>>>>> ) > >>>>>>>>>>> > >>>>>>>>>>> #custom Mode function > >>>>>>>>>>> Mode <- function(x) { > >>>>>>>>>>> ux <- unique(x) > >>>>>>>>>>> ux[which.max(tabulate(match(x, ux)))] > >>>>>>>>>>> } > >>>>>>>>>>> > >>>>>>>>>>> I am not sre what is going wrong on my interquartile range > >>>>>>>>>>> function, it > >>>>>>>>>>> works on its own outside of ddply() > >>>>>>>>>>> > >>>>>>>>>>> [[alternative HTML version deleted]] > >>>>>>>>>>> > >>>>>>>>>>> ______________________________________________ > >>>>>>>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, > >>>>>>>>>>> see > >>>>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>>>>>>>>> PLEASE do read the posting guide > >>>>>>>>>> http://www.R-project.org/posting-guide.html > >>>>>>>>>>> and provide commented, minimal, self-contained, reproducible > >>>>>>>>>>> code. > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> [[alternative HTML version deleted]] > >>>>>>>>> > >>>>>>>>> ______________________________________________ > >>>>>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, > see > >>>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>>>>>>> PLEASE do read the posting guide > >>>>>>>>> http://www.R-project.org/posting-guide.html > >>>>>>>>> and provide commented, minimal, self-contained, reproducible > code. > >>>>>>>>> > >>>>>>>> > >>>>>>>> [[alternative HTML version deleted]] > >>>>>>>> > >>>>>>>> ______________________________________________ > >>>>>>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help > >>>>>>>> PLEASE do read the posting guide > >>>>>>>> http://www.R-project.org/posting-guide.html > >>>>>>>> and provide commented, minimal, self-contained, reproducible code. > >>>> > >>>> > >>> > >> > > > > ______________________________________________ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.