Re: [R] Bug in by() function which works for some FUN argument and does not work for others

2016-04-17 Thread Akhilesh Singh
Dear All,

Yes, I certainly now agree with the suggestion of Adrian Dusa for using
colMeans in place of mean in the situation that I had reported to r-help.
And I am sorry that I did not personally extend my thanks to him. I really
wish to thank him for his suggestion, and I do this now.

However, I wished for future a way to apply a more complex function than
mean, say, e.g. a function for skewness or kurtosis and the likes in the
by() function. The function colMeans would be applicable for mean only.
That is why, I later came to above solution.

Yet, during all these deliberations, I wish to mention the suggestion given
by Dr. Jim Lemon, who suggested to use following code mixing by() into
sapply() wonderfully taking advantage of positional interpretations of
arguments in any R-code, that actually met my objective nicely for my
future project of using more complex functions in lieu of mean:

sapply(brain[,-1],by,brain$Gender,mean,na.rm=TRUE)
   FSIQVIQPIQ   Weight   Height MRI_Count
Female 111.9 109.45 110.45 137.2000 65.76500  862654.6
Male   115.0 115.25 111.60 166. 71.43158  954855.4

Secondly, I also wish to express my sorry to have mentioned "bug" for the
by() function, instead of thinking that I could be my mistake, whereas I
should have plainly sought help from r-help instead of calling it a "bug".
Had this hurt anybody's feeling, I express my regret and offer my apologies
to all of them for calling this name.

With best regards,

Dr. A.K. Singh
Head, Department of Agril. Statistics
Indira Gandhi Krishi Vishwavidyalaya, Raipur
Chhattisgarh, India, PIN-492012
Mobile: +919752620740
Email: akhileshsingh.i...@gmail.com



On Sun, Apr 17, 2016 at 7:52 AM, David Winsemius 
wrote:

>
> > On Apr 16, 2016, at 2:03 AM, Akhilesh Singh <
> akhileshsingh.i...@gmail.com> wrote:
> >
> > Dear All,
> >
> > I have got your core message, that it is my responsibility to determine
> whether any particular function in my version of R satisfies the language
> requirements at the time of your use. Jim Albert and Maria Rizzo must have
> used their code, which was permitted in the R-code of their time (2012).
> >
> > Therefore, I have now modified my R-code, as per R-3..2.4 version,
> according to my requirement as follows, which is working for my 'brain'
> data set, whose output is reproduced below for your information please:
> >
> > > by(brain[,-1], INDICES=list(Gender=brain$Gender), FUN=function(x,
> na.rm=FALSE) sapply(x, mean, na.rm=na.rm), na.rm=TRUE)
> > Gender: Female
> >   FSIQVIQPIQ Weight Height  MRI_Count
> >111.900109.450110.450137.200 65.765 862654.600
> >
> --
> > Gender: Male
> > FSIQ  VIQ  PIQ   Weight   Height
> MRI_Count
> >115.0115.25000111.6166.4 71.43158
> 954855.4
>
> Yes. that is certainly a workable alternative, although I thought the
> question of "how to to it" had been effectively answered with the
> suggestion from Adrian Dusa to use colMeans. It, too, has an `na.rm=TRUE`
> option
>
> I was only responding to your plaintive complaint that the current version
> of R had a "bug" because it was not behaving as promised by an introductory
> text with a three year-old publishing date.
>
> --
> David.
>
>
> >
> > With best regards,
> >
> > Dr. A.K. Singh
> > Head, Department of Agril. Statistics
> > Indira Gandhi Krishi Vishwavidyalaya, Raipur
> > Chhattisgarh, India, PIN-492012
> > Mobile: +919752620740
> > Email: akhileshsingh.i...@gmail.com
> >
> > On Fri, Apr 15, 2016 at 2:24 PM, David Winsemius 
> wrote:
> >
> > > On Apr 15, 2016, at 1:16 AM, Akhilesh Singh <
> akhileshsingh.i...@gmail.com> wrote:
> > >
> > > Dear All,
> > >
> > > Thanks for your help. However, I would like to draw your attention to
> the
> > > following:
> > >
> > > Actually, I was replicating the Example 2.3, using the dataset
> > > "brainsize.txt" given in Section 2.3.3 ("Summarize by group") at page
> 55,
> > > of a famous book "R by Example" written by "Jim Albert and Maria Rizzo"
> > > published in Springers (2012) in a Use R! Series. The output of the
> by()
> > > function printed in the book is being reproduced below for information
> to
> > > all:
> > >
> > >> by(data=brain[, -1], INDICES=brain$Gender, FUN=mean, na.rm=TRUE)
> > > brain$Gender: Female
> > > FSIQ VIQ PIQ Weight Height MRI_Count
> > > 111.900 109.450 110.450 137.200 65.765 862654.600
> > > 
> > > brain$Gender: Male
> > > FSIQ  VIQPIQ   WeightHeight   MRI_Count
> > > 115.0 115.25000 111.6 166.4 71.43158 954855.4
> > >
> > >
> > > I do not know how could the writers of the book have produced the above
> > > results by by() function.
> >
> >
> > There was in the not-so-distant past a function named 

Re: [R] Bug in by() function which works for some FUN argument and does not work for others

2016-04-16 Thread David Winsemius

> On Apr 16, 2016, at 2:03 AM, Akhilesh Singh  
> wrote:
> 
> Dear All, 
> 
> I have got your core message, that it is my responsibility to determine 
> whether any particular function in my version of R satisfies the language 
> requirements at the time of your use. Jim Albert and Maria Rizzo must have 
> used their code, which was permitted in the R-code of their time (2012). 
> 
> Therefore, I have now modified my R-code, as per R-3..2.4 version, according 
> to my requirement as follows, which is working for my 'brain' data set, whose 
> output is reproduced below for your information please:
> 
> > by(brain[,-1], INDICES=list(Gender=brain$Gender), FUN=function(x, 
> > na.rm=FALSE) sapply(x, mean, na.rm=na.rm), na.rm=TRUE)
> Gender: Female
>   FSIQVIQPIQ Weight Height  MRI_Count 
>111.900109.450110.450137.200 65.765 862654.600 
> --
>  
> Gender: Male
> FSIQ  VIQ  PIQ   Weight   HeightMRI_Count 
>115.0115.25000111.6166.4 71.43158 954855.4 

Yes. that is certainly a workable alternative, although I thought the question 
of "how to to it" had been effectively answered with the suggestion from Adrian 
Dusa to use colMeans. It, too, has an `na.rm=TRUE` option

I was only responding to your plaintive complaint that the current version of R 
had a "bug" because it was not behaving as promised by an introductory text 
with a three year-old publishing date.

-- 
David.


> 
> With best regards,
> 
> Dr. A.K. Singh
> Head, Department of Agril. Statistics
> Indira Gandhi Krishi Vishwavidyalaya, Raipur
> Chhattisgarh, India, PIN-492012
> Mobile: +919752620740
> Email: akhileshsingh.i...@gmail.com
> 
> On Fri, Apr 15, 2016 at 2:24 PM, David Winsemius  
> wrote:
> 
> > On Apr 15, 2016, at 1:16 AM, Akhilesh Singh  
> > wrote:
> >
> > Dear All,
> >
> > Thanks for your help. However, I would like to draw your attention to the
> > following:
> >
> > Actually, I was replicating the Example 2.3, using the dataset
> > "brainsize.txt" given in Section 2.3.3 ("Summarize by group") at page 55,
> > of a famous book "R by Example" written by "Jim Albert and Maria Rizzo"
> > published in Springers (2012) in a Use R! Series. The output of the by()
> > function printed in the book is being reproduced below for information to
> > all:
> >
> >> by(data=brain[, -1], INDICES=brain$Gender, FUN=mean, na.rm=TRUE)
> > brain$Gender: Female
> > FSIQ VIQ PIQ Weight Height MRI_Count
> > 111.900 109.450 110.450 137.200 65.765 862654.600
> > 
> > brain$Gender: Male
> > FSIQ  VIQPIQ   WeightHeight   MRI_Count
> > 115.0 115.25000 111.6 166.4 71.43158 954855.4
> >
> >
> > I do not know how could the writers of the book have produced the above
> > results by by() function.
> 
> 
> There was in the not-so-distant past a function named `mean.data.frame` which 
> would have "worked" in that instance. That function was removed. I thought 
> you could  find the exact date of that action by searching the NEWS but 
> failed. Reviewing the citations of `mean.data.frame` in the r-help archives I 
> see that users were being warned that its use was deprecated in mid 2012.  
> It's very possible that the authors of a book in 2012 were using an earlier 
> version of R that had that facility available to them before it was 
> deprecated. With a more than current version of R 3.3.0 and a modest number 
> of loaded packages I see this:
> 
> > methods(mean)
>  [1] mean,ANY-method  mean,Matrix-method   mean,Raster-method
>  [4] mean,sparseMatrix-method mean,sparseVector-method mean.Date
>  [7] mean.default mean.difftimemean.POSIXct
> [10] mean.POSIXlt mean.yearmon*mean.yearqtr*
> [13] mean.zoo*
> 
> It is your responsibility to determine whether any particular function in 
> your version of R satisfies the language requirements at the time of your 
> use. Jim Albert and Maria Rizzo do not set the standards for what is an 
> evolving piece of software.
> 
> --
> David.
> 
> 
> > But, when I could not reproduce these results,
> > then I thought that probably, this could possibly be due to some missing
> > values NA's in Weight and Height variables. Then I tried the above code for
> > the "mtcars" dataset for INDICES=mtcars$am. When I found the same results
> > here too, then I reported the case in "r-help@R-project.org".
> >
> > With best regards,
> >
> > Dr. A.K. Singh
> > Head, Department of Agril. Statistics
> > Indira Gandhi Krishi Vishwavidyalaya, Raipur
> > Chhattisgarh, India, PIN-492012
> > Mobile: +919752620740
> > Email: akhileshsingh.i...@gmail.com
> >
> > On Fri, Apr 15, 2016 at 3:06 AM, Adrian Dușa 

Re: [R] Bug in by() function which works for some FUN argument and does not work for others

2016-04-16 Thread Akhilesh Singh
Dear All,

I have got your core message, that it is my responsibility to determine
whether any particular function in my version of R satisfies the language
requirements at the time of your use. Jim Albert and Maria Rizzo must have
used their code, which was permitted in the R-code of their time (2012).

Therefore, I have now modified my R-code, as per R-3..2.4 version,
according to my requirement as follows, which is working for my 'brain'
data set, whose output is reproduced below for your information please:

> by(brain[,-1], INDICES=list(Gender=brain$Gender), FUN=function(x,
na.rm=FALSE) sapply(x, mean, na.rm=na.rm), na.rm=TRUE)
Gender: Female
  FSIQVIQPIQ Weight Height  MRI_Count
   111.900109.450110.450137.200 65.765 862654.600
--
Gender: Male
FSIQ  VIQ  PIQ   Weight   Height
 MRI_Count
   115.0115.25000111.6166.4 71.43158
954855.4

With best regards,

Dr. A.K. Singh
Head, Department of Agril. Statistics
Indira Gandhi Krishi Vishwavidyalaya, Raipur
Chhattisgarh, India, PIN-492012
Mobile: +919752620740
Email: akhileshsingh.i...@gmail.com

On Fri, Apr 15, 2016 at 2:24 PM, David Winsemius 
wrote:

>
> > On Apr 15, 2016, at 1:16 AM, Akhilesh Singh <
> akhileshsingh.i...@gmail.com> wrote:
> >
> > Dear All,
> >
> > Thanks for your help. However, I would like to draw your attention to the
> > following:
> >
> > Actually, I was replicating the Example 2.3, using the dataset
> > "brainsize.txt" given in Section 2.3.3 ("Summarize by group") at page 55,
> > of a famous book "R by Example" written by "Jim Albert and Maria Rizzo"
> > published in Springers (2012) in a Use R! Series. The output of the by()
> > function printed in the book is being reproduced below for information to
> > all:
> >
> >> by(data=brain[, -1], INDICES=brain$Gender, FUN=mean, na.rm=TRUE)
> > brain$Gender: Female
> > FSIQ VIQ PIQ Weight Height MRI_Count
> > 111.900 109.450 110.450 137.200 65.765 862654.600
> > 
> > brain$Gender: Male
> > FSIQ  VIQPIQ   WeightHeight   MRI_Count
> > 115.0 115.25000 111.6 166.4 71.43158 954855.4
> >
> >
> > I do not know how could the writers of the book have produced the above
> > results by by() function.
>
>
> There was in the not-so-distant past a function named `mean.data.frame`
> which would have "worked" in that instance. That function was removed. I
> thought you could  find the exact date of that action by searching the NEWS
> but failed. Reviewing the citations of `mean.data.frame` in the r-help
> archives I see that users were being warned that its use was deprecated in
> mid 2012.  It's very possible that the authors of a book in 2012 were using
> an earlier version of R that had that facility available to them before it
> was deprecated. With a more than current version of R 3.3.0 and a modest
> number of loaded packages I see this:
>
> > methods(mean)
>  [1] mean,ANY-method  mean,Matrix-method   mean,Raster-method
>  [4] mean,sparseMatrix-method mean,sparseVector-method mean.Date
>  [7] mean.default mean.difftimemean.POSIXct
> [10] mean.POSIXlt mean.yearmon*mean.yearqtr*
> [13] mean.zoo*
>
> It is your responsibility to determine whether any particular function in
> your version of R satisfies the language requirements at the time of your
> use. Jim Albert and Maria Rizzo do not set the standards for what is an
> evolving piece of software.
>
> --
> David.
>
>
> > But, when I could not reproduce these results,
> > then I thought that probably, this could possibly be due to some missing
> > values NA's in Weight and Height variables. Then I tried the above code
> for
> > the "mtcars" dataset for INDICES=mtcars$am. When I found the same results
> > here too, then I reported the case in "r-help@R-project.org".
> >
> > With best regards,
> >
> > Dr. A.K. Singh
> > Head, Department of Agril. Statistics
> > Indira Gandhi Krishi Vishwavidyalaya, Raipur
> > Chhattisgarh, India, PIN-492012
> > Mobile: +919752620740
> > Email: akhileshsingh.i...@gmail.com
> >
> > On Fri, Apr 15, 2016 at 3:06 AM, Adrian Dușa 
> wrote:
> >
> >> I think you are not using the best function for what your intentions
> are.
> >> Try:
> >>
> >>> by(data=mtcars, INDICES=list(as.factor(mtcars$am)), FUN=colMeans)
> >> : 0
> >>mpg cyldisp  hpdrat  wt
> >> qsec  vs
> >> 17.1473684   6.9473684 290.3789474 160.2631579   3.2863158   3.7688947
> >> 18.1831579   0.3684211
> >> amgearcarb
> >>  0.000   3.2105263   2.7368421
> >>
> >>
> ---
> >> : 1
> >>mpg cyldisp

Re: [R] Bug in by() function which works for some FUN argument and does not work for others

2016-04-15 Thread Duncan Murdoch

On 15/04/2016 4:16 AM, Akhilesh Singh wrote:

Dear All,

Thanks for your help. However, I would like to draw your attention to the
following:

Actually, I was replicating the Example 2.3, using the dataset
"brainsize.txt" given in Section 2.3.3 ("Summarize by group") at page 55,
of a famous book "R by Example" written by "Jim Albert and Maria Rizzo"
published in Springers (2012) in a Use R! Series. The output of the by()
function printed in the book is being reproduced below for information to
all:


See their errata page http://personal.bgsu.edu/~mrizzo/Rx/Rx-errata.txt. 
 They corrected "mean" to "colMeans".


Duncan Murdoch




by(data=brain[, -1], INDICES=brain$Gender, FUN=mean, na.rm=TRUE)

brain$Gender: Female
FSIQ VIQ PIQ Weight Height MRI_Count
111.900 109.450 110.450 137.200 65.765 862654.600

brain$Gender: Male
FSIQ  VIQPIQ   WeightHeight   MRI_Count
115.0 115.25000 111.6 166.4 71.43158 954855.4


I do not know how could the writers of the book have produced the above
results by by() function. But, when I could not reproduce these results,
then I thought that probably, this could possibly be due to some missing
values NA's in Weight and Height variables. Then I tried the above code for
the "mtcars" dataset for INDICES=mtcars$am. When I found the same results
here too, then I reported the case in "r-help@R-project.org".

With best regards,

Dr. A.K. Singh
Head, Department of Agril. Statistics
Indira Gandhi Krishi Vishwavidyalaya, Raipur
Chhattisgarh, India, PIN-492012
Mobile: +919752620740
Email: akhileshsingh.i...@gmail.com

On Fri, Apr 15, 2016 at 3:06 AM, Adrian Dușa  wrote:


I think you are not using the best function for what your intentions are.
Try:


by(data=mtcars, INDICES=list(as.factor(mtcars$am)), FUN=colMeans)

: 0
 mpg cyldisp  hpdrat  wt
  qsec  vs
  17.1473684   6.9473684 290.3789474 160.2631579   3.2863158   3.7688947
  18.1831579   0.3684211
  amgearcarb
   0.000   3.2105263   2.7368421

---
: 1
 mpg cyldisp  hpdrat  wt
  qsec  vs
  24.3923077   5.0769231 143.5307692 126.8461538   4.050   2.411
  17.360   0.5384615
  amgearcarb
   1.000   4.3846154   2.9230769

See the difference between colMeans() and mean() in their respective help
files.
Hth,
Adrian

On Thu, Apr 14, 2016 at 11:14 PM, Akhilesh Singh <
akhileshsingh.i...@gmail.com> wrote:


Dear Sirs,

I am Professor at Indira Gandhi Krishi Vishwavidyalaya, Raipur,
Chhattisgarh, India.

While taking classes, I found the *by() *function producing following
error

when I use FUN=mean or median and some other functions, however,
FUN=summary works.

Given below is the output of the example I used on a built-in dataset
"mtcars", along with error message reproduced herewith:


by(data=mtcars, INDICES=list(mtcars$am), FUN=mean)

: 0
[1] NA

: 1
[1] NA
Warning messages:
1: In mean.default(data[x, , drop = FALSE], ...) :
   argument is not numeric or logical: returning NA
2: In mean.default(data[x, , drop = FALSE], ...) :
   argument is not numeric or logical: returning NA

However, the same by() function works for FUN=summary, given below is the
output:


by(data=mtcars, INDICES=list(mtcars$am), FUN=summary)

: 0
   mpg cyl disp hp
  Min.   :10.40   Min.   :4.000   Min.   :120.1   Min.   : 62.0
  1st Qu.:14.95   1st Qu.:6.000   1st Qu.:196.3   1st Qu.:116.5
  Median :17.30   Median :8.000   Median :275.8   Median :175.0
  Mean   :17.15   Mean   :6.947   Mean   :290.4   Mean   :160.3
  3rd Qu.:19.20   3rd Qu.:8.000   3rd Qu.:360.0   3rd Qu.:192.5
  Max.   :24.40   Max.   :8.000   Max.   :472.0   Max.   :245.0
   drat wt qsec vs   am

  Min.   :2.760   Min.   :2.465   Min.   :15.41   Min.   :0.   Min.
  :0

  1st Qu.:3.070   1st Qu.:3.438   1st Qu.:17.18   1st Qu.:0.   1st
Qu.:0

  Median :3.150   Median :3.520   Median :17.82   Median :0.   Median
:0

  Mean   :3.286   Mean   :3.769   Mean   :18.18   Mean   :0.3684   Mean
  :0

  3rd Qu.:3.695   3rd Qu.:3.842   3rd Qu.:19.17   3rd Qu.:1.   3rd
Qu.:0

  Max.   :3.920   Max.   :5.424   Max.   :22.90   Max.   :1.   Max.
  :0

   gearcarb
  Min.   :3.000   Min.   :1.000
  1st Qu.:3.000   1st Qu.:2.000
  Median :3.000   Median :3.000
  Mean   :3.211   Mean   :2.737
  3rd Qu.:3.000   3rd Qu.:4.000
  Max.   :4.000   Max.   :4.000

: 1
   mpg cyl disp hp drat

  Min.   :15.00   Min.   :4.000   Min.   : 71.1   Min.   : 52.0   Min.

Re: [R] Bug in by() function which works for some FUN argument and does not work for others

2016-04-15 Thread peter dalgaard
Books don't rewrite themselves retroactively

NEWS for 3.0.0 has

   • mean() for data frames and sd() for data frames and matrices are
  defunct.

and 3.0.0 was released April 3, 2013.

A book published in 2012 would likely be based on R 2.13.x or maybe even 2.12.x.

So mean(dataframe) worked in the past. It was changed because of 
inconsistencies, e.g. mean(as.matrix(dataframe)) is a single number, 
median.data.frame never existed, var(dataframe) differed from sd(dataframe)^2, 
etc. The deprecation/defunct process started with 2.14.0-pre in October 2011.

-pd 



On 15 Apr 2016, at 10:16 , Akhilesh Singh  wrote:

> Dear All,
> 
> Thanks for your help. However, I would like to draw your attention to the
> following:
> 
> Actually, I was replicating the Example 2.3, using the dataset
> "brainsize.txt" given in Section 2.3.3 ("Summarize by group") at page 55,
> of a famous book "R by Example" written by "Jim Albert and Maria Rizzo"
> published in Springers (2012) in a Use R! Series. The output of the by()
> function printed in the book is being reproduced below for information to
> all:
> 
>> by(data=brain[, -1], INDICES=brain$Gender, FUN=mean, na.rm=TRUE)
> brain$Gender: Female
> FSIQ VIQ PIQ Weight Height MRI_Count
> 111.900 109.450 110.450 137.200 65.765 862654.600
> 
> brain$Gender: Male
> FSIQ  VIQPIQ   WeightHeight   MRI_Count
> 115.0 115.25000 111.6 166.4 71.43158 954855.4
> 
> 
> I do not know how could the writers of the book have produced the above
> results by by() function. But, when I could not reproduce these results,
> then I thought that probably, this could possibly be due to some missing
> values NA's in Weight and Height variables. Then I tried the above code for
> the "mtcars" dataset for INDICES=mtcars$am. When I found the same results
> here too, then I reported the case in "r-help@R-project.org".
> 
> With best regards,
> 
> Dr. A.K. Singh
> Head, Department of Agril. Statistics
> Indira Gandhi Krishi Vishwavidyalaya, Raipur
> Chhattisgarh, India, PIN-492012
> Mobile: +919752620740
> Email: akhileshsingh.i...@gmail.com
> 
> On Fri, Apr 15, 2016 at 3:06 AM, Adrian Dușa  wrote:
> 
>> I think you are not using the best function for what your intentions are.
>> Try:
>> 
>>> by(data=mtcars, INDICES=list(as.factor(mtcars$am)), FUN=colMeans)
>> : 0
>>mpg cyldisp  hpdrat  wt
>> qsec  vs
>> 17.1473684   6.9473684 290.3789474 160.2631579   3.2863158   3.7688947
>> 18.1831579   0.3684211
>> amgearcarb
>>  0.000   3.2105263   2.7368421
>> 
>> ---
>> : 1
>>mpg cyldisp  hpdrat  wt
>> qsec  vs
>> 24.3923077   5.0769231 143.5307692 126.8461538   4.050   2.411
>> 17.360   0.5384615
>> amgearcarb
>>  1.000   4.3846154   2.9230769
>> 
>> See the difference between colMeans() and mean() in their respective help
>> files.
>> Hth,
>> Adrian
>> 
>> On Thu, Apr 14, 2016 at 11:14 PM, Akhilesh Singh <
>> akhileshsingh.i...@gmail.com> wrote:
>> 
>>> Dear Sirs,
>>> 
>>> I am Professor at Indira Gandhi Krishi Vishwavidyalaya, Raipur,
>>> Chhattisgarh, India.
>>> 
>>> While taking classes, I found the *by() *function producing following
>>> error
>>> 
>>> when I use FUN=mean or median and some other functions, however,
>>> FUN=summary works.
>>> 
>>> Given below is the output of the example I used on a built-in dataset
>>> "mtcars", along with error message reproduced herewith:
>>> 
 by(data=mtcars, INDICES=list(mtcars$am), FUN=mean)
>>> : 0
>>> [1] NA
>>> 
>>> : 1
>>> [1] NA
>>> Warning messages:
>>> 1: In mean.default(data[x, , drop = FALSE], ...) :
>>>  argument is not numeric or logical: returning NA
>>> 2: In mean.default(data[x, , drop = FALSE], ...) :
>>>  argument is not numeric or logical: returning NA
>>> 
>>> However, the same by() function works for FUN=summary, given below is the
>>> output:
>>> 
 by(data=mtcars, INDICES=list(mtcars$am), FUN=summary)
>>> : 0
>>>  mpg cyl disp hp
>>> Min.   :10.40   Min.   :4.000   Min.   :120.1   Min.   : 62.0
>>> 1st Qu.:14.95   1st Qu.:6.000   1st Qu.:196.3   1st Qu.:116.5
>>> Median :17.30   Median :8.000   Median :275.8   Median :175.0
>>> Mean   :17.15   Mean   :6.947   Mean   :290.4   Mean   :160.3
>>> 3rd Qu.:19.20   3rd Qu.:8.000   3rd Qu.:360.0   3rd Qu.:192.5
>>> Max.   :24.40   Max.   :8.000   Max.   :472.0   Max.   :245.0
>>>  drat wt qsec vs   am
>>> 
>>> Min.   :2.760   Min.   :2.465   Min.   :15.41   Min.   :0.   Min.
>>> :0
>>> 
>>> 1st Qu.:3.070   1st Qu.:3.438   1st Qu.:17.18   1st 

Re: [R] Bug in by() function which works for some FUN argument and does not work for others

2016-04-15 Thread David Winsemius

> On Apr 15, 2016, at 1:16 AM, Akhilesh Singh  
> wrote:
> 
> Dear All,
> 
> Thanks for your help. However, I would like to draw your attention to the
> following:
> 
> Actually, I was replicating the Example 2.3, using the dataset
> "brainsize.txt" given in Section 2.3.3 ("Summarize by group") at page 55,
> of a famous book "R by Example" written by "Jim Albert and Maria Rizzo"
> published in Springers (2012) in a Use R! Series. The output of the by()
> function printed in the book is being reproduced below for information to
> all:
> 
>> by(data=brain[, -1], INDICES=brain$Gender, FUN=mean, na.rm=TRUE)
> brain$Gender: Female
> FSIQ VIQ PIQ Weight Height MRI_Count
> 111.900 109.450 110.450 137.200 65.765 862654.600
> 
> brain$Gender: Male
> FSIQ  VIQPIQ   WeightHeight   MRI_Count
> 115.0 115.25000 111.6 166.4 71.43158 954855.4
> 
> 
> I do not know how could the writers of the book have produced the above
> results by by() function.


There was in the not-so-distant past a function named `mean.data.frame` which 
would have "worked" in that instance. That function was removed. I thought you 
could  find the exact date of that action by searching the NEWS but failed. 
Reviewing the citations of `mean.data.frame` in the r-help archives I see that 
users were being warned that its use was deprecated in mid 2012.  It's very 
possible that the authors of a book in 2012 were using an earlier version of R 
that had that facility available to them before it was deprecated. With a more 
than current version of R 3.3.0 and a modest number of loaded packages I see 
this:

> methods(mean)
 [1] mean,ANY-method  mean,Matrix-method   mean,Raster-method  
 [4] mean,sparseMatrix-method mean,sparseVector-method mean.Date   
 [7] mean.default mean.difftimemean.POSIXct
[10] mean.POSIXlt mean.yearmon*mean.yearqtr*   
[13] mean.zoo*

It is your responsibility to determine whether any particular function in your 
version of R satisfies the language requirements at the time of your use. Jim 
Albert and Maria Rizzo do not set the standards for what is an evolving piece 
of software.

-- 
David.


> But, when I could not reproduce these results,
> then I thought that probably, this could possibly be due to some missing
> values NA's in Weight and Height variables. Then I tried the above code for
> the "mtcars" dataset for INDICES=mtcars$am. When I found the same results
> here too, then I reported the case in "r-help@R-project.org".
> 
> With best regards,
> 
> Dr. A.K. Singh
> Head, Department of Agril. Statistics
> Indira Gandhi Krishi Vishwavidyalaya, Raipur
> Chhattisgarh, India, PIN-492012
> Mobile: +919752620740
> Email: akhileshsingh.i...@gmail.com
> 
> On Fri, Apr 15, 2016 at 3:06 AM, Adrian Dușa  wrote:
> 
>> I think you are not using the best function for what your intentions are.
>> Try:
>> 
>>> by(data=mtcars, INDICES=list(as.factor(mtcars$am)), FUN=colMeans)
>> : 0
>>mpg cyldisp  hpdrat  wt
>> qsec  vs
>> 17.1473684   6.9473684 290.3789474 160.2631579   3.2863158   3.7688947
>> 18.1831579   0.3684211
>> amgearcarb
>>  0.000   3.2105263   2.7368421
>> 
>> ---
>> : 1
>>mpg cyldisp  hpdrat  wt
>> qsec  vs
>> 24.3923077   5.0769231 143.5307692 126.8461538   4.050   2.411
>> 17.360   0.5384615
>> amgearcarb
>>  1.000   4.3846154   2.9230769
>> 
>> See the difference between colMeans() and mean() in their respective help
>> files.
>> Hth,
>> Adrian
>> 
>> On Thu, Apr 14, 2016 at 11:14 PM, Akhilesh Singh <
>> akhileshsingh.i...@gmail.com> wrote:
>> 
>>> Dear Sirs,
>>> 
>>> I am Professor at Indira Gandhi Krishi Vishwavidyalaya, Raipur,
>>> Chhattisgarh, India.
>>> 
>>> While taking classes, I found the *by() *function producing following
>>> error
>>> 
>>> when I use FUN=mean or median and some other functions, however,
>>> FUN=summary works.
>>> 
>>> Given below is the output of the example I used on a built-in dataset
>>> "mtcars", along with error message reproduced herewith:
>>> 
 by(data=mtcars, INDICES=list(mtcars$am), FUN=mean)
>>> : 0
>>> [1] NA
>>> 
>>> : 1
>>> [1] NA
>>> Warning messages:
>>> 1: In mean.default(data[x, , drop = FALSE], ...) :
>>>  argument is not numeric or logical: returning NA
>>> 2: In mean.default(data[x, , drop = FALSE], ...) :
>>>  argument is not numeric or logical: returning NA
>>> 
>>> However, the same by() function works for FUN=summary, given below is the
>>> output:
>>> 
 by(data=mtcars, INDICES=list(mtcars$am), FUN=summary)
>>> : 

Re: [R] Bug in by() function which works for some FUN argument and does not work for others

2016-04-15 Thread Akhilesh Singh
Dear All,

Thanks for your help. However, I would like to draw your attention to the
following:

Actually, I was replicating the Example 2.3, using the dataset
"brainsize.txt" given in Section 2.3.3 ("Summarize by group") at page 55,
of a famous book "R by Example" written by "Jim Albert and Maria Rizzo"
published in Springers (2012) in a Use R! Series. The output of the by()
function printed in the book is being reproduced below for information to
all:

> by(data=brain[, -1], INDICES=brain$Gender, FUN=mean, na.rm=TRUE)
brain$Gender: Female
FSIQ VIQ PIQ Weight Height MRI_Count
111.900 109.450 110.450 137.200 65.765 862654.600

brain$Gender: Male
FSIQ  VIQPIQ   WeightHeight   MRI_Count
115.0 115.25000 111.6 166.4 71.43158 954855.4


I do not know how could the writers of the book have produced the above
results by by() function. But, when I could not reproduce these results,
then I thought that probably, this could possibly be due to some missing
values NA's in Weight and Height variables. Then I tried the above code for
the "mtcars" dataset for INDICES=mtcars$am. When I found the same results
here too, then I reported the case in "r-help@R-project.org".

With best regards,

Dr. A.K. Singh
Head, Department of Agril. Statistics
Indira Gandhi Krishi Vishwavidyalaya, Raipur
Chhattisgarh, India, PIN-492012
Mobile: +919752620740
Email: akhileshsingh.i...@gmail.com

On Fri, Apr 15, 2016 at 3:06 AM, Adrian Dușa  wrote:

> I think you are not using the best function for what your intentions are.
> Try:
>
> > by(data=mtcars, INDICES=list(as.factor(mtcars$am)), FUN=colMeans)
> : 0
> mpg cyldisp  hpdrat  wt
>  qsec  vs
>  17.1473684   6.9473684 290.3789474 160.2631579   3.2863158   3.7688947
>  18.1831579   0.3684211
>  amgearcarb
>   0.000   3.2105263   2.7368421
>
> ---
> : 1
> mpg cyldisp  hpdrat  wt
>  qsec  vs
>  24.3923077   5.0769231 143.5307692 126.8461538   4.050   2.411
>  17.360   0.5384615
>  amgearcarb
>   1.000   4.3846154   2.9230769
>
> See the difference between colMeans() and mean() in their respective help
> files.
> Hth,
> Adrian
>
> On Thu, Apr 14, 2016 at 11:14 PM, Akhilesh Singh <
> akhileshsingh.i...@gmail.com> wrote:
>
>> Dear Sirs,
>>
>> I am Professor at Indira Gandhi Krishi Vishwavidyalaya, Raipur,
>> Chhattisgarh, India.
>>
>> While taking classes, I found the *by() *function producing following
>> error
>>
>> when I use FUN=mean or median and some other functions, however,
>> FUN=summary works.
>>
>> Given below is the output of the example I used on a built-in dataset
>> "mtcars", along with error message reproduced herewith:
>>
>> > by(data=mtcars, INDICES=list(mtcars$am), FUN=mean)
>> : 0
>> [1] NA
>> 
>> : 1
>> [1] NA
>> Warning messages:
>> 1: In mean.default(data[x, , drop = FALSE], ...) :
>>   argument is not numeric or logical: returning NA
>> 2: In mean.default(data[x, , drop = FALSE], ...) :
>>   argument is not numeric or logical: returning NA
>>
>> However, the same by() function works for FUN=summary, given below is the
>> output:
>>
>> > by(data=mtcars, INDICES=list(mtcars$am), FUN=summary)
>> : 0
>>   mpg cyl disp hp
>>  Min.   :10.40   Min.   :4.000   Min.   :120.1   Min.   : 62.0
>>  1st Qu.:14.95   1st Qu.:6.000   1st Qu.:196.3   1st Qu.:116.5
>>  Median :17.30   Median :8.000   Median :275.8   Median :175.0
>>  Mean   :17.15   Mean   :6.947   Mean   :290.4   Mean   :160.3
>>  3rd Qu.:19.20   3rd Qu.:8.000   3rd Qu.:360.0   3rd Qu.:192.5
>>  Max.   :24.40   Max.   :8.000   Max.   :472.0   Max.   :245.0
>>   drat wt qsec vs   am
>>
>>  Min.   :2.760   Min.   :2.465   Min.   :15.41   Min.   :0.   Min.
>>  :0
>>
>>  1st Qu.:3.070   1st Qu.:3.438   1st Qu.:17.18   1st Qu.:0.   1st
>> Qu.:0
>>
>>  Median :3.150   Median :3.520   Median :17.82   Median :0.   Median
>> :0
>>
>>  Mean   :3.286   Mean   :3.769   Mean   :18.18   Mean   :0.3684   Mean
>>  :0
>>
>>  3rd Qu.:3.695   3rd Qu.:3.842   3rd Qu.:19.17   3rd Qu.:1.   3rd
>> Qu.:0
>>
>>  Max.   :3.920   Max.   :5.424   Max.   :22.90   Max.   :1.   Max.
>>  :0
>>
>>   gearcarb
>>  Min.   :3.000   Min.   :1.000
>>  1st Qu.:3.000   1st Qu.:2.000
>>  Median :3.000   Median :3.000
>>  Mean   :3.211   Mean   :2.737
>>  3rd Qu.:3.000   3rd Qu.:4.000
>>  Max.   :4.000   Max.   :4.000
>> 
>> : 1
>>   mpg cyl disp hp drat
>>
>>  Min.   :15.00   Min.   :4.000   Min.   : 71.1   

Re: [R] Bug in by() function which works for some FUN argument and does not work for others

2016-04-14 Thread Jim Lemon
Hi Dr Singh,
The object mtcars is a data frame and the mean is not defined for a
data frame. If you try it on a component of the data frame for which
mean is defined:

 by(mtcars$mpg,mtcars$am,mean)
mtcars$am: 0
[1] 17.14737

mtcars$am: 1
[1] 24.39231

Jim

On Fri, Apr 15, 2016 at 6:14 AM, Akhilesh Singh
 wrote:
> Dear Sirs,
>
> I am Professor at Indira Gandhi Krishi Vishwavidyalaya, Raipur,
> Chhattisgarh, India.
>
> While taking classes, I found the *by() *function producing following error
> when I use FUN=mean or median and some other functions, however,
> FUN=summary works.
>
> Given below is the output of the example I used on a built-in dataset
> "mtcars", along with error message reproduced herewith:
>
>> by(data=mtcars, INDICES=list(mtcars$am), FUN=mean)
> : 0
> [1] NA
> 
> : 1
> [1] NA
> Warning messages:
> 1: In mean.default(data[x, , drop = FALSE], ...) :
>   argument is not numeric or logical: returning NA
> 2: In mean.default(data[x, , drop = FALSE], ...) :
>   argument is not numeric or logical: returning NA
>
> However, the same by() function works for FUN=summary, given below is the
> output:
>
>> by(data=mtcars, INDICES=list(mtcars$am), FUN=summary)
> : 0
>   mpg cyl disp hp
>  Min.   :10.40   Min.   :4.000   Min.   :120.1   Min.   : 62.0
>  1st Qu.:14.95   1st Qu.:6.000   1st Qu.:196.3   1st Qu.:116.5
>  Median :17.30   Median :8.000   Median :275.8   Median :175.0
>  Mean   :17.15   Mean   :6.947   Mean   :290.4   Mean   :160.3
>  3rd Qu.:19.20   3rd Qu.:8.000   3rd Qu.:360.0   3rd Qu.:192.5
>  Max.   :24.40   Max.   :8.000   Max.   :472.0   Max.   :245.0
>   drat wt qsec vs   am
>
>  Min.   :2.760   Min.   :2.465   Min.   :15.41   Min.   :0.   Min.   :0
>
>  1st Qu.:3.070   1st Qu.:3.438   1st Qu.:17.18   1st Qu.:0.   1st Qu.:0
>
>  Median :3.150   Median :3.520   Median :17.82   Median :0.   Median :0
>
>  Mean   :3.286   Mean   :3.769   Mean   :18.18   Mean   :0.3684   Mean   :0
>
>  3rd Qu.:3.695   3rd Qu.:3.842   3rd Qu.:19.17   3rd Qu.:1.   3rd Qu.:0
>
>  Max.   :3.920   Max.   :5.424   Max.   :22.90   Max.   :1.   Max.   :0
>
>   gearcarb
>  Min.   :3.000   Min.   :1.000
>  1st Qu.:3.000   1st Qu.:2.000
>  Median :3.000   Median :3.000
>  Mean   :3.211   Mean   :2.737
>  3rd Qu.:3.000   3rd Qu.:4.000
>  Max.   :4.000   Max.   :4.000
> 
> : 1
>   mpg cyl disp hp drat
>
>  Min.   :15.00   Min.   :4.000   Min.   : 71.1   Min.   : 52.0   Min.
> :3.54
>  1st Qu.:21.00   1st Qu.:4.000   1st Qu.: 79.0   1st Qu.: 66.0   1st
> Qu.:3.85
>  Median :22.80   Median :4.000   Median :120.3   Median :109.0   Median
> :4.08
>  Mean   :24.39   Mean   :5.077   Mean   :143.5   Mean   :126.8   Mean
> :4.05
>  3rd Qu.:30.40   3rd Qu.:6.000   3rd Qu.:160.0   3rd Qu.:113.0   3rd
> Qu.:4.22
>  Max.   :33.90   Max.   :8.000   Max.   :351.0   Max.   :335.0   Max.
> :4.93
>wt qsec vs   am gear
>
>  Min.   :1.513   Min.   :14.50   Min.   :0.   Min.   :1   Min.   :4.000
>
>  1st Qu.:1.935   1st Qu.:16.46   1st Qu.:0.   1st Qu.:1   1st Qu.:4.000
>
>  Median :2.320   Median :17.02   Median :1.   Median :1   Median :4.000
>
>  Mean   :2.411   Mean   :17.36   Mean   :0.5385   Mean   :1   Mean   :4.385
>
>  3rd Qu.:2.780   3rd Qu.:18.61   3rd Qu.:1.   3rd Qu.:1   3rd Qu.:5.000
>
>  Max.   :3.570   Max.   :19.90   Max.   :1.   Max.   :1   Max.   :5.000
>
>   carb
>  Min.   :1.000
>  1st Qu.:1.000
>  Median :2.000
>  Mean   :2.923
>  3rd Qu.:4.000
>  Max.   :8.000
>>
>
> I am using the latest version of *R-3.2.4 on Windows*, however, this error
> is being generated in the previous version too,
>
> Hope this reporting will get serious attention in debugging.
>
> With best regards,
>
> Dr. A.K. Singh
> Head, Department of Agril. Statistics
> Indira Gandhi Krishi Vishwavidyalaya, Raipur
> Chhattisgarh, India, PIN-492012
> Mobile: +919752620740
> Email: akhileshsingh.i...@gmail.com
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Bug in by() function which works for some FUN argument and does not work for others

2016-04-14 Thread Bert Gunter
You're right, but I think this fails to pinpoint the error. The
problem is that FUN's argument is  "applied to (usually data-frame)
subsets of data,"  and the OP has used FUN = mean, which takes a
vector (+ a few other classes), not a data frame, as argument. See
?mean

Morals:

1.  It is rather presumptuous to think that long used, well-tested,
core R functionality like by() have bugs; a (new?) user's first
thought should be to assume it is HIS error, not R's.


2. DO read the Help docs carefully. They are often terse, but usually
they mean what they (appear to) say.

Cheers,
Bert


Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )


On Thu, Apr 14, 2016 at 2:36 PM, Adrian Dușa  wrote:
> I think you are not using the best function for what your intentions are.
> Try:
>
>> by(data=mtcars, INDICES=list(as.factor(mtcars$am)), FUN=colMeans)
> : 0
> mpg cyldisp  hpdrat  wt
>qsec  vs
>  17.1473684   6.9473684 290.3789474 160.2631579   3.2863158   3.7688947
>  18.1831579   0.3684211
>  amgearcarb
>   0.000   3.2105263   2.7368421
> ---
> : 1
> mpg cyldisp  hpdrat  wt
>qsec  vs
>  24.3923077   5.0769231 143.5307692 126.8461538   4.050   2.411
>  17.360   0.5384615
>  amgearcarb
>   1.000   4.3846154   2.9230769
>
> See the difference between colMeans() and mean() in their respective help
> files.
> Hth,
> Adrian
>
> On Thu, Apr 14, 2016 at 11:14 PM, Akhilesh Singh <
> akhileshsingh.i...@gmail.com> wrote:
>
>> Dear Sirs,
>>
>> I am Professor at Indira Gandhi Krishi Vishwavidyalaya, Raipur,
>> Chhattisgarh, India.
>>
>> While taking classes, I found the *by() *function producing following error
>> when I use FUN=mean or median and some other functions, however,
>> FUN=summary works.
>>
>> Given below is the output of the example I used on a built-in dataset
>> "mtcars", along with error message reproduced herewith:
>>
>> > by(data=mtcars, INDICES=list(mtcars$am), FUN=mean)
>> : 0
>> [1] NA
>> 
>> : 1
>> [1] NA
>> Warning messages:
>> 1: In mean.default(data[x, , drop = FALSE], ...) :
>>   argument is not numeric or logical: returning NA
>> 2: In mean.default(data[x, , drop = FALSE], ...) :
>>   argument is not numeric or logical: returning NA
>>
>> However, the same by() function works for FUN=summary, given below is the
>> output:
>>
>> > by(data=mtcars, INDICES=list(mtcars$am), FUN=summary)
>> : 0
>>   mpg cyl disp hp
>>  Min.   :10.40   Min.   :4.000   Min.   :120.1   Min.   : 62.0
>>  1st Qu.:14.95   1st Qu.:6.000   1st Qu.:196.3   1st Qu.:116.5
>>  Median :17.30   Median :8.000   Median :275.8   Median :175.0
>>  Mean   :17.15   Mean   :6.947   Mean   :290.4   Mean   :160.3
>>  3rd Qu.:19.20   3rd Qu.:8.000   3rd Qu.:360.0   3rd Qu.:192.5
>>  Max.   :24.40   Max.   :8.000   Max.   :472.0   Max.   :245.0
>>   drat wt qsec vs   am
>>
>>  Min.   :2.760   Min.   :2.465   Min.   :15.41   Min.   :0.   Min.   :0
>>
>>  1st Qu.:3.070   1st Qu.:3.438   1st Qu.:17.18   1st Qu.:0.   1st Qu.:0
>>
>>  Median :3.150   Median :3.520   Median :17.82   Median :0.   Median :0
>>
>>  Mean   :3.286   Mean   :3.769   Mean   :18.18   Mean   :0.3684   Mean   :0
>>
>>  3rd Qu.:3.695   3rd Qu.:3.842   3rd Qu.:19.17   3rd Qu.:1.   3rd Qu.:0
>>
>>  Max.   :3.920   Max.   :5.424   Max.   :22.90   Max.   :1.   Max.   :0
>>
>>   gearcarb
>>  Min.   :3.000   Min.   :1.000
>>  1st Qu.:3.000   1st Qu.:2.000
>>  Median :3.000   Median :3.000
>>  Mean   :3.211   Mean   :2.737
>>  3rd Qu.:3.000   3rd Qu.:4.000
>>  Max.   :4.000   Max.   :4.000
>> 
>> : 1
>>   mpg cyl disp hp drat
>>
>>  Min.   :15.00   Min.   :4.000   Min.   : 71.1   Min.   : 52.0   Min.
>> :3.54
>>  1st Qu.:21.00   1st Qu.:4.000   1st Qu.: 79.0   1st Qu.: 66.0   1st
>> Qu.:3.85
>>  Median :22.80   Median :4.000   Median :120.3   Median :109.0   Median
>> :4.08
>>  Mean   :24.39   Mean   :5.077   Mean   :143.5   Mean   :126.8   Mean
>> :4.05
>>  3rd Qu.:30.40   3rd Qu.:6.000   3rd Qu.:160.0   3rd Qu.:113.0   3rd
>> Qu.:4.22
>>  Max.   :33.90   Max.   :8.000   Max.   :351.0   Max.   :335.0   Max.
>> :4.93
>>wt qsec vs   am gear
>>
>>  Min.   :1.513   Min.   :14.50   Min.   :0.   Min.   :1   Min.   :4.000
>>
>>  1st Qu.:1.935   1st Qu.:16.46   1st Qu.:0.   1st Qu.:1   1st Qu.:4.000
>>
>>  Median :2.320   Median :17.02   

Re: [R] Bug in by() function which works for some FUN argument and does not work for others

2016-04-14 Thread Adrian Dușa
I think you are not using the best function for what your intentions are.
Try:

> by(data=mtcars, INDICES=list(as.factor(mtcars$am)), FUN=colMeans)
: 0
mpg cyldisp  hpdrat  wt
   qsec  vs
 17.1473684   6.9473684 290.3789474 160.2631579   3.2863158   3.7688947
 18.1831579   0.3684211
 amgearcarb
  0.000   3.2105263   2.7368421
---
: 1
mpg cyldisp  hpdrat  wt
   qsec  vs
 24.3923077   5.0769231 143.5307692 126.8461538   4.050   2.411
 17.360   0.5384615
 amgearcarb
  1.000   4.3846154   2.9230769

See the difference between colMeans() and mean() in their respective help
files.
Hth,
Adrian

On Thu, Apr 14, 2016 at 11:14 PM, Akhilesh Singh <
akhileshsingh.i...@gmail.com> wrote:

> Dear Sirs,
>
> I am Professor at Indira Gandhi Krishi Vishwavidyalaya, Raipur,
> Chhattisgarh, India.
>
> While taking classes, I found the *by() *function producing following error
> when I use FUN=mean or median and some other functions, however,
> FUN=summary works.
>
> Given below is the output of the example I used on a built-in dataset
> "mtcars", along with error message reproduced herewith:
>
> > by(data=mtcars, INDICES=list(mtcars$am), FUN=mean)
> : 0
> [1] NA
> 
> : 1
> [1] NA
> Warning messages:
> 1: In mean.default(data[x, , drop = FALSE], ...) :
>   argument is not numeric or logical: returning NA
> 2: In mean.default(data[x, , drop = FALSE], ...) :
>   argument is not numeric or logical: returning NA
>
> However, the same by() function works for FUN=summary, given below is the
> output:
>
> > by(data=mtcars, INDICES=list(mtcars$am), FUN=summary)
> : 0
>   mpg cyl disp hp
>  Min.   :10.40   Min.   :4.000   Min.   :120.1   Min.   : 62.0
>  1st Qu.:14.95   1st Qu.:6.000   1st Qu.:196.3   1st Qu.:116.5
>  Median :17.30   Median :8.000   Median :275.8   Median :175.0
>  Mean   :17.15   Mean   :6.947   Mean   :290.4   Mean   :160.3
>  3rd Qu.:19.20   3rd Qu.:8.000   3rd Qu.:360.0   3rd Qu.:192.5
>  Max.   :24.40   Max.   :8.000   Max.   :472.0   Max.   :245.0
>   drat wt qsec vs   am
>
>  Min.   :2.760   Min.   :2.465   Min.   :15.41   Min.   :0.   Min.   :0
>
>  1st Qu.:3.070   1st Qu.:3.438   1st Qu.:17.18   1st Qu.:0.   1st Qu.:0
>
>  Median :3.150   Median :3.520   Median :17.82   Median :0.   Median :0
>
>  Mean   :3.286   Mean   :3.769   Mean   :18.18   Mean   :0.3684   Mean   :0
>
>  3rd Qu.:3.695   3rd Qu.:3.842   3rd Qu.:19.17   3rd Qu.:1.   3rd Qu.:0
>
>  Max.   :3.920   Max.   :5.424   Max.   :22.90   Max.   :1.   Max.   :0
>
>   gearcarb
>  Min.   :3.000   Min.   :1.000
>  1st Qu.:3.000   1st Qu.:2.000
>  Median :3.000   Median :3.000
>  Mean   :3.211   Mean   :2.737
>  3rd Qu.:3.000   3rd Qu.:4.000
>  Max.   :4.000   Max.   :4.000
> 
> : 1
>   mpg cyl disp hp drat
>
>  Min.   :15.00   Min.   :4.000   Min.   : 71.1   Min.   : 52.0   Min.
> :3.54
>  1st Qu.:21.00   1st Qu.:4.000   1st Qu.: 79.0   1st Qu.: 66.0   1st
> Qu.:3.85
>  Median :22.80   Median :4.000   Median :120.3   Median :109.0   Median
> :4.08
>  Mean   :24.39   Mean   :5.077   Mean   :143.5   Mean   :126.8   Mean
> :4.05
>  3rd Qu.:30.40   3rd Qu.:6.000   3rd Qu.:160.0   3rd Qu.:113.0   3rd
> Qu.:4.22
>  Max.   :33.90   Max.   :8.000   Max.   :351.0   Max.   :335.0   Max.
> :4.93
>wt qsec vs   am gear
>
>  Min.   :1.513   Min.   :14.50   Min.   :0.   Min.   :1   Min.   :4.000
>
>  1st Qu.:1.935   1st Qu.:16.46   1st Qu.:0.   1st Qu.:1   1st Qu.:4.000
>
>  Median :2.320   Median :17.02   Median :1.   Median :1   Median :4.000
>
>  Mean   :2.411   Mean   :17.36   Mean   :0.5385   Mean   :1   Mean   :4.385
>
>  3rd Qu.:2.780   3rd Qu.:18.61   3rd Qu.:1.   3rd Qu.:1   3rd Qu.:5.000
>
>  Max.   :3.570   Max.   :19.90   Max.   :1.   Max.   :1   Max.   :5.000
>
>   carb
>  Min.   :1.000
>  1st Qu.:1.000
>  Median :2.000
>  Mean   :2.923
>  3rd Qu.:4.000
>  Max.   :8.000
> >
>
> I am using the latest version of *R-3.2.4 on Windows*, however, this error
> is being generated in the previous version too,
>
> Hope this reporting will get serious attention in debugging.
>
> With best regards,
>
> Dr. A.K. Singh
> Head, Department of Agril. Statistics
> Indira Gandhi Krishi Vishwavidyalaya, Raipur
> Chhattisgarh, India, PIN-492012
> Mobile: +919752620740
> Email: akhileshsingh.i...@gmail.com
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and 

[R] Bug in by() function which works for some FUN argument and does not work for others

2016-04-14 Thread Akhilesh Singh
Dear Sirs,

I am Professor at Indira Gandhi Krishi Vishwavidyalaya, Raipur,
Chhattisgarh, India.

While taking classes, I found the *by() *function producing following error
when I use FUN=mean or median and some other functions, however,
FUN=summary works.

Given below is the output of the example I used on a built-in dataset
"mtcars", along with error message reproduced herewith:

> by(data=mtcars, INDICES=list(mtcars$am), FUN=mean)
: 0
[1] NA

: 1
[1] NA
Warning messages:
1: In mean.default(data[x, , drop = FALSE], ...) :
  argument is not numeric or logical: returning NA
2: In mean.default(data[x, , drop = FALSE], ...) :
  argument is not numeric or logical: returning NA

However, the same by() function works for FUN=summary, given below is the
output:

> by(data=mtcars, INDICES=list(mtcars$am), FUN=summary)
: 0
  mpg cyl disp hp
 Min.   :10.40   Min.   :4.000   Min.   :120.1   Min.   : 62.0
 1st Qu.:14.95   1st Qu.:6.000   1st Qu.:196.3   1st Qu.:116.5
 Median :17.30   Median :8.000   Median :275.8   Median :175.0
 Mean   :17.15   Mean   :6.947   Mean   :290.4   Mean   :160.3
 3rd Qu.:19.20   3rd Qu.:8.000   3rd Qu.:360.0   3rd Qu.:192.5
 Max.   :24.40   Max.   :8.000   Max.   :472.0   Max.   :245.0
  drat wt qsec vs   am

 Min.   :2.760   Min.   :2.465   Min.   :15.41   Min.   :0.   Min.   :0

 1st Qu.:3.070   1st Qu.:3.438   1st Qu.:17.18   1st Qu.:0.   1st Qu.:0

 Median :3.150   Median :3.520   Median :17.82   Median :0.   Median :0

 Mean   :3.286   Mean   :3.769   Mean   :18.18   Mean   :0.3684   Mean   :0

 3rd Qu.:3.695   3rd Qu.:3.842   3rd Qu.:19.17   3rd Qu.:1.   3rd Qu.:0

 Max.   :3.920   Max.   :5.424   Max.   :22.90   Max.   :1.   Max.   :0

  gearcarb
 Min.   :3.000   Min.   :1.000
 1st Qu.:3.000   1st Qu.:2.000
 Median :3.000   Median :3.000
 Mean   :3.211   Mean   :2.737
 3rd Qu.:3.000   3rd Qu.:4.000
 Max.   :4.000   Max.   :4.000

: 1
  mpg cyl disp hp drat

 Min.   :15.00   Min.   :4.000   Min.   : 71.1   Min.   : 52.0   Min.
:3.54
 1st Qu.:21.00   1st Qu.:4.000   1st Qu.: 79.0   1st Qu.: 66.0   1st
Qu.:3.85
 Median :22.80   Median :4.000   Median :120.3   Median :109.0   Median
:4.08
 Mean   :24.39   Mean   :5.077   Mean   :143.5   Mean   :126.8   Mean
:4.05
 3rd Qu.:30.40   3rd Qu.:6.000   3rd Qu.:160.0   3rd Qu.:113.0   3rd
Qu.:4.22
 Max.   :33.90   Max.   :8.000   Max.   :351.0   Max.   :335.0   Max.
:4.93
   wt qsec vs   am gear

 Min.   :1.513   Min.   :14.50   Min.   :0.   Min.   :1   Min.   :4.000

 1st Qu.:1.935   1st Qu.:16.46   1st Qu.:0.   1st Qu.:1   1st Qu.:4.000

 Median :2.320   Median :17.02   Median :1.   Median :1   Median :4.000

 Mean   :2.411   Mean   :17.36   Mean   :0.5385   Mean   :1   Mean   :4.385

 3rd Qu.:2.780   3rd Qu.:18.61   3rd Qu.:1.   3rd Qu.:1   3rd Qu.:5.000

 Max.   :3.570   Max.   :19.90   Max.   :1.   Max.   :1   Max.   :5.000

  carb
 Min.   :1.000
 1st Qu.:1.000
 Median :2.000
 Mean   :2.923
 3rd Qu.:4.000
 Max.   :8.000
>

I am using the latest version of *R-3.2.4 on Windows*, however, this error
is being generated in the previous version too,

Hope this reporting will get serious attention in debugging.

With best regards,

Dr. A.K. Singh
Head, Department of Agril. Statistics
Indira Gandhi Krishi Vishwavidyalaya, Raipur
Chhattisgarh, India, PIN-492012
Mobile: +919752620740
Email: akhileshsingh.i...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.