Re: [R] Why does summary show number of NAs as non-integer?

2005-06-01 Thread Earl F. Glynn
Berton Gunter [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 summary() is an S3 generic that for your vector dispatches
 summary.default(). The output of summary default has class table and so
 calls print.table (print is another S3 generic). Look at the code of
 print.table() to see how it formats the output.

Marc Schwartz [EMAIL PROTECTED] wrote in message
news:[EMAIL PROTECTED]
 On Tue, 2005-05-31 at 17:14 -0500, Earl F. Glynn wrote:

  Why isn't the number of NA's just 2 instead of the 2.000 shown
above?

 The same number of decimal places is used throughout a vector

I'm talking about how this should be designed.  The current impementation
may be to print a vector using generic logic, but why use generic logic to
produce a wrong solution? Shouldn't correctness be more important than using
a generic solution?

There is special logic to suppress NA's when they don't exist (see below),
so why isn't there special logic to print the count of NAs, which MUST be an
integer, correctly when they do exist?

An integer should NOT be displayed with meaningless decimal places. Why
would this ever be desirable?  The generic solution should be dropped in
favor of a correct solution.

# Why not use special logic to show the number of NA's correctly as an
integer?
 set.seed(19)
 summary( c(NA, runif(10,1,100), NaN) )
   Min. 1st Qu.  MedianMean 3rd Qu.Max.NA's
  7.771  24.850  43.040  43.940  63.540  83.830   2.000

# There is already special logic to suppress NA's
 set.seed(19)
 summary( runif(10,1,100) )
   Min. 1st Qu.  MedianMean 3rd Qu.Max.
  7.771  24.850  43.040  43.940  63.540  83.830

2.000 and 2 do not have equivalent meaning.

efg

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Why does summary show number of NAs as non-integer?

2005-06-01 Thread Gabor Grothendieck
On 6/1/05, Earl F. Glynn [EMAIL PROTECTED] wrote:
 Berton Gunter [EMAIL PROTECTED] wrote in message
 news:[EMAIL PROTECTED]
  summary() is an S3 generic that for your vector dispatches
  summary.default(). The output of summary default has class table and so
  calls print.table (print is another S3 generic). Look at the code of
  print.table() to see how it formats the output.
 
 Marc Schwartz [EMAIL PROTECTED] wrote in message
 news:[EMAIL PROTECTED]
  On Tue, 2005-05-31 at 17:14 -0500, Earl F. Glynn wrote:
 
   Why isn't the number of NA's just 2 instead of the 2.000 shown
 above?
 
  The same number of decimal places is used throughout a vector
 
 I'm talking about how this should be designed.  The current impementation
 may be to print a vector using generic logic, but why use generic logic to
 produce a wrong solution? Shouldn't correctness be more important than using
 a generic solution?
 
 There is special logic to suppress NA's when they don't exist (see below),
 so why isn't there special logic to print the count of NAs, which MUST be an
 integer, correctly when they do exist?
 
 An integer should NOT be displayed with meaningless decimal places. Why
 would this ever be desirable?  The generic solution should be dropped in
 favor of a correct solution.
 
 # Why not use special logic to show the number of NA's correctly as an
 integer?
  set.seed(19)
  summary( c(NA, runif(10,1,100), NaN) )
   Min. 1st Qu.  MedianMean 3rd Qu.Max.NA's
  7.771  24.850  43.040  43.940  63.540  83.830   2.000
 
 # There is already special logic to suppress NA's
  set.seed(19)
  summary( runif(10,1,100) )
   Min. 1st Qu.  MedianMean 3rd Qu.Max.
  7.771  24.850  43.040  43.940  63.540  83.830
 
 2.000 and 2 do not have equivalent meaning.

Try:

R library(Hmisc)
R describe( c(NA, runif(10,1,100), NaN) )
c(NA, runif(10, 1, 100), NaN) 
  n missing  uniqueMean .05 .10 .25 .50 .75 .90 
 10   2  10   50.99   15.24   16.82   21.14   52.70   76.35   83.52 
.95 
  90.79 

  13.65 17.17 18.12 30.18 46.21 59.19 65.36 80.01 81.90 98.06
Frequency 1 1 1 1 1 1 1 1 1 1
%10101010101010101010

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Why does summary show number of NAs as non-integer?

2005-05-31 Thread Earl F. Glynn
Example:

 set.seed(19)
 summary( c(NA, runif(10,1,100), NaN) )
   Min. 1st Qu.  MedianMean 3rd Qu.Max.NA's
  7.771  24.850  43.040  43.940  63.540  83.830   2.000

Why isn't the number of NA's just 2 instead of the 2.000 shown above?

efg

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


RE: [R] Why does summary show number of NAs as non-integer?

2005-05-31 Thread Berton Gunter
summary() is an S3 generic that for your vector dispatches
summary.default(). The output of summary default has class table and so
calls print.table (print is another S3 generic). Look at the code of
print.table() to see how it formats the output.

-- Bert Gunter
Genentech Non-Clinical Statistics
South San Francisco, CA
 
The business of the statistician is to catalyze the scientific learning
process.  - George E. P. Box
 
 

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Earl F. Glynn
 Sent: Tuesday, May 31, 2005 3:14 PM
 To: r-help@stat.math.ethz.ch
 Subject: [R] Why does summary show number of NAs as non-integer?
 
 Example:
 
  set.seed(19)
  summary( c(NA, runif(10,1,100), NaN) )
Min. 1st Qu.  MedianMean 3rd Qu.Max.NA's
   7.771  24.850  43.040  43.940  63.540  83.830   2.000
 
 Why isn't the number of NA's just 2 instead of the 2.000 
 shown above?
 
 efg
 
 __
 R-help@stat.math.ethz.ch mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide! 
 http://www.R-project.org/posting-guide.html


__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Why does summary show number of NAs as non-integer?

2005-05-31 Thread Marc Schwartz
On Tue, 2005-05-31 at 17:14 -0500, Earl F. Glynn wrote:
 Example:
 
  set.seed(19)
  summary( c(NA, runif(10,1,100), NaN) )
Min. 1st Qu.  MedianMean 3rd Qu.Max.NA's
   7.771  24.850  43.040  43.940  63.540  83.830   2.000
 
 Why isn't the number of NA's just 2 instead of the 2.000 shown above?
 
 efg

This is actually related to the thread on formatting numbers.

In reviewing the Detail section of ?print.default:

The same number of decimal places is used throughout a vector, This
means that digits specifies the minimum number of significant digits to
be used, and that at least one entry will be printed with that minimum
number.

'digits' in the above is the digits argument to print.default(). In this
case, it defaults to options(digits), which is 7.

In the above output from summary, you will note that all of the output
has three digits after the decimal place.

Thus:

 c(2)
[1] 2

 c(2, 3)
[1] 2 3

 c(2, 3.5)
[1] 2.0 3.5

 c(2, 3.57)
[1] 2.00 3.57

 c(2, 3.579)
[1] 2.000 3.579


Note how the output format of 2 varies depending upon how many decimal
places I use in the second element. 

This goes to the need to use other functions where there is a need to
exert greater control over how numeric output can be formatted and
aligned using formatC() and/or sprintf().

For example:

 sprintf(0 decimal places: %d3 decimal places: %4.3f, 2, 3.57911)
[1] 0 decimal places: 23 decimal places: 3.579


See ?sprintf and ?formatC for more information.

HTH,

Marc Schwartz

__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html