Denis Chabot <chabotd <at> globetrotter.net> writes:
: The sum of a vector having at least one NA but also valid data gives NA 
: if we do not specify na.rm=T. But with na.rm=T, we are telling sum to 
: give the sum of valid data, ignoring NAs that do not tell us anything 
: about the value of a variable. I found out while getting the sum of 
: small subsets of my data (such as when subsetting by several 
: variables), sometimes a "cell" only contained NAs for my response 
: variable. I would have expected the sum to be NA in such cases, as I do 
: not have a single data point telling me the value of my response here. 
: But R tells me the sum was zero in that cell! Was this behavior 
: considered "desirable" when sum was built? If not, any hope it will be 
: fixed?

Think of it this way: If u and v are index vectors then its desirable that

        sum(x[u]) + sum(x[v]) == sum(x[c(u,v)])

hold for zero length index vectors too in which case
sum(numeric()) should be zero, not NA.

If you want a short expression that gives NA for zero length x try this:

        sum(x) + if (length(x)) 0 else NA

or define your own function, sum0, say.

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html

Reply via email to