Hi, there is one more thing to consider: the risk of getting integer overflow. Should that be ignored? How is this handled by sum()?
I think it is good if {col|row}Sums() would return the same data type as the input object. Cheers Henrik On Thu, Jul 17, 2008 at 6:32 PM, Herve Pages <[EMAIL PROTECTED]> wrote: > Hi Bill, > > [EMAIL PROTECTED] wrote: >> >> I don't see the cost of doing so paying off. > > The cost seems rather small. For the "if" block that is tagged /* columns */ > in the do_colsum() function (src/main/array.c), that would give something > like: > > if (OP == 0 || OP == 1) { /* columns */ > int ans_type = (type == REALSXP || OP == 1) ? REALSXP : INTSXP; > int ina = type == INTSXP ? NA_INTEGER : NA_LOGICAL; > PROTECT(ans = allocVector(ans_type, p)); > for (j = 0; j < p; j++) { > double rsum = 0.0; > int isum = 0; > switch (type) { > case REALSXP: > rx = REAL(x) + n*j; > for (i = cnt = 0; i < n; i++, rx++) { > if (!ISNAN(*rx)) {cnt++; rsum += *rx;} > else if (keepNA) {rsum = NA_REAL; break;} > } > if (OP == 1) { > if (cnt == 0 || ISNAN(rsum)) > REAL(ans)[j] = NA_REAL; > else > REAL(ans)[j] = rsum / cnt; > } else > REAL(ans)[j] = rsum; > break; > case INTSXP: case LGLSXP: > ix = (type == INTSXP ? INTEGER(x) : LOGICAL(x)) + n*j; > for (i = cnt = 0; i < n; i++, ix++) { > if (*ix != ina) {cnt++; isum += *ix;} > else if (keepNA) {isum = NA_INTEGER; break;} > } > if (OP == 1) { > if (cnt == 0 || isum == NA_INTEGER) > REAL(ans)[j] = NA_REAL; > else > REAL(ans)[j] = ((double) isum) / cnt; > } else > INTEGER(ans)[j] = isum; > break; > default: > /* we checked the type above, but be sure */ > UNIMPLEMENTED_TYPEt("do_colsum", type); > } > } > } > > So now you have 42 lines instead of 37 in the current code. > Then do something similar for the "if" block tagged /* rows */. > Basically the new code would not be more complicated than the > current code. > > In the end rowSums()/colSums() will do the _right_ thing i.e. > they'll preserve the 'integer' storage mode, and, by doing so, will > behave consistently with the sum() function. > > BTW since the LGLSXP type is supported then the "'x' must be numeric" > error msg at the beginning of the do_colsum() function is inappropriate. > > Cheers, > H. > > >> >> storage.mode is really only important if you are passing arguments to >> compiled code. >> >> If you are passing to compiled code, you really need to ensure the >> storage mode is what you think it is, anyway. >> >> Bill Venables. >> >> >> -----Original Message----- >> From: [EMAIL PROTECTED] >> [mailto:[EMAIL PROTECTED] On Behalf Of Herve Pages >> Sent: Thursday, 17 July 2008 3:48 PM >> To: R-devel@r-project.org >> Subject: [Rd] rowSums()/colSums() don't preserve the 'integer' storage >> mode >> >> Hi, >> >> Wouldn't that make sense to have rowSums()/colSums() to preserve the >> storage mode? >> >> m <- matrix(1:15, nrow=5) >> >> > storage.mode(m) >> [1] "integer" >> >> > storage.mode(sum(m)) >> [1] "integer" >> >> > storage.mode(rowSums(m)) >> [1] "double" <------------------- surprising! >> >> Cheers, >> H. >> >> ______________________________________________ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel >> >> >> > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel