Re: [Rd] mean
Note that in > > quantile(c("1","2","3"),p=.5) > Error in (1 - h) * qs[i] : > argument non numérique pour un opérateur binaire the default quantile type (7) does not work for non-numerics. Quantile types 1 and 3 work as expected: > quantile(c("1","2","3"),p=.5, type=1) 50% "2" > quantile(c("1","2","3"),p=.5, type=3) 50% "2" Steve E *** This email and any attachments are confidential. Any use, copying or disclosure other than by the intended recipient is unauthorised. If you have received this message in error, please notify the sender immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com and delete this message and any copies from your computer and network. LGC Limited. Registered in England 2991879. Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] mean
Peter, Thanks for the reply. If that were the case, then should not the following be allowed to work with ordered factors? > median(factor(c("1", "2", "3"), ordered = TRUE)) Error in median.default(factor(c("1", "2", "3"), ordered = TRUE)) : need numeric data At least on the surface, if you can lexically order a character vector: > median(c("red", "blue", "green")) [1] "green" you can also order a factor, or ordered factor, and if the number of elements is odd, return a median value. Regards, Marc > On Jan 9, 2020, at 10:46 AM, peter dalgaard wrote: > > I think median() behaves as designed: As long as the argument can be ordered, > the "middle observation" makes sense, except when the middle falls between > two categories, and you can't define and average of the two candidates for a > median. > > The "sick man" would seem to be var(). Notice that it is also inconsistent > with cov(): > >> cov(c("1","2","3","4"),c("1","2","3","4") ) > Error in cov(c("1", "2", "3", "4"), c("1", "2", "3", "4")) : > is.numeric(x) || is.logical(x) is not TRUE >> var(c("1","2","3","4"),c("1","2","3","4") ) > [1] 1.67 > > -pd > > >> On 9 Jan 2020, at 14:49 , Marc Schwartz via R-devel >> wrote: >> >> Jean-Luc, >> >> Please keep the communications on the list, for the benefit of others, now >> and in the future, via the list archive. I am adding r-devel back here. >> >> I can't speak to the rationale in some of these cases. As I noted, it may be >> (is likely) due to differing authors over time, and there may have been >> relevant use cases at the time that the code was written, resulting in the >> various checks. Presumably, the additional checks were not incorporated into >> the other functions to enforce a level of consistency. >> >> We will need to wait for someone from R Core to comment. >> >> Regards, >> >> Marc >> >>> On Jan 9, 2020, at 8:34 AM, Lipatz Jean-Luc >>> wrote: >>> >>> Ok, inconstencies. >>> >>> The last test you wrote is a bit strange. I agree that it is useful to warn >>> about a computation that have no sense in the case of factors. But why >>> testing data;frames? If you go that way using random structures, you can >>> also try : >>> >>>> median(list(1,2),list(3,4),list(4,5)) >>> Error in if (na.rm) x <- x[!is.na(x)] else if (any(is.na(x))) >>> return(x[FALSE][NA]) : >>> l'argument n'est pas interprétable comme une valeur logique >>> De plus : Warning message: >>> In if (na.rm) x <- x[!is.na(x)] else if (any(is.na(x))) >>> return(x[FALSE][NA]) : >>> la condition a une longueur > 1 et seul le premier élément est utilisé >>> >>> giving a message which, despite of his length, doesn't really explain the >>> reason of the error. >>> >>> Why not a test on arguments like? >>> if (!is.numeric(x)) >>>stop("need numeric data") >>> >>> >>> -Message d'origine- >>> De : Marc Schwartz >>> Envoyé : jeudi 9 janvier 2020 14:19 >>> À : Lipatz Jean-Luc >>> Cc : R-Devel >>> Objet : Re: [Rd] mean >>> >>> >>>> On Jan 9, 2020, at 7:40 AM, Lipatz Jean-Luc >>>> wrote: >>>> >>>> Hello, >>>> >>>> Is there a reason for the following behaviour? >>>>> mean(c("1","2","3")) >>>> [1] NA >>>> Warning message: >>>> In mean.default(c("1", "2", "3")) : >>>> l'argument n'est ni numérique, ni logique : renvoi de NA >>>> >>>> But: >>>>> var(c("1","2","3")) >>>> [1] 1 >>>> >>>> And also: >>>>> median(c("1","2","3")) >>>> [1] "2" >>>> >>>> But: >>>>> quantile(c("1","2","3"),p=.5) >>>> Error in (1 - h) * qs[i] : >>>> argument non numérique pour un opérateur binaire >>>> >>>> It sounds like a lack of symet
Re: [Rd] mean
I think median() behaves as designed: As long as the argument can be ordered, the "middle observation" makes sense, except when the middle falls between two categories, and you can't define and average of the two candidates for a median. The "sick man" would seem to be var(). Notice that it is also inconsistent with cov(): > cov(c("1","2","3","4"),c("1","2","3","4") ) Error in cov(c("1", "2", "3", "4"), c("1", "2", "3", "4")) : is.numeric(x) || is.logical(x) is not TRUE > var(c("1","2","3","4"),c("1","2","3","4") ) [1] 1.67 -pd > On 9 Jan 2020, at 14:49 , Marc Schwartz via R-devel > wrote: > > Jean-Luc, > > Please keep the communications on the list, for the benefit of others, now > and in the future, via the list archive. I am adding r-devel back here. > > I can't speak to the rationale in some of these cases. As I noted, it may be > (is likely) due to differing authors over time, and there may have been > relevant use cases at the time that the code was written, resulting in the > various checks. Presumably, the additional checks were not incorporated into > the other functions to enforce a level of consistency. > > We will need to wait for someone from R Core to comment. > > Regards, > > Marc > >> On Jan 9, 2020, at 8:34 AM, Lipatz Jean-Luc wrote: >> >> Ok, inconstencies. >> >> The last test you wrote is a bit strange. I agree that it is useful to warn >> about a computation that have no sense in the case of factors. But why >> testing data;frames? If you go that way using random structures, you can >> also try : >> >>> median(list(1,2),list(3,4),list(4,5)) >> Error in if (na.rm) x <- x[!is.na(x)] else if (any(is.na(x))) >> return(x[FALSE][NA]) : >> l'argument n'est pas interprétable comme une valeur logique >> De plus : Warning message: >> In if (na.rm) x <- x[!is.na(x)] else if (any(is.na(x))) return(x[FALSE][NA]) >> : >> la condition a une longueur > 1 et seul le premier élément est utilisé >> >> giving a message which, despite of his length, doesn't really explain the >> reason of the error. >> >> Why not a test on arguments like? >> if (!is.numeric(x)) >> stop("need numeric data") >> >> >> -Message d'origine- >> De : Marc Schwartz >> Envoyé : jeudi 9 janvier 2020 14:19 >> À : Lipatz Jean-Luc >> Cc : R-Devel >> Objet : Re: [Rd] mean >> >> >>> On Jan 9, 2020, at 7:40 AM, Lipatz Jean-Luc >>> wrote: >>> >>> Hello, >>> >>> Is there a reason for the following behaviour? >>>> mean(c("1","2","3")) >>> [1] NA >>> Warning message: >>> In mean.default(c("1", "2", "3")) : >>> l'argument n'est ni numérique, ni logique : renvoi de NA >>> >>> But: >>>> var(c("1","2","3")) >>> [1] 1 >>> >>> And also: >>>> median(c("1","2","3")) >>> [1] "2" >>> >>> But: >>>> quantile(c("1","2","3"),p=.5) >>> Error in (1 - h) * qs[i] : >>> argument non numérique pour un opérateur binaire >>> >>> It sounds like a lack of symetry. >>> Best regards. >>> >>> >>> Jean-Luc LIPATZ >>> Insee - Direction générale >>> Responsable de la coordination sur le développement de R et la mise en >>> oeuvre d'alternatives à SAS >> >> >> Hi, >> >> It would appear, whether by design or just inconsistent implementations, >> perhaps by different authors over time, that the checks for whether or not >> the input vector is numeric differ across the functions. >> >> A further inconsistency is for median(), where: >> >>> median(c("1", "2", "3", "4")) >> [1] NA >> Warning message: >> In mean.default(sort(x, partial = half + 0L:1L)[half + 0L:1L]) : >> argument is not numeric or logical: returning NA >> >> as a result of there being 4 elements, rather than 3, and the internal >> checks in the code, where in the case of the input vector having an even >> number of elements, mean() is used: >> >> if (n%%2L == 1L) >> sort(x, partial = half)[half] >> else mean(sort(x, partial = half + 0L:1L)[half + 0L:1L]) >> >> >> Similarly: >> >>> median(factor(c("1", "2", "3"))) >> Error in median.default(factor(c("1", "2", "3"))) : need numeric data >> >> because the input vector is a factor, rather than character, and the initial >> check has: >> >> if (is.factor(x) || is.data.frame(x)) >> stop("need numeric data") >> >> >> Regards, >> >> Marc Schwartz >> >> > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd@cbs.dk Priv: pda...@gmail.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] mean
Jean-Luc, Please keep the communications on the list, for the benefit of others, now and in the future, via the list archive. I am adding r-devel back here. I can't speak to the rationale in some of these cases. As I noted, it may be (is likely) due to differing authors over time, and there may have been relevant use cases at the time that the code was written, resulting in the various checks. Presumably, the additional checks were not incorporated into the other functions to enforce a level of consistency. We will need to wait for someone from R Core to comment. Regards, Marc > On Jan 9, 2020, at 8:34 AM, Lipatz Jean-Luc wrote: > > Ok, inconstencies. > > The last test you wrote is a bit strange. I agree that it is useful to warn > about a computation that have no sense in the case of factors. But why > testing data;frames? If you go that way using random structures, you can also > try : > >> median(list(1,2),list(3,4),list(4,5)) > Error in if (na.rm) x <- x[!is.na(x)] else if (any(is.na(x))) > return(x[FALSE][NA]) : > l'argument n'est pas interprétable comme une valeur logique > De plus : Warning message: > In if (na.rm) x <- x[!is.na(x)] else if (any(is.na(x))) return(x[FALSE][NA]) : > la condition a une longueur > 1 et seul le premier élément est utilisé > > giving a message which, despite of his length, doesn't really explain the > reason of the error. > > Why not a test on arguments like? > if (!is.numeric(x)) > stop("need numeric data") > > > -Message d'origine- > De : Marc Schwartz > Envoyé : jeudi 9 janvier 2020 14:19 > À : Lipatz Jean-Luc > Cc : R-Devel > Objet : Re: [Rd] mean > > >> On Jan 9, 2020, at 7:40 AM, Lipatz Jean-Luc wrote: >> >> Hello, >> >> Is there a reason for the following behaviour? >>> mean(c("1","2","3")) >> [1] NA >> Warning message: >> In mean.default(c("1", "2", "3")) : >> l'argument n'est ni numérique, ni logique : renvoi de NA >> >> But: >>> var(c("1","2","3")) >> [1] 1 >> >> And also: >>> median(c("1","2","3")) >> [1] "2" >> >> But: >>> quantile(c("1","2","3"),p=.5) >> Error in (1 - h) * qs[i] : >> argument non numérique pour un opérateur binaire >> >> It sounds like a lack of symetry. >> Best regards. >> >> >> Jean-Luc LIPATZ >> Insee - Direction générale >> Responsable de la coordination sur le développement de R et la mise en >> oeuvre d'alternatives à SAS > > > Hi, > > It would appear, whether by design or just inconsistent implementations, > perhaps by different authors over time, that the checks for whether or not > the input vector is numeric differ across the functions. > > A further inconsistency is for median(), where: > >> median(c("1", "2", "3", "4")) > [1] NA > Warning message: > In mean.default(sort(x, partial = half + 0L:1L)[half + 0L:1L]) : > argument is not numeric or logical: returning NA > > as a result of there being 4 elements, rather than 3, and the internal checks > in the code, where in the case of the input vector having an even number of > elements, mean() is used: > >if (n%%2L == 1L) >sort(x, partial = half)[half] >else mean(sort(x, partial = half + 0L:1L)[half + 0L:1L]) > > > Similarly: > >> median(factor(c("1", "2", "3"))) > Error in median.default(factor(c("1", "2", "3"))) : need numeric data > > because the input vector is a factor, rather than character, and the initial > check has: > > if (is.factor(x) || is.data.frame(x)) > stop("need numeric data") > > > Regards, > > Marc Schwartz > > __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] mean
> On Jan 9, 2020, at 7:40 AM, Lipatz Jean-Luc wrote: > > Hello, > > Is there a reason for the following behaviour? >> mean(c("1","2","3")) > [1] NA > Warning message: > In mean.default(c("1", "2", "3")) : > l'argument n'est ni numérique, ni logique : renvoi de NA > > But: >> var(c("1","2","3")) > [1] 1 > > And also: >> median(c("1","2","3")) > [1] "2" > > But: >> quantile(c("1","2","3"),p=.5) > Error in (1 - h) * qs[i] : > argument non numérique pour un opérateur binaire > > It sounds like a lack of symetry. > Best regards. > > > Jean-Luc LIPATZ > Insee - Direction générale > Responsable de la coordination sur le développement de R et la mise en oeuvre > d'alternatives à SAS Hi, It would appear, whether by design or just inconsistent implementations, perhaps by different authors over time, that the checks for whether or not the input vector is numeric differ across the functions. A further inconsistency is for median(), where: > median(c("1", "2", "3", "4")) [1] NA Warning message: In mean.default(sort(x, partial = half + 0L:1L)[half + 0L:1L]) : argument is not numeric or logical: returning NA as a result of there being 4 elements, rather than 3, and the internal checks in the code, where in the case of the input vector having an even number of elements, mean() is used: if (n%%2L == 1L) sort(x, partial = half)[half] else mean(sort(x, partial = half + 0L:1L)[half + 0L:1L]) Similarly: > median(factor(c("1", "2", "3"))) Error in median.default(factor(c("1", "2", "3"))) : need numeric data because the input vector is a factor, rather than character, and the initial check has: if (is.factor(x) || is.data.frame(x)) stop("need numeric data") Regards, Marc Schwartz __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] mean
Hello, Is there a reason for the following behaviour? > mean(c("1","2","3")) [1] NA Warning message: In mean.default(c("1", "2", "3")) : l'argument n'est ni numérique, ni logique : renvoi de NA But: > var(c("1","2","3")) [1] 1 And also: > median(c("1","2","3")) [1] "2" But: > quantile(c("1","2","3"),p=.5) Error in (1 - h) * qs[i] : argument non numérique pour un opérateur binaire It sounds like a lack of symetry. Best regards. Jean-Luc LIPATZ Insee - Direction générale Responsable de la coordination sur le développement de R et la mise en oeuvre d'alternatives à SAS __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] mean(x) for ALTREP
Serguei, The R 3.5.0 release includes the fundamental ALTREP framework but does not include many 'hooks' within R's source code to make use of methods on the ALTREP custom vector classes. I have implemented a fair number, including for mean() to use the custom Sum method when available, in the ALTREP branch but unfortunately we did not have time to test and port them to the trunk in time for this release. The current plan, as I understand it, is that we will continue to develop and test these, and other hooks, and then when ready they will be ported into trunk/R-devel over the course this current development cycle for inclusion in the next release of R. My hope is that the end-user benefits of ALTREP will really show through much more in future releases, but for now, things like mean will will behave as they always have from a user perspective. Best, ~G On Thu, Apr 26, 2018 at 2:31 AM, Serguei Sokolwrote: > Hi, > > By looking at a doc about ALTREP https://svn.r-project.org/R/br > anches/ALTREP/ALTREP.html (by the way congratulations for that and for > R-3.5.0 in general), I was a little bit surprised by the following example: > > > x <- 1:1e10 > > system.time(print(mean(x))) > [1] 5e+09 >user system elapsed > 38.520 0.008 38.531 > > Taking 38.520 s to calculate a mean value of an arithmetic sequence seemed > a lot to me. It probably means that calculations are made by running into a > for loop while in the case of arithmetic sequence a mean value can simply > be calculated as (b+e)/2 where b and e are the begin and end value > respectively. Is it planned to take benefit of ALTREP for functions like > mean(), sum(), min(), max() and some others to avoid running a for loop > wherever possible? It seems so natural to me but after all some > implementation details preventing this can escape to me. > > Best, > Serguei. > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > -- Gabriel Becker, Ph.D Scientist Bioinformatics and Computational Biology Genentech Research [[alternative HTML version deleted]] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] mean(x) for ALTREP
Hi, By looking at a doc about ALTREP https://svn.r-project.org/R/branches/ALTREP/ALTREP.html (by the way congratulations for that and for R-3.5.0 in general), I was a little bit surprised by the following example: > x <- 1:1e10 > system.time(print(mean(x))) [1] 5e+09 user system elapsed 38.520 0.008 38.531 Taking 38.520 s to calculate a mean value of an arithmetic sequence seemed a lot to me. It probably means that calculations are made by running into a for loop while in the case of arithmetic sequence a mean value can simply be calculated as (b+e)/2 where b and e are the begin and end value respectively. Is it planned to take benefit of ALTREP for functions like mean(), sum(), min(), max() and some others to avoid running a for loop wherever possible? It seems so natural to me but after all some implementation details preventing this can escape to me. Best, Serguei. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] mean(x) != mean(rev(x)) different with x <- c(NA, NaN) for some builds
On 03/31/2017 10:14 PM, Prof Brian Ripley wrote: From ?NA Numerical computations using ‘NA’ will normally result in ‘NA’: a possible exception is where ‘NaN’ is also involved, in which case either might result. and ?NaN Computations involving ‘NaN’ will return ‘NaN’ or perhaps ‘NA’: which of those two is not guaranteed and may depend on the R platform (since compilers may re-order computations). fortunes::fortune(14) applies (yet again). The problem is that TFM often contradicts itself e.g. in ?prod: If ‘na.rm’ is ‘FALSE’ an ‘NA’ value in any of the arguments will cause a value of ‘NA’ to be returned, otherwise ‘NA’ values are ignored. which is clearly not the case (at least for me): > x <- c(NaN, NA) > prod(x) [1] NaN H. On 01/04/2017 04:50, Henrik Bengtsson wrote: In R 3.3.3, I observe the following on Ubuntu 16.04 (when building from source as well as for the sudo apt r-base build): x <- c(NA, NaN) mean(x) [1] NA mean(rev(x)) [1] NaN rowMeans(matrix(x, nrow = 1, ncol = 2)) [1] NA rowMeans(matrix(rev(x), nrow = 1, ncol = 2)) [1] NaN .rowMeans(x, m = 1, n = 2) [1] NA .rowMeans(rev(x), m = 1, n = 2) [1] NaN .rowSums(x, m = 1, n = 2) [1] NA .rowSums(rev(x), m = 1, n = 2) [1] NaN rowSums(matrix(x, nrow = 1, ncol = 2)) [1] NA rowSums(matrix(rev(x), nrow = 1, ncol = 2)) [1] NaN I'd expect NA to trump NaN in all cases (with na.rm = FALSE). sum() does not have this problem and returns NA in both cases (*). For the same R version build from source on RHEL 6.6 system (completely different architecture), I get the expected result (= NA) for all of the above cases, e.g. x <- c(NA, NaN) mean(x) [1] NA mean(rev(x)) [1] NA [...] Before going insane trying to troubleshoot this, I have a vague memory that this, or something related to this, has been discussed previously, but I cannot locate it. Is the above a bug in R, a FAQ, a build error, overzealous compiler optimization, and / or ...? Thanks, Henrik -- Hervé Pagès Program in Computational Biology Division of Public Health Sciences Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N, M1-B514 P.O. Box 19024 Seattle, WA 98109-1024 E-mail: hpa...@fredhutch.org Phone: (206) 667-5791 Fax:(206) 667-1319 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] mean(x) != mean(rev(x)) different with x <- c(NA, NaN) for some builds
Although help("is.nan") says: "Computations involving NaN will return NaN or perhaps NA: ..." it might not be obvious that this is also why one may get: > mean(c(-Inf, +Inf, NA)) [1] NaN > mean(c(-Inf, NA, +Inf)) [1] NA This is because internally the intermediate sum +Inf + -Inf is NaN in the first case. May I propose the following patch to that help paragraph: Index: src/library/base/man/is.finite.Rd === --- src/library/base/man/is.finite.Rd (revision 72462) +++ src/library/base/man/is.finite.Rd (working copy) @@ -78,6 +78,8 @@ Computations involving \code{NaN} will return \code{NaN} or perhaps \code{\link{NA}}: which of those two is not guaranteed and may depend on the \R platform (since compilers may re-order computations). + This may also apply to computations involving both \code{-Inf} and + \code{+Inf} in cases where they produce an intermediate \code{NaN}. } \value{ A logical vector of the same length as \code{x}: \code{dim}, /Henrik On Fri, Mar 31, 2017 at 10:51 PM, Henrik Bengtssonwrote: > On Fri, Mar 31, 2017 at 10:14 PM, Prof Brian Ripley > wrote: >> From ?NA >> >> Numerical computations using ‘NA’ will normally result in ‘NA’: a >> possible exception is where ‘NaN’ is also involved, in which case >> either might result. >> >> and ?NaN >> >> Computations involving ‘NaN’ will return ‘NaN’ or perhaps ‘NA’: >> which of those two is not guaranteed and may depend on the R >> platform (since compilers may re-order computations). >> >> fortunes::fortune(14) applies (yet again). > > Thanks; I'm often happy to have contributed to some of the fortune > counters, but not so sure about this one. What's even worse is that > one of my own matrixStats NEWS has an entry go a few years back which > mentions "... incorrectly assumed that the value of prod(c(NaN, NA)) > is uniquely defined. However, as documented in help("is.nan"), it may > be NA or NaN depending on R system/platform." I guess the joke is on > me - it's April 1st after all. > > But, technically one could test for ISNA(x) for each element before > calculating the intermediate sum, but since that is a quite expensive > test it is not done and sum += x is performed "as is" on NA and NaN > (and -Inf and +Inf). Is that correct? > > /Henrik > >> >> >> On 01/04/2017 04:50, Henrik Bengtsson wrote: >>> >>> In R 3.3.3, I observe the following on Ubuntu 16.04 (when building >>> from source as well as for the sudo apt r-base build): >>> x <- c(NA, NaN) mean(x) >>> >>> [1] NA mean(rev(x)) >>> >>> [1] NaN >>> rowMeans(matrix(x, nrow = 1, ncol = 2)) >>> >>> [1] NA rowMeans(matrix(rev(x), nrow = 1, ncol = 2)) >>> >>> [1] NaN >>> .rowMeans(x, m = 1, n = 2) >>> >>> [1] NA .rowMeans(rev(x), m = 1, n = 2) >>> >>> [1] NaN >>> .rowSums(x, m = 1, n = 2) >>> >>> [1] NA .rowSums(rev(x), m = 1, n = 2) >>> >>> [1] NaN >>> rowSums(matrix(x, nrow = 1, ncol = 2)) >>> >>> [1] NA rowSums(matrix(rev(x), nrow = 1, ncol = 2)) >>> >>> [1] NaN >>> >>> I'd expect NA to trump NaN in all cases (with na.rm = FALSE). sum() >>> does not have this problem and returns NA in both cases (*). >>> >>> For the same R version build from source on RHEL 6.6 system >>> (completely different architecture), I get the expected result (= NA) >>> for all of the above cases, e.g. >>> x <- c(NA, NaN) mean(x) >>> >>> [1] NA mean(rev(x)) >>> >>> [1] NA >>> [...] >>> >>> Before going insane trying to troubleshoot this, I have a vague memory >>> that this, or something related to this, has been discussed >>> previously, but I cannot locate it. >>> >>> Is the above a bug in R, a FAQ, a build error, overzealous compiler >>> optimization, and / or ...? >>> >>> Thanks, >>> >>> Henrik >> >> >> >> -- >> Brian D. Ripley, rip...@stats.ox.ac.uk >> Emeritus Professor of Applied Statistics, University of Oxford >> >> __ >> R-devel@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] mean(x) != mean(rev(x)) different with x <- c(NA, NaN) for some builds
On Fri, Mar 31, 2017 at 10:14 PM, Prof Brian Ripleywrote: > From ?NA > > Numerical computations using ‘NA’ will normally result in ‘NA’: a > possible exception is where ‘NaN’ is also involved, in which case > either might result. > > and ?NaN > > Computations involving ‘NaN’ will return ‘NaN’ or perhaps ‘NA’: > which of those two is not guaranteed and may depend on the R > platform (since compilers may re-order computations). > > fortunes::fortune(14) applies (yet again). Thanks; I'm often happy to have contributed to some of the fortune counters, but not so sure about this one. What's even worse is that one of my own matrixStats NEWS has an entry go a few years back which mentions "... incorrectly assumed that the value of prod(c(NaN, NA)) is uniquely defined. However, as documented in help("is.nan"), it may be NA or NaN depending on R system/platform." I guess the joke is on me - it's April 1st after all. But, technically one could test for ISNA(x) for each element before calculating the intermediate sum, but since that is a quite expensive test it is not done and sum += x is performed "as is" on NA and NaN (and -Inf and +Inf). Is that correct? /Henrik > > > On 01/04/2017 04:50, Henrik Bengtsson wrote: >> >> In R 3.3.3, I observe the following on Ubuntu 16.04 (when building >> from source as well as for the sudo apt r-base build): >> >>> x <- c(NA, NaN) >>> mean(x) >> >> [1] NA >>> >>> mean(rev(x)) >> >> [1] NaN >> >>> rowMeans(matrix(x, nrow = 1, ncol = 2)) >> >> [1] NA >>> >>> rowMeans(matrix(rev(x), nrow = 1, ncol = 2)) >> >> [1] NaN >> >>> .rowMeans(x, m = 1, n = 2) >> >> [1] NA >>> >>> .rowMeans(rev(x), m = 1, n = 2) >> >> [1] NaN >> >>> .rowSums(x, m = 1, n = 2) >> >> [1] NA >>> >>> .rowSums(rev(x), m = 1, n = 2) >> >> [1] NaN >> >>> rowSums(matrix(x, nrow = 1, ncol = 2)) >> >> [1] NA >>> >>> rowSums(matrix(rev(x), nrow = 1, ncol = 2)) >> >> [1] NaN >> >> I'd expect NA to trump NaN in all cases (with na.rm = FALSE). sum() >> does not have this problem and returns NA in both cases (*). >> >> For the same R version build from source on RHEL 6.6 system >> (completely different architecture), I get the expected result (= NA) >> for all of the above cases, e.g. >> >>> x <- c(NA, NaN) >>> mean(x) >> >> [1] NA >>> >>> mean(rev(x)) >> >> [1] NA >> [...] >> >> Before going insane trying to troubleshoot this, I have a vague memory >> that this, or something related to this, has been discussed >> previously, but I cannot locate it. >> >> Is the above a bug in R, a FAQ, a build error, overzealous compiler >> optimization, and / or ...? >> >> Thanks, >> >> Henrik > > > > -- > Brian D. Ripley, rip...@stats.ox.ac.uk > Emeritus Professor of Applied Statistics, University of Oxford > > __ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] mean(x) != mean(rev(x)) different with x <- c(NA, NaN) for some builds
From ?NA Numerical computations using ‘NA’ will normally result in ‘NA’: a possible exception is where ‘NaN’ is also involved, in which case either might result. and ?NaN Computations involving ‘NaN’ will return ‘NaN’ or perhaps ‘NA’: which of those two is not guaranteed and may depend on the R platform (since compilers may re-order computations). fortunes::fortune(14) applies (yet again). On 01/04/2017 04:50, Henrik Bengtsson wrote: In R 3.3.3, I observe the following on Ubuntu 16.04 (when building from source as well as for the sudo apt r-base build): x <- c(NA, NaN) mean(x) [1] NA mean(rev(x)) [1] NaN rowMeans(matrix(x, nrow = 1, ncol = 2)) [1] NA rowMeans(matrix(rev(x), nrow = 1, ncol = 2)) [1] NaN .rowMeans(x, m = 1, n = 2) [1] NA .rowMeans(rev(x), m = 1, n = 2) [1] NaN .rowSums(x, m = 1, n = 2) [1] NA .rowSums(rev(x), m = 1, n = 2) [1] NaN rowSums(matrix(x, nrow = 1, ncol = 2)) [1] NA rowSums(matrix(rev(x), nrow = 1, ncol = 2)) [1] NaN I'd expect NA to trump NaN in all cases (with na.rm = FALSE). sum() does not have this problem and returns NA in both cases (*). For the same R version build from source on RHEL 6.6 system (completely different architecture), I get the expected result (= NA) for all of the above cases, e.g. x <- c(NA, NaN) mean(x) [1] NA mean(rev(x)) [1] NA [...] Before going insane trying to troubleshoot this, I have a vague memory that this, or something related to this, has been discussed previously, but I cannot locate it. Is the above a bug in R, a FAQ, a build error, overzealous compiler optimization, and / or ...? Thanks, Henrik -- Brian D. Ripley, rip...@stats.ox.ac.uk Emeritus Professor of Applied Statistics, University of Oxford __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] mean(x) != mean(rev(x)) different with x <- c(NA, NaN) for some builds
In R 3.3.3, I observe the following on Ubuntu 16.04 (when building from source as well as for the sudo apt r-base build): > x <- c(NA, NaN) > mean(x) [1] NA > mean(rev(x)) [1] NaN > rowMeans(matrix(x, nrow = 1, ncol = 2)) [1] NA > rowMeans(matrix(rev(x), nrow = 1, ncol = 2)) [1] NaN > .rowMeans(x, m = 1, n = 2) [1] NA > .rowMeans(rev(x), m = 1, n = 2) [1] NaN > .rowSums(x, m = 1, n = 2) [1] NA > .rowSums(rev(x), m = 1, n = 2) [1] NaN > rowSums(matrix(x, nrow = 1, ncol = 2)) [1] NA > rowSums(matrix(rev(x), nrow = 1, ncol = 2)) [1] NaN I'd expect NA to trump NaN in all cases (with na.rm = FALSE). sum() does not have this problem and returns NA in both cases (*). For the same R version build from source on RHEL 6.6 system (completely different architecture), I get the expected result (= NA) for all of the above cases, e.g. > x <- c(NA, NaN) > mean(x) [1] NA > mean(rev(x)) [1] NA [...] Before going insane trying to troubleshoot this, I have a vague memory that this, or something related to this, has been discussed previously, but I cannot locate it. Is the above a bug in R, a FAQ, a build error, overzealous compiler optimization, and / or ...? Thanks, Henrik __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] mean(trim=, c(NA,...), na.rm=FALSE) does not return NA
On Tue, 16 Mar 2010, William Dunlap wrote: Both of the following should return NA, but do not in R version 2.11.0 Under development (unstable) (2010-03-07 r51225) on 32-bit Windows: Nor in any version of R in the last several years (e.g. 2.1.0) mean(c(1,10,100,NA), trim=.1) Error in sort.int(x, partial = unique(c(lo, hi))) : index 4 outside bounds mean(c(1,10,100,NA), trim=.26) [1] 55 With na.rm=TRUE they give the correct results. But the fix is easy and I've done so in R-devel, thank you. (mean() would be so much simpler if we didn't have to worry about the seldom-used trim= argument.) Only a little. I think the drawback is more conceptual: a trimmed mean needs order-able data whereas 'mean' in its usual sense does not. Bill Dunlap Spotfire, TIBCO Software -- Brian D. Ripley, rip...@stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UKFax: +44 1865 272595 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] mean(trim=, c(NA,...), na.rm=FALSE) does not return NA
Both of the following should return NA, but do not in R version 2.11.0 Under development (unstable) (2010-03-07 r51225) on 32-bit Windows: mean(c(1,10,100,NA), trim=.1) Error in sort.int(x, partial = unique(c(lo, hi))) : index 4 outside bounds mean(c(1,10,100,NA), trim=.26) [1] 55 With na.rm=TRUE they give the correct results. (mean() would be so much simpler if we didn't have to worry about the seldom-used trim= argument.) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] 'mean' is not reverted in median() as NEWS says (PR#13731)
Full_Name: Version: 2.9.0 OS: windows, linux Submission from: (NULL) (128.231.21.125) In NEWS, it says median.default() was altered in 2.8.1 to use sum() rather than mean(), although it was still documented to use mean(). This caused problems for POSIXt objects, for which mean() but not sum() makes sense, so the change has been reverted. But it's not reverted yet. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] 'mean' is not reverted in median() as NEWS says (PR#13731)
zheng...@mail.nih.gov wrote: Full_Name: Version: 2.9.0 OS: windows, linux Submission from: (NULL) (128.231.21.125) In NEWS, it says median.default() was altered in 2.8.1 to use sum() rather than mean(), although it was still documented to use mean(). This caused problems for POSIXt objects, for which mean() but not sum() makes sense, so the change has been reverted. But it's not reverted yet. That text is not in the NEWS file for 2.9.0. And the NEWS file that it is in is not for 2.9.0, and does not list that change under CHANGES IN R VERSION 2.9.0. -- O__ Peter Dalgaard Øster Farimagsgade 5, Entr.B c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~ - (p.dalga...@biostat.ku.dk) FAX: (+45) 35327907 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] mean (PR#10864)
Full_Name: Paul PONCET Version: 2.6.0 OS: Windows 2000 Submission from: (NULL) (83.137.240.218) Function 'mean.default' calls function 'stats::median' if 'trim = 0.5'. In that case the call should be 'stats::median(x, na.rm = na.rm)' instead of 'stats::median(x, na.rm = FALSE)'. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] mean mailing list moderator ..
There is a simple solution to this kind of problem - for my non-day-job-related software stuff, I usually subscribe under my sourceforge address. Sourceforge's is simple re-direction service so I actually cannot post from it; but I like incoming e-mails to go through sourceforge for a double spam filter. So e-mails come in through sourceforge but replied under my real address if I do, and it get held occasionally in the past depending on the mailing list policies. The solution I found is this: subscribe both addresses, but disabling delivery to the real-one. (this can be done by the user, no admin required). This way I can post from the real one, but receive twice-filtered mailing-list e-mails through an alias. (For R-devel, I am receiving and posting from my day-job address, if you are wondering...) Martin Maechler wrote: Hi Jari, (and interested readers) JO == Jari Oksanen [EMAIL PROTECTED] on Wed, 07 Nov 2007 12:21:10 +0200 writes: [..] [...some very good stuff...] [..] JO Cheers, Jari Oksanen JO PS. Please Mr Moderator, don't treat me so mean (*): I've subscribed to JO this group although you regularly reject my mail as coming from a JO non-member. More than a year ago, I had changed R-devel policy to 1) subscribers can post freely 2) everything else is on hold for moderator approval +) 3) ``spam-suspicious e-mails'' are also put on hold. Now your problem is that you are subscribed under a different e-mail address than the one you are currently sending mail from (and also use in your sig. below). To the mailing list software (mailman) this is equivalent to a non-subscriber. +) the moderator can **manually** add non-subscriber addresses to a list which is treated as allowed to post and I could do this next time ... but my general attitude is that r-devel subscribers should make these things work... Best regards, Martin JO (*) an extract from a classic song Mr R jumped the rabbit. JO -- JO Jari Oksanen [EMAIL PROTECTED] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] mean mailing list moderator ..
Hi Jari, (and interested readers) JO == Jari Oksanen [EMAIL PROTECTED] on Wed, 07 Nov 2007 12:21:10 +0200 writes: [..] [...some very good stuff...] [..] JO Cheers, Jari Oksanen JO PS. Please Mr Moderator, don't treat me so mean (*): I've subscribed to JO this group although you regularly reject my mail as coming from a JO non-member. More than a year ago, I had changed R-devel policy to 1) subscribers can post freely 2) everything else is on hold for moderator approval +) 3) ``spam-suspicious e-mails'' are also put on hold. Now your problem is that you are subscribed under a different e-mail address than the one you are currently sending mail from (and also use in your sig. below). To the mailing list software (mailman) this is equivalent to a non-subscriber. +) the moderator can **manually** add non-subscriber addresses to a list which is treated as allowed to post and I could do this next time ... but my general attitude is that r-devel subscribers should make these things work... Best regards, Martin JO (*) an extract from a classic song Mr R jumped the rabbit. JO -- JO Jari Oksanen [EMAIL PROTECTED] __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] ?mean
G'day Gabor, On Thu, 25 Jan 2007 09:53:49 -0500 Gabor Grothendieck [EMAIL PROTECTED] wrote: The help page for mean does not say what happens when one applies mean to a matrix. Well, not directly. :-) But the help page of mean says that one of the arguments is: x: An R object. Currently there are methods for numeric data frames, numeric vectors and dates. A complex vector is allowed for 'trim = 0', only. And the `Value' section states: For a data frame, a named vector with the appropriate method being applied column by column. If 'trim' is zero (the default), the arithmetic mean of the values in 'x' is computed, as a numeric or complex vector of length one. If any argument is not logical (coerced to numeric), integer, numeric or complex, 'NA' is returned, with a warning. Since a matrix is a vector with a dimension attribute, and not a data frame, one can deduce that the second paragraph describes the return value for `mean(x)' when x is a matrix. As I always tell my students, reading R help pages is a bit of an art. :) mean and sd work in an inconsistent way on a matrix so that should at least be documented. Agreed. But it is documented in the help page of sd, which clearly states: [] If 'x' is a matrix or a data frame, a vector of the standard deviation of the columns is returned. I guess you also want to have it documented in the mean help page? But then, should `var' also be mentioned in the mean help page? This command also work in an a different and inconsistent manner to mean on matrices. And, of course, there are other subtle inconsistencies in the language used in these help pages. Note that the mean help page talks about numeric data frames while the help pages of `var' and `se' talk about data frames only, though all components of the data frame have to be numeric, of course. Also there should be a See Also to colMeans since that provides the missing column-wise analog to sd. That's probably a good idea. What would you suggest should be mentioned to provide the column-wise analog of `var'? Cheers, Berwin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] ?mean
The help page for mean does not say what happens when one applies mean to a matrix. mean and sd work in an inconsistent way on a matrix so that should at least be documented. Also there should be a See Also to colMeans since that provides the missing column-wise analog to sd. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] ?mean
Gabor == Gabor Grothendieck [EMAIL PROTECTED] on Thu, 25 Jan 2007 09:53:49 -0500 writes: Gabor The help page for mean does not say what happens when one Gabor applies mean to a matrix. Gabor mean and sd work in an inconsistent way on a matrix Gabor so that should at least be documented. You are right (though I think this *was* documented at some point in time). As a matter of fact, I hate the the inconsistencies you've been mentioning, and I think is very wrong from an S-pedagogical point of view both thatsd(mat) :== apply(mat, 2, sd) and mean(dfr) :== apply(dfr, 2, mean) and it leads just to wrong ``analogy conclusions'' by useRs. I'd vote for deprecating these ``builtin conveniences'' in order to gain consistency and clarity... Though I haven't checked how many CRAN + Bioconductor packages would break if we'd disactivate these two mis-features ... Martin Gabor Also there should be a See Also to colMeans since that Gabor provides the missing column-wise analog to sd. __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] ?mean
Good point. Perhaps what is needed is a Note clarifying all this in ?mean (unless the software itself is reworked as Martin has discussed). Regarding var(x), one could use sd(x)^2. On 1/25/07, Berwin A Turlach [EMAIL PROTECTED] wrote: G'day Gabor, On Thu, 25 Jan 2007 09:53:49 -0500 Gabor Grothendieck [EMAIL PROTECTED] wrote: The help page for mean does not say what happens when one applies mean to a matrix. Well, not directly. :-) But the help page of mean says that one of the arguments is: x: An R object. Currently there are methods for numeric data frames, numeric vectors and dates. A complex vector is allowed for 'trim = 0', only. And the `Value' section states: For a data frame, a named vector with the appropriate method being applied column by column. If 'trim' is zero (the default), the arithmetic mean of the values in 'x' is computed, as a numeric or complex vector of length one. If any argument is not logical (coerced to numeric), integer, numeric or complex, 'NA' is returned, with a warning. Since a matrix is a vector with a dimension attribute, and not a data frame, one can deduce that the second paragraph describes the return value for `mean(x)' when x is a matrix. As I always tell my students, reading R help pages is a bit of an art. :) mean and sd work in an inconsistent way on a matrix so that should at least be documented. Agreed. But it is documented in the help page of sd, which clearly states: [] If 'x' is a matrix or a data frame, a vector of the standard deviation of the columns is returned. I guess you also want to have it documented in the mean help page? But then, should `var' also be mentioned in the mean help page? This command also work in an a different and inconsistent manner to mean on matrices. And, of course, there are other subtle inconsistencies in the language used in these help pages. Note that the mean help page talks about numeric data frames while the help pages of `var' and `se' talk about data frames only, though all components of the data frame have to be numeric, of course. Also there should be a See Also to colMeans since that provides the missing column-wise analog to sd. That's probably a good idea. What would you suggest should be mentioned to provide the column-wise analog of `var'? Cheers, Berwin __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] mean relative differences from all.equal() (PR#9276)
Full_Name: Brad Christoffersen Version: 2.3.1 OS: Windows XP Submission from: (NULL) (128.196.193.132) Why is the difference between two numbers so different from the mean relative difference output from the all.equal() function? Is this an artifact of the way R stores numerics? I could not find this problem as I searched through the submitted bugs. But I am brand new to R so I apologize if there is something obvious I'm missing here. rm(list=ls(all=TRUE)) ## Remove all objects that could hinder w/ consistent output a - 204 b - 203.9792 all.equal(a,b) [1] Mean relative difference: 0.0001019608 a - b [1] 0.0208 -- version - platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 3.1 year 2006 month 06 day01 svn rev38247 language R version.string Version 2.3.1 (2006-06-01) Thanks, Brad __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] mean relative differences from all.equal() (PR#9276)
On Thu, 2006-10-05 at 03:10 +0200, [EMAIL PROTECTED] wrote: Full_Name: Brad Christoffersen Version: 2.3.1 OS: Windows XP Submission from: (NULL) (128.196.193.132) Why is the difference between two numbers so different from the mean relative difference output from the all.equal() function? Is this an artifact of the way R stores numerics? I could not find this problem as I searched through the submitted bugs. But I am brand new to R so I apologize if there is something obvious I'm missing here. rm(list=ls(all=TRUE)) ## Remove all objects that could hinder w/ consistent output a - 204 b - 203.9792 all.equal(a,b) [1] Mean relative difference: 0.0001019608 a - b [1] 0.0208 Read the Details section of ?all.equal, which states: Numerical comparisons for scale = NULL (the default) are done by first computing the mean absolute difference of the two numerical vectors. If this is smaller than tolerance or not finite, absolute differences are used, otherwise relative differences scaled by the mean absolute difference. If scale is positive, absolute comparisons are made after scaling (dividing) by scale Thus on R version 2.4.0 (2006-10-03): all.equal(a, b, scale = 1) [1] Mean scaled difference: 0.0208 Please do not report doubts about behavior as bugs. Simply post a query on r-help first. If it is a bug, somebody will confirm it and you can then report it as such. BTW, time to upgrade...Go Wildcats! HTH, Marc Schwartz __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] mean relative differences from all.equal() (PR#9276)
On Wed, 2006-10-04 at 20:22 -0500, Marc Schwartz wrote: On Thu, 2006-10-05 at 03:10 +0200, [EMAIL PROTECTED] wrote: Full_Name: Brad Christoffersen Version: 2.3.1 OS: Windows XP Submission from: (NULL) (128.196.193.132) Why is the difference between two numbers so different from the mean relative difference output from the all.equal() function? Is this an artifact of the way R stores numerics? I could not find this problem as I searched through the submitted bugs. But I am brand new to R so I apologize if there is something obvious I'm missing here. rm(list=ls(all=TRUE)) ## Remove all objects that could hinder w/ consistent output a - 204 b - 203.9792 all.equal(a,b) [1] Mean relative difference: 0.0001019608 a - b [1] 0.0208 Read the Details section of ?all.equal, which states: Numerical comparisons for scale = NULL (the default) are done by first computing the mean absolute difference of the two numerical vectors. If this is smaller than tolerance or not finite, absolute differences are used, otherwise relative differences scaled by the mean absolute difference. If scale is positive, absolute comparisons are made after scaling (dividing) by scale Thus on R version 2.4.0 (2006-10-03): all.equal(a, b, scale = 1) [1] Mean scaled difference: 0.0208 Please do not report doubts about behavior as bugs. Simply post a query on r-help first. If it is a bug, somebody will confirm it and you can then report it as such. BTW, time to upgrade...Go Wildcats! HTH, Marc Schwartz [OFFLIST and PRIVATE] Brad, A couple of comments. First, welcome to R. I hope that you enjoy it and find it of value. If you are not used to open source software and communities (ie. Linux, etc.), you will find that this community, unlike commercial paid support forums, tends to be direct with respect to comments. Don't take it personally. Be aware that nobody is getting paid to support R. It is developed and supported on a voluntary basis by a large body of folks, mainly those known as R Core. Some of them have quite literally risked their academic careers and livelihood to facilitate R's existence. You will, over time, get a flavor for the nature of the community and the interchange that takes place. As a result of the voluntary nature of the community, there is an a priori expectation that you will have put forth reasonable efforts to avail yourself of the various support resources before posting. Especially in the case of a bug report, as a member of R Core has to manually manage the handling and resolution of bug reports. A good place to start is to review the R Posting Guide: http://www.r-project.org/posting-guide.html which covers many of these issues and how to go about getting support via the various sources provided. That all being said, you will find that R's support mechanisms and resources are second to none and I would challenge any commercial software vendor to provide a comparable level of support and expertise. With respect to your specific question above and how the result is obtained: (a - b) / a [1] 0.0001019608 Here, 'a' is used as the scaling factor, since you only passed single values. If these were 'vectors' of values, the scaling factor would be impacted accordingly. As a result of R's open source nature, you have access to all of the source code that is R. You can download the source tarball (archive) from one of the CRAN mirrors, if you so desire. In this case, the actual function that is used is called all.equal.numeric(). This is a consequence of how R uses 'dispatch methods' after a call to a 'generic' function, such as all.equal(). If you are not familiar with these terms, the available R documentation is a good place to start, if you should decide to pursue moving into that level of detail. If you have experience in other programming languages, this may be second nature already. In many cases, R's functions are written in R itself. Others are written in FORTRAN and/or C that is compiled and linked to R via various calling mechanisms. Since R is an interpreted language, you can have easy access to many of the functions within the R console. Thus, at the R command prompt, you can type: all.equal.numeric [Note without the parens] which will then display a representation of the function's source code, enabling you to review how the function works. If you desire to become a better R user/programmer, this approach provides a reasonable way to see how functions are coded and to investigate algorithms and techniques. I hope that the above is helpful. Best regards, Marc __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
Re: [Rd] mean relative differences from all.equal() (PR#9276)
On Wed, 2006-10-04 at 21:57 -0500, Marc Schwartz wrote: On Wed, 2006-10-04 at 20:22 -0500, Marc Schwartz wrote: On Thu, 2006-10-05 at 03:10 +0200, [EMAIL PROTECTED] wrote: Full_Name: Brad Christoffersen Version: 2.3.1 OS: Windows XP Submission from: (NULL) (128.196.193.132) Why is the difference between two numbers so different from the mean relative difference output from the all.equal() function? Is this an artifact of the way R stores numerics? I could not find this problem as I searched through the submitted bugs. But I am brand new to R so I apologize if there is something obvious I'm missing here. rm(list=ls(all=TRUE)) ## Remove all objects that could hinder w/ consistent output a - 204 b - 203.9792 all.equal(a,b) [1] Mean relative difference: 0.0001019608 a - b [1] 0.0208 Read the Details section of ?all.equal, which states: Numerical comparisons for scale = NULL (the default) are done by first computing the mean absolute difference of the two numerical vectors. If this is smaller than tolerance or not finite, absolute differences are used, otherwise relative differences scaled by the mean absolute difference. If scale is positive, absolute comparisons are made after scaling (dividing) by scale Thus on R version 2.4.0 (2006-10-03): all.equal(a, b, scale = 1) [1] Mean scaled difference: 0.0208 Please do not report doubts about behavior as bugs. Simply post a query on r-help first. If it is a bug, somebody will confirm it and you can then report it as such. BTW, time to upgrade...Go Wildcats! HTH, Marc Schwartz [OFFLIST and PRIVATE] Brad, A couple of comments. First, welcome to R. I hope that you enjoy it and find it of value. If you are not used to open source software and communities (ie. Linux, etc.), you will find that this community, unlike commercial paid support forums, tends to be direct with respect to comments. Don't take it personally. Be aware that nobody is getting paid to support R. It is developed and supported on a voluntary basis by a large body of folks, mainly those known as R Core. Some of them have quite literally risked their academic careers and livelihood to facilitate R's existence. You will, over time, get a flavor for the nature of the community and the interchange that takes place. As a result of the voluntary nature of the community, there is an a priori expectation that you will have put forth reasonable efforts to avail yourself of the various support resources before posting. Especially in the case of a bug report, as a member of R Core has to manually manage the handling and resolution of bug reports. A good place to start is to review the R Posting Guide: http://www.r-project.org/posting-guide.html which covers many of these issues and how to go about getting support via the various sources provided. That all being said, you will find that R's support mechanisms and resources are second to none and I would challenge any commercial software vendor to provide a comparable level of support and expertise. With respect to your specific question above and how the result is obtained: (a - b) / a [1] 0.0001019608 Here, 'a' is used as the scaling factor, since you only passed single values. If these were 'vectors' of values, the scaling factor would be impacted accordingly. As a result of R's open source nature, you have access to all of the source code that is R. You can download the source tarball (archive) from one of the CRAN mirrors, if you so desire. In this case, the actual function that is used is called all.equal.numeric(). This is a consequence of how R uses 'dispatch methods' after a call to a 'generic' function, such as all.equal(). If you are not familiar with these terms, the available R documentation is a good place to start, if you should decide to pursue moving into that level of detail. If you have experience in other programming languages, this may be second nature already. In many cases, R's functions are written in R itself. Others are written in FORTRAN and/or C that is compiled and linked to R via various calling mechanisms. Since R is an interpreted language, you can have easy access to many of the functions within the R console. Thus, at the R command prompt, you can type: all.equal.numeric [Note without the parens] which will then display a representation of the function's source code, enabling you to review how the function works. If you desire to become a better R user/programmer, this approach provides a reasonable way to see how functions are coded and to investigate algorithms and techniques. I hope that the above is helpful. Best regards, Marc My most sincere and public apologies to Brad. The reply message above was mistakenly copied to the list. Brad I am sorry. Marc Schwartz
[Rd] mean(NA) returns -(1+.Machine$integer.max) (PR#9097)
Full_Name: Benjamin Tyner Version: 2.3.0 OS: linux-gnu (debian) Submission from: (NULL) (71.98.75.54) mean(NA) returns -2147483648 on my system, which is -(1+.Machine$integer.max) sessionInfo() Version 2.3.0 (2006-04-24) i686-pc-linux-gnu attached base packages: [1] methods stats graphics grDevices utils datasets [7] base __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel
[Rd] mean of complex vector (PR#8842)
Full_Name: John Peters Version: 2.3.0 OS: Windows 2000, xp Submission from: (NULL) (220.233.20.203) In R2.3.0 on Windows 2000 and xp mean(c(1i)) [1] 0+2i mean(c(1i,1i)) [1] 0+3i mean(c(1i,1i,1i)) [1] 0+4i OK in R2.2.1 __ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel