Perhaps the original question was only the syntax question about how if()
statements work. That has been answered. (But I’ll add a note that omitting {
} in this situation is a good way to introduce bugs over time, so I generally
avoid doing that.)
But in case the question is motivated by the desire to compute summary
statistics for a data set, the df_stats() function provides a simple way to do
this, and the fBasics package includes functions for skew and kurtosis.
Combining, you can do this:
library(fBasics) # defines skewness and kurtosis
library(ggformula) # or library(mosaic)
df_stats( ~ Sepal.Length, data = iris, mean, sd, skewness, kurtosis, n =
length())
## mean_Sepal.Length sd_Sepal.Length skewness_Sepal.Length
kurtosis_Sepal.Length n
## 1 5.843333 0.8280661 0.3086407
-0.6058125 150
df_stats( Sepal.Length ~ Species, data = iris, mean, sd, skewness, kurtosis, n
= length())
## Species mean_Sepal.Length sd_Sepal.Length skewness_Sepal.Length
kurtosis_Sepal.Length n
## 1 setosa 5.006 0.3524897 0.11297784
-0.4508724 50
## 2 versicolor 5.936 0.5161711 0.09913926
-0.6939138 50
## 3 virginica 6.588 0.6358796 0.11102862
-0.2032597 50
df_stats(~Sepal.Length, data = iris)
## min Q1 median Q3 max mean sd n missing
## 1 4.3 5.1 5.8 6.4 7.9 5.843333 0.8280661 150 0
There are options to control how things are named if the defaults are long for
your liking. The results are returned in a data frame, so they are suitable
for downstream things like plotting with ggformula.
—rjp
On Jun 14, 2019, at 2:37 PM, Christopher David Desjardins
<[email protected]<mailto:[email protected]>> wrote:
Hi Chenguang Du,
This is really a better question for R-help as R-Sig-Teaching is about
teaching statistics with R. But …
This function:
mystats <- function(x, na.omit=FALSE){
if (na.omit)
x <- x[!is.na(x)]
m <- mean(x)
n <- length(x)
s <- sd(x)
skew <- sum((x-m)^3/s^3)/n
kurt <- sum((x-m)^4/s^4)/n - 3
return(c(n=n, mean=m, stdev=s, skew=skew, kurtosis=kurt))
}
is equivalent to:
mystats <- function(x, na.omit=FALSE){if (na.omit){
x <- x[!is.na(x)]
}
m <- mean(x)
n <- length(x)
s <- sd(x)
skew <- sum((x-m)^3/s^3)/n
kurt <- sum((x-m)^4/s^4)/n - 3
return(c(n=n, mean=m, stdev=s, skew=skew, kurtosis=kurt))
}
So, the authors of that function wanted that if() statement to apply only
to the line immediately below it (i.e., this code x <- x[!is.na(x)]) and
not the rest of the function.
On Fri, Jun 14, 2019 at 1:19 PM Chenguang Du
<[email protected]<mailto:[email protected]>> wrote:
I am reading the book R in action, but get confused by the following code
mystats <- function(x, na.omit=FALSE){
if (na.omit)
x <- x[!is.na(x)]
m <- mean(x)
n <- length(x)
s <- sd(x)
skew <- sum((x-m)^3/s^3)/n
kurt <- sum((x-m)^4/s^4)/n - 3
return(c(n=n, mean=m, stdev=s, skew=skew, kurtosis=kurt))
}
my question is when if control statement is used inside the function, why
the { } after the (na.omit) is not followed??? why it still works???
--
Chenguang Du
Ph.D Candidate
Educational Research and Evaluation
School of Education
Virginia Tech
[[alternative HTML version deleted]]
_______________________________________________
[email protected]<mailto:[email protected]> mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dsig-2Dteaching&d=DwIFaQ&c=4rZ6NPIETe-LE5i2KBR4rw&r=S6U-baLhvGcJ7iUQX_KZ6K2om1TTOeUI_-mjRpTrm00&m=3j96AZXWDZA1eiPxIVqSIQEXW9YNVKW_yfM42D6OBTE&s=jO7Yk0rUYjReb1doAvESno2PNBn6qb4iy01nuet4600&e=
[[alternative HTML version deleted]]
_______________________________________________
[email protected]<mailto:[email protected]> mailing list
https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_r-2Dsig-2Dteaching&d=DwIFaQ&c=4rZ6NPIETe-LE5i2KBR4rw&r=S6U-baLhvGcJ7iUQX_KZ6K2om1TTOeUI_-mjRpTrm00&m=3j96AZXWDZA1eiPxIVqSIQEXW9YNVKW_yfM42D6OBTE&s=jO7Yk0rUYjReb1doAvESno2PNBn6qb4iy01nuet4600&e=
[[alternative HTML version deleted]]
_______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-teaching