Re: [R] Help with plotting and date-times for climate data

2023-09-12 Thread Rui Barradas

Às 21:50 de 12/09/2023, Kevin Zembower via R-help escreveu:

Hello,

I'm trying to calculate the mean temperature max from a file of climate
date, and plot it over a range of days in the year. I've downloaded the
data, and cleaned it up the way I think it should be. However, when I
plot it, the geom_smooth line doesn't show up. I think that's because
my x axis is characters or factors. Here's what I have so far:

library(tidyverse)

data <- read_csv("Ely_MN_Weather.csv")

start_day = yday(as_date("2023-09-22"))
end_day = yday(as_date("2023-10-15"))

d <- as_tibble(data) %>%

 select(DATE,TMAX,TMIN) %>%
 mutate(DATE = as_date(DATE),
yday = yday(DATE),
md = sprintf("%02d-%02d", month(DATE), mday(DATE))
) %>%
 filter(yday >= start_day & yday <= end_day) %>%
 mutate(md = as.factor(md))

d_sum <- d %>%
 group_by(md) %>%
 summarize(tmax_mean = mean(TMAX, na.rm=TRUE))

## Here's the filtered data:
dput(d_sum)


structure(list(md = structure(1:25, levels = c("09-21", "09-22",

"09-23", "09-24", "09-25", "09-26", "09-27", "09-28", "09-29",
"09-30", "10-01", "10-02", "10-03", "10-04", "10-05", "10-06",
"10-07", "10-08", "10-09", "10-10", "10-11", "10-12", "10-13",
"10-14", "10-15"), class = "factor"), tmax_mean = c(65,
62.2,
61.3, 63.9, 64.3, 60.1, 62.3, 60.5, 61.9,
61.2, 63.7, 59.5, 59.6, 61.6,
59.4, 58.8, 55.9, 58.125,
58, 55.7, 57, 55.4, 49.8,
48.75, 43.7)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -25L))



ggplot(data = d_sum, aes(x = md)) +
 geom_point(aes(y = tmax_mean, color = "blue")) +
 geom_smooth(aes(y = tmax_mean, color = "blue"))
=
My questions are:
1. Why isn't my geom_smooth plotting? How can I fix it?
2. I don't think I'm handling the month and day combination correctly.
Is there a way to encode month and day (but not year) as a date?
3. (Minor point) Why does my graph of tmax_mean come out red when I
specify "blue"?

Thanks for any advice or guidance you can offer. I really appreciate
the expertise of this group.

-Kevin

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Hello,

The problem is that the dates are factors, not real dates. And 
geom_smooth is not interpolating along a discrete axis (the x axis).


Paste a fake year with md, coerce to date and plot.
I have simplified the aes() calls and added a date scale in order to 
make the x axis more readable.


Without the formula and method arguments, geom_smooth will print a 
message, they are now made explicit.




suppressPackageStartupMessages({
  library(dplyr)
  library(ggplot2)
})

d_sum %>%
  mutate(md = paste("2023", md, sep = "-"),
 md = as.Date(md)) %>%
  ggplot(aes(x = md, y = tmax_mean)) +
  geom_point(color = "blue") +
  geom_smooth(
formula = y ~ x,
method = loess,
color = "blue"
  ) +
  scale_x_date(date_breaks = "7 days", date_labels = "%m-%d")



Hope this helps,

Rui Barradas

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] graph in R with grouping letters from the turkey test with agricolae package

2023-09-12 Thread Rui Barradas

Às 16:24 de 12/09/2023, Loop Vinyl escreveu:

I would like to produce the attached graph (graph1) with the R package
agricolae, could someone give me an example with the attached data (data)?

I expect an adapted graph (graph2) with the data (data)

Best regards


__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Hello,

There are no attached graphs, only data.
Can you post the code have you tried?

Hope this helps,

Rui Barradas

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with plotting and date-times for climate data

2023-09-12 Thread Kevin Zembower via R-help
Hello,

I'm trying to calculate the mean temperature max from a file of climate
date, and plot it over a range of days in the year. I've downloaded the
data, and cleaned it up the way I think it should be. However, when I
plot it, the geom_smooth line doesn't show up. I think that's because
my x axis is characters or factors. Here's what I have so far:

library(tidyverse)

data <- read_csv("Ely_MN_Weather.csv")

start_day = yday(as_date("2023-09-22"))
end_day = yday(as_date("2023-10-15"))
   
d <- as_tibble(data) %>%
select(DATE,TMAX,TMIN) %>%
mutate(DATE = as_date(DATE),
   yday = yday(DATE),
   md = sprintf("%02d-%02d", month(DATE), mday(DATE))
   ) %>%
filter(yday >= start_day & yday <= end_day) %>%
mutate(md = as.factor(md))

d_sum <- d %>%
group_by(md) %>%
summarize(tmax_mean = mean(TMAX, na.rm=TRUE))

## Here's the filtered data:
dput(d_sum)

> structure(list(md = structure(1:25, levels = c("09-21", "09-22", 
"09-23", "09-24", "09-25", "09-26", "09-27", "09-28", "09-29", 
"09-30", "10-01", "10-02", "10-03", "10-04", "10-05", "10-06", 
"10-07", "10-08", "10-09", "10-10", "10-11", "10-12", "10-13", 
"10-14", "10-15"), class = "factor"), tmax_mean = c(65,
62.2, 
61.3, 63.9, 64.3, 60.1, 62.3, 60.5, 61.9, 
61.2, 63.7, 59.5, 59.6, 61.6, 
59.4, 58.8, 55.9, 58.125, 
58, 55.7, 57, 55.4, 49.8, 
48.75, 43.7)), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -25L))
> 
ggplot(data = d_sum, aes(x = md)) +
geom_point(aes(y = tmax_mean, color = "blue")) +
geom_smooth(aes(y = tmax_mean, color = "blue"))
=
My questions are:
1. Why isn't my geom_smooth plotting? How can I fix it?
2. I don't think I'm handling the month and day combination correctly.
Is there a way to encode month and day (but not year) as a date?
3. (Minor point) Why does my graph of tmax_mean come out red when I
specify "blue"?

Thanks for any advice or guidance you can offer. I really appreciate
the expertise of this group.

-Kevin

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] graph in R with grouping letters from the turkey test with agricolae package

2023-09-12 Thread Loop Vinyl
I would like to produce the attached graph (graph1) with the R package
agricolae, could someone give me an example with the attached data (data)?

I expect an adapted graph (graph2) with the data (data)

Best regards
TREAT   REP VAR1VAR2VAR3VAR4
t1  1   16,10,805   0,966   1,449
t1  2   14,10,705   0,846   1,269
t1  3   5,8 0,290,348   0,522
t1  4   4,9 0,245   0,294   0,441
t2  1   30,81,541,848   2,772
t2  2   29  1,451,742,61
t2  3   28,81,441,728   2,592
t2  4   16,60,830,996   1,494
t3  1   31,31,565   1,878   2,817
t3  2   29,61,481,776   2,664
t3  3   28,11,405   1,686   2,529
t3  4   19,50,975   1,171,755
t4  1   21,45   1,0725  1,287   1,9305
t4  2   23,595  1,17975 1,4157  2,12355
t4  3   25,9545 1,2977251,55727 2,335905
t4  4   28,549951,4274975   1,7129972,5694955
t5  1   27,31,365   1,638   2,457
t5  2   13,70,685   0,822   1,233
t5  3   8,1 0,405   0,486   0,729
t5  4   0   0   0   0
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] prop.trend.test

2023-09-12 Thread peter dalgaard
Argh, yes, drats, thanks. 

There will be a matter of an estimated residual error.

So

> coef(summary(ht$lmfit))["score","t value"]*sigma(ht$lmfit)
[1] -2.867913

matches the signes square root of the Chi-square.

Or, likely better (avoid 0 df cases), switch to a Gaussian glm fit and use the 
z stat

> coef(summary(ht$glmfit, dispersion = 1))
   Estimate Std. Error   z value  Pr(>|z|)
(Intercept)  1.02187141 0.03199737 31.936102 8.425506e-224
score   -0.03341563 0.01165155 -2.867913  4.131897e-03

The Estimate (-0.0334) should still make sense as the LPM estimate of the 
regression slope.

- Peter D.


> On 8 Sep 2023, at 12:53 , Rui Barradas  wrote:
> 
> Às 10:06 de 08/09/2023, peter dalgaard escreveu:
>> Yes, this was written a bit bone-headed (as I am allowed to say...)
>> If you look at the code, you will see inside:
>> a <- anova(lm(freq ~ score, data = list(freq = x/n, score = 
>> as.vector(score)),
>> weights = w))
>> and the lm() inside should give you the direction via the sign of the 
>> regression coefficient on "score".
>>  So, at least for now, you could just doctor a copy of the code for your own 
>> purposes, as in
>>  fit <- lm(freq ~ score, data = list(freq = x/n, score = as.vector(score)),
>> weights = w)
>>  a <- anova(fit)
>>  and arrange to return coef(fit)["score"] at the end. Something like 
>> structure(... estimate=c(lpm.slope=coef(fit)["score"]) )
>> (I expect that you might also extract the t-statistic from 
>> coef(summary(fit)) and find that it is the signed square root of the 
>> Chi-square, but I won't have time to test that just now.)
>> -pd
>>> On 8 Sep 2023, at 07:22 , Thomas Subia via R-help  
>>> wrote:
>>> 
>>> Colleagues,
>>> 
>>> Thanks all for the responses.
>>> 
>>> I am monitoring the daily total number of defects per sample unit.
>>> I need to know whether this daily defect proportion is trending upward (a 
>>> bad thing for a manufacturing process).
>>> 
>>> My first thought was to use either a u or a u' control chart for this.
>>> As far as I know, u or u' charts are poor to detect drifts.
>>> 
>>> This is why I chose to use prop.trend.test to detect trends in proportions.
>>> 
>>> While prop.trend.test can confirm the existence of a trend, as far as I 
>>> know, it is left to the user
>>> to determine what direction that trend is.
>>> 
>>> One way to illustrate trending is of course to plot the data and use 
>>> geom_smooth and method lm
>>> For the non-statisticians in my group, I've found that using this method 
>>> along with the p-value of prop.trend.test, makes it easier for the users to 
>>> determine the existence of trending and its direction.
>>> 
>>> If there are any other ways to do this, please let me know.
>>> 
>>> Thomas Subia
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Thursday, September 7, 2023 at 10:31:27 AM PDT, Rui Barradas 
>>>  wrote:
>>> 
>>> 
>>> 
>>> 
>>> 
>>> Às 14:23 de 07/09/2023, Thomas Subia via R-help escreveu:
 
 Colleagues
 
Consider
 smokers  <- c( 83, 90, 129, 70 )
 patients <- c( 86, 93, 136, 82 )
 
prop.trend.test(smokers, patients)
 
Output:
 
Chi-squared Test for Trend inProportions
 
data:  smokers out of patients ,
 
 using scores: 1 2 3 4
 
 X-squared = 8.2249, df = 1, p-value = 0.004132
 
# trend test for proportions indicates proportions aretrending.
 
How does one identify the direction of trending?
# prop.test indicates that the proportions are unequal but doeslittle 
 to indicate trend direction.
 All the best,
 Thomas Subia
 
 
 [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
>>> Hello,
>>> 
>>> By visual inspection it seems that there is a decreasing trend.
>>> Note that the sample estimates of prop.test and smokers/patients are equal.
>>> 
>>> 
>>> smokers  <- c( 83, 90, 129, 70 )
>>> patients <- c( 86, 93, 136, 82 )
>>> 
>>> prop.test(smokers, patients)$estimate
>>> #>prop 1prop 2prop 3prop 4
>>> #> 0.9651163 0.9677419 0.9485294 0.8536585
>>> 
>>> smokers/patients
>>> 
>>> #> [1] 0.9651163 0.9677419 0.9485294 0.8536585
>>> 
>>> plot(smokers/patients, type = "b")
>>> 
>>> 
>>> 
>>> Hope this helps,
>>> 
>>> Rui Barradas
>>> 
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
> Hello,
> 
>