Re: [R] add only the 1st of May with POSIXct
Às 07:01 de 29/05/2024, Stefano Sofia escreveu: Thank you Rui for your code. I basically understood all your suggestions. I am using an old version of R (version 3.6.3, installed in a server I am not allowed to control), and the new pipe operator does not work. I tried to run your code without the "|>" operator, but I get an error when I use apply. Could you please expand your code without the pipe operator? Thank you again for your help Stefano (oo) --oOO--( )--OOo-- Stefano Sofia PhD Civil Protection - Marche Region - Italy Meteo Section Snow Section Via del Colle Ameno 5 60126 Torrette di Ancona, Ancona (AN) Uff: +39 071 806 7743 E-mail: stefano.so...@regione.marche.it ---Oo-oO ____ Da: Rui Barradas Inviato: martedì 28 maggio 2024 18:19 A: Stefano Sofia; r-help@R-project.org Oggetto: Re: [R] add only the 1st of May with POSIXct [Non ricevi spesso messaggi di posta elettronica da ruipbarra...@sapo.pt. Per informazioni sull'importanza di questo fatto, visita https://aka.ms/LearnAboutSenderIdentification.] Às 16:23 de 28/05/2024, Stefano Sofia escreveu: Dear R-list users, From an initial and a final date I create a sequence of days using POSIXct. If this interval covers all or only in part the months from May to October, I need to get rid of the days from the 2nd of May to the 31st of October: a <- as.POSIXct("2002-11-01", format = "%Y-%m-%d", tz="Etc/GMT-1") b <- as.POSIXct("2004-06-01", format = "%Y-%m-%d", tz="Etc/GMT-1") mydf <- data.frame(data_POSIX=seq(as.POSIXct(paste(format(a, "%Y-%m-%d"), "09:00:00", sep=""), format="%Y-%m-%d %H:%M:%S", tz="Etc/GMT-1"), as.POSIXct(paste(format(b, "%Y-%m-%d"), "09:00:00", sep=""), format="%Y-%m-%d %H:%M:%S", tz="Etc/GMT-1"), by="1 day")) If I execute as.data.frame(mydf[format(mydf$data_POSIX,"%m") %in% c("11", "12", "01", "02", "03", "04"), ]) the interval will be from 2002-11-01 09:00:00 to 2003-04-30 09:00:00 and from 2003-11-01 09:00:00 to 2004-04-30 09:00:00 but I need also 2003-05-01 09:00:00 and 2004-05-01 09:00:00 How can I solve this problem? Thank you for your attention and your help Stefano (oo) --oOO--( )--OOo-- Stefano Sofia PhD Civil Protection - Marche Region - Italy Meteo Section Snow Section Via del Colle Ameno 5 60126 Torrette di Ancona, Ancona (AN) Uff: +39 071 806 7743 E-mail: stefano.so...@regione.marche.it ---Oo-oO AVVISO IMPORTANTE: Questo messaggio di posta elettronica pu� contenere informazioni confidenziali, pertanto � destinato solo a persone autorizzate alla ricezione. I messaggi di posta elettronica per i client di Regione Marche possono contenere informazioni confidenziali e con privilegi legali. Se non si � il destinatario specificato, non leggere, copiare, inoltrare o archiviare questo messaggio. Se si � ricevuto questo messaggio per errore, inoltrarlo al mittente ed eliminarlo completamente dal sistema del proprio computer. Ai sensi dell'art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessit� ed urgenza, la risposta al presente messaggio di posta elettronica pu� essere visionata da persone estranee al destinatario. IMPORTANT NOTICE: This e-mail message is intended to be received only by persons entitled to receive the confidential information it may contain. E-mail messages to clients of Regione Marche may contain information that is confidential and legally privileged. Please do not read, copy, forward, or store this message unless you are an intended recipient of it. If you have received this message in error, please forward it to the sender and delete it completely from your computer system. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fr-help=05%7C02%7Cstefano.sofia%40regione.marche.it%7C0d812d3223344a1508d408dc7f31f657%7C295eaa1431a14b09bfe65a338b679f60%7C0%7C0%7C638525100275684754%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C6%7C%7C%7C=ac0Hx9auMSeXgsllDaaimZDFBpSLZ%2B3OeOGQoVvcjxQ%3D=0 PLEASE do read the posting guide https://eur02.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.r-project.org%2Fposting-guide.html=05%7C02%7Cstefano.sofia%40regione.marche.it%7C0d812d3223344a1508d408dc7f31f657%7C295eaa1431a14b09bfe65a338b679f60%7C0%7C0%7C638525100275684754%7CUnknown%7CTWFpbGZs
Re: [R] add only the 1st of May with POSIXct
Às 16:23 de 28/05/2024, Stefano Sofia escreveu: Dear R-list users, From an initial and a final date I create a sequence of days using POSIXct. If this interval covers all or only in part the months from May to October, I need to get rid of the days from the 2nd of May to the 31st of October: a <- as.POSIXct("2002-11-01", format = "%Y-%m-%d", tz="Etc/GMT-1") b <- as.POSIXct("2004-06-01", format = "%Y-%m-%d", tz="Etc/GMT-1") mydf <- data.frame(data_POSIX=seq(as.POSIXct(paste(format(a, "%Y-%m-%d"), "09:00:00", sep=""), format="%Y-%m-%d %H:%M:%S", tz="Etc/GMT-1"), as.POSIXct(paste(format(b, "%Y-%m-%d"), "09:00:00", sep=""), format="%Y-%m-%d %H:%M:%S", tz="Etc/GMT-1"), by="1 day")) If I execute as.data.frame(mydf[format(mydf$data_POSIX,"%m") %in% c("11", "12", "01", "02", "03", "04"), ]) the interval will be from 2002-11-01 09:00:00 to 2003-04-30 09:00:00 and from 2003-11-01 09:00:00 to 2004-04-30 09:00:00 but I need also 2003-05-01 09:00:00 and 2004-05-01 09:00:00 How can I solve this problem? Thank you for your attention and your help Stefano (oo) --oOO--( )--OOo-- Stefano Sofia PhD Civil Protection - Marche Region - Italy Meteo Section Snow Section Via del Colle Ameno 5 60126 Torrette di Ancona, Ancona (AN) Uff: +39 071 806 7743 E-mail: stefano.so...@regione.marche.it ---Oo-oO AVVISO IMPORTANTE: Questo messaggio di posta elettronica pu� contenere informazioni confidenziali, pertanto � destinato solo a persone autorizzate alla ricezione. I messaggi di posta elettronica per i client di Regione Marche possono contenere informazioni confidenziali e con privilegi legali. Se non si � il destinatario specificato, non leggere, copiare, inoltrare o archiviare questo messaggio. Se si � ricevuto questo messaggio per errore, inoltrarlo al mittente ed eliminarlo completamente dal sistema del proprio computer. Ai sensi dell'art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessit� ed urgenza, la risposta al presente messaggio di posta elettronica pu� essere visionata da persone estranee al destinatario. IMPORTANT NOTICE: This e-mail message is intended to be received only by persons entitled to receive the confidential information it may contain. E-mail messages to clients of Regione Marche may contain information that is confidential and legally privileged. Please do not read, copy, forward, or store this message unless you are an intended recipient of it. If you have received this message in error, please forward it to the sender and delete it completely from your computer system. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, First of all, 'a' and 'b' are already objects of class "POSIXct", you don't need to repeat the code creating them when creating mydf. As for the question, see the code below. a <- as.POSIXct("2002-11-01", format = "%Y-%m-%d", tz="Etc/GMT-1") b <- as.POSIXct("2004-06-01", format = "%Y-%m-%d", tz="Etc/GMT-1") mydf <- data.frame(data_POSIX = seq(a, b, by = "1 day")) # get the years from the data years <- format(c(a, b), "%Y") |> as.integer() # this creates a sequence with all the years years <- Reduce(`:`, years) # coerce to "Date" from <- ISOdate(years, 5L, 2L, tz = "Etc/GMT-1") to <- ISOdate(years, 10L, 30L, tz = "Etc/GMT-1") # this logical index keeps only the dates between May, 2nd and Nov 1st. keep <- data.frame(from, to) |> apply(1L, \(x) x[1L] <= mydf$data_POSIX & mydf$data_POSIX <= x[2L]) |> rowSums() > 0L mydf[keep, , drop = FALSE] Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Print date on y axis with month, day, and year
Às 00:58 de 10/05/2024, Sorkin, John escreveu: I am trying to use ggplot to plot the data, and R code, below. The dates (jdate) are printing as Mar 01, Mar 15, etc. I want to have the date printed as MMM DD (or any other way that will show month, date, and year, e.g. mm/dd/yy). How can I accomplish this? yyy <- structure(list( jdate = structure(c(19052, 19053, 19054, 19055, 19058, 19059, 19060, 19061, 19062, 19063, 19065, 19066, 19067, 19068, 19069, 19072, 19073, 19074, 19075, 19076, 19077, 19083, 19086, 19087, 19088, 19089, 19090, 19093, 19094, 19095), class = "Date"), Sum = c ( 1, 3, 9, 11, 13, 16, 18, 22, 26, 27, 30, 32, 35, 39, 41, 43, 48, 51, 56, 58, 59, 63, 73, 79, 81, 88, 91, 93, 96, 103)), row.names = c(NA, 30L), class = "data.frame") yyy class(yyy$jdate) ggplot(data=yyy[1:30,],aes(as.Date(jdate,format="%m-%d-%Y"),Sum)) +geom_point() Thank you John John David Sorkin M.D., Ph.D. Professor of Medicine, University of Maryland School of Medicine; Associate Director for Biostatistics and Informatics, Baltimore VA Medical Center Geriatrics Research, Education, and Clinical Center; PI Biostatistics and Informatics Core, University of Maryland School of Medicine Claude D. Pepper Older Americans Independence Center; Senior Statistician University of Maryland Center for Vascular Research; Division of Gerontology and Paliative Care, 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 Cell phone 443-418-5382 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Since class(yyy$jdate) returns "Date", you have a real date and scale_x_date can handle the printed formats, there is no need for an extra as.Date in aes(). And get rid of the format = "%m-%d-%Y" argument. Let scale_x_date take care of formating the date as you want it displayed. Any of the two below is a valid date format. ggplot(data = yyy[1:30,], aes(jdate, Sum)) + geom_point() + # scale_x_date(date_labels = "%b %d, %Y") scale_x_date(date_labels = "%m/%d/%Y") Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] x[0]: Can '0' be made an allowed index in R?
Às 09:08 de 21/04/2024, Rui Barradas escreveu: Às 08:55 de 21/04/2024, Hans W escreveu: As we all know, in R indices for vectors start with 1, i.e, x[0] is not a correct expression. Some algorithms, e.g. in graph theory or combinatorics, are much easier to formulate and code if 0 is an allowed index pointing to the first element of the vector. Some programming languages, for instance Julia (where the index for normal vectors also starts with 1), provide libraries/packages that allow the user to define an index range for its vectors, say 0:9 or 10:20 or even negative indices. Of course, this notation would only be feasible for certain specially defined vectors. Is there a library that provides this functionality? Or is there a simple trick to do this in R? The expression 'x[0]' must be possible, does this mean the syntax of R has to be twisted somehow? Thanks, Hans W. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, I find what you are asking awkward but it can be done with S3 classes. Write an extraction method for the new class and in the use case below it works. The method increments the ndex before calling NextMethod, the usual extraction function. `[.zerobased` <- function(x, i, ...) { i <- i + 1L NextMethod() } as_zerobased <- function(x) { class(x) <- c("zerobased", class(x)) x } x <- 1:10 y <- as_zerobased(x) y[0] #> [1] 1 y[1] #> [1] 2 y[9] #> [1] 10 y[10] #> [1] NA Hope this helps, Rui Barradas Sorry, forgot to also define a `[[zerobased` method. It's probably safer. `[[.zerobased` <- function(x, i, ...) { i <- i + 1L NextMethod() } Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] x[0]: Can '0' be made an allowed index in R?
Às 08:55 de 21/04/2024, Hans W escreveu: As we all know, in R indices for vectors start with 1, i.e, x[0] is not a correct expression. Some algorithms, e.g. in graph theory or combinatorics, are much easier to formulate and code if 0 is an allowed index pointing to the first element of the vector. Some programming languages, for instance Julia (where the index for normal vectors also starts with 1), provide libraries/packages that allow the user to define an index range for its vectors, say 0:9 or 10:20 or even negative indices. Of course, this notation would only be feasible for certain specially defined vectors. Is there a library that provides this functionality? Or is there a simple trick to do this in R? The expression 'x[0]' must be possible, does this mean the syntax of R has to be twisted somehow? Thanks, Hans W. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, I find what you are asking awkward but it can be done with S3 classes. Write an extraction method for the new class and in the use case below it works. The method increments the ndex before calling NextMethod, the usual extraction function. `[.zerobased` <- function(x, i, ...) { i <- i + 1L NextMethod() } as_zerobased <- function(x) { class(x) <- c("zerobased", class(x)) x } x <- 1:10 y <- as_zerobased(x) y[0] #> [1] 1 y[1] #> [1] 2 y[9] #> [1] 10 y[10] #> [1] NA Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Exceptional slowness with read.csv
Às 06:47 de 08/04/2024, Dave Dixon escreveu: Greetings, I have a csv file of 76 fields and about 4 million records. I know that some of the records have errors - unmatched quotes, specifically. Reading the file with readLines and parsing the lines with read.csv(text = ...) is really slow. I know that the first 2459465 records are good. So I try this: > startTime <- Sys.time() > first_records <- read.csv(file_name, nrows = 2459465) > endTime <- Sys.time() > cat("elapsed time = ", endTime - startTime, "\n") elapsed time = 24.12598 > startTime <- Sys.time() > second_records <- read.csv(file_name, skip = 2459465, nrows = 5) > endTime <- Sys.time() > cat("elapsed time = ", endTime - startTime, "\n") This appears to never finish. I have been waiting over 20 minutes. So why would (skip = 2459465, nrows = 5) take orders of magnitude longer than (nrows = 2459465) ? Thanks! -dave PS: readLines(n=2459470) takes 10.42731 seconds. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Can the following function be of help? After reading the data setting argument quote=FALSE, call a function applying gregexpr to its character columns, then transforming the output in a two column data.frame with columns Col - the column processed; Unbalanced - the rows with unbalanced double quotes. I am assuming the quotes are double quotes. It shouldn't be difficult to adapt it to other cas, single quotes, both cases. unbalanced_dquotes <- function(x) { char_cols <- sapply(x, is.character) |> which() lapply(char_cols, \(i) { y <- x[[i]] Unbalanced <- gregexpr('"', y) |> sapply(\(x) attr(x, "match.length") |> length()) |> {\(x) (x %% 2L) == 1L}() |> which() data.frame(Col = i, Unbalanced = Unbalanced) }) |> do.call(rbind, args = _) } # read the data disregardin g quoted strings df1 <- read.csv(fl, quote = "") # determine which strings have unbalanced quotes and # where unbalanced_dquotes(df1) Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Exceptional slowness with read.csv
Às 19:42 de 08/04/2024, Ivan Krylov via R-help escreveu: В Sun, 7 Apr 2024 23:47:52 -0600 Dave Dixon пишет: > second_records <- read.csv(file_name, skip = 2459465, nrows = 5) It may or may not be important that read.csv defaults to header = TRUE. Having skipped 2459465 lines, it may attempt to parse the next one as a header, so the second call read.csv() should probably include header = FALSE. This will throw an error, call read.table with sep="," instead. Bert's advice to try scan() is on point, though. It's likely that the default-enabled header is not the most serious problem here. Hoep this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Question regarding reservoir volume and water level
Às 13:27 de 07/04/2024, javad bayat escreveu: Dear all; I have a question about the water level of a reservoir, when the volume changed or doubled. There is a DEM file with the highest elevation 1267 m. The lowest elevation is 1230 m. The current volume of the reservoir is 7,000,000 m3 at 1240 m. Now I want to know what would be the water level if the volume rises to 1250 m? or what would be the water level if the volume doubled (14,000,000 m3)? Is there any way to write codes to do this in R? I would be more than happy if anyone could help me. Sincerely Hello, This is a simple rule of three. If you know the level l the argument doesn't need to be named but if you know the volume v then it must be named. water_level <- function(l, v, level = 1240, volume = 7e6) { if(missing(v)) { volume * l / level } else level * v / volume } lev <- 1250 vol <- 14e6 water_level(l = lev) #> [1] 7056452 water_level(v = vol) #> [1] 2480 Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Output of tapply function as data frame: Problem Fixed
Às 01:43 de 29/03/2024, Ogbos Okike escreveu: Dear Rui, Thanks again for resolving this. I have already started using the version that works for me. But to clarify the second part, please let me paste the what I did and the error message: set.seed(2024) data <- data.frame( +Date = sample(seq(Sys.Date() - 5, Sys.Date(), by = "1 days"), 100L, + TRUE), +count = sample(10L, 100L, TRUE) + ) # coerce tapply's result to class "data.frame" res <- with(data, tapply(count, Date, mean)) |> as.data.frame() Error: unexpected '>' in "res <- with(data, tapply(count, Date, mean)) |>" # assign a dates column from the row names res$Date <- row.names(res) Error in row.names(res) : object 'res' not found # cosmetics names(res)[2:1] <- names(data) Error in names(res)[2:1] <- names(data) : object 'res' not found # note that the row names are still tapply's names vector # and that the columns order is not Date/count. Both are fixed # after the calculations. res You can see that the error message is on the pipe. Please, let me know where I am missing it. Thanks. On Wed, Mar 27, 2024 at 10:45 PM Rui Barradas wrote: Às 08:58 de 27/03/2024, Ogbos Okike escreveu: Dear Rui, Nice to hear from you! I am sorry for the omission and I have taken note. Many thanks for responding. The second solution looks elegant as it quickly resolved the problem. Please, take a second look at the first solution. It refused to run. Looks as if the pipe is not properly positioned. Efforts to correct it and get it run failed. If you can look further, it would be great. If time does not permit, I am fine too. But having the too solutions will certainly make the subject more interesting. Thank you so much. With warmest regards from Ogbos On Wed, Mar 27, 2024 at 8:44 AM Rui Barradas wrote: Às 04:30 de 27/03/2024, Ogbos Okike escreveu: Warm greetings to you all. Using the tapply function below: data<-read.table("FD1month",col.names = c("Dates","count")) x=data$count f<-factor(data$Dates) AB<- tapply(x,f,mean) I made a simple calculation. The result, stored in AB, is of the form below. But an effort to write AB to a file as a data frame fails. When I use the write table, it only produces the count column and strip of the first column (date). 2005-11-01 2005-12-01 2006-01-01 2006-02-01 2006-03-01 2006-04-01 2006-05-01 -4.106887 -4.259154 -5.836090 -4.756757 -4.118011 -4.487942 -4.430705 2006-06-01 2006-07-01 2006-08-01 2006-09-01 2006-10-01 2006-11-01 2006-12-01 -3.856727 -6.067103 -6.418767 -4.383031 -3.985805 -4.768196 -10.072579 2007-01-01 2007-02-01 2007-03-01 2007-04-01 2007-05-01 2007-06-01 2007-07-01 -5.342338 -4.653128 -4.325094 -4.525373 -4.574783 -3.915600 -4.127980 2007-08-01 2007-09-01 2007-10-01 2007-11-01 2007-12-01 2008-01-01 2008-02-01 -3.952150 -4.033518 -4.532878 -4.522941 -4.485693 -3.922155 -4.183578 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01 2008-08-01 2008-09-01 -4.336969 -3.813306 -4.296579 -4.575095 -4.036036 -4.727994 -4.347428 2008-10-01 2008-11-01 2008-12-01 -4.029918 -4.260326 -4.454224 But the normal format I wish to display only appears on the terminal, leading me to copy it and paste into a text file. That is, when I enter AB on the terminal, it returns a format in the form: 008-02-01 -4.183578 2008-03-01 -4.336969 2008-04-01 -3.813306 2008-05-01 -4.296579 2008-06-01 -4.575095 2008-07-01 -4.036036 2008-08-01 -4.727994 2008-09-01 -4.347428 2008-10-01 -4.029918 2008-11-01 -4.260326 2008-12-01 -4.454224 Now, my question: How do I write out two columns displayed by AB on the terminal to a file? I have tried using AB<-data.frame(AB) but it doesn't work either. Many thanks for your time. Ogbos [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, The main trick is to pipe to as.data.frame. But the result will have one column only, you must assign the dates from the df's row names. I also include an aggregate solution. # create a test data set set.seed(2024) data <- data.frame( Date = sample(seq(Sys.Date() - 5, Sys.Date(), by = "1 days"), 100L, TRUE), count = sample(10L, 100L, TRUE) ) # coerce tapply's result to class "data.frame" res <- with(data, tapply(count, Date, mean)) |> as.data.frame() # assign a dates column from the row names res$Date <- row.names(res) # cosmetics names(res)[2:1] <- names(data) # note that the row names are still tapply's names vector # and that the columns order is not Date/count. Both are fixed # after the calculat
Re: [R] Output of tapply function as data frame: Problem Fixed
Às 08:58 de 27/03/2024, Ogbos Okike escreveu: Dear Rui, Nice to hear from you! I am sorry for the omission and I have taken note. Many thanks for responding. The second solution looks elegant as it quickly resolved the problem. Please, take a second look at the first solution. It refused to run. Looks as if the pipe is not properly positioned. Efforts to correct it and get it run failed. If you can look further, it would be great. If time does not permit, I am fine too. But having the too solutions will certainly make the subject more interesting. Thank you so much. With warmest regards from Ogbos On Wed, Mar 27, 2024 at 8:44 AM Rui Barradas wrote: Às 04:30 de 27/03/2024, Ogbos Okike escreveu: Warm greetings to you all. Using the tapply function below: data<-read.table("FD1month",col.names = c("Dates","count")) x=data$count f<-factor(data$Dates) AB<- tapply(x,f,mean) I made a simple calculation. The result, stored in AB, is of the form below. But an effort to write AB to a file as a data frame fails. When I use the write table, it only produces the count column and strip of the first column (date). 2005-11-01 2005-12-01 2006-01-01 2006-02-01 2006-03-01 2006-04-01 2006-05-01 -4.106887 -4.259154 -5.836090 -4.756757 -4.118011 -4.487942 -4.430705 2006-06-01 2006-07-01 2006-08-01 2006-09-01 2006-10-01 2006-11-01 2006-12-01 -3.856727 -6.067103 -6.418767 -4.383031 -3.985805 -4.768196 -10.072579 2007-01-01 2007-02-01 2007-03-01 2007-04-01 2007-05-01 2007-06-01 2007-07-01 -5.342338 -4.653128 -4.325094 -4.525373 -4.574783 -3.915600 -4.127980 2007-08-01 2007-09-01 2007-10-01 2007-11-01 2007-12-01 2008-01-01 2008-02-01 -3.952150 -4.033518 -4.532878 -4.522941 -4.485693 -3.922155 -4.183578 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01 2008-08-01 2008-09-01 -4.336969 -3.813306 -4.296579 -4.575095 -4.036036 -4.727994 -4.347428 2008-10-01 2008-11-01 2008-12-01 -4.029918 -4.260326 -4.454224 But the normal format I wish to display only appears on the terminal, leading me to copy it and paste into a text file. That is, when I enter AB on the terminal, it returns a format in the form: 008-02-01 -4.183578 2008-03-01 -4.336969 2008-04-01 -3.813306 2008-05-01 -4.296579 2008-06-01 -4.575095 2008-07-01 -4.036036 2008-08-01 -4.727994 2008-09-01 -4.347428 2008-10-01 -4.029918 2008-11-01 -4.260326 2008-12-01 -4.454224 Now, my question: How do I write out two columns displayed by AB on the terminal to a file? I have tried using AB<-data.frame(AB) but it doesn't work either. Many thanks for your time. Ogbos [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, The main trick is to pipe to as.data.frame. But the result will have one column only, you must assign the dates from the df's row names. I also include an aggregate solution. # create a test data set set.seed(2024) data <- data.frame( Date = sample(seq(Sys.Date() - 5, Sys.Date(), by = "1 days"), 100L, TRUE), count = sample(10L, 100L, TRUE) ) # coerce tapply's result to class "data.frame" res <- with(data, tapply(count, Date, mean)) |> as.data.frame() # assign a dates column from the row names res$Date <- row.names(res) # cosmetics names(res)[2:1] <- names(data) # note that the row names are still tapply's names vector # and that the columns order is not Date/count. Both are fixed # after the calculations. res #> count Date #> 2024-03-22 5.416667 2024-03-22 #> 2024-03-23 5.50 2024-03-23 #> 2024-03-24 6.00 2024-03-24 #> 2024-03-25 4.476190 2024-03-25 #> 2024-03-26 6.538462 2024-03-26 #> 2024-03-27 5.20 2024-03-27 # fix the columns' order res <- res[2:1] # better all in one instruction aggregate(count ~ Date, data, mean) #> Datecount #> 1 2024-03-22 5.416667 #> 2 2024-03-23 5.50 #> 3 2024-03-24 6.00 #> 4 2024-03-25 4.476190 #> 5 2024-03-26 6.538462 #> 6 2024-03-27 5.20 Also, I'm glad to help as always but Ogbos, you have been an R-Help contributor for quite a while, please post data in dput format. Given the problem the output of the following is more than enough. dput(head(data, 20L)) Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com Hello, This pipe? with(data, tapply(count, Date, mean)) |> as.data.frame() I am not seeing anything wrong with it. I have tried it again just now and it runs with no problems, like it had before. A solution is not to pipe, separate the instructions.
Re: [R] Output of tapply function as data frame
Às 04:30 de 27/03/2024, Ogbos Okike escreveu: Warm greetings to you all. Using the tapply function below: data<-read.table("FD1month",col.names = c("Dates","count")) x=data$count f<-factor(data$Dates) AB<- tapply(x,f,mean) I made a simple calculation. The result, stored in AB, is of the form below. But an effort to write AB to a file as a data frame fails. When I use the write table, it only produces the count column and strip of the first column (date). 2005-11-01 2005-12-01 2006-01-01 2006-02-01 2006-03-01 2006-04-01 2006-05-01 -4.106887 -4.259154 -5.836090 -4.756757 -4.118011 -4.487942 -4.430705 2006-06-01 2006-07-01 2006-08-01 2006-09-01 2006-10-01 2006-11-01 2006-12-01 -3.856727 -6.067103 -6.418767 -4.383031 -3.985805 -4.768196 -10.072579 2007-01-01 2007-02-01 2007-03-01 2007-04-01 2007-05-01 2007-06-01 2007-07-01 -5.342338 -4.653128 -4.325094 -4.525373 -4.574783 -3.915600 -4.127980 2007-08-01 2007-09-01 2007-10-01 2007-11-01 2007-12-01 2008-01-01 2008-02-01 -3.952150 -4.033518 -4.532878 -4.522941 -4.485693 -3.922155 -4.183578 2008-03-01 2008-04-01 2008-05-01 2008-06-01 2008-07-01 2008-08-01 2008-09-01 -4.336969 -3.813306 -4.296579 -4.575095 -4.036036 -4.727994 -4.347428 2008-10-01 2008-11-01 2008-12-01 -4.029918 -4.260326 -4.454224 But the normal format I wish to display only appears on the terminal, leading me to copy it and paste into a text file. That is, when I enter AB on the terminal, it returns a format in the form: 008-02-01 -4.183578 2008-03-01 -4.336969 2008-04-01 -3.813306 2008-05-01 -4.296579 2008-06-01 -4.575095 2008-07-01 -4.036036 2008-08-01 -4.727994 2008-09-01 -4.347428 2008-10-01 -4.029918 2008-11-01 -4.260326 2008-12-01 -4.454224 Now, my question: How do I write out two columns displayed by AB on the terminal to a file? I have tried using AB<-data.frame(AB) but it doesn't work either. Many thanks for your time. Ogbos [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, The main trick is to pipe to as.data.frame. But the result will have one column only, you must assign the dates from the df's row names. I also include an aggregate solution. # create a test data set set.seed(2024) data <- data.frame( Date = sample(seq(Sys.Date() - 5, Sys.Date(), by = "1 days"), 100L, TRUE), count = sample(10L, 100L, TRUE) ) # coerce tapply's result to class "data.frame" res <- with(data, tapply(count, Date, mean)) |> as.data.frame() # assign a dates column from the row names res$Date <- row.names(res) # cosmetics names(res)[2:1] <- names(data) # note that the row names are still tapply's names vector # and that the columns order is not Date/count. Both are fixed # after the calculations. res #> count Date #> 2024-03-22 5.416667 2024-03-22 #> 2024-03-23 5.50 2024-03-23 #> 2024-03-24 6.00 2024-03-24 #> 2024-03-25 4.476190 2024-03-25 #> 2024-03-26 6.538462 2024-03-26 #> 2024-03-27 5.20 2024-03-27 # fix the columns' order res <- res[2:1] # better all in one instruction aggregate(count ~ Date, data, mean) #> Datecount #> 1 2024-03-22 5.416667 #> 2 2024-03-23 5.50 #> 3 2024-03-24 6.00 #> 4 2024-03-25 4.476190 #> 5 2024-03-26 6.538462 #> 6 2024-03-27 5.20 Also, I'm glad to help as always but Ogbos, you have been an R-Help contributor for quite a while, please post data in dput format. Given the problem the output of the following is more than enough. dput(head(data, 20L)) Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with R coding
Às 07:43 de 12/03/2024, Maria Del Mar García Zamora escreveu: Hello, This is the error that appears when I try to load library(Rcmdr). I am using R version 4.3.3. I have tried to upload the packages, uninstall them and intalling them again and nothing. Loading required package: splines Loading required package: RcmdrMisc Loading required package: car Loading required package: carData Loading required package: sandwich Loading required package: effects lattice theme set by effectsTheme() See ?effectsTheme for details. Error: package or namespace load failed for ‘Rcmdr’: .onLoad failed in loadNamespace() for 'tcltk2', details: call: file.exists("~/.Rtk2theme") error: file name conversion problem -- name too long? Once this appears I use path.expand('~') and this is R's answer: [1] "C:\\Users\\marga\\OneDrive - Fundaci\xf3n Universitaria San Pablo CEU\\Documentos" The thing is that in spanish we use accents, so this word (Fundaci\xf3n) really is Fundación, but I can't change it. I have tried to start R from CDM using: C:\Users\marga>set R_USER=C:\Users\marga\R_USER C:\Users\marga>"C:\Users\marga\Desktop\R-4.3.3\bin\R.exe" CMD Rgui At the beginning this worked but right now a message saying that this app cannot be used and that I have to ask the software company (photo attached) What should I do? Thanks, Mar [https://www.uchceu.es/img/externos/correo/ceu_uch.gif]<https://www.uchceu.es/> Maria Del Mar García Zamora Alumno UCHCEU - Universidad CEU Cardenal Herrera - Tel. www.uchceu.es<https://www.uchceu.es/> [https://www.uchceu.es/img/logos/wur.jpg] [https://www.uchceu.es/img/externos/correo/medio_ambiente.gif] Por favor, piensa en el medio ambiente antes de imprimir este contenido [http://www.uchceu.es/img/externos/correo/ceu_uch.gif]<http://www.uchceu.es/> Maria Del Mar García Zamora www.uchceu.es<http://www.uchceu.es/> [http://www.uchceu.es/img/externos/correo/medio_ambiente.gif] Por favor, piensa en el medio ambiente antes de imprimir este contenido Este mensaje y sus archivos adjuntos, enviados desde FUNDACIÓN UNIVERSITARIA SAN PABLO-CEU, pueden contener información confidencial y está destinado a ser leído sólo por la persona a la que va dirigido, por lo que queda prohibida la difusión, copia o utilización de dicha información por terceros. Si usted lo recibiera por error, por favor, notifíquelo al remitente y destruya el mensaje y cualquier documento adjunto que pudiera contener. Cualquier información, opinión, conclusión, recomendación, etc. contenida en el presente mensaje no relacionada con la actividad de FUNDACIÓN UNIVERSITARIA SAN PABLO-CEU, y/o emitida por persona no autorizada para ello, deberá considerarse como no proporcionada ni aprobada por FUNDACIÓN UNIVERSITARIA SAN PABLO-CEU, que pone los medios a su alcance para garantizar la seguridad y ausencia de errores en la correspondencia electrónica, pero no puede asegurar la inexistencia de virus o la no alteración de los documentos transmitidos electrónicamente, por lo que declina cualquier responsabilidad a este respecto. This message and its attachments, sent from FUNDACIÓN UNIVERSITARIA SAN PABLO-CEU, may contain confidential information and is intended to be read only by the person it is directed. Therefore any disclosure, copying or use by third parties of this information is prohibited. If you receive this in error, please notify the sender and destroy the message and any attachments may contain. Any information, opinion, conclusion, recommendation,... contained in this message and which is unrelated to the business activity of FUNDACIÓN UNIVERSITARIA SAN PABLO-CEU and/or issued by unauthorized personnel, shall be considered unapproved by FUNDACIÓN UNIVERSITARIA SAN PABLO-CEU. FUNDACIÓN UNIVERSITARIA SAN PABLO-CEU implements control measures to ensure, as far as possible, the security and reliability of all its electronic correspondence. However, FUNDACIÓN UNIVERSITARIA SAN PABLO-CEU does not guarantee that emails are virus-free or that documents have not be altered, and does not take responsibility in this respect. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, First of all, try running Rgui only, no R.exe CMD. Just Rgui.exe or C:\Users\marga\Desktop\R-4.3.3\bin\Rgui.exe Then, in Rgui, try loading Rcmdr library(Rcmdr) Also, do you have R in your Windows PATH variable? The directory to put in PATH should be C:\Users\marga\Desktop\R-4.3.3\bin so that Windows can find R.exe and Rgui.exe without the full path name. Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg
Re: [R] help - Package: stats - function ar.ols
Às 16:34 de 22/02/2024, Pedro Gavronski. escreveu: Hello, My name is Pedro and it is nice to meet you all. I am having trouble understanding a message that I receive when use function ar.ols from package stats, it says that "Warning message: In ar.ols(x = dtb[2:6966, ], demean = FALSE, intercept = TRUE, prewhite = TRUE) : model order: 2 singularities in the computation of the projection matrix results are only valid up to model order 1, which I do not know what it means, if someone could clarify it, I would really appreciate it. Attached to this email you will find my code and data I used to run this formula. Thanks in advance. Best regards, Pedro. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Thanks for the data but the code is missing from the attachment. Can you please post your code? In an attachment or directly in the e-mail body. Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Looping
Às 03:27 de 19/02/2024, Steven Yen escreveu: I need to read csv files repeatedly, named data1.csv, data2.csv,… data24.csv, 24 altogether. That is, data<-read.csv(“data1.csv”) … data<-read.csv(“data24.csv”) … Is there a way to do this in a loop? Thank you. Steven from iPhone [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Here is a way of reading the files in a *apply loop. The file names are created by getting them from file (list.files) or by a string editing function (sprintf). # file_names_vec <- list.files(pattern = "data\\d+\\.csv") file_names_vec <- sprintf("data%d.csv", 1:24) data_list <- sapply(file_names_vec, read.csv, simplify = FALSE) # access the 1st data.frame data_list[[1L]] # same as above data_list[["data1.csv"]] # same as above data_list$data1.csv Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Packages sometimes don't update, but no error or warning is thrown
Às 10:50 de 14/02/2024, Martin Maechler escreveu: Berwin A Turlach on Wed, 14 Feb 2024 11:47:41 +0800 writes: Berwin A Turlach on Wed, 14 Feb 2024 11:47:41 +0800 writes: > G'day Philipp, > On Tue, 13 Feb 2024 09:59:17 +0100 gernophil--- via R-help > wrote: >> this question is related to this >> (https://community.rstudio.com/t/packages-are-not-updating/166214/3), >> [...] >> To sum it up: If I am updating packages (be it via >> Bioconductor or CRAN) some packages simply don’t update, >> [...] >> I would expect any kind of message that the package will >> not be updated, since no newer binary is available or a >> prompt, if I want to compile from source. > RStudio is doing its own thing for some task, including > 'install.packages()' (and for some reasons, at least on > the platforms on which I use RStudio, RStudio calls > 'install.packages()' and not 'update.packages()' when an > update is requested via the GUI). See: RStudio> install.packages > function (...) .rs.callAs(name, hook, original, ...) > > compared to: R> install.packages > function (pkgs, lib, repos = getOption("repos"), > contriburl = contrib.url(repos, type), method, available = > NULL, destdir = NULL, dependencies = NA, type = > getOption("pkgType"), configure.args = > getOption("configure.args"), configure.vars = > getOption("configure.vars"), clean = FALSE, Ncpus = > getOption("Ncpus", 1L), verbose = getOption("verbose"), > libs_only = FALSE, INSTALL_opts, quiet = FALSE, > keep_outputs = FALSE, ...) { [...] > So if you use Install/Update in the Packages tab of > RStudio and do not experience the behaviour you are > expecting, it is something that you need to discuss with > Posit, not with R. :) >> However, the only message I get is: ``` trying URL >> '' > The package name has the version number encoded in it, so > theoretical you should be able to tell at this point > whether the package that is downloaded is the version that > is already installed, hence no update will happen. > Best wishes, > Berwin Yes, thank's a lot, Berwin. Indeed I've raised the fact that RStudio hides R's own install.packages() from the user and uses its own, undocumented one ... this has been the case for quite a few years. I found out during teaching --- one of the few times, I use RStudio to use R... in another case where RStudio's install.packages() behaved differently than R's. I'm pretty sure this is reason for quite a bit of confusion... Martin __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, From within RStudio you can always run the qualified names utils::install.packages() utils::update.packages() or run from the command line. Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Packages sometimes don't update, but no error or warning is thrown
Hello, Not exactly an answer, just a thought: Whenever I have problems updating or installing packages from whithin RStudio I close RStudio, write a script with the install.packages() call and run it from a command window. R -q -f "instscript.R" This many times works better and it also works with Bioconductor's BiocManager::install or with remotes'/devtools's install_github. Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] gathering denominator under frac
Às 10:01 de 02/02/2024, Troels Ring escreveu: Hi friends - I'm plotting a ratio of bicarbonates i ggplot2 and ylab(expression(paste(frac("additive BIC","true BIC" worked OK - but now I have been asked to put the chemistry instead - so I wrote ylab(expression(paste(frac("additive",HCO[3]^"-","true",HCO[3]^"-" - and frac saw that as additive = numerator and HCO3- = denominator and the rest was ignored- So how do I make frac ignore the first "," and print the fraction as I want? All best wishes Troels __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, This seems to work. Instead of separating the two numerator strings with a comma, separate them with a tilde. The same goes for the denominator. And there is no need for double quotes around "additive" and "true". library(ggplot2) g <- ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species)) + geom_point() g + ylab(expression(paste(frac( additive~HCO[3]^"-", true~HCO[3]^"-" Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Need help testing a problem
ackages: [1] rerddap_1.1.0 loaded via a namespace (and not attached): [1] vctrs_0.6.3 cli_3.6.1 rlang_1.1.1 ncdf4_1.22 [5] crul_1.4.0generics_0.1.3jsonlite_1.8.7 data.table_1.14.8 [9] glue_1.6.2httpcode_0.3.0triebeard_0.4.1 fansi_1.0.5 [13] rappdirs_0.3.3tibble_3.2.1 hoardr_0.5.4 lifecycle_1.0.4 [17] compiler_4.3.2dplyr_1.1.3 Rcpp_1.0.12 pkgconfig_2.0.3 [21] digest_0.6.33 R6_2.5.1 tidyselect_1.2.0 utf8_1.2.4 [25] pillar_1.9.0 curl_5.2.0magrittr_2.0.3urltools_1.7.3 [29] xml2_1.3.5 > So there was an unspecified error, an error without a condition message and no call expression. I find this stranger, a call like the following is expected. tryCatch(stop("error"), error = function(e) e) |> str() List of 2 $ message: chr "error" $ call : language doTryCatch(return(expr), name, parentenv, handler) - attr(*, "class")= chr [1:3] "simpleError" "error" "condition" Function tabledap doesn't seem to be handling errors properly. Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot 3-dimensions
Às 09:13 de 17/12/2023, SIBYLLE STÖCKLI via R-help escreveu: Dear R community In the meantime I made some progress: ggplot(data = Fig2b, aes(x = BFF, y = Wert, fill = Effekt))+theme_bw()+ geom_bar(stat = "identity", width = 0.95) + scale_y_continuous(limits=c(0,13), expand=c(0,0))+ facet_wrap(~Aspekt, strip.position = "bottom", scales = "free_x") + theme(panel.spacing = unit(0, "lines"), strip.background = element_blank(), strip.placement = "outside")+ theme(axis.title.x=element_blank())+ scale_fill_manual("Effekt", values = c("Neg" = "red", "Neu" = "darkgrey", "Pos" = "blue"), labels=c("Negativ", "Nicht sign.", "Positiv")) Question - Is it possible to present all the subpolots in one graph (not to "lines")? - I tried to change the angel of the x-axis. However, I was able to change the first x-axis (BB...), but not the second one (Voegel). Maybe this would solve the problem. - If not, is there another possibility to fix the number of subplots per line? Kind regards Sibylle -Original Message- From: R-help On Behalf Of SIBYLLE STÖCKLI via R-help Sent: Saturday, December 16, 2023 12:16 PM To: R-help@r-project.org Subject: [R] ggplot 3-dimensions Dear R-user Does anybody now, if ggplot allows to use two x-axis including two dimensions (similar to excel plot (picture 1 in the pdf attachmet). If yes, how should I adapt my code? The parameters are presented in the input file (attachment: Input). Fig2b = read.delim("BFF_Fig-2b.txt", na.strings="NA") names(Fig2b) head(Fig2b) summary(Fig2b) str(Fig2b) Fig2b$Aspekt<-factor(Fig2b$Aspekt, levels=(c("Voegel", "Kleinsaeuger", "Schnecken", "Regenwuermer_Asseln", "Pilze"))) ### Figure 2b ggplot(Fig2b,aes(Aspekt,Wert,fill=Effekt))+ geom_bar(stat="identity",position='fill')+ scale_y_continuous(limits=c(0,14), expand=c(0,0))+ labs(x="", y="Anzahl Studien pro Effekt") Kind regards Sibylle __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, You are posting the data as image once again, please don't do this. Paste the output of dput(Fig2b)# if small data dput(head(Fig2b, 20)) # if too big to fit in an e-mail in your mails. Here it is. Aspekt <- c("Flora", "Flora", "Flora", "Tagfalter", "Tagfalter", "Tagfalter", "Heuschre", "Heuschre", "Heuschre", "Kaefer_Sp", "Kaefer_Sp", "Kaefer_Sp", "Schwebfli", "Schwebfli", "Schwebfli", "Bienen_F", "Bienen_F", "Bienen_F") Aspekt <- c(Aspekt, Aspekt) BFF <- rep(c("BB", "SA", "NE"), times = 12) Effekt <- c(rep("Neg", times = 18), rep("Pos", times = 18)) Wert <- c(0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 1, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 2, 1, 0, 0, 1, 0, 9, 4, 6, 0, 0, 3, 0, 0, 4) Fig2b <- data.frame(Aspekt, BFF, Effekt, Wert) As for the question, you can use facet_wrap argument nrow to have all plots in one row only, see the comment before facet_wrap. I don't know if this solves the problem. Also, I define a custom theme to make the code clearer later. library(ggplot2) theme_sibylle <- function() { theme_bw(base_size = 10) %+replace% theme( panel.spacing = unit(0, "lines"), strip.background = element_blank(), strip.placement = "outside", # this line was added by me, remove if not wanted strip.text.x.bottom = element_text(face = "bold", size = 10), axis.title.x = element_blank() ) } ggplot(data = Fig2b, aes(x = BFF, y = Wert, fill = Effekt)) + geom_bar(stat = "identity", width = 0.95) + scale_y_continuous(limits=c(0,13), expand=c(0,0)) + # here I use nrow = 1L to put everything in one row only facet_wrap(~ Aspekt, nrow = 1L, strip.position = "bottom", scales = "free_x") + scale_fill_manual( name = "Effekt", values = c("Neg" = "red", "Neu" = "darkgrey", "Pos" = "blue"), labels = c("Negativ", "Nicht sign.", "Positiv")) + theme_sibylle() Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot2: Get the regression line with 95% confidence bands
Às 00:36 de 13/12/2023, Robert Baer escreveu: coord_cartesian also seems to work for y, and including the breaks = . How about: df=data.frame(year= c(2012,2015,2018,2022), score=c(495,493, 495, 474)) ggplot(df, aes(x = year, y = score)) + geom_point() + geom_smooth(method = "lm", formula = y ~ x) + labs(title = "Standard linear regression for France", x = "Year", y = "PISA score in mathematics") + coord_cartesian(ylim=c(470,500)) + scale_x_continuous(breaks = 2012:2022) On 12/12/2023 3:19 PM, varin sacha via R-help wrote: Dear Ben, Dear Daniel, Dear Rui, Dear Bert, Here below my R code. I really appreciate all your comments. My R code is perfectly working but there is still something I would like to improve. The X-axis is showing 2012.5 ; 2015.0 ; 2017.5 ; 2020.0 I would like to see on X-axis only the year (2012 ; 2015 ; 2017 ; 2020). How to do? # library(ggplot2) df=data.frame(year= c(2012,2015,2018,2022), score=c(495,493, 495, 474)) ggplot(df, aes(x = year, y = score)) + geom_point() + geom_smooth(method = "lm", formula = y ~ x) + labs(title = "Standard linear regression for France", x = "Year", y = "PISA score in mathematics") + scale_y_continuous(limits=c(470,500),oob=scales::squish) # Le lundi 11 décembre 2023 à 23:38:06 UTC+1, Ben Bolker a écrit : On 2023-12-11 5:27 p.m., Daniel Nordlund wrote: On 12/10/2023 2:50 PM, Rui Barradas wrote: Às 22:35 de 10/12/2023, varin sacha via R-help escreveu: Dear R-experts, Here below my R code, as my X-axis is "year", I must be missing one or more steps! I am trying to get the regression line with the 95% confidence bands around the regression line. Any help would be appreciated. Best, S. # library(ggplot2) df=data.frame(year=factor(c("2012","2015","2018","2022")), score=c(495,493, 495, 474)) ggplot(df, aes(x=year, y=score)) + geom_point( ) + geom_smooth(method="lm", formula = score ~ factor(year), data = df) + labs(title="Standard linear regression for France", y="PISA score in mathematics") + ylim(470, 500) # __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, I don't see a reason why year should be a factor and the formula in geom_smooth is wrong, it should be y ~ x, the aesthetics envolved. It still doesn't plot the CI's though. There's a warning and I am not understanding where it comes from. But the regression line is plotted. ggplot(df, aes(x = as.numeric(year), y = score)) + geom_point() + geom_smooth(method = "lm", formula = y ~ x) + labs( title = "Standard linear regression for France", x = "Year", y = "PISA score in mathematics" ) + ylim(470, 500) #> Warning message: #> In max(ids, na.rm = TRUE) : no non-missing arguments to max; returning -Inf Hope this helps, Rui Barradas After playing with this for a little while, I realized that the problem with plotting the confidence limits is the addition of ylim(470, 500). The confidence values are outside the ylim values. Remove the limits, or increase the range, and the confidence curves will plot. Hope this is helpful, Dan Or use + scale_y_continuous(limits = c(470, 500), oob = scales::squish) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, In the code below I don't use coord_cartesian because to set ylim will cut part of the confidence intervals. To have labels only in the years present in the data set, get them from the data. library(ggplot2) df <- data.frame(year= c(2012,2015,2018,2022), score=c(495,493, 495, 47
Re: [R] ggplot2: Get the regression line with 95% confidence bands
Às 22:35 de 10/12/2023, varin sacha via R-help escreveu: Dear R-experts, Here below my R code, as my X-axis is "year", I must be missing one or more steps! I am trying to get the regression line with the 95% confidence bands around the regression line. Any help would be appreciated. Best, S. # library(ggplot2) df=data.frame(year=factor(c("2012","2015","2018","2022")), score=c(495,493, 495, 474)) ggplot(df, aes(x=year, y=score)) + geom_point( ) + geom_smooth(method="lm", formula = score ~ factor(year), data = df) + labs(title="Standard linear regression for France", y="PISA score in mathematics") + ylim(470, 500) # __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, I don't see a reason why year should be a factor and the formula in geom_smooth is wrong, it should be y ~ x, the aesthetics envolved. It still doesn't plot the CI's though. There's a warning and I am not understanding where it comes from. But the regression line is plotted. ggplot(df, aes(x = as.numeric(year), y = score)) + geom_point() + geom_smooth(method = "lm", formula = y ~ x) + labs( title = "Standard linear regression for France", x = "Year", y = "PISA score in mathematics" ) + ylim(470, 500) #> Warning message: #> In max(ids, na.rm = TRUE) : no non-missing arguments to max; returning -Inf Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert character date time to R date-time variable.
Às 16:30 de 07/12/2023, Rui Barradas escreveu: Às 16:21 de 07/12/2023, Sorkin, John escreveu: Colleagues, I have a matrix of character data that represents date and time. The format of each element of the matrix is "2020-09-17_00:00:00" How can I convert the elements into a valid R date-time constant? Thank you, John John David Sorkin M.D., Ph.D. Professor of Medicine, University of Maryland School of Medicine; Associate Director for Biostatistics and Informatics, Baltimore VA Medical Center Geriatrics Research, Education, and Clinical Center; PI Biostatistics and Informatics Core, University of Maryland School of Medicine Claude D. Pepper Older Americans Independence Center; Senior Statistician University of Maryland Center for Vascular Research; Division of Gerontology and Paliative Care, 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 Cell phone 443-418-5382 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Coerce with ?as.POSIXct Don't forget the underscore in the format. as.POSIXct("2020-09-17_00:00:00", format = "%Y-%m-%d_%H:%M:%S") Hope this helps, Rui Barradas Sorry, I forgot: lubridate::ymd_hms("2020-09-17_00:00:00") Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Convert character date time to R date-time variable.
Às 16:21 de 07/12/2023, Sorkin, John escreveu: Colleagues, I have a matrix of character data that represents date and time. The format of each element of the matrix is "2020-09-17_00:00:00" How can I convert the elements into a valid R date-time constant? Thank you, John John David Sorkin M.D., Ph.D. Professor of Medicine, University of Maryland School of Medicine; Associate Director for Biostatistics and Informatics, Baltimore VA Medical Center Geriatrics Research, Education, and Clinical Center; PI Biostatistics and Informatics Core, University of Maryland School of Medicine Claude D. Pepper Older Americans Independence Center; Senior Statistician University of Maryland Center for Vascular Research; Division of Gerontology and Paliative Care, 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 Cell phone 443-418-5382 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Coerce with ?as.POSIXct Don't forget the underscore in the format. as.POSIXct("2020-09-17_00:00:00", format = "%Y-%m-%d_%H:%M:%S") Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Mann Kendall mutation package?
Às 11:58 de 01/12/2023, Nick Wray escreveu: Hello - does anyone know whether there are any packages for Mann-Kendall mutation tests in R available? The only one I could find online is this MK_mut_test: Mann-Kendall mutation test in Sibada/sibadaR: Sibada's accumulated R scripts for next probably use to avoid reinventing the wheel. (rdrr.io) <https://rdrr.io/github/Sibada/sibadaR/man/MK_mut_test.html> but there doesn't seem to be a package corresponding to this. I've tried installing various permutations of the apparent name Sibada/sibadaR but nothing comes up, so I'm not sure whether it even exists... Thanks Nick Wray [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Your link points to a GitHub repository, the package can be installed with devtools::install_github(repo = "Sibada/sibadaR") Hope this helps Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] back tick names with predict function
Às 17:57 de 30/11/2023, Rui Barradas escreveu: Às 17:38 de 30/11/2023, Robert Baer escreveu: I am having trouble using back ticks with the R extractor function 'predict' and an lm() model. I'm trying too construct some nice vectors that can be used for plotting the two types of regression intervals. I think it works with normal column heading names but it fails when I have "special" back-tick names. Can anyone help with how I would reference these? Short of renaming my columns, is there a way to accomplish this? Repex *# dataframe with dashes in column headings cob = structure(list(`cob-wt` = c(212, 241, 215, 225, 250, 241, 237, 282, 206, 246, 194, 241, 196, 193, 224, 257, 200, 190, 208, 224 ), `plant-density` = c(137, 107, 132, 135, 115, 103, 102, 65, 149, 85, 173, 124, 157, 184, 112, 80, 165, 160, 157, 119)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -20L)) # regression model works mod2 = lm(`cob-wt` ~ `plant-density`, data = cob) # x sequence for plotting CI's # Set up x points x = seq(min(cob$`plant-density`), max(cob$`plant-density`), length = 1000) # Use predict to get CIs for a plot # Add CI for regression line (y-hat uses 'c') # usual trick is to assign x to actual x-var name in middle dataframe arguement CI.c = predict(mod2, data.frame( `plant-density` = x), interval = 'c') # fail # Add CI for prediction value (y-tilde uses 'p') # usual trick is to assign x to actual x-var name in middle dataframe arguement CI.p = predict(mod2, data.frame(`plant-density` = x), interval = 'p') # fail * __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, When creating the new data df, the default check.names = TRUE changes the column name, it is repaired and the hyphen is replaced by a legal dot. # check.names defaults to TRUE newd <- data.frame(`plant-density` = x) # `plant-density` is not a column name head(newd) # check.names set to FALSE newd <- data.frame(`plant-density` = x, check.names = FALSE) # `plant-density` is becomes a column name head(newd) # Use predict to get CIs for a plot # Add CI for regression line (y-hat uses 'c') # usual trick is to assign x to actual x-var name in middle dataframe arguement CI.c = predict(mod2, newdata = newd, interval = 'confidence') # fail # Add CI for prediction value (y-tilde uses 'p') # usual trick is to assign x to actual x-var name in middle dataframe arguement CI.p = predict(mod2, newdata = newd, interval = 'prediction') # fail Hope this helps, Rui Barradas Hello, Sorry for the comments '# fail' in the last two instructions, I should have changed them. CI.c <- predict(mod2, newdata = newd, interval = 'confidence') # works CI.p <- predict(mod2, newdata = newd, interval = 'prediction') # works Hoep this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] back tick names with predict function
Às 17:38 de 30/11/2023, Robert Baer escreveu: I am having trouble using back ticks with the R extractor function 'predict' and an lm() model. I'm trying too construct some nice vectors that can be used for plotting the two types of regression intervals. I think it works with normal column heading names but it fails when I have "special" back-tick names. Can anyone help with how I would reference these? Short of renaming my columns, is there a way to accomplish this? Repex *# dataframe with dashes in column headings cob = structure(list(`cob-wt` = c(212, 241, 215, 225, 250, 241, 237, 282, 206, 246, 194, 241, 196, 193, 224, 257, 200, 190, 208, 224 ), `plant-density` = c(137, 107, 132, 135, 115, 103, 102, 65, 149, 85, 173, 124, 157, 184, 112, 80, 165, 160, 157, 119)), class = c("tbl_df", "tbl", "data.frame"), row.names = c(NA, -20L)) # regression model works mod2 = lm(`cob-wt` ~ `plant-density`, data = cob) # x sequence for plotting CI's # Set up x points x = seq(min(cob$`plant-density`), max(cob$`plant-density`), length = 1000) # Use predict to get CIs for a plot # Add CI for regression line (y-hat uses 'c') # usual trick is to assign x to actual x-var name in middle dataframe arguement CI.c = predict(mod2, data.frame( `plant-density` = x), interval = 'c') # fail # Add CI for prediction value (y-tilde uses 'p') # usual trick is to assign x to actual x-var name in middle dataframe arguement CI.p = predict(mod2, data.frame(`plant-density` = x), interval = 'p') # fail * __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, When creating the new data df, the default check.names = TRUE changes the column name, it is repaired and the hyphen is replaced by a legal dot. # check.names defaults to TRUE newd <- data.frame(`plant-density` = x) # `plant-density` is not a column name head(newd) # check.names set to FALSE newd <- data.frame(`plant-density` = x, check.names = FALSE) # `plant-density` is becomes a column name head(newd) # Use predict to get CIs for a plot # Add CI for regression line (y-hat uses 'c') # usual trick is to assign x to actual x-var name in middle dataframe arguement CI.c = predict(mod2, newdata = newd, interval = 'confidence') # fail # Add CI for prediction value (y-tilde uses 'p') # usual trick is to assign x to actual x-var name in middle dataframe arguement CI.p = predict(mod2, newdata = newd, interval = 'prediction')# fail Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] ggplot with two x-axis and two dimensions
Às 10:29 de 24/11/2023, sibylle.stoec...@gmx.ch escreveu: Dear R-user Does anybody now, if ggplot allows to use two x-axis including two dimensions (similar to excel plot (picture 1 in the pdf attachmet). If yes, how should I adapt my code? The parameters are presented in the input file (attachment: Input). Fig2b = read.delim("BFF_Fig-2b.txt", na.strings="NA") names(Fig2b) head(Fig2b) summary(Fig2b) str(Fig2b) Fig2b$Aspekt<-factor(Fig2b$Aspekt, levels=(c("Voegel", "Kleinsaeuger", "Schnecken", "Regenwuermer_Asseln", "Pilze"))) ### Figure 2b ggplot(Fig2b,aes(Aspekt,Wert,fill=Effekt))+ geom_bar(stat="identity",position='fill')+ scale_y_continuous(limits=c(0,14), expand=c(0,0))+ labs(x="", y="Anzahl Studien pro Effekt") Kind regards Sibylle __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, The first attached file does not match the data in the second file but here is an answer to both this question and to your other question [1]. The trick to have a secondary axis is to compute a ratio of axis lenghts. The lengths of the main and secondary axis can be computed by functions range() and diff(), like in the code below. Then use it to scale the secondary axis. Fig2b <- structure(list( Aspekt = c("Flora", "Flora", "Flora", "Tagfalter", "Tagfalter", "Tagfalter", "Heuschre", "Heuschre", "Heuschre", "Kaefer_Sp", "Kaefer_Sp", "Kaefer_Sp", "Schwebfli", "Schwebfli", "Schwebfli", "Bienen_F", "Bienen_F", "Bienen_F", "Flora", "Flora", "Flora", "Tagfalter", "Tagfalter", "Tagfalter", "Heuschre", "Heuschre", "Heuschre", "Kaefer_Sp", "Kaefer_Sp", "Kaefer_Sp", "Schwebfli", "Schwebfli", "Schwebfli", "Bienen_F", "Bienen_F", "Bienen_F"), BFF = c("BB", "SA", "NE", "BB", "SA", "NE", "BB", "SA", "NE", "BB", "SA", "NE", "BB", "SA", "NE", "BB", "SA", "NE", "BB", "SA", "NE", "BB", "SA", "NE", "BB", "SA", "NE", "BB", "SA", "NE", "BB", "SA", "NE", "BB", "SA", "NE"), Effekt = c("Neu", "Neu", "Neu", "Neu", "Neu", "Neu", "Neu", "Neu", "Neu", "Neu", "Neu", "Neu", "Neu", "Neu", "Neu", "Neu", "Neu", "Neu", "Pos", "Pos", "Pos", "Pos", "Pos", "Pos", "Pos", "Pos", "Pos", "Pos", "Pos", "Pos", "Pos", "Pos", "Pos", "Pos", "Pos", "Pos"), Wert = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 3L, 1L, 1L, 0L, 0L, 1L, 1L, 0L, 0L, 1L, 0L, 0L, 2L, 1L, 0L, 0L, 1L, 0L, 9L, 4L, 6L, 0L, 0L, 3L, 0L, 0L, 4L)), row.names = c(NA, -36L), class = "data.frame") library(ggplot2) # First y axis (0-9) # Second y axis (0-2500) # fac <- diff(range( sec axis ))/diff(range( 1st axis )) fac <- diff(range(0, 2500))/diff(range(0, 9)) ggplot(Fig2b, aes(Aspekt, Wert, fill = Effekt)) + geom_col(position = position_dodge()) + scale_y_continuous( breaks = seq(0, 12, 2L), sec.axis = sec_axis(~ . * fac) ) + labs(x = "", y = "Anzahl Studien pro Effekt") [1] https://stat.ethz.ch/pipermail/r-help/2023-November/478605.html Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fast way to draw mean values and 95% confidence intervals of groups with ggplot2
Às 11:59 de 16/11/2023, Luigi Marongiu escreveu: Hello, I have triplicate (column A) readings (column D) of samples exposed to different concentrations (column C) over time (column B). Is it possible to draw a line plot of the mean values for each concentration (C)? At the moment, I get a single line. Also, is there a simple way to draw the 95% CI around these data? I know I need to use ribbon with the lower and upper limit, but is there a simple way for ggplot2 to calculate directly these values? Here is a working example: ``` A = c(rep(1, 28), rep(2, 28), rep(3, 28)) B = rep(c(0, 15, 30, 45, 60, 75, 90), 12) C = rep(c(rep(0, 7), rep(0.6, 7), rep(1.2, 7), rep(2.5,7)),3) D = c(731.33,761.67,730,761.67,741.67,788.67,784.33, 686.67,685.33,680,693.67,684,704,709.67,739, 731,719,767,760.67,776.67,768.67,675,671.67, 668.67,677.33,673.67,687,696.67,727,750.67, 752.67,786.67,794.67,843.33,946,732.67,737.33, 775.33,828,918,1063,1270,752.67,742.33, 735.67, 747.67,777.33,803.67,865.67,700,700.67,705.67, 722.67,744,779,837,748,742,754,747.67, 775.67,808.67,869,705.67,714.33,702.33,730, 710.67,731,744,686.33,687.33,670,702.33, 669.33,707.33,708.33,724,747,761.33,715, 697.67,728,728) df = data.frame(A, B, C, D) library(ggplot2) ggplot(data=df, aes(x=B, y=D, z=C, color =C)) + geom_line(stat = "summary", fun = "mean") + geom_ribbon() ``` Thank you __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, I am not sure that the code below is what you want. The first 3 instructions are to create a named vector of colors. The pipe is what tries to solve the problem. It computes means and se's by groups of time and concentration, then plots the ribbon below the lines. It is important to not set color = C in the initial call to ggplot, since it would be effective in all the subsequent layers (try it). To have one line per concentration I use group = C instead. suppressPackageStartupMessages({ library(ggplot2) library(dplyr) }) n_colors <- df$C |> unique() |> length() names_colors <- df$C |> unique() |> as.character() clrs <- setNames(palette.colors(n_colors), names_colors) df %>% mutate(C = factor(C)) %>% group_by(B, C) %>% mutate(mean_D = mean(D), se_D = sd(D)) %>% ungroup() %>% ggplot(aes(x = B, group = C)) + geom_ribbon(aes(ymin = mean_D - se_D, ymax = mean_D + se_D), fill = "grey", alpha = 0.5) + geom_line(aes(y = mean_D, color = C)) + geom_point(aes(y = D, color = C)) + scale_color_manual(name = "Concentration", values = clrs) Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] anyone having trouble accesing CRAN?
Às 19:13 de 15/11/2023, Christopher W. Ryan via R-help escreveu: at https://cran.r-project.org/ I get this error message: = Secure Connection Failed An error occurred during a connection to cran.r-project.org. PR_END_OF_FILE_ERROR Error code: PR_END_OF_FILE_ERROR The page you are trying to view cannot be shown because the authenticity of the received data could not be verified. === Three different browsers, two different devices, two different networks. (The text of the error messages varies.) Anyone seeing similar? Thanks. --Chris Ryan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Yes, CRAN is down. I know last week there was an anouncement about a maintenance scheduled but I cannot place that e-mail right now and don't remember the date exactly so I cannot say for sure this is what is happening. But it is probably a scheduled maintenance. Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Cryptic error for mscmt function
Às 13:35 de 05/11/2023, Leu Thierry escreveu: Hi everyone, I am trying to conduct a synthetic control analysis using the MSCMT package. However, when trying to run it I get a very cryptic error message saying "Error in lst[[nam]][intersect(tim, rownames(lst[[nam]])), cols, drop = FALSE]: subscript out of bounds". Does anyone know what this means and why I receive this error? I attached the code & dataset used in the attachment. Thanks a lot! Best regards Thierry __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, No attachment came through the filters, can you resend in plain text or if it was a .R file, rename it .txt? See [1], section General Instructions for more on this [1] https://www.r-project.org/mail.html#instructions Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Sum data according to date in sequence
_ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Here are two solutions. 1. Base R Though I don't coerce the date column to class "Date", it seems to work. aggregate(EnergykWh ~ date, dt1, sum) #>date EnergykWh #> 1 1/14/2016 11.98569 #> 2 1/15/2016 32.56938 #> 3 1/16/2016 21.29181 #> 4 1/17/2016 22.88083 #> 5 1/18/2016 9.05750 2. Package dplyr. First column date is coerced from class "character" to class "Date". Then the grouped sums are computed. suppressPackageStartupMessages( library(dplyr) ) dt1 %>% mutate(date = as.Date(date, "%m/%d/%Y")) %>% summarise(EnergykWh = sum(EnergykWh), .by = date) #> date EnergykWh #> 1 2016-01-14 11.98569 #> 2 2016-01-15 32.56938 #> 3 2016-01-16 21.29181 #> 4 2016-01-17 22.88083 #> 5 2016-01-18 9.05750 As you can see, the results are the same. Also, this exact problem is one of the most asked on StackOverflow. Maybe you could try searching there for a solution. My code above is also exactly the code in [1], though I had already this answer written. I only checked after :(. [1] https://stackoverflow.com/questions/61548758/r-how-sum-values-by-group-by-date Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Missing shapes in legend with scale_shape_manual
Às 20:55 de 30/10/2023, Kevin Zembower via R-help escreveu: Hello, I'm trying to plot a graph of blood glucose versus date. I also record conditions, such as missing the previous night's medications, and missing exercise on the previous day. My data looks like: b2[68:74,] # A tibble: 7 × 5 Date Time bg missed_meds no_exercise 1 2023-10-17 08:50128 TRUEFALSE 2 2023-10-16 06:58144 FALSE FALSE 3 2023-10-15 09:17137 FALSE TRUE 4 2023-10-14 09:04115 FALSE FALSE 5 2023-10-13 08:44136 FALSE TRUE 6 2023-10-12 08:55122 FALSE TRUE 7 2023-10-11 07:55150 TRUETRUE This gets me most of the way to what I want: ggplot(data = b2, aes(x = Date, y = bg)) + geom_line() + geom_point(data = filter(b2, missed_meds), shape = 20, size = 3) + geom_point(data = filter(b2, no_exercise), shape = 4, size = 3) + geom_point(aes(x = Date, y = bg, shape = missed_meds), alpha = 0) + #Invisible point layer for shape mapping scale_y_continuous(name = "Blood glucose (mg/dL)", breaks = seq(100, 230, by = 20) ) + geom_hline(yintercept = 130) + scale_shape_manual(name = "Conditions", labels = c("Missed meds", "Missed exercise"), values = c(20, 4), ## size = 3 ) However, the legend just prints an empty square in front of the labels. What I want is a filled circle (shape 20) in front of "Missed meds" and a filled circle (shape 4) in front of "Missed exercise." My questions are: 1. How can I fix my plot to show the shapes in the legend? 2. Can my overall plotting method be improved? Would you do it this way? Thanks so much for your advice and guidance. -Kevin __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, In ggplot2 graphics when you have more than one call to the same layer function, then you can probably simplify the code. In this case you make several calls to geom_point. This can probably be avoided. Create a new column named Condition. Assign to it the column names wherever the values of those columns are TRUE. The simplest way of doing this is to use colus missed_meds and no_exercise as logical index columns, see code below. Like this the values are mapped to shapes in just one call to geom_point. That's what function aes() is meant for, to tell what variables define what in the plot. b2$Date <- as.Date(b2$Date) # this new column will be mapped to the shape aesthetic b2$Conditions <- NA_character_ b2$Conditions[b2$missed_meds] <- names(b2)[4] b2$Conditions[b2$no_exercise] <- names(b2)[5] ggplot(data = b2, aes(x = Date, y = bg)) + geom_line() + geom_point(aes(shape = Conditions), size = 3) + geom_hline(yintercept = 130) + scale_y_continuous( name = "Blood glucose (mg/dL)", breaks = seq(100, 230, by = 20) ) + scale_shape_manual( #name = "Conditions", labels = c("Missed meds", "Missed exercise"), values = c(20, 4), na.translate = FALSE ) Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to Reformat a dataframe
want to do is, instead of having 12 observations by row, I want to have one observation by row. I want to have a single column with 1509 observations instead of 126 rows with 12 columns per row. I tried the following: df = data.frame(matrix(nrow = Length, ncol = 1)) colnames(df) = c("aportes_alajuela") for (row in 1:nrow(alajuela_df)){ for (col in 1:ncol(alajuela_df)){ df[i,1]=alajuela_df[i,j] } } But I am not getting the data in the structure I want. Any help will be greatly appreciated. Best regards, Paul [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Here are two base R way with ?stack and with ?reshape. # 1. With stack() df_long <- stack(alajuela_df)[1] df_long <- df_long[complete.cases(df_long), , drop = FALSE] head(df_long) # 2. With reshape df_long <- reshape( alajuela_df, direction = "long", varying = names(alajuela_df), v.names = "x" )[2] # 1512 rows, only one column dim(df_long) # [1] 15121 # there are NA's in the data df_long[complete.cases(df_long), , drop = FALSE] |> dim() # [1] 15091 # keep the rows with values not NA df_long <- df_long[complete.cases(df_long), , drop = FALSE] # check the dimensions again dim(df_long) # [1] 15091 Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plot for 10 years extrapolation
Às 19:23 de 26/10/2023, varin sacha via R-help escreveu: Dear R-Experts, Here below my R code working but I don't know how to complete/finish my R code to get the final plot with the extrapolation for the10 more years. Indeed, I try to extrapolate my data with a linear fit over the next 10 years. So I create a date sequence for the next 10 years and store as a dataframe to make the prediction possible. Now, I am trying to get the plot with the actual data (from year 2004 to 2018) and with the 10 more years extrapolation. Thanks for your help. date <-as.Date(c("2018-12-31", "2017-12-31", "2016-12-31", "2015-12-31", "2014-12-31", "2013-12-31", "2012-12-31", "2011-12-31", "2010-12-31", "2009-12-31", "2008-12-31", "2007-12-31", "2006-12-31", "2005-12-31", "2004-12-31")) value <-c(15348, 13136, 11733, 10737, 15674, 11098, 13721, 13209, 11099, 10087, 14987, 11098, 13421, 9023, 12098) model <- lm(value~date) plot(value~date ,col="grey",pch=20,cex=1.5,main="Plot") abline(model,col="darkorange",lwd=2) dfuture <- data.frame(date=seq(as.Date("2019-12-31"), by="1 year", length.out=10)) predict(model,dfuture,interval="prediction") __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Here is a way with base R graphics. Explained in the code comments. date <-as.Date(c("2018-12-31", "2017-12-31", "2016-12-31", "2015-12-31", "2014-12-31", "2013-12-31", "2012-12-31", "2011-12-31", "2010-12-31", "2009-12-31", "2008-12-31", "2007-12-31", "2006-12-31", "2005-12-31", "2004-12-31")) value <-c(15348, 13136, 11733, 10737, 15674, 11098, 13721, 13209, 11099, 10087, 14987, 11098, 13421, 9023, 12098) model <- lm(value ~ date) dfuture <- data.frame(date = seq(as.Date("2019-12-31"), by="1 year", length.out=10)) predfuture <- predict(model, dfuture, interval="prediction") dfuture <- cbind(dfuture, predfuture) # start the plot with the required x and y limits xlim <- range(c(date, dfuture$date)) ylim <- range(c(value, dfuture$fit)) plot(value ~ date, col="grey", pch=20, cex=1.5, main="Plot" , xlim = xlim, ylim = ylim) # abline extends the fitted line past the x value (date) # limit making the next ten years line ugly and not even # completely overplotting the abline drawn line abline(model, col="darkorange", lwd=2) lines(fit ~ date, dfuture # , lty = "dashed" , lwd=2 , col = "black") # if lines() is used for both the interpolated and extrapolated # values you will have a gap between both fitted and predicted lines # but it is closer to what you want # get the fitted values first (interpolated values) ypred <- predict(model) plot(value ~ date, col="grey", pch=20, cex=1.5, main="Plot" , xlim = xlim, ylim = ylim) # plot the interpolated values lines(ypred ~ date, col="darkorange", lwd = 2) # and now the extrapolated values # I use normal orange to make the difference more obvious lines(fit ~ date, dfuture, lty = "dashed", lwd=2, col = "orange") Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bug in print for data frames?
Hello, Inline. Às 13:32 de 26/10/2023, Ebert,Timothy Aaron escreveu: The "problem" goes away if you use x$C <- y[1,] Actually, if I understand correctly, the OP wants the column: x$C <- y[,1] In this case it will produce the same output because y is a df with only one row. But that is a very special case, the general case would be to extract the column. Hope this helps, Rui Barradas If you have another row in your x, say: x <- data.frame(A=c(1,4), B=c(2,5), C=c(3,6)) then your code x$C <- y[1] returns an error. If y has the same number of rows as x$C then R has the same outcome as in your example. It looks like your code tells R to replace all of column C (including the name) with all of vector y. Maybe unexpected, but not a bug. It is consistent. -Original Message- From: R-help On Behalf Of Rui Barradas Sent: Thursday, October 26, 2023 6:43 AM To: Christian Asseburg ; r-help@r-project.org Subject: Re: [R] Bug in print for data frames? [External Email] Às 07:18 de 25/10/2023, Christian Asseburg escreveu: Hi! I came across this unexpected behaviour in R. First I thought it was a bug in the assignment operator <- but now I think it's maybe a bug in the way data frames are being printed. What do you think? Using R 4.3.1: x <- data.frame(A = 1, B = 2, C = 3) y <- data.frame(A = 1) x A B C 1 1 2 3 x$B <- y$A # works as expected x A B C 1 1 1 3 x$C <- y[1] # makes C disappear x A B A 1 1 1 1 str(x) 'data.frame': 1 obs. of 3 variables: $ A: num 1 $ B: num 1 $ C:'data.frame': 1 obs. of 1 variable: ..$ A: num 1 Why does the print(x) not show "C" as the name of the third element? I did mess up the data frame (and this was a mistake on my part), but finding the bug was harder because print(x) didn't show the C any longer. Thanks. With best wishes - . . . Christian __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat/ .ethz.ch%2Fmailman%2Flistinfo%2Fr-help=05%7C01%7Ctebert%40ufl.edu %7C237aa7be3de54af710be08dbd61056a4%7C0d4da0f84a314d76ace60a62331e1b84 %7C0%7C0%7C638339137898359565%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAw MDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C ta=fgR6iFifXQpRCv0WqIu4S%2Bnctg%2F0v6j7AXftxrfQGPk%3D=0 PLEASE do read the posting guide http://www.r/ -project.org%2Fposting-guide.html=05%7C01%7Ctebert%40ufl.edu%7C23 7aa7be3de54af710be08dbd61056a4%7C0d4da0f84a314d76ace60a62331e1b84%7C0% 7C0%7C638339137898359565%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiL CJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C=FN CYM6%2FbpqThk76Zug%2Bm5x8o1Y2S1Z1S0ajAzPePIms%3D=0 and provide commented, minimal, self-contained, reproducible code. Hello, To expand on the good answers already given, I will present two other example data sets. Example 1. Imagine that instead of assigning just one column from y to x$C you assign two columns. The result is a data.frame column. See what is displayed as the columns names. And unlike what happens with `[`, when asssigning columns 1:2, the operator `[[` doesn't work. You will have to extract the columns y$A and y$B one by one. x <- data.frame(A = 1, B = 2, C = 3) y <- data.frame(A = 1, B = 4) str(y) #> 'data.frame':1 obs. of 2 variables: #> $ A: num 1 #> $ B: num 4 x$C <- y[1:2] x #> A B C.A C.B #> 1 1 2 1 4 str(x) #> 'data.frame':1 obs. of 3 variables: #> $ A: num 1 #> $ B: num 2 #> $ C:'data.frame': 1 obs. of 2 variables: #> ..$ A: num 1 #> ..$ B: num 4 x[[1:2]] # doesn't work #> Error in .subset2(x, i, exact = exact): subscript out of bounds Example 2. Sometimes it is usefull to get a result like this first and then correct the resulting df. For instance, when computing more than one summary statistics. str(agg) below shows that the result summary stats is a matrix, so you have a column-matrix. And once again the displayed names reflect that. The trick to make the result a df is to extract all but the last column as a sub-df, extract the last column's values as a matrix (which it is) and then cbind the two together. cbind is a generic function. Since the first argument to cbind is a sub-df, the method called is cbind.data.frame and the result is a df. df1 <- data.frame(A = rep(c("a", "b", "c"), 5L), X = 1:30) # the anonymous function computes more than one summary statistics # note that it returns a named vector agg <- aggregate(X ~ A, df1, \(x) c(Mean = mean(x), S = sd(x))) agg #> AX.Mean X.S #> 1 a 14.50 9.082951 #> 2 b 15.50 9.082951 #> 3 c 16.50 9.082951 # similar effect as in the OP, The difference is that the last # column is a matrix, not a data.frame str(agg) #> 'data.frame':3 obs. of 2 variables: #> $ A: chr "a" "b&quo
Re: [R] Bug in print for data frames?
Às 07:18 de 25/10/2023, Christian Asseburg escreveu: Hi! I came across this unexpected behaviour in R. First I thought it was a bug in the assignment operator <- but now I think it's maybe a bug in the way data frames are being printed. What do you think? Using R 4.3.1: x <- data.frame(A = 1, B = 2, C = 3) y <- data.frame(A = 1) x A B C 1 1 2 3 x$B <- y$A # works as expected x A B C 1 1 1 3 x$C <- y[1] # makes C disappear x A B A 1 1 1 1 str(x) 'data.frame': 1 obs. of 3 variables: $ A: num 1 $ B: num 1 $ C:'data.frame': 1 obs. of 1 variable: ..$ A: num 1 Why does the print(x) not show "C" as the name of the third element? I did mess up the data frame (and this was a mistake on my part), but finding the bug was harder because print(x) didn't show the C any longer. Thanks. With best wishes - . . . Christian __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, To expand on the good answers already given, I will present two other example data sets. Example 1. Imagine that instead of assigning just one column from y to x$C you assign two columns. The result is a data.frame column. See what is displayed as the columns names. And unlike what happens with `[`, when asssigning columns 1:2, the operator `[[` doesn't work. You will have to extract the columns y$A and y$B one by one. x <- data.frame(A = 1, B = 2, C = 3) y <- data.frame(A = 1, B = 4) str(y) #> 'data.frame':1 obs. of 2 variables: #> $ A: num 1 #> $ B: num 4 x$C <- y[1:2] x #> A B C.A C.B #> 1 1 2 1 4 str(x) #> 'data.frame':1 obs. of 3 variables: #> $ A: num 1 #> $ B: num 2 #> $ C:'data.frame': 1 obs. of 2 variables: #> ..$ A: num 1 #> ..$ B: num 4 x[[1:2]] # doesn't work #> Error in .subset2(x, i, exact = exact): subscript out of bounds Example 2. Sometimes it is usefull to get a result like this first and then correct the resulting df. For instance, when computing more than one summary statistics. str(agg) below shows that the result summary stats is a matrix, so you have a column-matrix. And once again the displayed names reflect that. The trick to make the result a df is to extract all but the last column as a sub-df, extract the last column's values as a matrix (which it is) and then cbind the two together. cbind is a generic function. Since the first argument to cbind is a sub-df, the method called is cbind.data.frame and the result is a df. df1 <- data.frame(A = rep(c("a", "b", "c"), 5L), X = 1:30) # the anonymous function computes more than one summary statistics # note that it returns a named vector agg <- aggregate(X ~ A, df1, \(x) c(Mean = mean(x), S = sd(x))) agg #> AX.Mean X.S #> 1 a 14.50 9.082951 #> 2 b 15.50 9.082951 #> 3 c 16.50 9.082951 # similar effect as in the OP, The difference is that the last # column is a matrix, not a data.frame str(agg) #> 'data.frame':3 obs. of 2 variables: #> $ A: chr "a" "b" "c" #> $ X: num [1:3, 1:2] 14.5 15.5 16.5 9.08 9.08 ... #> ..- attr(*, "dimnames")=List of 2 #> .. ..$ : NULL #> .. ..$ : chr [1:2] "Mean" "S" # nc is just a convenience, avoids repeated calls to ncol nc <- ncol(agg) cbind(agg[-nc], agg[[nc]]) #> A MeanS #> 1 a 14.5 9.082951 #> 2 b 15.5 9.082951 #> 3 c 16.5 9.082951 # all is well cbind(agg[-nc], agg[[nc]]) |> str() #> 'data.frame':3 obs. of 3 variables: #> $ A : chr "a" "b" "c" #> $ Mean: num 14.5 15.5 16.5 #> $ S : num 9.08 9.08 9.08 If the anonymous function hadn't returned a named vetor, the new column names would have been "1". "2", try it. Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] by function does not separate output from function with mulliple parts
--- #> mydata$StepType: Second #> lm model parameter contrast #> #> Contrast S.E. LowerUpper t df Pr(>|t|) #> 1 -2.435 1.819421 -6.198759 1.328759 -1.34 23 0.1939 Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] running crossvalidation many times MSE for Lasso regression
>> >> } >> >> mean(unlist(lst)) >> >> ## >> >> >> >> >> >> >> >> >> >> __ >> >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> >> https://stat.ethz.ch/mailman/listinfo/r-help >> >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> >> and provide commented, minimal, self-contained, reproducible code. >> > >> > __ >> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > -- > Jin > -- > Jin Li, PhD > Founder, Data2action, Australia > https://www.researchgate.net/profile/Jin_Li32 > https://scholar.google.com/citations?user=Jeot53EJ=en > [[alternative HTML version deleted]] > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, In your OP, the following two code lines are where that error comes from. predictLasso=predict(cv_model, newx=test1) ypred=predict(predictLasso,newdata=test1) predictLasso already are predictions, it's the output of predict. So when you run the 2nd line above you are passing it a matrix, not a fitted model, and the error is thrown. After the several suggestion in this thread, don't you want something like this instead of your for loop? # make the results reproducible set.seed(2023) # this is better than what you had z <- TT[c("x1", "x2")] |> as.matrix() y <- TT[["y"]] cv_model <- cv.glmnet(z, y, alpha = 1, type.measure = "mse") best_lambda <- cv_model$lambda.min best_lambda # these two values should be the same, and they are # index to minimum mse (i <- cv_model$index[1]) which(cv_model$lambda == cv_model$lambda.min) # these two values should be the same, and they are # value of minimum mse cv_model$cvm[i] min(cv_model$cvm) plot(cv_model) Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Best way to test for numeric digits?
Às 19:35 de 18/10/2023, Leonard Mada escreveu: Dear Rui, On 10/18/2023 8:45 PM, Rui Barradas wrote: split_chem_elements <- function(x, rm.digits = TRUE) { regex <- "(?<=[A-Z])(?![a-z]|$)|(?<=.)(?=[A-Z])|(?<=[a-z])(?=[^a-z])" if(rm.digits) { stringr::str_replace_all(mol, regex, "#") |> strsplit("#|[[:digit:]]") |> lapply(\(x) x[nchar(x) > 0L]) } else { strsplit(x, regex, perl = TRUE) } } split.symbol.character = function(x, rm.digits = TRUE) { # Perl is partly broken in R 4.3, but this works: regex <- "(?<=[A-Z])(?![a-z]|$)|(?<=.)(?=[A-Z])|(?<=[a-z])(?=[^a-z])" s <- strsplit(x, regex, perl = TRUE) if(rm.digits) { s <- lapply(s, \(x) x[grep("[[:digit:]]+", x, invert = TRUE)]) } s } You have a glitch (mol is hardcoded) in the code of the first function. The times are similar, after correcting for that glitch. Note: - grep("[[:digit:]]", ...) behaves almost twice as slow as grep("[0-9]", ...)! - corrected results below; Sincerely, Leonard ### split_chem_elements <- function(x, rm.digits = TRUE) { regex <- "(?<=[A-Z])(?![a-z]|$)|(?<=.)(?=[A-Z])|(?<=[a-z])(?=[^a-z])" if(rm.digits) { stringr::str_replace_all(x, regex, "#") |> strsplit("#|[[:digit:]]") |> lapply(\(x) x[nchar(x) > 0L]) } else { strsplit(x, regex, perl = TRUE) } } split.symbol.character = function(x, rm.digits = TRUE) { # Perl is partly broken in R 4.3, but this works: regex <- "(?<=[A-Z])(?![a-z]|$)|(?<=.)(?=[A-Z])|(?<=[a-z])(?=[^a-z])" s <- strsplit(x, regex, perl = TRUE) if(rm.digits) { s <- lapply(s, \(x) x[grep("[0-9]", x, invert = TRUE)]) } s } mol <- c("CCl3F", "Li4Al4H16", "CCl2CO2AlPO4SiO4Cl") mol1 <- rep(mol, 1) system.time( split_chem_elements(mol1) ) # user system elapsed # 0.58 0.00 0.58 system.time( split.symbol.character(mol1) ) # user system elapsed # 0.67 0.00 0.67 Hello, You are right, sorry for the blunder :(. In the code below I have replaced stringr::str_replace_all by the package stringi function stri_replace_all_regex and the improvement is significant. split_chem_elements <- function(x, rm.digits = TRUE) { regex <- "(?<=[A-Z])(?![a-z]|$)|(?<=.)(?=[A-Z])|(?<=[a-z])(?=[^a-z])" if(rm.digits) { stringi::stri_replace_all_regex(x, "#", regex) |> strsplit("#|[0-9]") |> lapply(\(x) x[nchar(x) > 0L]) } else { strsplit(x, regex, perl = TRUE) } } # system.time( # split_chem_elements(mol1) # ) # user system elapsed # 0.060.000.09 # system.time( # split.symbol.character(mol1) # ) # user system elapsed # 0.250.000.28 Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Best way to test for numeric digits?
Às 17:24 de 18/10/2023, Leonard Mada escreveu: Dear Rui, Thank you for your reply. I do have actually access to the chemical symbols: I have started to refactor and enhance the Rpdb package, see Rpdb::elements: https://github.com/discoleo/Rpdb However, the regex that you have constructed is quite heavy, as it needs to iterate through all chemical symbols (in decreasing nchar). Elements like C, and especially O, P or S, appear late in the regex expression - but are quite common in chemistry. The alternative regex is (in this respect) simpler. It actually works (once you know about the workaround). Q: My question focused if there is anything like is.numeric, but to parse each element of a vector. Sincerely, Leonard On 10/18/2023 6:53 PM, Rui Barradas wrote: Às 15:59 de 18/10/2023, Leonard Mada via R-help escreveu: Dear List members, What is the best way to test for numeric digits? suppressWarnings(as.double(c("Li", "Na", "K", "2", "Rb", "Ca", "3"))) # [1] NA NA NA 2 NA NA 3 The above requires the use of the suppressWarnings function. Are there any better ways? I was working to extract chemical elements from a formula, something like this: split.symbol.character = function(x, rm.digits = TRUE) { # Perl is partly broken in R 4.3, but this works: regex = "(?<=[A-Z])(?![a-z]|$)|(?<=.)(?=[A-Z])|(?<=[a-z])(?=[^a-z])"; # stringi::stri_split(x, regex = regex); s = strsplit(x, regex, perl = TRUE); if(rm.digits) { s = lapply(s, function(s) { isNotD = is.na(suppressWarnings(as.numeric(s))); s = s[isNotD]; }); } return(s); } split.symbol.character(c("CCl3F", "Li4Al4H16", "CCl2CO2AlPO4SiO4Cl")) Sincerely, Leonard Note: # works: regex = "(?<=[A-Z])(?![a-z]|$)|(?<=.)(?=[A-Z])|(?<=[a-z])(?=[^a-z])"; strsplit(c("CCl3F", "Li4Al4H16", "CCl2CO2AlPO4SiO4Cl"), regex, perl = T) # broken in R 4.3.1 # only slightly "erroneous" with stringi::stri_split regex = "(?<=[A-Z])(?![a-z]|$)|(?=[A-Z])|(?<=[a-z])(?=[^a-z])"; strsplit(c("CCl3F", "Li4Al4H16", "CCl2CO2AlPO4SiO4Cl"), regex, perl = T) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://eu01.z.antigena.com/l/boS9jwics77ZHEe0yO-Lt8AIDZm9-s6afEH4ulMO3sMyE9mLHNAR603_eeHQG2-_t0N2KsFVQRcldL-XDy~dLMhLtJWX69QR9Y0E8BCSopItW8RqG76PPj7ejTkm7UOsLQcy9PUV0-uTjKs2zeC_oxUOrjaFUWIhk8xuDJWb PLEASE do read the posting guide https://eu01.z.antigena.com/l/rUSt2cEKjOO0HrIFcEgHH_NROfU9g5sZ8MaK28fnBl9G6CrCrrQyqd~_vNxLYzQ7Ruvlxfq~P_77QvT1BngSg~NLk7joNyC4dSEagQsiroWozpyhR~tbGOGCRg5cGlOszZLsmq2~w6qHO5T~8b5z8ZBTJkCZ8CBDi5KYD33-OK and provide commented, minimal, self-contained, reproducible code. Hello, If you want to extract chemical elements symbols, the following might work. It uses the periodic table in GitHub package chemr and a package stringr function. devtools::install_github("paleolimbot/chemr") split_chem_elements <- function(x) { data(pt, package = "chemr", envir = environment()) el <- pt$symbol[order(nchar(pt$symbol), decreasing = TRUE)] pat <- paste(el, collapse = "|") stringr::str_extract_all(x, pat) } mol <- c("CCl3F", "Li4Al4H16", "CCl2CO2AlPO4SiO4Cl") split_chem_elements(mol) #> [[1]] #> [1] "C" "Cl" "F" #> #> [[2]] #> [1] "Li" "Al" "H" #> #> [[3]] #> [1] "C" "Cl" "C" "O" "Al" "P" "O" "Si" "O" "Cl" It is also possible to rewrite the function without calls to non base packages but that will take some more work. Hope this helps, Rui Barradas Hello, You and Avi are right, my function's performance is terrible. The following is much faster. As for how to not have digits throw warnings, the lapply in the version of your function below solves it by setting grep argument invert = TRUE. This will get all strings where digits do not occur. split_chem_elements <- function(x, rm.digits = TRUE) { regex <- "(?<=[A-Z])(?![a-z]|$)|(?<=.)(?=[A-Z])|(?<=[a-z])(?=[^a-z])" if(rm.digits) { stringr::str_replace_all(mol, regex, "#") |> strsplit("#|[[:digit:]]") |> lapply(\(x) x[nchar(x) > 0L]) } else { strsplit(x, regex, perl = TRUE) } } split.symbol.character = function(x, rm.digits = TRUE) { # Perl is partly broken in R 4.3, but this works: regex <- "(?<=[A-Z])(?![a-z]|$)|(?<=.)(?=[A-Z])|(?<=[a-z])(?=[^a-z])" s <- strsplit(x, regex, perl = TRUE) if(rm.digits) { s <- l
Re: [R] Best way to test for numeric digits?
Às 15:59 de 18/10/2023, Leonard Mada via R-help escreveu: Dear List members, What is the best way to test for numeric digits? suppressWarnings(as.double(c("Li", "Na", "K", "2", "Rb", "Ca", "3"))) # [1] NA NA NA 2 NA NA 3 The above requires the use of the suppressWarnings function. Are there any better ways? I was working to extract chemical elements from a formula, something like this: split.symbol.character = function(x, rm.digits = TRUE) { # Perl is partly broken in R 4.3, but this works: regex = "(?<=[A-Z])(?![a-z]|$)|(?<=.)(?=[A-Z])|(?<=[a-z])(?=[^a-z])"; # stringi::stri_split(x, regex = regex); s = strsplit(x, regex, perl = TRUE); if(rm.digits) { s = lapply(s, function(s) { isNotD = is.na(suppressWarnings(as.numeric(s))); s = s[isNotD]; }); } return(s); } split.symbol.character(c("CCl3F", "Li4Al4H16", "CCl2CO2AlPO4SiO4Cl")) Sincerely, Leonard Note: # works: regex = "(?<=[A-Z])(?![a-z]|$)|(?<=.)(?=[A-Z])|(?<=[a-z])(?=[^a-z])"; strsplit(c("CCl3F", "Li4Al4H16", "CCl2CO2AlPO4SiO4Cl"), regex, perl = T) # broken in R 4.3.1 # only slightly "erroneous" with stringi::stri_split regex = "(?<=[A-Z])(?![a-z]|$)|(?=[A-Z])|(?<=[a-z])(?=[^a-z])"; strsplit(c("CCl3F", "Li4Al4H16", "CCl2CO2AlPO4SiO4Cl"), regex, perl = T) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, If you want to extract chemical elements symbols, the following might work. It uses the periodic table in GitHub package chemr and a package stringr function. devtools::install_github("paleolimbot/chemr") split_chem_elements <- function(x) { data(pt, package = "chemr", envir = environment()) el <- pt$symbol[order(nchar(pt$symbol), decreasing = TRUE)] pat <- paste(el, collapse = "|") stringr::str_extract_all(x, pat) } mol <- c("CCl3F", "Li4Al4H16", "CCl2CO2AlPO4SiO4Cl") split_chem_elements(mol) #> [[1]] #> [1] "C" "Cl" "F" #> #> [[2]] #> [1] "Li" "Al" "H" #> #> [[3]] #> [1] "C" "Cl" "C" "O" "Al" "P" "O" "Si" "O" "Cl" It is also possible to rewrite the function without calls to non base packages but that will take some more work. Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] creating a time series
Às 11:12 de 16/10/2023, ahmet varlı escreveu: Hello everyone, � had 15 minutes of data from 2017-11-02 13:30:00 to 2022-11-26 23:45:00 and number of data is 177647 � would like to ask why my time series are less then my expectation. baslangic <- as.POSIXct("2017-11-02 13:30:00", tz = "CET") bitis <- as.POSIXct("2022-11-26 23:45:00", tz = "CET") # zaman_seti <- seq.POSIXt(from = baslangic, to = bitis, by = 60 * 15) length(zaman_seti) [1] 177642 but it has to be 177647 and secondly � have times in this format ( 2.11.2017 13:30/DD-MM- HH:MM:SS) su_seviyeleri_data <- as.POSIXct(su_seviyeleri_data$kayit_zaman, format = "%Y-%m-%d %H:%M:%S") I am using this code to change the format but it gives result as Na How can � solve this problem? Bests, [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Given your date format, try format = "%d.%m.%Y %H:%M" Test with your date time: x <- "2.11.2017 13:30" as.POSIXct(x, format = "%d.%m.%Y %H:%M") #> [1] "2017-11-02 13:30:00 WET" as.POSIXct(su_seviyeleri_data$kayit_zaman, format = "%d.%m.%Y %H:%M") Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] if-else that returns vector
Às 21:22 de 12/10/2023, Christofer Bogaso escreveu: Hi, Following expression returns only the first element ifelse(T, c(1,2,3), c(5,6)) However I am looking for some one-liner expression like above which will return the entire vector. Is there any way to achieve this? __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, I don't like it but ifelse(rep(T, length(c(1,2,3))), c(1,2,3), c(5,6)) maybe you should use max(length(c(1, 2, 3)), length(5, 6))) instead, but it's still ugly. Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Text showing when R is launched
Às 19:21 de 11/10/2023, George Loftus escreveu: Hi, Thankyou for your response <https://1drv.ms/i/s!AkfoLX--ikbqkweYckSQiXYKXJuR> [https://9c11xq.db.files.1drv.com/y4m7xqt5yVu7b5IG1jFuopunwB7Oa9Eij0WeZ7p1lSSmBECcSIB3XjcKjXIUhdMrJwaJdjZnBRhMeAxY0_Kko06Nq1fm5IhqaHlT6aFeI3R7gicXCteRPkzqNwmCdVxZu5DhNq66IrpwDyQ1lr8E5OFdm_xL86pMgNSLAx5HRRKLPOmFdUFWdv1ID-D1PC6LvNvAB-rT87JiQonSHRJIHouLg?width=200=150=center] [https://res-h3.public.cdn.office.net/assets/mail/file-icon/png/cloud_blue_16x16.png]Screenshot 2023-10-11 at 19.19.48.png ? However this is all that exists in Users/Admin There were a couple of R files in there which I have since deleted but I am still getting the same issue Thankyou, George From: Rui Barradas Sent: 10 October 2023 12:06 To: George Loftus ; r-help@r-project.org Subject: Re: [R] Text showing when R is launched Às 23:56 de 09/10/2023, George Loftus escreveu: Good Evening, I was wondering if you were able to help, I am running R on MacOS, it is the 2020 model mac so have install the Intel arm of R which I believe is correct However when I launch R or resume the R window after going on a different programme the following text is running I have also copied and pasted for ease 1 HIToolbox 0x7ff82142e0c2 _ZN15MenuBarInstance22RemoveAutoShowObserverEv + 30 2 HIToolbox 0x7ff82146a638 _ZL17BroadcastInternaljPvh + 167 3 SkyLight0x7ff81c70f23d _ZN12_GLOBAL__N_123notify_datagram_handlerEj15CGSDatagramTypePvmS1_ + 1030 4 SkyLight0x7ff81ca2205a _ZN21CGSDatagramReadStream26dispatchMainQueueDatagramsEv + 202 5 SkyLight0x7ff81ca21f81 ___ZN21CGSDatagramReadStream15mainQueueWakeupEv_block_invoke + 18 6 libdispatch.dylib 0x7ff8178867fb _dispatch_call_block_and_release + 12 7 libdispatch.dylib 0x7ff817887a44 _dispatch_client_callout + 8 8 libdispatch.dylib 0x7ff8178947b9 _dispatch_main_queue_drain + 952 9 libdispatch.dylib 0x7ff8178943f3 _dispatch_main_queue_callback_4CF + 31 10 CoreFoundation 0x7ff817b215f0 __CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__ + 9 11 CoreFoundation 0x7ff817ae1b70 __CFRunLoopRun + 2454 12 CoreFoundation 0x7ff817ae0b60 CFRunLoopRunSpecific + 560 13 HIToolbox 0x7ff82142e766 RunCurrentEventLoopInMode + 292 14 HIToolbox 0x7ff82142e576 ReceiveNextEventCommon + 679 15 HIToolbox 0x7ff82142e2b3 _BlockUntilNextEventMatchingListInModeWithFilter + 70 16 AppKit 0x7ff81ac31293 _DPSNextEvent + 909 17 AppKit 0x7ff81ac30114 -[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:] + 1219 18 R 0x000103d60c76 -[RController doProcessEvents:] + 166 19 R 0x000103d5b295 -[RController handleReadConsole:] + 149 20 R 0x000103d6466f Re_ReadConsole + 175 21 libR.dylib 0x000104442154 R_ReplDLLdo1 + 148 22 R 0x000103d71c47 run_REngineRmainloop + 263 23 R 0x000103d66d5f -[REngine runREPL] + 143 24 R 0x000103d56718 main + 792 25 dyld0x7ff8176d4310 start + 2432 1 HIToolbox 0x7ff8214a1726 _ZN15MenuBarInstance22EnsureAutoShowObserverEv + 102 2 HIToolbox 0x7ff82146a638 _ZL17BroadcastInternaljPvh + 167 3 SkyLight0x7ff81c70f23d _ZN12_GLOBAL__N_123notify_datagram_handlerEj15CGSDatagramTypePvmS1_ + 1030 4 SkyLight0x7ff81ca2205a _ZN21CGSDatagramReadStream26dispatchMainQueueDatagramsEv + 202 5 SkyLight0x7ff81ca21f81 ___ZN21CGSDatagramReadStream15mainQueueWakeupEv_block_invoke + 18 6 libdispatch.dylib 0x7ff8178867fb _dispatch_call_block_and_release + 12 7 libdispatch.dylib 0x7ff817887a44 _dispatch_client_callout + 8 8 libdispatch.dylib 0x7ff8178947b9 _dispatch_main_queue_drain + 952 9 libdispatch.dylib 0x7ff8178943f3 _dispatch_main_queue_callback_4CF + 31 10 CoreFoundation 0x7ff817b215f0 __CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__ + 9 11 CoreFoundation 0x7ff817ae1b70 __CFRunLoopRun + 2454 12 CoreFoundation 0x7ff817ae0b60 CFRunLoopRunSpecific + 560 13 HIT
Re: [R] Text showing when R is launched
0x000103d71c47 run_REngineRmainloop + 263 23 R 0x000103d66d5f -[REngine runREPL] + 143 24 R 0x000103d56718 main + 792 25 dyld0x7ff8176d4310 start + 2432 Are you able to inform me what is causing this? I can't seem to find any online help regarding this Thankyou in advance, George Loftus __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Try deleting file /Users/admin/.RData It is restoring the previous session and this is many times a source for problems. Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is it possible to get a downward pointing solid triangle plotting symbol in R?
Às 10:09 de 06/10/2023, Chris Evans via R-help escreveu: The reason I am asking is that I would like to mark areas on a plot using geom_polygon() and aes(fill = variable) to fill various polygons forming the background of a plot with different colours. Then I would like to overlay that with points representing direction of change: improved, no reliable change, deteriorated. The obvious symbols to use for those three directions are an upward arrow, a circle or square and a downward pointing arrow. There is a solid upward point triangle symbol in R (ph = 17) and there are both upward and downward pointing open triangle symbols (pch 21 and 25) but to fill those with a solid colour so they will be visible over the background requires that I use a fill aesthetic and that gets me a mess with the legend as I will have used a different fill mapping to fill the polygons. This silly reprex shows the issue I think. library(tidyverse) tibble(x = 2:9, y = 2:9, c = c(rep("A", 5), rep("B", 3))) -> tmpTibPoints tibble(x = c(1, 5, 5, 1), y = c(1, 1, 5, 5), a = rep("a", 4)) -> tmpTibArea1 tibble(x = c(5, 10, 10, 5), y = c(1, 1, 5, 5), a = rep("b", 4)) -> tmpTibArea2 tibble(x = c(1, 5, 5, 1), y = c(5, 5, 10, 10), a = rep("c", 4)) -> tmpTibArea3 tibble(x = c(5, 10, 10, 5), y = c(5, 5, 10, 10), a = rep("d", 4)) -> tmpTibArea4 bind_rows(tmpTibArea1, tmpTibArea2, tmpTibArea3, tmpTibArea4) -> tmpTibAreas ggplot(data = tmpTib, aes(x = x, y = y)) + geom_polygon(data = tmpTibAreas, aes(x = x, y = y, fill = a)) + geom_point(data = tmpTibPoints, aes(x = x, y = y, fill = c), pch = 24, size = 6) Does anyone know a way to create a solid downward pointing symbol? Or another workaround? TIA, Chris Hello, Maybe you can solve the problem with unicode characters. See the two scale_*_manual at the end of the plot. # Unicode characters for black up- and down-pointing characters pts_shapes <- c("\U25B2", "\U25BC") |> setNames(c("A", "B")) pts_colors <- c("blue", "red") |> setNames(c("A", "B")) ggplot(data = tmpTibAreas, aes(x = x, y = y)) + geom_polygon(data = tmpTibAreas, aes(x = x, y = y, fill = a)) + geom_point(data = tmpTibPoints, aes(x = x, y = y, color = c, shape = c), size = 6) + scale_shape_manual(values = pts_shapes) + scale_color_manual(values = pts_colors) -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R issue / No buffer space available
Às 21:28 de 04/10/2023, Ohad Oren, MD escreveu: Hello, I keep getting the following message about 'no buffer space available'. I am using R studio via connection to server. I verified that the connection to the server is good. 2023-10-04T20:26:25.698193Z [rsession-oo968] ERROR system error 105 (No buffer space available) [host: localhost, uri: /log_message, path: /var/run/rstudio-server/rstudio-rserver/rserver-monitor.socket]; OCCURRED AT void rstudio::core::http::LocalStreamAsyncClient::handleConnect(const rstudio_boost::system::error_code&) src/cpp/session/SessionModuleContext.cpp:124 Will appreciate your help! Ohad [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, RStudio is an IDE for R, not R itself. That is a RStudio error and RStudio technical support [1] is better suited to solve your problem. [1] https://community.rstudio.com/ Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] annotate
Às 20:34 de 04/10/2023, Subia Thomas OI-US-LIV5 escreveu: Colleagues, I wish to create y-data labels which meet a criterion. Here is my reproducible code. library(dplyr) library(ggplot2) library(cowplot) above_92 <- filter(faithful,waiting>92) ggplot(faithful,aes(x=eruptions,y=waiting))+ geom_point(shape=21,size=3,fill="orange")+ theme_cowplot()+ geom_hline(yintercept = 92)+ annotate(geom="text",x=above_92$eruptions,y=above_92$waiting+2,label=above_92$waiting) A bit of trial and error is required to figure out what number to add or subtract to above_92$waiting. Is there a more efficient way to do this? Thomas Subia Lean Six Sigma Senior Practitioner DRÄXLMAIER Group DAA Draexlmaier Automotive of America LLC mailto:thomas.su...@draexlmaier.com http://www.draexlmaier.com "Nous croyons en Dieu. Tous les autres doivent apporter des données. Edward Deming Public: All rights reserved. Distribution to third parties allowed. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hdello, Yes, there is an automatic way of doing this. Use a new data set in geom_text or annotate. Below I use geom_text. Then vjust will take care of the labels placement. library(dplyr) library(ggplot2) library(cowplot) above_92 <- filter(faithful, waiting > 92) ggplot(faithful, aes(x = eruptions, y = waiting)) + geom_point(shape=21,size=3,fill="orange") + geom_hline(yintercept = 92) + # use a new data argument here geom_text( data = above_92, mapping = aes(x = eruptions, y = waiting, label = waiting), vjust = -1 ) + theme_cowplot() Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Jim Lemon RIP
My sympathies for your loss. Jim Lemon was a dedicated contributor to the R community and his answers were always welcome. Jim will be missed. Rui Barradas Às 23:36 de 04/10/2023, Jim Lemon escreveu: Hello, I am very sad to let you know that my husband Jim died on 18th September. I apologise for not letting you know earlier but I had trouble finding the password for his phone. Kind regards, Juel -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Grouping by Date and showing count of failures by date
Às 21:29 de 29/09/2023, Paul Bernal escreveu: Dear friends, Hope you are doing great. I am attaching the dataset I am working with because, when I tried to dput() it, I was not able to copy the entire result from dput(), so I apologize in advance for that. I am interested in creating a column named Failure_Date_Period that has the FAILDATE but formatted as _MM. Then I want to count the number of failures (given by column WONUM) and just have a dataframe that has the FAILDATE and the count of WONUM. I tried this: pt <- PivotTable$new() pt$addData(failuredf) pt$addColumnDataGroups("FAILDATE") pt <- PivotTable$new() pt$addData(failuredf) pt$addColumnDataGroups("FAILDATE") pt$defineCalculation(calculationName = "FailCounts", summariseExpression="n()") pt$renderPivot() but I was not successful. Bottom line, I need to create a new dataframe that has the number of failures by FAILDATE, but in -MM format. Any help and/or guidance will be greatly appreciated. Kind regards, Paul __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, No data is attached. Maybe try dput(head(failuredf, 30)) ? And where can we find non-base PivotTable? Please start the scripts with calls to library() when using non-base functionality. Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antivírus AVG para verificar a presença de vírus. www.avg.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] predict function type class vs. prob
Às 11:12 de 22/09/2023, Milbert, Sabine (LGL) escreveu: Dear R Help Team, My research group and I use R scripts for our multivariate data screening routines. During routine use, we encountered some inconsistencies within the predict() function of the R Stats Package. Through internal research, we were unable to find the reason for this and have decided to contact your help team with the following issue: The predict() function is used once to predict the class membership of a new sample (type = "class") on a trained linear SVM model for distinguishing two classes (using the caret package). It is then used to also examine the probability of class membership (type = "prob"). Both are then presented in an R shiny output. Within the routine, we noticed two samples (out of 100+) where the class prediction and probability prediction did not match. The prediction probabilities of one class (52%) did not match the class membership within the predict function. We use the same seed and the discrepancy is reproducible in this sample. The same problem did not occur in other trained models (lda, random forest, radial SVM...). Is there a weighing of classes within the prediction function or is the classification limit not at 50%/a majority vote? Or do you have another explanation for this discrepancy, please let us know. PS: If this is an issue based on the model training function of the caret package and therefore not your responsibility, please let us know. Thank you in advance for your support! Yours sincerely, Sabine Milbert [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, I cannot tell what is going on but I would like to make a correction to your post. predict() is a generic function with methods for objects of several classes in many packages. In base package stats you will find methods for objects (fits) of class lm, glm and others, see ?predict. The method you are asking about is predict.train, defined in package caret, not in package stats. to see what predict method is being called, check class(your_fit) Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Hadamard transformation
Às 18:45 de 18/09/2023, mohan radhakrishnan escreveu: Hello, I am attempting to port the R code which is an answer to https://codegolf.stackexchange.com/questions/194229/implement-the-2d-hadamard-transform function(M){for(i in 1:log2(nrow(M)))T=T%x%matrix(1-2*!3:0,2)/2; print(T); T%*%M%*%T} The code, 3 inputs and the corresponding outputs are shown in https://tio.run/##PYyxCsIwFEX3fkUcAu@VV7WvcSl2dOwi8QNqNSXQJhAqrYjfHoOIwz3D4XBDNOJYiGgerp@td9Diy/gAVlgnynr0A4MLfkkeUTdarnLq5mBXKAvON1W9J8YdZ1rmsk3T72jgV/TAVBHTAROYrs/00@jz5YSY/aOSFKmvGP1yD9sk4Wa7ARSSRowf These are the inputs. f(matrix(c(2,3,2,5),2,2,byrow=TRUE)) f(matrix(1,4,4)) f(lower.tri(diag(4),T)) My attempt to port this R code to another framework(Tensorflow) was only partially successful because I didn't fully understand the cryptic R code. The second input shown above works after hacking Tensorflow for a long time. My question is this. Can anyone code this in a clear way so that I can understand ? I understand Kronecker Product and matrix multiplication and can port that code but I am missing something as the same ported code does not work for all inputs. Thanks, Mohan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Is this what you want? (I have changed the notation a bit.) H <- function(M){ H0 <- 1 Transf <- matrix(c(1, 1, 1, -1), 2L) for(i in 1:log2(nrow(M))) { H0 <- H0 %x% Transf/2 } H0 %*% M %*% H0 } x <- matrix(c(2, 3, 2, 5), 2, 2, byrow = TRUE) y <- matrix(1, 4, 4) z <- lower.tri(diag(4), TRUE) z[] <- apply(z, 2, as.integer) H(x) H(y) H(z) Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Help with plotting and date-times for climate data
Às 21:50 de 12/09/2023, Kevin Zembower via R-help escreveu: Hello, I'm trying to calculate the mean temperature max from a file of climate date, and plot it over a range of days in the year. I've downloaded the data, and cleaned it up the way I think it should be. However, when I plot it, the geom_smooth line doesn't show up. I think that's because my x axis is characters or factors. Here's what I have so far: library(tidyverse) data <- read_csv("Ely_MN_Weather.csv") start_day = yday(as_date("2023-09-22")) end_day = yday(as_date("2023-10-15")) d <- as_tibble(data) %>% select(DATE,TMAX,TMIN) %>% mutate(DATE = as_date(DATE), yday = yday(DATE), md = sprintf("%02d-%02d", month(DATE), mday(DATE)) ) %>% filter(yday >= start_day & yday <= end_day) %>% mutate(md = as.factor(md)) d_sum <- d %>% group_by(md) %>% summarize(tmax_mean = mean(TMAX, na.rm=TRUE)) ## Here's the filtered data: dput(d_sum) structure(list(md = structure(1:25, levels = c("09-21", "09-22", "09-23", "09-24", "09-25", "09-26", "09-27", "09-28", "09-29", "09-30", "10-01", "10-02", "10-03", "10-04", "10-05", "10-06", "10-07", "10-08", "10-09", "10-10", "10-11", "10-12", "10-13", "10-14", "10-15"), class = "factor"), tmax_mean = c(65, 62.2, 61.3, 63.9, 64.3, 60.1, 62.3, 60.5, 61.9, 61.2, 63.7, 59.5, 59.6, 61.6, 59.4, 58.8, 55.9, 58.125, 58, 55.7, 57, 55.4, 49.8, 48.75, 43.7)), class = c("tbl_df", "tbl", "data.frame" ), row.names = c(NA, -25L)) ggplot(data = d_sum, aes(x = md)) + geom_point(aes(y = tmax_mean, color = "blue")) + geom_smooth(aes(y = tmax_mean, color = "blue")) = My questions are: 1. Why isn't my geom_smooth plotting? How can I fix it? 2. I don't think I'm handling the month and day combination correctly. Is there a way to encode month and day (but not year) as a date? 3. (Minor point) Why does my graph of tmax_mean come out red when I specify "blue"? Thanks for any advice or guidance you can offer. I really appreciate the expertise of this group. -Kevin __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, The problem is that the dates are factors, not real dates. And geom_smooth is not interpolating along a discrete axis (the x axis). Paste a fake year with md, coerce to date and plot. I have simplified the aes() calls and added a date scale in order to make the x axis more readable. Without the formula and method arguments, geom_smooth will print a message, they are now made explicit. suppressPackageStartupMessages({ library(dplyr) library(ggplot2) }) d_sum %>% mutate(md = paste("2023", md, sep = "-"), md = as.Date(md)) %>% ggplot(aes(x = md, y = tmax_mean)) + geom_point(color = "blue") + geom_smooth( formula = y ~ x, method = loess, color = "blue" ) + scale_x_date(date_breaks = "7 days", date_labels = "%m-%d") Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] graph in R with grouping letters from the turkey test with agricolae package
Às 16:24 de 12/09/2023, Loop Vinyl escreveu: I would like to produce the attached graph (graph1) with the R package agricolae, could someone give me an example with the attached data (data)? I expect an adapted graph (graph2) with the data (data) Best regards __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, There are no attached graphs, only data. Can you post the code have you tried? Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] prop.trend.test
Às 10:06 de 08/09/2023, peter dalgaard escreveu: Yes, this was written a bit bone-headed (as I am allowed to say...) If you look at the code, you will see inside: a <- anova(lm(freq ~ score, data = list(freq = x/n, score = as.vector(score)), weights = w)) and the lm() inside should give you the direction via the sign of the regression coefficient on "score". So, at least for now, you could just doctor a copy of the code for your own purposes, as in fit <- lm(freq ~ score, data = list(freq = x/n, score = as.vector(score)), weights = w) a <- anova(fit) and arrange to return coef(fit)["score"] at the end. Something like structure(... estimate=c(lpm.slope=coef(fit)["score"]) ) (I expect that you might also extract the t-statistic from coef(summary(fit)) and find that it is the signed square root of the Chi-square, but I won't have time to test that just now.) -pd On 8 Sep 2023, at 07:22 , Thomas Subia via R-help wrote: Colleagues, Thanks all for the responses. I am monitoring the daily total number of defects per sample unit. I need to know whether this daily defect proportion is trending upward (a bad thing for a manufacturing process). My first thought was to use either a u or a u' control chart for this. As far as I know, u or u' charts are poor to detect drifts. This is why I chose to use prop.trend.test to detect trends in proportions. While prop.trend.test can confirm the existence of a trend, as far as I know, it is left to the user to determine what direction that trend is. One way to illustrate trending is of course to plot the data and use geom_smooth and method lm For the non-statisticians in my group, I've found that using this method along with the p-value of prop.trend.test, makes it easier for the users to determine the existence of trending and its direction. If there are any other ways to do this, please let me know. Thomas Subia On Thursday, September 7, 2023 at 10:31:27 AM PDT, Rui Barradas wrote: Às 14:23 de 07/09/2023, Thomas Subia via R-help escreveu: Colleagues Consider smokers <- c( 83, 90, 129, 70 ) patients <- c( 86, 93, 136, 82 ) prop.trend.test(smokers, patients) Output: Chi-squared Test for Trend inProportions data: smokers out of patients , using scores: 1 2 3 4 X-squared = 8.2249, df = 1, p-value = 0.004132 # trend test for proportions indicates proportions aretrending. How does one identify the direction of trending? # prop.test indicates that the proportions are unequal but doeslittle to indicate trend direction. All the best, Thomas Subia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, By visual inspection it seems that there is a decreasing trend. Note that the sample estimates of prop.test and smokers/patients are equal. smokers <- c( 83, 90, 129, 70 ) patients <- c( 86, 93, 136, 82 ) prop.test(smokers, patients)$estimate #>prop 1prop 2prop 3prop 4 #> 0.9651163 0.9677419 0.9485294 0.8536585 smokers/patients #> [1] 0.9651163 0.9677419 0.9485294 0.8536585 plot(smokers/patients, type = "b") Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Actually, the t-statistic is not the signed square root of the X-squared test statistic. I have edited the function, assigned the lm fit and returned it as is. (print.htest won't print this new list member so the output is not cluttered with irrelevant noise.) smokers <- c( 83, 90, 129, 70 ) patients <- c( 86, 93, 136, 82 ) edit(prop.trend.test, file = "ptt.R") source("ptt.R") # stats::prop.trend.test edited to include the results # of the lm fit and saved under a new name ptt <- function (x, n, score = seq_along(x)) { method <- "Chi-squared Test for Trend in Proportions" dname <- paste(deparse1(substitute(x)), "out of", deparse1(substitute(n)), ",\n using scores:", paste(score, collapse = " ")) x <- as.vector(x) n <- as.vector(n) p <- sum(x)/sum(n) w <- n/p/(1 - p) a <- anova(fit <- lm(freq ~ score, data = list(freq = x/n, score = as.vector(score)), weights = w)) chisq <- c(`X-squared` = a["score", "Sum Sq"]) structure(list(statistic =
Re: [R] prop.trend.test
Às 14:23 de 07/09/2023, Thomas Subia via R-help escreveu: Colleagues Consider smokers <- c( 83, 90, 129, 70 ) patients <- c( 86, 93, 136, 82 ) prop.trend.test(smokers, patients) Output: Chi-squared Test for Trend inProportions data: smokers out of patients , using scores: 1 2 3 4 X-squared = 8.2249, df = 1, p-value = 0.004132 # trend test for proportions indicates proportions aretrending. How does one identify the direction of trending? # prop.test indicates that the proportions are unequal but doeslittle to indicate trend direction. All the best, Thomas Subia [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, By visual inspection it seems that there is a decreasing trend. Note that the sample estimates of prop.test and smokers/patients are equal. smokers <- c( 83, 90, 129, 70 ) patients <- c( 86, 93, 136, 82 ) prop.test(smokers, patients)$estimate #>prop 1prop 2prop 3prop 4 #> 0.9651163 0.9677419 0.9485294 0.8536585 smokers/patients #> [1] 0.9651163 0.9677419 0.9485294 0.8536585 plot(smokers/patients, type = "b") Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Regarding error in RStudio
Às 17:59 de 05/09/2023, Sukriti Sood escreveu: Hi, I am Sukriti Sood, a research analyst at Woodstock Institute <https://woodstockinst.org/> . I use RStudio extensively for our analysis. I have been facing two issues for a while: 1. I am unable to copy from RStudio and paste into or vice versa to any other programs. 2. I am facing some kind of a conversion error (screenshot attached). I tried looking up online however could not find a resolution to these issues. Could I please get some help with this urgently. Thanks! Best, Sukriti Sood Sukriti Sood | Research Analyst Woodstock Institute Pronouns: She/Her/Hers 67 East Madison, Suite 2108 | Chicago, Illinois 60603 O (312) 368-0310 x2029 | C (610) 604-6708 www.woodstockinst.org<http://www.woodstockinst.org/> | ss...@woodstockinst.org<mailto:ss...@woodstockinst.org> __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, You should post RStudio questions to the RStudio support service, they answer quickly and the answers are generally good. It's written at the bottom of the attached image that the workspace was loaded from file C:/WSI/.RData Close RStudio, remove this file and restart. See if it solved it. Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Merge and replace data
Às 09:55 de 05/09/2023, roslinazairimah zakaria escreveu: Hi all, I have these data x1 <- c(116,0,115,137,127,0,0) x2 <- c(0,159,0,0,0,159,127) I want : xx <- c(116,115,137,127,159, 127) I would like to merge these data into one column. Whenever the data is '0' it will be replaced by the value in the column which is non zero.. I tried append and merge but fail to get what I want. Hello, That's a case for ?pmax: x1 <- c(116,0,115,137,127,0,0) x2 <- c(0,159,0,0,0,159,127) pmax(x1, x2) #> [1] 116 159 115 137 127 159 127 Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate formula - differing results
Às 12:51 de 04/09/2023, Ivan Calandra escreveu: Thanks Rui for your help; that would be one possibility indeed. But am I the only one who finds that behavior of aggregate() completely unexpected and confusing? Especially considering that dplyr::summarise() and doBy::summaryBy() deal with NAs differently, even though they all use mean(na.rm = TRUE) to calculate the group stats. Best wishes, Ivan On 04/09/2023 13:46, Rui Barradas wrote: Às 10:44 de 04/09/2023, Ivan Calandra escreveu: Dear useRs, I have just stumbled across a behavior in aggregate() that I cannot explain. Any help would be appreciated! Sample data: my_data <- structure(list(ID = c("FLINT-1", "FLINT-10", "FLINT-100", "FLINT-101", "FLINT-102", "HORN-10", "HORN-100", "HORN-102", "HORN-103", "HORN-104"), EdgeLength = c(130.75, 168.77, 142.79, 130.1, 140.41, 121.37, 70.52, 122.3, 71.01, 104.5), SurfaceArea = c(1736.87, 1571.83, 1656.46, 1247.18, 1177.47, 1169.26, 444.61, 1791.48, 461.15, 1127.2), Length = c(44.384, 29.831, 43.869, 48.011, 54.109, 41.742, 23.854, 32.075, 21.337, 35.459), Width = c(45.982, 67.303, 52.679, 26.42, 25.149, 33.427, 20.683, 62.783, 26.417, 35.297), PLATWIDTH = c(38.84, NA, 15.33, 30.37, 11.44, 14.88, 13.86, NA, NA, 26.71), PLATTHICK = c(8.67, NA, 7.99, 11.69, 3.3, 16.52, 4.58, NA, NA, 9.35), EPA = c(78, NA, 78, 54, 72, 49, 56, NA, NA, 56), THICKNESS = c(10.97, NA, 9.36, 6.4, 5.89, 11.05, 4.9, NA, NA, 10.08), WEIGHT = c(34.3, NA, 25.5, 18.6, 14.9, 29.5, 4.5, NA, NA, 23), RAWMAT = c("FLINT", "FLINT", "FLINT", "FLINT", "FLINT", "HORNFELS", "HORNFELS", "HORNFELS", "HORNFELS", "HORNFELS")), row.names = c(1L, 2L, 3L, 4L, 5L, 111L, 112L, 113L, 114L, 115L), class = "data.frame") 1) Simple aggregation with 2 variables: aggregate(cbind(Length, Width) ~ RAWMAT, data = my_data, FUN = mean, na.rm = TRUE) 2) Using the dot notation - different results: aggregate(. ~ RAWMAT, data = my_data[-1], FUN = mean, na.rm = TRUE) 3) Using dplyr, I get the same results as #1: group_by(my_data, RAWMAT) %>% summarise(across(c("Length", "Width"), ~ mean(.x, na.rm = TRUE))) 4) It gets weirder: using all columns in #1 give the same results as in #2 but different from #1 and #3 aggregate(cbind(EdgeLength, SurfaceArea, Length, Width, PLATWIDTH, PLATTHICK, EPA, THICKNESS, WEIGHT) ~ RAWMAT, data = my_data, FUN = mean, na.rm = TRUE) So it seems it is not only due to the notation (cbind() vs. dot). Is it a bug? A peculiar thing in my dataset? I tend to think this could be due to some variables (or their names) as all notations seem to agree when I remove some variables (although I haven't found out which variable(s) is (are) at fault), e.g.: my_data2 <- structure(list(ID = c("FLINT-1", "FLINT-10", "FLINT-100", "FLINT-101", "FLINT-102", "HORN-10", "HORN-100", "HORN-102", "HORN-103", "HORN-104"), EdgeLength = c(130.75, 168.77, 142.79, 130.1, 140.41, 121.37, 70.52, 122.3, 71.01, 104.5), SurfaceArea = c(1736.87, 1571.83, 1656.46, 1247.18, 1177.47, 1169.26, 444.61, 1791.48, 461.15, 1127.2), Length = c(44.384, 29.831, 43.869, 48.011, 54.109, 41.742, 23.854, 32.075, 21.337, 35.459), Width = c(45.982, 67.303, 52.679, 26.42, 25.149, 33.427, 20.683, 62.783, 26.417, 35.297), RAWMAT = c("FLINT", "FLINT", "FLINT", "FLINT", "FLINT", "HORNFELS", "HORNFELS", "HORNFELS", "HORNFELS", "HORNFELS")), row.names = c(1L, 2L, 3L, 4L, 5L, 111L, 112L, 113L, 114L, 115L), class = "data.frame") aggregate(cbind(EdgeLength, SurfaceArea, Length, Width) ~ RAWMAT, data = my_data2, FUN = mean, na.rm = TRUE) aggregate(. ~ RAWMAT, data = my_data2[-1], FUN = mean, na.rm = TRUE) group_by(my_data2, RAWMAT) %>% summarise(across(where(is.numeric), ~ mean(.x, na.rm = TRUE))) Thank you in advance for any hint. Best wishes, Ivan *LEIBNIZ-ZENTRUM* *FÜR ARCHÄOLOGIE* *Dr. Ivan CALANDRA* **Head of IMPALA (IMaging Platform At LeizA) *MONREPOS* Archaeological Research Centre, Schloss Monrepos 56567 Neuwied, Germany T: +49 2631 9772 243 T: +49 6131 8885 543 ivan.calan...@leiza.de leiza.de <http://www.leiza.de/> <http://www.leiza.de/> ORCID <https://orcid.org/-0003-3816-6359> ResearchGate <https://www.researchgate.net/profile/Ivan_Calandra> LEIZA is a foundation under public law of the State of Rhineland-Palatinate and the City of Mainz. Its headquarters are in Mainz. Supervision is carried out by the Ministry of Science and Health of the State of Rhineland-Palatinate. LEIZA is a research museum of the Leibniz Association. __
Re: [R] aggregate formula - differing results
s in at least one column and the results are the same. However, this will not give the mean values of the other numeric columns, just of those two. # define a vector of columns of interest cols <- c("Length", "Width", "RAWMAT") # 1) Simple aggregation with 2 variables, select cols: aggregate(cbind(Length, Width) ~ RAWMAT, data = my_data[cols], FUN = mean, na.rm = TRUE) # 2) Using the dot notation - if cols are selected, equal results: aggregate(. ~ RAWMAT, data = my_data[cols], FUN = mean, na.rm = TRUE) # 3) Using dplyr, the results are now the same results as #1 and #2: my_data %>% select(all_of(cols)) %>% group_by(RAWMAT) %>% summarise(across(c("Length", "Width"), ~ mean(.x, na.rm = TRUE))) Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in analysis of Rasch using eRm package.
Às 16:49 de 21/08/2023, nor azila escreveu: Dear R users, I am using eRm package in analysing my polytomous data as below Respondents = 277 people Item = 30 questions The data consists of 0,1,2,3 responses/answers. I'm having a problem in writing coding as below because I do not know what I should replace in each of the arguments. data.frame(..., row.names = NULL, check.rows = FALSE, check.names = TRUE, fix.empty.names = TRUE, stringsAsFactors = FALSE) Thank you very much for any help given. Azi. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, It seems that you have data that in tabular form is one column per answer, so you would end up with 30 columns, maybe an extra id column. Can you post sample data? If not, make up the answers and post the answers of the first 6 individuals or so. Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Questions about R
Às 12:10 de 17/08/2023, Shaun Parr escreveu: Sent from Outlook for Android<https://aka.ms/AAb9ysg> Hi there, My name is Shaun and I work in an organisation where one of our users wishes to install the R software and our process is to assess the safety of anyone software prior to authorisation. I can’t seem to locate all the information that we require on the webpage, so could someone kindly advise me of the following information please? 1. Please can you confirm what user information the software collects (E.g. Name, password, e-mail address, any Personally Identifiable Information etc)? 2. If any is collected, please can you confirm if the information collected by the software stays locally on the device or if it is transferred anywhere. If it is transferred, could you please advise where it is transferred to (E.g. your own servers, or a third party data centre such as Amazon Web Services or Azure)? 3. Are there any third-party components installed within the software and, if so, are these also kept up-to-date? If you could kindly advise this information, it would be really appreciated, thank you Shaun [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, 1. R itself? None. Download from CRAN and install. There are OS related installation issues, namely authorization but that information is not asked for nor recorded by R. 2. The answer to "If any is collected" is already given above. 3. I am not sure I understand this point. R comes with third-party components and their developers try to keep them up-to-date. This has nothing to do with PII. CRAN is the main official repository for contributed packages. From [1]: Available Packages Currently, the CRAN package repository features 19955 available packages. and the R instruction available.packages() |> nrow() # [1] 19931 says a number close to that one. Those packages are developed and contributed by volunteers and it's impossible for the CRAN maintainers to check what exactly those packages do but those packages' source code must be submited and anyone willing to check them can. [1] https://cran.r-project.org/web/packages/ Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] geom_smooth
Às 05:17 de 12/08/2023, Thomas Subia via R-help escreveu: Colleagues, Here is my reproducible code for a graph using geom_smooth set.seed(55) scatter_data <- tibble(x_var = runif(100, min = 0, max = 25) ,y_var = log2(x_var) + rnorm(100)) library(ggplot2) library(cowplot) ggplot(scatter_data,aes(x=x_var,y=y_var))+ geom_point()+ geom_smooth(se=TRUE,fill="blue",color="black",linetype="dashed")+ theme_cowplot() I'd like to add a black boundary around the shaded area. I suspect this can be done with geom_ribbon but I cannot figure this out. Some advice would be welcome. Thanks! Thomas Subia __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Here is a solution. You ,ust access the computed variables, which you can with ?ggplot_build. Then pass them in the data argument. p <- ggplot(scatter_data,aes(x=x_var,y=y_var)) + geom_point()+ geom_smooth(se=TRUE,fill="blue",color="black",linetype="dashed")+ theme_cowplot() # this is a data.frame, relevant columns are x, ymin and ymax fit <- ggplot_build(p)$data[[2]] p + geom_line(data = fit, aes(x, ymin), linetype = "dashed", linewidth = 1) + geom_line(data = fit, aes(x, ymax), linetype = "dashed", linewidth = 1) Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Use generic functions, e.g. print, without UseMethod?
Às 08:20 de 11/08/2023, Sigbert Klinke escreveu: Hello, I have defined a function 'equations(...)' which returns an object with class 'equations'. I also defined a function 'print.equations' which prints the object. But I did not use 'equations <- function(x, ...) UseMethod("equations"). Two questions: 1.) Is this a sensible approach? 2.) If yes, are there any pitfalls I could run in later? Thanks Sigbert Hello, You have to ask yourself what kind of objects are you passing to 'equations(...)'? Do you need to have 'equations.double(...)' 'equations.character(...)' 'equations.formula(...)' 'equations.matrix(...)' [...] specifically written for objects of class numeric character formula matrix [...] respectively? These methods would act on the respective class, process those objects somewhat differently because they are of different classes and output an object of class "equation". (If so, it is recommended to write a 'equations.default(...)' too.) Methods such as print.equation or summary.equation are written when you want your new class to have functionality your new class' users are familiar with. If, for instance, autoprint is on as it frequently is, users can see their "equation" by typing its name at a prompt. print.equation would display the "equation" in a way relevant to that new class. But this does not mean that the function that *creates* the object needs to be generic, you only need a new generic to have methods processing inputs of different classes in ways specific to those classes. Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] unused argument(s) (Header = 1) help!
Às 05:30 de 09/08/2023, Andreas Noviyanto escreveu: Dear Daniel, I was use this script to calculate replicateBE with R software, its worked. when i use the same script with similar data (xlsx) i got error messages like below, do you have any suggest? thanks anyway my script: library(replicateBE) path.in <- "Z:/Personil Omega" path.out <- path.in method.A(path.in=path.in, path.out=path.out, file="lans", set="01", ext="xlsx", header=1, ola=TRUE) method.A(path.in=path.in, path.out=path.out, file="lans", set="02", ext="xlsx", header=1) ABE(path.in=path.in, path.out=path.out, file="lans", set="01", ext="xlsx", header=1) ABE(path.in=path.in, path.out=path.out, file="lans", set="02", ext="xlsx", header=1) result: > library(replicateBE) > path.in <- "Z:/Personil Omega" > path.out <- path.in > method.A(path.in=path.in, path.out=path.out, file="lans", + set="01", ext="xlsx", header=1, ola=TRUE) Error in method.A(path.in = path.in, path.out = path.out, file = "lans", : unused argument (header = 1) > method.A(path.in=path.in, path.out=path.out, file="lans", + set="02", ext="xlsx", header=1) Error in method.A(path.in = path.in, path.out = path.out, file = "lans", : unused argument (header = 1) > ABE(path.in=path.in, path.out=path.out, file="lans", + set="01", ext="xlsx", header=1) Error in ABE(path.in = path.in, path.out = path.out, file = "lans", : unused argument (header = 1) > ABE(path.in=path.in, path.out=path.out, file="lans", + set="02", ext="xlsx", header=1) Error in ABE(path.in = path.in, path.out = path.out, file = "lans", : unused argument (header = 1) Warm Regards, Andreas [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, That error message means that there is no argument 'header' to function method.A. Simply remove it and you should be fine. Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Stacking matrix columns
Às 01:15 de 06/08/2023, Iris Simmons escreveu: You could also do dim(x) <- c(length(x), 1) On Sat, Aug 5, 2023, 20:12 Steven Yen wrote: I wish to stack columns of a matrix into one column. The following matrix command does it. Any other ways? Thanks. > x<-matrix(1:20,5,4) > x [,1] [,2] [,3] [,4] [1,]16 11 16 [2,]27 12 17 [3,]38 13 18 [4,]49 14 19 [5,]5 10 15 20 > matrix(x,ncol=1) [,1] [1,]1 [2,]2 [3,]3 [4,]4 [5,]5 [6,]6 [7,]7 [8,]8 [9,]9 [10,] 10 [11,] 11 [12,] 12 [13,] 13 [14,] 14 [15,] 15 [16,] 16 [17,] 17 [18,] 18 [19,] 19 [20,] 20 > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Yet another solution. t(t(c(x))) or x |> c() |> t() |> t() At first I liked it but it's the slowest of the three, OP's, Iris' (the fastest). Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Multiply
Às 19:03 de 04/08/2023, Val escreveu: Thank you, Avi and Ivan. Worked for this particular Example. Yes, I am looking for something with a more general purpose. I think Ivan's suggestion works for this. multiplication=as.matrix(dat1[,-1]) %*% as.matrix(dat2[match(dat1[,1], dat2[,1]),-1]) Res=data.frame(ID = dat1[,1], Index = multiplication) On Fri, Aug 4, 2023 at 10:59 AM wrote: Val, A data.frame is not quite the same thing as a matrix. But as long as everything is numeric, you can convert both data.frames to matrices, perform the computations needed and, if you want, convert it back into a data.frame. BUT it must be all numeric and you violate that requirement by having a character column for ID. You need to eliminate that temporarily: dat1 <- read.table(text="ID, x, y, z A, 10, 34, 12 B, 25, 42, 18 C, 14, 20, 8 ",sep=",",header=TRUE,stringsAsFactors=F) mat1 <- as.matrix(dat1[,2:4]) The result is: mat1 x y z [1,] 10 34 12 [2,] 25 42 18 [3,] 14 20 8 Now do the second matrix, perhaps in one step: mat2 <- as.matrix(read.table(text="ID, weight, weiht2 A, 0.25, 0.35 B, 0.42, 0.52 C, 0.65, 0.75",sep=",",header=TRUE,stringsAsFactors=F)[,2:3]) Do note some people use read.csv() instead of read.table, albeit it simply calls read.table after setting some parameters like the comma. The result is what you asked for, including spelling weight wrong once.: mat2 weight weiht2 [1,] 0.25 0.35 [2,] 0.42 0.52 [3,] 0.65 0.75 Now you wanted to multiply as in matrix multiplication. mat1 %*% mat2 weight weiht2 [1,] 24.58 30.18 [2,] 35.59 44.09 [3,] 17.10 21.30 Of course, you wanted different names for the columns and you can do that easily enough: result <- mat1 %*% mat2 colnames(result) <- c("index1", "index2") But this is missing something: result index1 index2 [1,] 24.58 30.18 [2,] 35.59 44.09 [3,] 17.10 21.30 Do you want a column of ID numbers on the left? If numeric, you can keep it in a matrix in one of many ways but if you want to go back to the data.frame format and re-use the ID numbers, there are again MANY ways. But note mixing characters and numbers can inadvertently convert everything to characters. Here is one solution. Not the only one nor the best one but reasonable: recombined <- data.frame(index=dat1$ID, index1=result[,1], index2=result[,2]) recombined index index1 index2 1 A 24.58 30.18 2 B 35.59 44.09 3 C 17.10 21.30 If for some reason you need a more general purpose way to do this for arbitrary conformant matrices, you can write a function that does this in a more general way but perhaps a better idea might be a way to store your matrices in files in a way that can be read back in directly or to not include indices as character columns but as row names. -Original Message- From: R-help On Behalf Of Val Sent: Friday, August 4, 2023 10:54 AM To: r-help@R-project.org (r-help@r-project.org) Subject: [R] Multiply Hi all, I want to multiply two data frames as shown below, dat1 <-read.table(text="ID, x, y, z A, 10, 34, 12 B, 25, 42, 18 C, 14, 20, 8 ",sep=",",header=TRUE,stringsAsFactors=F) dat2 <-read.table(text="ID, weight, weiht2 A, 0.25, 0.35 B, 0.42, 0.52 C, 0.65, 0.75",sep=",",header=TRUE,stringsAsFactors=F) Desired result ID Index1 Index2 1 A 24.58 30.18 2 B 35.59 44.09 3 C 17.10 21.30 Here is my attempt, but did not work dat3 <- data.frame(ID = dat1[,1], Index = apply(dat1[,-1], 1, FUN= function(x) {sum(x*dat2[,2:ncol(dat2)])} ), stringsAsFactors=F) Any help? Thank you, __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Slightly simpler: multiplication <- as.matrix(dat1[,-1]) %*% as.matrix(dat2[match(dat1[,1], dat2[,1]),-1]) Res <- data.frame(ID = dat1[,1], Index = multiplication) # this is what I find simpler # the method being called is cbind.data.frame Res2 <- cbind(dat1[1], Index = multiplication) identical(Res, Res2) #> [1] TRUE Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problems with facets in ggplot2
Às 11:08 de 04/08/2023, Nick Wray escreveu: Hello I am wrestling with ggplot – I have produced a facetted plot of flows under various metrics but I can’t find info on the net which tells me how to do three things I have created some simplified mock data to illustrate (and using a colour-blind palette): library(ggplot2) library(forcats) cb8<- c("#00", "#E69F00", "#56B4E9", "#009E73","#F0E442", "#0072B2", "#D55E00", "#CC79A7") set.seed<-(040823) mock<- set.seed<-(040823) mock<-as.data.frame(cbind(rep((1990:1995),8),round(rnorm(48,50,10),3),rep(c(rep("Tweed",6),rep("Tay",6)),4),rep(c("AMAX","Mean","AMIN","Median"),each=12))) colnames(mock)<-c("Year","Flow","Stat","Metric") mock ggplot(mock, aes(Year,Flow, group = factor(Stat), colour = factor(Stat)))+ coord_cartesian(ylim = c(0, 100)) + geom_line(size=1)+ scale_color_manual(name = "Stat", values = cb8[4:7])+ scale_y_discrete(breaks=c(0,25,50,75,100),labels=c(0,25,50,75,100))+ facet_wrap(vars(Metric),nrow=2,ncol=2)+ ylab("Flow") 1)This gives me a facetted plot but I can’t work out why I’m not getting a labelled y scale 2)Why are plots down at the bottom of the facets rather than in the middle? 3)And also I’d like the plots to be in the order (top left to bottom right) of AMAX MEAN AMIN MEDIAN but if I add in the line facet_grid(~fct_relevel(Metric,"AMAX","Mean","AMIN","Median")) before the line ylab it disrupts the 2x2 layout Can anyone tell me how to resolve these problems? Thanks Nick Wray [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, The main problem is the way ou create the data set. cbind defaults to creating a matrix and since some of the vectors are of class "character" all others will be coerced to character too. And Year and Flow will no longer be numeric. You can coerce those two columns to numeric manually or you can use data.frame(), not as.data.frame(), to create the data set. And you were, therefore, using scale_y_discrete when it should be scale_y_continuous. Corrected below. As for the relevel, I'm not getting any errors. library(ggplot2) library(forcats) cb8<- c("#00", "#E69F00", "#56B4E9", "#009E73", "#F0E442", "#0072B2", "#D55E00", "#CC79A7") set.seed(040823) # the right way of creating the data set mock <- data.frame( Year = rep((1990:1995),8), Flow = round(rnorm(48,50,10),3), Stat = rep(c(rep("Tweed",6), rep("Tay",6)),4), Metric = rep(c("AMAX","Mean","AMIN","Median"),each=12) ) ggplot(mock, aes(Year,Flow, group = factor(Stat), colour = factor(Stat))) + coord_cartesian(ylim = c(0, 100)) + geom_line(size=1) + scale_color_manual(name = "Stat", values = cb8[4:5]) + scale_y_continuous(breaks=c(0, 25, 50, 75, 100), labels=c(0, 25, 50, 75, 100)) + facet_wrap(~ fct_relevel(Metric,"AMAX","Mean","AMIN","Median"), nrow = 2, ncol = 2) + ylab("Flow") Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Choosing colours for lines in ggplot2
Às 18:10 de 02/08/2023, Nick Wray escreveu: Hello - I am trying to plot flows in a number of rivers within the same plot, and need to colour the lines differently, using a colour-blind palette. Code beneath works but has colours assigned by the program I have made some simple dummy data: ## code 1 cb8<- c("#00", "#E69F00", "#56B4E9", "#009E73","#F0E442", "#0072B2", "#D55E00", "#CC79A7") ## this is the colour-blind palette set.seed(020823) df<-as.data.frame(cbind(rep(1980:1991,2),c(10*runif(12),10*runif(12)),c(rep(1,12),rep(2,12 colnames(df)<-c("Year","Flow","Stat") df ggplot(df,aes(Year,Flow,group=Stat,colour=Stat))+ coord_cartesian(ylim = c(0, 10)) + geom_line()+ geom_point() ## this works ## BUT: ## code 2 col.2<-cb8[4:5] ggplot(df,aes(Year,Flow,group=Stat,colour=Stat))+ coord_cartesian(ylim = c(0, 10)) + geom_line()+ geom_point()+ scale_color_manual(values =cb8[4:5])+ theme_bw() ## this gives error message Error: Continuous value supplied to discrete scale ## However this example using code from the net does work so I don't understand why my ## second code doesn't work. ## code 3 df.1 <- data.frame(store=c('A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C'), week=c(1, 2, 3, 1, 2, 3, 1, 2, 3), sales=c(9, 12, 15, 7, 9, 14, 10, 16, 19)) ggplot(df.1, aes(x=week, y=sales, group=store, color=store)) + geom_line(size=2) + #scale_color_manual(values=c('orange', 'pink', 'red')) scale_color_manual(values=cb8[4:6]) Can anyone help? Thanks Nick Wray [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Your Stat column is numeric, therefore, ggplot sees it as continuous. To make it work, coerce to factor. Here are two ways. ## 1st way, Stat coerce to factor in the ggplot code ## this means you will have to set the legend name ## manually in scale_color_manual ## code 2, now it works col.2<-cb8[4:5] ggplot(df, aes(Year,Flow, group = factor(Stat), colour = factor(Stat)))+ coord_cartesian(ylim = c(0, 10)) + geom_line()+ geom_point()+ scale_color_manual(name = "Stat", values = cb8[4:5])+ theme_bw() ## 2nd way, since you are using ggplot2, a tidyverse package, ## coerce to factor in a pipe before the ggplot call ## this is done with dplyr::mutate and R's native pipe operator ## (could also be magritttr's pipe) ## I have left name = "Stat" like above though it's no ## longer needed df |> dplyr::mutate(Stat = factor(Stat)) |> ggplot(aes(Year, Flow, group = Stat, colour = Stat))+ coord_cartesian(ylim = c(0, 10)) + geom_line()+ geom_point()+ scale_color_manual(name = "Stat", values = cb8[4:5])+ theme_bw() Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting Fitted vs Observed Values in Logistic Regression Model
Às 14:57 de 01/08/2023, Paul Bernal escreveu: Dear friends, I hope this email finds you all well. This is the dataset I am working with: dput(random_mod12_data2) structure(list(Index = c(1L, 5L, 11L, 3L, 2L, 8L, 9L, 4L), x = c(5, 13, 25, 9, 7, 19, 21, 11), n = c(500, 500, 500, 500, 500, 500, 500, 500), r = c(100, 211, 391, 147, 122, 310, 343, 176), ratio = c(0.2, 0.422, 0.782, 0.294, 0.244, 0.62, 0.686, 0.352)), row.names = c(NA, -8L), class = "data.frame") A brief description of the dataset: Index: is just a column that shows the ID of each observation (row) x: is a column which gives information on the discount rate of the coupon n: is the sample or number of observations r: is the count of redeemed coupons ratio: is just the ratio of redeemed coupons to n (total number of observations) #Fitting a logistic regression model to response variable y for problem 13.4 logistic_regmod2 <- glm(formula = ratio~x, family = binomial(logit), data = random_mod12_data2) I would like to plot the value of r (in the y-axis) vs x (the different discount rates) and then superimpose the logistic regression fitted values all in the same plot. How could I accomplish this? Any help and/or guidance will be greatly appreciated. Kind regards, Paul [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Here is another way with ggplot2. It doesn't give you the fitted values but it plots the fitted line. library(ggplot2) ggplot(random_mod12_data2, aes(x, ratio)) + geom_point() + stat_smooth( formula = y ~ x, method = glm, method.args = list(family = binomial), se = FALSE ) Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Downloading a directory of text files into R
Às 23:06 de 25/07/2023, Bob Green escreveu: Hello, I am seeking advice as to how I can download the 833 files from this site:"http://home.brisnet.org.au/~bgreen/Data/; I want to be able to download them to perform a textual analysis. If the 833 files, which are in a Directory with two subfolders were on my computer I could read them through readtext. Using readtext I get the error: > x = readtext("http://home.brisnet.org.au/~bgreen/Data/*;) Error in download_remote(file, ignore_missing, cache, verbosity) : Remote URL does not end in known extension. Please download the file manually. > x = readtext("http://home.brisnet.org.au/~bgreen/Data/Dir/()") Error in download_remote(file, ignore_missing, cache, verbosity) : Remote URL does not end in known extension. Please download the file manually. Any suggestions are appreciated. Bob __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, The following code downloads all files in the posted link. suppressPackageStartupMessages({ library(rvest) }) # destination directory, change this at will dest_dir <- "~/Temp" # first get the two subfolders from the Data webpage link <- "http://home.brisnet.org.au/~bgreen/Data/; page <- read_html(link) page %>% html_elements("a") %>% html_text() %>% grep("/$", ., value = TRUE) -> sub_folder # create relevant disk sub-directories, if # they do not exist yet for(subf in sub_folder) { d <- file.path(dest_dir, subf) if(!dir.exists(d)) { success <- dir.create(d) msg <- paste("created directory", d, "-", success) message(msg) } } # prepare to download the files dest_dir <- file.path(dest_dir, sub_folder) source_url <- paste0(link, sub_folder) success <- mapply(\(src, dest) { # read each Data subfolder # and get the file names therein # then lapply 'download.file' to each filename pg <- read_html(src) pg %>% html_elements("a") %>% html_text() %>% grep("\\.txt$", ., value = TRUE) %>% lapply(\(x) { s <- paste0(src, x) d <- file.path(dest, x) tryCatch( download.file(url = s, destfile = d), warning = function(w) w, error = function(e) e ) }) }, source_url, dest_dir) lengths(success) # http://home.brisnet.org.au/~bgreen/Data/Hanson1/ # 84 # http://home.brisnet.org.au/~bgreen/Data/Hanson2/ # 749 # matches the question's number sum(lengths(success)) # [1] 833 Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Off-topic: ChatGPT Code Interpreter
Às 00:23 de 18/07/2023, Jim Lemon escreveu: I haven't really focused on the statistical capabilities of AI, that marriage of massive memory and associative learning. I am impressed by its ability to perform text-to-image conversion, something I have recently needed. My artistic ability is that of the average three year old, yet I can employ AI to translate my mental images into realistic pictures. Perhaps we really are learning about how we think. As far as I am aware, it just does what we tell it to do. Like other tools, it is as good or bad as the user. Jim __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Also off-topic but the date is fun: A system is not a head. Furniture is not people. All processes and all devices, will be useless for organizations, if the heads of the individuals who employ them, are not properly organized. And these heads will be organized, if the same part of the boss's body that directs them is properly organized. Just like you can write nonsense with a latest model typewriter, nonsense can also be done with the most perfect systems and devices meant to help you not to. Systems, processes, furniture, machines, are purely auxiliary elements. The real process is to think. The fundamental machine is intelligence. Fernando Pessoa, 1926 Revista de Comércio e Contabilidade, nº 4. Lisboa, 25-4-1926. (Magazine of Commerce and Accounting, nº 4. Lisbon, 25-4-1926) Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] nlmixr2 installation problems
Às 12:55 de 16/07/2023, Troels Ring escreveu: Hi friends - Trying to install nlmixr2 caused problems. I'm on windows with R4.3.1 so made sure to have rtools 4.3 and also reinstalled R and then ran install.packages("nlmixr2",dependencies = TRUE)and got the responseInstalling package into ‘C:/Users/Admin/AppData/Local/R/win-library/4.3’ (as ‘lib’ is unspecified) also installing the dependencies ‘fs’, ‘rappdirs’, ‘bit’, ‘prettyunits’, ‘rematch’, ‘askpass’, ‘sass’, ‘commonmark’, ‘proxy’, ‘bit64’, ‘progress’, ‘rootSolve’, ‘lmom’, ‘cellranger’, ‘jsonlite’, ‘mime’, ‘openssl’, ‘htmlwidgets’, ‘ellipsis’, ‘bslib’, ‘fontawesome’, ‘jquerylib’, ‘tinytex’, ‘curl’, ‘markdown’, ‘jpeg’, ‘xml2’, ‘fastmap’, ‘e1071’, ‘generics’, ‘tidyselect’, ‘clipr’, ‘hms’, ‘vroom’, ‘cpp11’, ‘tzdb’, ‘stringi’, ‘purrr’, ‘mvtnorm’, ‘expm’, ‘rstudioapi’, ‘Exact’, ‘gld’, ‘readxl’, ‘httr’, ‘gridExtra’, ‘htmlTable’, ‘viridis’, ‘htmltools’, ‘base64enc’, ‘rmarkdown’, ‘Formula’, ‘bitops’, ‘evaluate’, ‘highr’, ‘xfun’, ‘yaml’, ‘numDeriv’, ‘lazyeval’, ‘optextras’, ‘dparser’, ‘RcppEigen’, ‘StanHeaders’, ‘sitmo’, ‘gridtext’, ‘cachem’, ‘RcppParallel’, ‘RApiSerialize’, ‘stringfish’, ‘classInt’, ‘dplyr’, ‘readr’, ‘stringr’, ‘tidyr’, ‘assertthat’, ‘binom’, ‘Deriv’, ‘DescTools’, ‘Hmisc’, ‘minpack.lm’, ‘pander’, ‘png’, ‘RCurl’, ‘backports’, ‘checkmate’, ‘knitr’, ‘lbfgsb3c’, ‘minqa’, ‘n1qn1’, ‘Rcpp’, ‘rex’, ‘Rvmmin’, ‘symengine’, ‘BH’, ‘RcppArmadillo’, ‘rxode2parse’, ‘rxode2random’, ‘data.table’, ‘digest’, ‘ggtext’, ‘PreciseSums’, ‘inline’, ‘memoise’, ‘sys’, ‘rxode2ll’, ‘rxode2et’, ‘qs’, ‘vpc’, ‘xgxr’, ‘nlmixr2data’, ‘nlmixr2est’, ‘nlmixr2extra’, ‘rxode2’, ‘lotri’, ‘nlmixr2plot’, ‘crayon’ There are binary versions available but the source versions are later: binary source needs_compilation sass 0.4.6 0.4.7 TRUE openssl 2.0.6 2.1.0 TRUE - and nothing more happened But when trying to install sass and openssl individually the same announcement of later source versions appeared without making it possible to ask for recompilation. All best wishes Troels [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Maybe this [1] is relevant. [1] https://community.rstudio.com/t/meaning-of-common-message-when-install-a-package-there-are-binary-versions-available-but-the-source-versions-are-later/2431 Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to add error bars to a line plot with ggplot2?
Às 17:33 de 14/07/2023, Luigi Marongiu escreveu: Hello, I am measuring a certain variable at given time intervals and different concentrations of a reagent. I would like to make a scatter plot of the values, joined by a line to highlight the temporal measure. I can plot this all right. Now, since I have more than one replicate, I would like to add he error bars. I prepared a dataframe with the mean measures and a column with the standard deviations, but when I run the code, I get the error: ``` Error in `check_aesthetics()`: ! Aesthetics must be either length 1 or the same as the data (20): colour Run `rlang::last_trace()` to see where the error occurred. ``` I am missing something, but what? Thank you WORKING EXAMPLE ``` measTime= c(1,2,4,24,48,1,2,4,24 ,48,1,2,4,24,48,1,2,4,24,48) conc= c(0.25,0.25,0.25,0.25,0.25,1.12,1.12 ,1.12,1.12,1.12,2.5,2.5,2.5,2.5,2.5 ,25,25,25,25,25) varbl= c(0.0329,0.27,0.0785,0.1015 ,-0.193,0.048,0.113,0.1695,-0.775,0.464,-0.257 ,-0.154,-0.3835,-1.23,-0.513,1.3465,1.276 ,1.128,-2.56,-1.813) stdDev=c(0.646632301492381,0,1.77997087991162 ,0.247683265482349,0,0.282901631902917,0 ,0.273086677326693,1.03807578400295,0,0.912213425319609 ,0,1.64371621638287,2.23203614068709,0,0.2615396719429 ,0,0.54039985196149,2.15236180353893,0) df = data.frame(Time=measTime, mM=conc, ddC=varbl, SD=stdDev) library(ggplot2) COLS = c("green", "red", "blue", "yellow") ggplot(df, aes(x=Time, y=ddC, colour=mM, group=mM)) + geom_line(aes(x=Time, y=ddC, colour=mM, group=mM)) + geom_errorbar(aes(x=Time, ymin=ddC-SD, ymax=ddC+SD, colour=mM, group=mM), width=.1, colour=COLS) + geom_point(size=6) + scale_colour_manual(values = COLS) + ggtitle("Exposure") + xlab(expression(bold("Time (h)"))) + ylab(expression(bold("Value"))) + geom_hline(aes(yintercept=0)) + theme_classic() ``` __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Two notes: 1. If you want to use a discrete colours vector, your 'colour' aesthetic must be mapped to a discrete variabe. The most frequent cases are character or factor columns. 2. If you start the plot with certain aesthetics set you don't have to repeat them in subsequent layers, geom_line can be called with no aes() and gem_errorbar doesn't need x=measTime again. As for the main error, the colors vector COLS should be removed from geom_errorbar. df <- data.frame(Time = measTime, mM = factor(conc), # this must be a factor ddC = varbl, SD = stdDev) library(ggplot2) COLS = c("green", "red", "blue", "yellow") ggplot(df, aes(x = Time, y = ddC, colour = mM, group = mM)) + geom_line() + geom_errorbar(aes(ymin = ddC - SD, ymax = ddC + SD), width = 0.1) + geom_point(size = 6) + geom_hline(aes(yintercept = 0)) + scale_colour_manual(values = COLS) + ggtitle("Exposure") + xlab(expression(bold("Time (h)"))) + ylab(expression(bold("Value"))) + theme_classic() Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] textual analysis - transforming several pdf to txt - naming the files
Às 11:12 de 05/07/2023, Cecília Carmo escreveu: convertpdf2txt <- function(dirpath){ files <- list.files(dirpath, pattern = "Consoli.*\\.pdf$", full.names = TRUE) files <- chartr("\\", "/", files) x <- lapply(files, function(x){ pdftools::pdf_text(x) %>% paste0(collapse = " ") %>% stringr::str_squish() }) new_names <- tools::file_path_sans_ext(files) new_names <- paste(new_names, "txt", sep = ".") setNames(x, new_names) } # apply function # note that my test files are in "~/Temp" txts <- convertpdf2txt(here::here("~", "Temp")) names(txts) Thank you very much, but the following error appeared: Error: unexpected '}' in "}" Cec�lia Carmo Universidade de Aveiro [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, I had tested the code with a couple of PDF's and it ran with no errors or warnings. That error is telling that a "}" is not balanced but in my code they all are, RStudio checks it automatically. Can you try to check in an editor with syntax highlighting? Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] textual analysis - transforming several pdf to txt - naming the files
Às 10:14 de 05/07/2023, Cecília Carmo escreveu: I am taking my first steps in textual analysis with R. I have pdf files consisting of company reports for several years (1 file corresponds to 1 company and 1 year). My idea is to start by transforming all my pdf files into txt files for further treatment and analysis (this will allow me to group the files by company or by year, depending on the future analysis to be performed). I do not have in-depth knowledge of programming in R. I just adapt codes that I find, to my needs. Here goes the first doubt in a code I'm adapting: My pdf files are in one directory named "pdfs". The names of my files are, for example, SONAE2020FS.pdf, EDP2021GS.pdf I want to convert them to txt and give the same names as in the pdf files: SOANE2020FS.txt, EDP2021GS.txt I'm running the following scrip, but the names of txt files that I obtain are: pdftext1, pdftext2, pdftext3... What do I need to change? Thank you very much, Cec�lia Carmo Universidade de Aveiro - Portugal dirpath <- ("/Users/ceciliacarmo/documents/RTextualAnalysis/data/pdfs") library(pdftools) library(dplyr) convertpdf2txt <- function(dirpath){ files <- list.files(dirpath, full.names = T) x <- sapply(files, function(x){ x <- pdftools::pdf_text(x) %>% paste0(collapse = " ") %>% stringr::str_squish() return(x) }) } # apply function txts <- convertpdf2txt(here::here("data", "pdf/")) # add names to txt files names(txts) <- paste0(here::here("data","pdftext"), 1:length(txts), sep = "") [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Try the following. The corrected function convertpdf2txt assigns names based on the files variable. It uses tools::file_path_sans_ext to keep the filename without extension and pastes the new extension to them. In the end there is no need to call here::here again, the list already is a named list. convertpdf2txt <- function(dirpath){ files <- list.files(dirpath, pattern = "Consoli.*\\.pdf$", full.names = TRUE) files <- chartr("\\", "/", files) x <- lapply(files, function(x){ pdftools::pdf_text(x) %>% paste0(collapse = " ") %>% stringr::str_squish() }) new_names <- tools::file_path_sans_ext(files) new_names <- paste(new_names, "txt", sep = ".") setNames(x, new_names) } # apply function # note that my test files are in "~/Temp" txts <- convertpdf2txt(here::here("~", "Temp")) names(txts) Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create matrix with column names wiht the same prefix xxxx and that end in 1, 2
Às 20:55 de 03/07/2023, Rui Barradas escreveu: Às 20:26 de 03/07/2023, Sorkin, John escreveu: Jeff, Again my thanks for your guidance. I replaced dimnames(myvalues)<-list(NULL,c(zzz)) with colnames(myvalues)<-zzz and get the same error, Error in dimnames(x) <- dn : length of 'dimnames' [2] not equal to array extent It appears that I am creating the string zzz in a manner that is not compatable with either dimnames(myvalues)<-list(NULL,c(zzz)) or colnames(myvalues)<-zzz I think I need to modify the way I create the string zzz. # create variable names xxx1 and xxx2. string="" for (j in 1:2){ name <- paste("xxx",j,sep="") string <- paste(string,name) print(string) } # Creation of xxx1 and xxx2 works string # Create matrix myvalues <- matrix(nrow=2,ncol=4) head(myvalues,1) # Add "j" and "k" to the string of column names zzz <- paste("j","k",string) zzz # assign column names, j, k, xxx1, xxx2 to the matrix # create column names, j, k, xxx1, xxx2. dimnames(myvalues)<-list(NULL,c(zzz)) colnames(myvalues)<-zzz From: Jeff Newmiller Sent: Monday, July 3, 2023 2:45 PM To: Sorkin, John Cc: r-help@r-project.org Subject: Re: [R] Create matrix with column names wiht the same prefix and that end in 1, 2 I really think you should read that help page. colnames() accesses the second element of dimnames() directly. On July 3, 2023 11:39:37 AM PDT, "Sorkin, John" wrote: Jeff, Thank you for your reply. I should have said with dim names not column names. I want the Mateix to have dim names, no row names, dim names j, k, xxx1, xxx2. John John David Sorkin M.D., Ph.D. Professor of Medicine Chief, Biostatistics and Informatics University of Maryland School of Medicine Division of Gerontology and Geriatric Medicine Baltimore VA Medical Center 10 North Greene Street GRECC (BT/18/GR) Baltimore, MD 21201-1524 (Phone) 410-605-7119 (Fax) 410-605-7913 (Please call phone number above prior to faxing) On Jul 3, 2023, at 2:11 PM, Jeff Newmiller wrote: ?colnames On July 3, 2023 11:00:32 AM PDT, "Sorkin, John" wrote: I am trying to create an array, myvalues, having 2 rows and 4 columns, where the column names are j,k,xxx1,xxx2. The code below fails, with the following error, "Error in dimnames(myvalues) <- list(NULL, zzz) : length of 'dimnames' [2] not equal to array extent" Please help me get the code to work. Thank you, John # create variable names xxx1 and xxx2. string="" for (j in 1:2){ name <- paste("xxx",j,sep="") string <- paste(string,name) print(string) } # Creation of xxx1 and xxx2 works string # Create matrix myvalues <- matrix(nrow=2,ncol=4) head(myvalues,1) # Add "j" and "k" to the string of column names zzz <- paste("j","k",string) zzz # assign column names, j, k, xxx1, xxx2 to the matrix # create column names, j, k, xxx1, xxx2. dimnames(myvalues)<-list(NULL,zzz) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.r-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Sent from my phone. Please excuse my brevity. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, I should have pointed out in my answer that you are inded creating the names vector in a (very) wrong way. When in the loop you paste string and name you create one vector of length 1. When the loop ends, you have " xxx1 xxx2", not two names. string="" for (j in 1:2){ name <- paste("xxx",j,sep="") string <- paste(string,name) print(string) } #> [1] " xxx1" #> [1] " xxx1 xxx2" # Creation of xxx1 and xxx2 works string #> [1] " xxx1 xxx2" Quoting the comment above, Creation of xxx1 and xxx2 works No, it does not! And then you paste again, adding two extra letters to one string zzz <- paste("j","k",string) This zzz also is of length 1, check it. With a loop the right way would be any of # 1. concatenate the current
Re: [R] Create matrix with column names wiht the same prefix xxxx and that end in 1, 2
frequent for (j in 1:2){ name <- paste("xxx",j,sep="") string <- c(string, name) print(string) } #> [1] "xxx1" #> [1] "xxx1" "xxx2" # Now creation of xxx1 and xxx2 does work string #> [1] "xxx1" "xxx2" # 2. create a vector of the appropriate length beforehand, my preferred string <- character(2) for (j in 1:2){ string[j] <- paste0("xxx",j,sep="") print(string) } #> [1] "xxx1" "" #> [1] "xxx1" "xxx2" # Creation of xxx1 and xxx2 works string #> [1] "xxx1" "xxx2" But the vectorized way is still the better one. Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Create matrix with column names wiht the same prefix xxxx and that end in 1, 2
Às 19:00 de 03/07/2023, Sorkin, John escreveu: I am trying to create an array, myvalues, having 2 rows and 4 columns, where the column names are j,k,xxx1,xxx2. The code below fails, with the following error, "Error in dimnames(myvalues) <- list(NULL, zzz) : length of 'dimnames' [2] not equal to array extent" Please help me get the code to work. Thank you, John # create variable names xxx1 and xxx2. string="" for (j in 1:2){ name <- paste("xxx",j,sep="") string <- paste(string,name) print(string) } # Creation of xxx1 and xxx2 works string # Create matrix myvalues <- matrix(nrow=2,ncol=4) head(myvalues,1) # Add "j" and "k" to the string of column names zzz <- paste("j","k",string) zzz # assign column names, j, k, xxx1, xxx2 to the matrix # create column names, j, k, xxx1, xxx2. dimnames(myvalues)<-list(NULL,zzz) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, You don't need so many calls to paste, one is enough. And you don't need the for loop at all, paste and paste0 are vectorized. myvalues <- matrix(nrow=2,ncol=4) cnames <- paste0("xxx", 1:2) cnames # [1] "xxx1" "xxx2" colnames(myvalues) <- c("j", "k", cnames) Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to plot both lines and points by group on ggplot2
Às 19:20 de 01/07/2023, Luigi Marongiu escreveu: Hello, I have a dataframe with measurements stratified by the concentration of a certain substance. I would like to plot the points of the measures and connect the points within each series of concentrations. When I launch ggplot2 I get the error ``` geom_path: Each group consists of only one observation. Do you need to adjust the group aesthetic? ``` and no lines are drawn. Where am I going wrong? Thank you Luigi ``` df = data.frame(Conc = c(rep(1, 3), rep(2, 3), rep(5, 3)), Time = rep(1:3, 3), Value = c(0.91, 0.67, 0.71, 0.91, 0.65, 0.74, 0.95, 0.67, 0.67)) df$Time <- as.factor(df$Time) levels(df$Time) = c(1, 4, 24) df$Conc <- as.factor(df$Conc) levels(df$Conc) = c(1, 2, 5) library(ggplot2) ggplot(df, aes(x=Time, y=Value, colour=Conc)) + geom_point(size=6) + geom_line(aes(x=Time, y=Value, colour=Conc)) + scale_colour_manual(values = c("darkslategray3", "darkslategray4", "deepskyblue4")) + ggtitle("Working example") + xlab(expression(bold("Time (h)"))) + ylab(expression(bold("Concentration (mM)"))) ``` Hello, Here are two solutions. I have removed the redundant aes() from geom_line in both plots. 1. If you do not coerce Time to factor, the x axis will be continuous. The plot will be as expected but you wi have to include a scale_x_continuous to have the wanted labels. df = data.frame(Conc = c(rep(1, 3), rep(2, 3), rep(5, 3)), Time = rep(1:3, 3), Value = c(0.91, 0.67, 0.71, 0.91, 0.65, 0.74, 0.95, 0.67, 0.67)) library(ggplot2) df$Conc <- factor(df$Conc, levels = c(1, 2, 5)) ggplot(df, aes(x=Time, y=Value, colour=Conc)) + geom_point(size=6) + geom_line() + scale_colour_manual(values = c("darkslategray3", "darkslategray4", "deepskyblue4")) + scale_x_continuous(breaks = 1:3, labels = c(1, 2, 24)) + ggtitle("Working example") + xlab(expression(bold("Time (h)"))) + ylab(expression(bold("Concentration (mM)"))) 2. Time is coerced to factor. Then, tell geom_line the data is grouped by Conc. This is probably the solution you should use. df$Time <- factor(df$Time, labels = c(1, 4, 24)) ggplot(df, aes(x=Time, y=Value, colour=Conc)) + geom_point(size=6) + geom_line(aes(group = Conc)) + scale_colour_manual(values = c("darkslategray3", "darkslategray4", "deepskyblue4")) + ggtitle("Working example") + xlab(expression(bold("Time (h)"))) + ylab(expression(bold("Concentration (mM)"))) Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Issue with crammed Y axis
Às 00:07 de 17/06/2023, Ana Marija escreveu: Hi, I have a data frame like this: dput(df) structure(list(ID = 1:8, Type = c("gmx mdrun -ntmpi 8 -ntomp 1 -s benchPEP.tpr -nsteps 1 -resethway", "gmx mdrun -ntmpi 8 -ntomp 1 -s benchPEP.tpr -nsteps 1 -resethway", "gmx mdrun -ntmpi 8 -s benchPEP.tpr -nsteps 4000 -resetstep 3000", "gmx mdrun -ntmpi 8 -s benchPEP.tpr -nsteps 4000 -resetstep 3000", "gmx mdrun -ntmpi 8 -s benchPEP.tpr -nsteps -1 -maxh 1.0 -resethway", "gmx mdrun -ntmpi 8 -s benchPEP.tpr -nsteps -1 -maxh 1.0 -resethway", "gmx mdrun -ntmpi 8 -ntomp 1 -s benchPEP.tpr -nsteps -1 -maxh 1.0 -resethway -noconfout", "gmx mdrun -ntmpi 8 -ntomp 1 -s benchPEP.tpr -nsteps -1 -maxh 1.0 -resethway -noconfout" ), Annee = c("SYCL", "CUDA", "SYCL", "CUDA", "SYCL", "CUDA", "SYCL", "CUDA"), Domain.decomp. = c("2. 1", "2", "2. 1", "2. 1", "2.1", "2", "2. 1", "2"), DD.com..load = c(0, 0, 0, 0, 3.7, 3, 0, 0), Neighbor.search = c("3.7", "3. 1", "3.7", "3.9", "0. 1", "O. 1", "3.5", "3. 1"), Launch.PP.GPU.ops. = c("0. 1", "0", "0.2", "0", "1 .6", "1 . 5", "0.2", "0. 1"), Comm..coord. = c("1 .6", "1 .0", "1 .5", "1 .3", "1 .5", "1 .3", "1 . 5", "1 .6"), Force = c("1 . 5", "1 .2", "1 .4", "1 .2", "1 .3", "1 . 1", "1 .5", "1 .2"), Wait...Comm..F = c("1 .3", "1 .7", "1 .2", "1 .0", "66.7", "68.8", "1 .2", "1 .2"), PIE.mesh = c("65.6", "70.9", "61 .0", "61 .4", "0", "0", "67.6", "69.2"), Wait.Bonded.GPU = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), wait.GPU.NB.nonloc. = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), Wait.GPU.NB.local = c(0, 0, 0, 0, 7.4, 5.7, 0, 0), NB.X.F.buffer.ops. = c("7.3", "4.4", "6. 7", "5", "0. 1", "0. 1", "7.2", "5.5"), Write.traje = c("0.3", "0.3", "1 .2", "1 .3", "6.4", "6. 1", "O. 1", "0. 1"), Update = c(6.3, 4.3, 5.7, 4.9, 8.2, 9.5, 6.2, 5.6), Constraints = c("8.9", "9.7", "1 1 .6", "13.3", "0.3", "0.4", "8. 1", "9.5"), Comm..energies = c("0.9", "0.9", "3.3", "3.9", "8.4", "8. 5", "0.3", "0.4"), PIE.redist..X.F = c("8. 1", "8.7", "7.9", "7.4", "29.9", "30.1", "8. 1", "8. 1"), PIE.spread = c("29.7", "30.6", "27.2", "29.6", "20.3", "20.2", "30. 1", "30.4"), PIE.gather = c("19.9", "21 .3", "18.7", "19", "6.4", "8.4", "20", "20.6"), PIE.3D.FFT = c("6", "8.6", "5.7", "4.3", "1 .0", "1 .1", "7.6", "8.4"), PIE.3D.FFT.comm. = c("1 .2", "1 .0", "0.9", "0.7", "1 .2", "0. 5", "1 .0", "1 .1"), PIE.solve.Elec = c(0.7, 0.5, 0.6, 0.3, 0.7, 0.5, 0.7, 0.5)), class = "data.frame", row.names = c(NA, -8L)) I am plotting this data with: library(reshape2) library(ggplot2) df <- read.csv("/Users/anamaria/Downloads/B5.csv", stringsAsFactors=FALSE, header=TRUE) df.long<-melt(df,id.vars=c("ID","Type","Annee")) myplot =ggplot(df.long,aes(variable,value,fill=as.factor(Annee)))+ geom_bar(position="dodge",stat="identity")+ ylab("Simulation Progress (%)") + facet_wrap(~Type,nrow=3) myplot + theme(panel.grid.major = element_blank(), legend.title=element_blank(), panel.grid.minor = element_blank(), panel.background = element_blank(), axis.title.x = element_blank(), axis.text.x = element_text(angle = 90, vjust = 0.5, hjust=1), axis.line = element_line(colour = "black")) My issue is that Y axis is crammed. How it can be cleaned up and say feature only say these values: 0, 10, 20,30, ...80. I tried using: scale_y_continuous(breaks = breaks_width(10))+ But I got this error: Error in breaks_width(10) : could not find function "breaks_width" Also can anything be done about the subtitle of the top left plot, which is not quite fitting in that gray box: " gmx mdrun -ntmpi 8 -ntomp 1 -s benchPEP.tpr -nsteps 1 -resethway" Thanks Ana [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, The problem seems to be that df.long$value is character and that it has spaces and "O" (upper case letter O) in it. Try, before plotting df.long$value <- gsub(" ", "", df.long$value) df.long$value <- sub("O", "0", df.long$value) df.long$value <- as.numeric(df.long$value) With me it solved the problem. As for breaks_width, that's a function in package scales, so if the above doesn't solve it, qualify the function name: scale_y_continuous(breaks = scales::breaks_width(10)) + Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with filling dataframe's column
Às 17:18 de 13/06/2023, javad bayat escreveu: Dear Rui; Hi. I used your codes, but it seems it didn't work for me. pat <- c("_esmdes|_Des Section|0") dim(data2) [1] 281549 9 grep(pat, data2$Layer) dim(data2) [1] 281549 9 What does grep function do? I expected the function to remove 3 rows of the dataframe. I do not know the reason. On Mon, Jun 12, 2023 at 5:16 PM Rui Barradas wrote: Às 23:13 de 12/06/2023, javad bayat escreveu: Dear Rui; Many thanks for the email. I tried your codes and found that the length of the "Values" and "Names" vectors must be equal, otherwise the results will not be useful. For some of the characters in the Layer column that I do not need to be filled in the LU column, I used "NA". But I need to delete some of the rows from the table as they are useless for me. I tried this code to delete entire rows of the dataframe which contained these three value in the Layer column: It gave me the following error. data3 = data2[-grep(c("_esmdes","_Des Section","0"), data2$Layer),] Warning message: In grep(c("_esmdes", "_Des Section", "0"), data2$Layer) : argument 'pattern' has length > 1 and only the first element will be used data3 = data2[!grepl(c("_esmdes","_Des Section","0"), data2$Layer),] Warning message: In grepl(c("_esmdes", "_Des Section", "0"), data2$Layer) : argument 'pattern' has length > 1 and only the first element will be used How can I do this? Sincerely On Sun, Jun 11, 2023 at 5:03 PM Rui Barradas wrote: Às 13:18 de 11/06/2023, Rui Barradas escreveu: Às 22:54 de 11/06/2023, javad bayat escreveu: Dear Rui; Many thanks for your email. I used one of your codes, "data2$LU[which(data2$Layer == "Level 12")] <- "Park"", and it works correctly for me. Actually I need to expand the codes so as to consider all "Levels" in the "Layer" column. There are more than hundred levels in the Layer column. If I use your provided code, I have to write it hundred of time as below: data2$LU[which(data2$Layer == "Level 1")] <- "Park"; data2$LU[which(data2$Layer == "Level 2")] <- "Agri"; ... ... ... . Is there any other way to expand the code in order to consider all of the levels simultaneously? Like the below code: data2$LU[which(data2$Layer == c("Level 1","Level 2", "Level 3", ...))] <- c("Park", "Agri", "GS", ...) Sincerely On Sun, Jun 11, 2023 at 1:43 PM Rui Barradas wrote: Às 21:05 de 11/06/2023, javad bayat escreveu: Dear R users; I am trying to fill a column based on a specific value in another column of a dataframe, but it seems there is a problem with the codes! The "Layer" and the "LU" are two different columns of the dataframe. How can I fix this? Sincerely for (i in 1:nrow(data2$Layer)){ if (data2$Layer == "Level 12") { data2$LU == "Park" } } Hello, There are two bugs in your code, 1) the index i is not used in the loop 2) the assignment operator is `<-`, not `==` Here is the loop corrected. for (i in 1:nrow(data2$Layer)){ if (data2$Layer[i] == "Level 12") { data2$LU[i] <- "Park" } } But R is a vectorized language, the following two ways are the idiomac ways of doing what you want to do. i <- data2$Layer == "Level 12" data2$LU[i] <- "Park" # equivalent one-liner data2$LU[data2$Layer == "Level 12"] <- "Park" If there are NA's in data2$Layer it's probably safer to use ?which() in the logical index, to have a numeric one. i <- which(data2$Layer == "Level 12") data2$LU[i] <- "Park" # equivalent one-liner data2$LU[which(data2$Layer == "Level 12")] <- "Park" Hope this helps, Rui Barradas Hello, You don't need to repeat the same instruction 100+ times, there is a way of assigning all new LU values at the same time with match(). This assumes that you have the new values in a vector. Sorry, this is not clear. I mean This assumes that you have the new values in a vector, the vector Names below. The vector of values to be matched is created from the data. Rui Barradas Values <- sort(unique(data2$Layer)) Names <- c("Park", "Agri", "GS") i <- match(data2$Layer, Values) data2$LU <- Names[i] Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting
Re: [R] Problem with filling dataframe's column
Às 23:13 de 12/06/2023, javad bayat escreveu: Dear Rui; Many thanks for the email. I tried your codes and found that the length of the "Values" and "Names" vectors must be equal, otherwise the results will not be useful. For some of the characters in the Layer column that I do not need to be filled in the LU column, I used "NA". But I need to delete some of the rows from the table as they are useless for me. I tried this code to delete entire rows of the dataframe which contained these three value in the Layer column: It gave me the following error. data3 = data2[-grep(c("_esmdes","_Des Section","0"), data2$Layer),] Warning message: In grep(c("_esmdes", "_Des Section", "0"), data2$Layer) : argument 'pattern' has length > 1 and only the first element will be used data3 = data2[!grepl(c("_esmdes","_Des Section","0"), data2$Layer),] Warning message: In grepl(c("_esmdes", "_Des Section", "0"), data2$Layer) : argument 'pattern' has length > 1 and only the first element will be used How can I do this? Sincerely On Sun, Jun 11, 2023 at 5:03 PM Rui Barradas wrote: Às 13:18 de 11/06/2023, Rui Barradas escreveu: Às 22:54 de 11/06/2023, javad bayat escreveu: Dear Rui; Many thanks for your email. I used one of your codes, "data2$LU[which(data2$Layer == "Level 12")] <- "Park"", and it works correctly for me. Actually I need to expand the codes so as to consider all "Levels" in the "Layer" column. There are more than hundred levels in the Layer column. If I use your provided code, I have to write it hundred of time as below: data2$LU[which(data2$Layer == "Level 1")] <- "Park"; data2$LU[which(data2$Layer == "Level 2")] <- "Agri"; ... ... ... . Is there any other way to expand the code in order to consider all of the levels simultaneously? Like the below code: data2$LU[which(data2$Layer == c("Level 1","Level 2", "Level 3", ...))] <- c("Park", "Agri", "GS", ...) Sincerely On Sun, Jun 11, 2023 at 1:43 PM Rui Barradas wrote: Às 21:05 de 11/06/2023, javad bayat escreveu: Dear R users; I am trying to fill a column based on a specific value in another column of a dataframe, but it seems there is a problem with the codes! The "Layer" and the "LU" are two different columns of the dataframe. How can I fix this? Sincerely for (i in 1:nrow(data2$Layer)){ if (data2$Layer == "Level 12") { data2$LU == "Park" } } Hello, There are two bugs in your code, 1) the index i is not used in the loop 2) the assignment operator is `<-`, not `==` Here is the loop corrected. for (i in 1:nrow(data2$Layer)){ if (data2$Layer[i] == "Level 12") { data2$LU[i] <- "Park" } } But R is a vectorized language, the following two ways are the idiomac ways of doing what you want to do. i <- data2$Layer == "Level 12" data2$LU[i] <- "Park" # equivalent one-liner data2$LU[data2$Layer == "Level 12"] <- "Park" If there are NA's in data2$Layer it's probably safer to use ?which() in the logical index, to have a numeric one. i <- which(data2$Layer == "Level 12") data2$LU[i] <- "Park" # equivalent one-liner data2$LU[which(data2$Layer == "Level 12")] <- "Park" Hope this helps, Rui Barradas Hello, You don't need to repeat the same instruction 100+ times, there is a way of assigning all new LU values at the same time with match(). This assumes that you have the new values in a vector. Sorry, this is not clear. I mean This assumes that you have the new values in a vector, the vector Names below. The vector of values to be matched is created from the data. Rui Barradas Values <- sort(unique(data2$Layer)) Names <- c("Park", "Agri", "GS") i <- match(data2$Layer, Values) data2$LU <- Names[i] Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Please cc the r-help list, R-Help is threaded and this can in the future be helpful to others. You can combine several patters like this: pat <- c("_esmdes|_Des Section|0") grep(pat, data2$Layer) or, programatically, pat <- paste(c("_esmdes","_Des Section","0"), collapse = "|") Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with filling dataframe's column
Às 13:18 de 11/06/2023, Rui Barradas escreveu: Às 22:54 de 11/06/2023, javad bayat escreveu: Dear Rui; Many thanks for your email. I used one of your codes, "data2$LU[which(data2$Layer == "Level 12")] <- "Park"", and it works correctly for me. Actually I need to expand the codes so as to consider all "Levels" in the "Layer" column. There are more than hundred levels in the Layer column. If I use your provided code, I have to write it hundred of time as below: data2$LU[which(data2$Layer == "Level 1")] <- "Park"; data2$LU[which(data2$Layer == "Level 2")] <- "Agri"; ... ... ... . Is there any other way to expand the code in order to consider all of the levels simultaneously? Like the below code: data2$LU[which(data2$Layer == c("Level 1","Level 2", "Level 3", ...))] <- c("Park", "Agri", "GS", ...) Sincerely On Sun, Jun 11, 2023 at 1:43 PM Rui Barradas wrote: Às 21:05 de 11/06/2023, javad bayat escreveu: Dear R users; I am trying to fill a column based on a specific value in another column of a dataframe, but it seems there is a problem with the codes! The "Layer" and the "LU" are two different columns of the dataframe. How can I fix this? Sincerely for (i in 1:nrow(data2$Layer)){ if (data2$Layer == "Level 12") { data2$LU == "Park" } } Hello, There are two bugs in your code, 1) the index i is not used in the loop 2) the assignment operator is `<-`, not `==` Here is the loop corrected. for (i in 1:nrow(data2$Layer)){ if (data2$Layer[i] == "Level 12") { data2$LU[i] <- "Park" } } But R is a vectorized language, the following two ways are the idiomac ways of doing what you want to do. i <- data2$Layer == "Level 12" data2$LU[i] <- "Park" # equivalent one-liner data2$LU[data2$Layer == "Level 12"] <- "Park" If there are NA's in data2$Layer it's probably safer to use ?which() in the logical index, to have a numeric one. i <- which(data2$Layer == "Level 12") data2$LU[i] <- "Park" # equivalent one-liner data2$LU[which(data2$Layer == "Level 12")] <- "Park" Hope this helps, Rui Barradas Hello, You don't need to repeat the same instruction 100+ times, there is a way of assigning all new LU values at the same time with match(). This assumes that you have the new values in a vector. Sorry, this is not clear. I mean This assumes that you have the new values in a vector, the vector Names below. The vector of values to be matched is created from the data. Rui Barradas Values <- sort(unique(data2$Layer)) Names <- c("Park", "Agri", "GS") i <- match(data2$Layer, Values) data2$LU <- Names[i] Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with filling dataframe's column
Às 22:54 de 11/06/2023, javad bayat escreveu: Dear Rui; Many thanks for your email. I used one of your codes, "data2$LU[which(data2$Layer == "Level 12")] <- "Park"", and it works correctly for me. Actually I need to expand the codes so as to consider all "Levels" in the "Layer" column. There are more than hundred levels in the Layer column. If I use your provided code, I have to write it hundred of time as below: data2$LU[which(data2$Layer == "Level 1")] <- "Park"; data2$LU[which(data2$Layer == "Level 2")] <- "Agri"; ... ... ... . Is there any other way to expand the code in order to consider all of the levels simultaneously? Like the below code: data2$LU[which(data2$Layer == c("Level 1","Level 2", "Level 3", ...))] <- c("Park", "Agri", "GS", ...) Sincerely On Sun, Jun 11, 2023 at 1:43 PM Rui Barradas wrote: Às 21:05 de 11/06/2023, javad bayat escreveu: Dear R users; I am trying to fill a column based on a specific value in another column of a dataframe, but it seems there is a problem with the codes! The "Layer" and the "LU" are two different columns of the dataframe. How can I fix this? Sincerely for (i in 1:nrow(data2$Layer)){ if (data2$Layer == "Level 12") { data2$LU == "Park" } } Hello, There are two bugs in your code, 1) the index i is not used in the loop 2) the assignment operator is `<-`, not `==` Here is the loop corrected. for (i in 1:nrow(data2$Layer)){ if (data2$Layer[i] == "Level 12") { data2$LU[i] <- "Park" } } But R is a vectorized language, the following two ways are the idiomac ways of doing what you want to do. i <- data2$Layer == "Level 12" data2$LU[i] <- "Park" # equivalent one-liner data2$LU[data2$Layer == "Level 12"] <- "Park" If there are NA's in data2$Layer it's probably safer to use ?which() in the logical index, to have a numeric one. i <- which(data2$Layer == "Level 12") data2$LU[i] <- "Park" # equivalent one-liner data2$LU[which(data2$Layer == "Level 12")] <- "Park" Hope this helps, Rui Barradas Hello, You don't need to repeat the same instruction 100+ times, there is a way of assigning all new LU values at the same time with match(). This assumes that you have the new values in a vector. Values <- sort(unique(data2$Layer)) Names <- c("Park", "Agri", "GS") i <- match(data2$Layer, Values) data2$LU <- Names[i] Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Problem with filling dataframe's column
Às 21:05 de 11/06/2023, javad bayat escreveu: Dear R users; I am trying to fill a column based on a specific value in another column of a dataframe, but it seems there is a problem with the codes! The "Layer" and the "LU" are two different columns of the dataframe. How can I fix this? Sincerely for (i in 1:nrow(data2$Layer)){ if (data2$Layer == "Level 12") { data2$LU == "Park" } } Hello, There are two bugs in your code, 1) the index i is not used in the loop 2) the assignment operator is `<-`, not `==` Here is the loop corrected. for (i in 1:nrow(data2$Layer)){ if (data2$Layer[i] == "Level 12") { data2$LU[i] <- "Park" } } But R is a vectorized language, the following two ways are the idiomac ways of doing what you want to do. i <- data2$Layer == "Level 12" data2$LU[i] <- "Park" # equivalent one-liner data2$LU[data2$Layer == "Level 12"] <- "Park" If there are NA's in data2$Layer it's probably safer to use ?which() in the logical index, to have a numeric one. i <- which(data2$Layer == "Level 12") data2$LU[i] <- "Park" # equivalent one-liner data2$LU[which(data2$Layer == "Level 12")] <- "Park" Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Recombining Mon and Year values
Às 21:29 de 16/05/2023, Jeff Reichman escreveu: R Help I have a data.frame where I've broken out the year and an ordered month values. But I need to recombine them so I can graph mon-year in order but when I recombine I lose the month order and the results are plotted alphabetical. Yearmonth mon_year 2021 MarMar-2021 2021 Jan Jan-2021 2021 Apr Apr-2021 So do I need to convert the months back to an integer then recombine to plot. Jeff Reichman [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, You can use function as.yearmon in package zoo to get the correct month/year order. df1 <- data.frame(Year = c(2021, 2021, 2021), Mon = c("Mar", "Jan", "Apr")) df1$mon_year <- zoo::as.yearmon(paste(df1$Mon, df1$Year)) sort(df1$mon_year) #> [1] "Jan 2021" "Mar 2021" "Apr 2021" Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Newbie: Drawing fitted lines on subset of data
Às 15:29 de 16/05/2023, Kevin Zembower via R-help escreveu: Hello, I's still working with my tsibble of weight data for the last 20 years. In addition to drawing an overall trend line, using lm, for the whole data set, I'd like to draw short lines that would recompute lm and draw it, say, just for the years from 2010:2015. Here's a short example that I think illustrates what I'm trying to do. The commented out sections show what I've tried to far: ## Short example to test segments: w <- tsibble( date = as.Date("2022-01-01") + 0:99, value = rnorm(100) ) ggplot(data = w, mapping = aes(date, value)) + geom_smooth(method = "lm", se = FALSE) + geom_point() ## Below gives error about ignoring data ## geom_abline( data = w$date[25:75] ) ## Gives error ''data' must be in ' ## geom_smooth(data = w$date[25:35], ## method = lm, ## color = "black", ## se = FALSE) I'm thinking that this is probably easily done, but I'm struggling with how to subset the data in the middle of the pipeline. Thanks for any advice and help. -Kevin __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Try the following. In the 2nd geom_smooth you need a subset of the data not of just one of its columns. suppressPackageStartupMessages({ library(tsibble) library(dplyr) library(ggplot2) library(lubridate) }) ggplot(data = w, mapping = aes(date, value)) + geom_smooth(formula = y ~ x, method = "lm", se = FALSE) + geom_point() + geom_smooth( data = w %>% filter(year(date) >= 2010, year(date) <= 2015), mapping = aes(date, value), formula = y ~ x, method = lm, color = "black", se = FALSE ) Other ways to subset the data are # dplyr data = w %>% filter(year(date) %in% 2010:2015) # base R data = subset(w, year(date) %in% 2010:2015) Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error message when using 'optim' for numerical maximum likelihood
Às 06:28 de 14/05/2023, iguodala edwin via R-help escreveu: Good morning, How can I resolved error message New_X with convergence 1.Thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Please include data and the code you tried in your questions to R-Help. We'll be glad to help but like this it is not possible to do so. Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] aggregate wind direction data with wind speed required
Às 15:51 de 13/05/2023, Stefano Sofia escreveu: Dear list users, I have to aggregate wind direction data (wd) using a function that requires also a second input variable, wind speed (ws). This is the function that I need to use: my_fun <- function(wd1, ws1){ u_component <- -ws1*sin(2*pi*wd1/360) v_component <- -ws1*cos(2*pi*wd1/360) mean_u <- mean(u_component, na.rm=T) mean_v <- mean(v_component, na.rm=T) mean_wd <- (atan2(mean_u, mean_v) * 360/2/pi) + 180 result <- mean_wd result } Does the aggregate function work only with functions with a single input variable (the one that I want to aggregate), or its use can be extended to functions with two input variables? Here a simple example (which is meaningless, the important think is the concept behind it): df <- data.frame(day=c(1, 1, 1, 2, 2, 2, 3, 3), month=c(1, 1, 2, 2, 2, 2, 2, 2), wd=c(45, 90, 90, 135, 180, 270, 270, 315), ws=c(7, 7, 8, 3, 2, 7, 14, 13)) aggregate(wd ~ day + month, data=df, FUN = my_fun) cannot work, because ws is not taken into consideration. I got lost. Any hint, any help? I hope to have been able to explain my problem. Thank you for your attention, Stefano (oo) --oOO--( )--OOo-- Stefano Sofia PhD Civil Protection - Marche Region - Italy Meteo Section Snow Section Via del Colle Ameno 5 60126 Torrette di Ancona, Ancona (AN) Uff: +39 071 806 7743 E-mail: stefano.so...@regione.marche.it ---Oo-oO AVVISO IMPORTANTE: Questo messaggio di posta elettronica pu� contenere informazioni confidenziali, pertanto � destinato solo a persone autorizzate alla ricezione. I messaggi di posta elettronica per i client di Regione Marche possono contenere informazioni confidenziali e con privilegi legali. Se non si � il destinatario specificato, non leggere, copiare, inoltrare o archiviare questo messaggio. Se si � ricevuto questo messaggio per errore, inoltrarlo al mittente ed eliminarlo completamente dal sistema del proprio computer. Ai sensi dell'art. 6 della DGR n. 1394/2008 si segnala che, in caso di necessit� ed urgenza, la risposta al presente messaggio di posta elettronica pu� essere visionata da persone estranee al destinatario. IMPORTANT NOTICE: This e-mail message is intended to be received only by persons entitled to receive the confidential information it may contain. E-mail messages to clients of Regione Marche may contain information that is confidential and legally privileged. Please do not read, copy, forward, or store this message unless you are an intended recipient of it. If you have received this message in error, please forward it to the sender and delete it completely from your computer system. -- Questo messaggio stato analizzato da Libraesva ESG ed risultato non infetto. This message was scanned by Libraesva ESG and is believed to be clean. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Use the dots argument to pass any number of named arguments to your aggregation function. In this case, ws1 = ws at the end of the aggregate call. aggregate(wd ~ day + month, data=df, FUN = my_fun, ws1 = ws) You can also give the user the option to remove or not NA's by adding a na.rm argument: my_fun <- function(wd1, ws1, na.rm = FALSE) { [...] mean_u <- mean(u_component, na.rm = na.rm) mean_v <- mean(v_component, na.rm = na.rm) [...] } aggregate(wd ~ day + month, data=df, FUN = my_fun, ws1 = ws, na.rm = TRUE) Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Newbie: Controlling legends in graphs
Às 14:24 de 12/05/2023, Kevin Zembower via R-help escreveu: Hello, I'm trying to create a line graph with a legend, but have no success controlling the legend. Since nothing I've tried seems to work, I must be doing something systematically wrong. Can anyone point this out to me? Here's my data: > weights # A tibble: 1,246 × 3 Date J K 1 2000-02-13 133 188 2 2000-02-20 134 185 3 2000-02-27 135 187 4 2000-03-05 135 185 5 2000-03-12NA 184 6 2000-03-19NA 184. 7 2000-03-26 136 184. 8 2000-04-02 134 185 9 2000-04-09 133 186 10 2000-04-16NA 186 # ℹ 1,236 more rows # ℹ Use `print(n = ...)` to see more rows > Here's my attempts. You can see some of the things I've tried in the commented out sections: weights %>% group_by(year(Date)) %>% summarize( m_K = mean(K, na.rm = TRUE), m_J = mean(J, na.rm = TRUE), ) %>% ggplot(aes(x = `year(Date)`)) + geom_point(aes(y = m_K, color = "red")) + geom_smooth(aes(y = m_K, color = "red")) + geom_point(aes(y = m_J, color = "blue")) + geom_smooth(aes(y = m_J, color = "blue")) + guides(size = "legend", shape = "legend") ## scale_shape_discrete(name="Person", ## breaks=c("m_K", "m_J"), ## labels=c("K", "J")) ## theme(legend.title=element_blank()) When this runs, the blue line for "K" is above the red line for "J", as I expect, but in the legend, the red is shown first, and labeled "blue." I'd like to be able to create a legend where the first entry shows a blue line and is labeled "K" and the second is red and labeled "J". On a different but related topic, I'd welcome any advice or suggestions on my methodology in this example. Is this the correct way to summarize with a mean? Do I need the two sets of geom_point and geom_line clauses to create this graph, or is there a better way? Thanks for all your advice and guidance. -Kevin __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, This is mainly a data reshaping problem. Insteadof plotting two variables, J and K, if the data is in the long format you will map the column with these variables names to the color aesthetic and call each geom_* only once. Then, assign the colors you want. As for placing K above J, note that ggplot places them by alphabetical order unless you coerce to factor with the levels in the order you want. Also, if you want to compute aggregate statistics for several columns, use ?across. See the code below. Here is a complete example. I have augmented your data set in order to have more years to plot. # augment the data set weights <- " Date J K 1 2000-02-13 133 188 2 2000-02-20 134 185 3 2000-02-27 135 187 4 2000-03-05 135 185 5 2000-03-12NA 184 6 2000-03-19NA 184. 7 2000-03-26 136 184. 8 2000-04-02 134 185 9 2000-04-09 133 186 10 2000-04-16NA 186" weights <- read.table(text = weights, header = TRUE) weights$Date <- as.Date(weights$Date) tmp <- weights tmp <- lapply(1:10, \(y) { tmp$Date <- years(y) + tmp$Date tmp$J <- tmp$J + sample(-10:10, nrow(weights), TRUE) tmp$K <- tmp$K + sample(-10:10, nrow(weights), TRUE) tmp }) weights <- do.call(rbind, tmp) #--- # plot code library(ggplot2) library(dplyr) library(tidyr) library(lubridate) weights %>% mutate(Year = year(Date)) %>% group_by(Year) %>% summarize(across(J:K, mean, na.rm = TRUE)) %>% # now reshape the data pivot_longer(-Year) %>% # uncomment the next line if you want K # to show up on top in the legend # mutate(name = factor(name, levels = c("K", "J"))) %>% ggplot(aes(Year, value, color = name)) + geom_smooth( formula = y ~ x, method = lm, se = FALSE ) + geom_point() + scale_color_manual(values = c(J = "red", K = "blue")) Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] data.frame with a column containing an array
Às 11:52 de 08/05/2023, Georg Kindermann escreveu: Dear list members, when I create a data.frame containing an array I had expected, that I get a similar result, when subsetting it, like having a matrix in a data.frame. But instead I get only the first element and not all values of the remaining dimensions. Differences are already when creating the data.frame, where I can use `I` in case of a matrix but for an array I am only able to insert it in a second step. DFA <- data.frame(id = 1:2) DFA[["ar"]] <- array(1:8, c(2,2,2)) DFA[1,] # id ar #1 1 1 DFM <- data.frame(id = 1:2, M = I(matrix(1:4, 2))) DFM[1,] # id M.1 M.2 #1 1 1 3 The same when trying to use merge, where only the first value is kept. merge(DFA, data.frame(id = 1)) # id ar #1 1 1 merge(DFM, data.frame(id = 1)) # id M.1 M.2 #1 1 1 3 Is there a way to use an array in a data.frame like I can use a matrix in a data.frame? I am using R version 4.3.0. Kind regards, Georg __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Are you looking for something like this? DFA <- data.frame(id = 1:2) DFA[["ar"]] <- array(1:8, c(2,2,2)) DFA$ar[1, , ] #> [,1] [,2] #> [1,]1 5 #> [2,]37 Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grubbs test to detect all outliers
Às 14:01 de 29/04/2023, AbouEl-Makarim Aboueissa escreveu: Hi Rui: How about this dataset, please see below. I included a few outliers in each column, as you can see in the printed dataset; please see below. Once again, thank you very much, and sorry if I bothered you all. abou dput(datafortest) structure(list(factor1 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, NA, NA, NA, NA), levels = c("1", "2", "3"), class = "factor"), X = c(994455.077, 4348.031, .789, 3813.139, 12.65, 5642.667, 876684.386, 5165.731, NA, 3259.241, 8.383, 1997.878, 0.608, 2655.977, 9.49, 1826.851, 4386.002, 883295.091, 2120.902, NA, 2056.123, 5.088, NA, 92539.873, NA, NA, NA, NA), Y = c(76888L, 333L, 618L, 10L, 344L, NA, 3L, 86999L, 265L, 557L, 7L, 383L, NA, NA, 8L, 287L, 352L, 308L, 999526L, 489L, 2L, 444L, 9L, 333L, NA, NA, NA, NA), factor2 = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), levels = c("1", "2", "3"), class = "factor"), Z = c(54999L, 475L, 15L, 603L, 442L, 79486L, 927L, 971L, 388L, 888L, 514L, 409L, 546L, 523L, 313L, 296L, 320L, 388L, 7L, 677L, 555L, NA, 479L, 257L, 313L, 21L, 320L, 4L), U = c(NA, NA, 1.5, 332, 216, 217, 1000, 10, , 444, NA, 5, 327, 5, 456, 412, 251, 6, 398, 438, 428, 15, NA, 406, 334, 465, 180, 88999), V = c(12, 240, 9000, 265, NA, 9, 1, 562, 13, 777, 322, NA, 99988, 653, 450, 576, NA, 396.5, 91888, 5, 219, NA, 321, 417, 409, 99, 523, 10)), row.names = c(NA, -28L), class = "data.frame") datafortest factor1 X Y factor2 Z UV 11 994455.077 76888 1 54999 NA 12.0 21 4348.031333 1 475 NA240.0 31 .789618 115 1.5 9000.0 41 3813.139 10 1 603 332.0265.0 51 12.650344 1 442 216.0 NA 61 5642.667 NA 1 79486 217.0 9.0 71 876684.386 3 1 927 1000.0 1.0 82 5165.731 86999 1 97110.0562.0 92 NA265 1 388 .0 13.0 10 2 3259.241557 2 888 444.0777.0 11 2 8.383 7 2 514 NA322.0 12 2 1997.878383 2 409 5.0 NA 13 2 0.608 NA 2 546 327.0 99988.0 14 2 2655.977 NA 2 523 5.0653.0 15 3 9.490 8 2 313 456.0450.0 16 3 1826.851287 2 296 412.0576.0 17 3 4386.002352 2 320 251.0 NA 18 3 883295.091308 2 388 6.0396.5 19 3 2120.902 999526 3 7 398.0 91888.0 20 3 NA489 3 677 438.0 5.0 21 3 2056.123 2 3 555 428.0219.0 22 3 5.088444 3NA15.0 NA 23 3 NA 9 3 479 NA321.0 24 3 92539.873333 3 257 406.0417.0 25 NA NA 3 313 334.0409.0 26 NA NA 321 465.0 99.0 27 NA NA 3 320 180.0523.0 28 NA NA 3 4 88999.0 10.0 with many thanks abou __ *AbouEl-Makarim Aboueissa, PhD* *Professor, Mathematics and Statistics* *Graduate Coordinator* *Department of Mathematics and Statistics* *University of Southern Maine* On Sat, Apr 29, 2023 at 8:05 AM Rui Barradas wrote: Às 14:09 de 28/04/2023, AbouEl-Makarim Aboueissa escreveu: *R: *Grubbs Test to detect all outliers Per group for all columns in a data frame Dear All: good morning I have a dataset (as an example) with two column factors (factor1 and factor2) and 5 numerical columns (X,Y,Z,U,V). The X and Y columns have same length as factor1; and Z, U, and V have same length as factor2. Please see dataset is copied below. Please note that all dataset columns have NAs values. *Need help on this:* Can we use the grubbs.test() function to detect all outliers and replace it by NA in X and Y datasets per group in factor1; and in Z, U, and V datasets per group in factor2. Columns in the dataframe have different lengths, but when I read the .csv file, R added NA values for the shorter columns. If you need the .csv data file, please let me know. Thank you very much for your help in advance. install.packages("outliers") library(outliers) datafortest<-read.csv("G:/data_for_test.csv", header=TRUE) datafortest datafortest<-data.frame(datafortest) datafortest$factor1<-as.factor(datafortest$factor1) datafortest$factor2<-as.fact
Re: [R] grubbs test to detect all outliers
Às 14:09 de 28/04/2023, AbouEl-Makarim Aboueissa escreveu: *R: *Grubbs Test to detect all outliers Per group for all columns in a data frame Dear All: good morning I have a dataset (as an example) with two column factors (factor1 and factor2) and 5 numerical columns (X,Y,Z,U,V). The X and Y columns have same length as factor1; and Z, U, and V have same length as factor2. Please see dataset is copied below. Please note that all dataset columns have NAs values. *Need help on this:* Can we use the grubbs.test() function to detect all outliers and replace it by NA in X and Y datasets per group in factor1; and in Z, U, and V datasets per group in factor2. Columns in the dataframe have different lengths, but when I read the .csv file, R added NA values for the shorter columns. If you need the .csv data file, please let me know. Thank you very much for your help in advance. install.packages("outliers") library(outliers) datafortest<-read.csv("G:/data_for_test.csv", header=TRUE) datafortest datafortest<-data.frame(datafortest) datafortest$factor1<-as.factor(datafortest$factor1) datafortest$factor2<-as.factor(datafortest$factor2) str(datafortest) # tried to use grubbs.test() on a single column of the dataframe, but still not working tests.for.outliers.X<- grubbs.test(datafortest$X, na.rm = TRUE, type=11) *grubbs.test() on a single dataset: but this can only detect if the min and the max are outliers.* xx999<-c(0.088,1,2,3,4,5,6,7,8,9,88,98,99) grubbs.test(xx999, type=11) With many thanks Abou factor1 XY factor2 Z U V 1 4455.077 888 1 999 NA 999 1 4348.031 333 1 475NA 240 1.789 618 1 507 252 394 13813.139 417 1 603 332 265 1 7512.65 344 1 442 216 NA 1 5642.667NA 1 486 217 275 1 6684.386 341 1 927 698 479 2 5165.731 999 1 971 311 562 2 NA 265 1 388 999 512 2 3259.241 557 2 888 444 777 2 3288.383 234 2 514NA 322 2 1997.878 383 2 409 311 NA 2 0.61 NA 2 546 327 728 2 2655.977 NA 2 523 228 653 3 3189.49 2 313 456 450 3 1826.851 287 2 296 412 576 3 4386.002 352 2 320 251 NA 3 3295.091 308 2 388 888 396.5 3 2120.902 526 3 398 888 3 NA 489 3 677 438 307 3 2056.123 291 3 555 428 219 3 1995.088 444 3 NA 319 NA 3 NA 349 3 479 NA 321 3 2539.873 333 3 257 406 417 3 313 334 409 3 296 465 546 3 320 180 523 3 388 999 313 __ *AbouEl-Makarim Aboueissa, PhD* *Professor, Mathematics and Statistics* *Graduate Coordinator* *Department of Mathematics and Statistics* *University of Southern Maine* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, With the data file you have attached I cannot reproduce any errors, all went well at the first try. library(outliers) fl <- "~/data_for_test.csv" datafortest <- read.csv(fl) # these are not needed to run the test datafortest$factor1 <- as.factor(datafortest$factor1) datafortest$factor2 <- as.factor(datafortest$factor2) str(datafortest) #> 'data.frame':28 obs. of 7 variables: #> $ factor1: Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 2 2 2 ... #> $ X : num 4455 4348 1 3813 7513 ... #> $ Y : int 888 333 618 417 344 NA 341 999 265 557 ... #> $ factor2: Factor w/ 3 levels "1","2","3": 1 1 1 1 1 1 1 1 1 2 ... #> $ Z : int 999 475 507 603 442 486 927 971 388 888 ... #> $ U : int NA NA 252 332 216 217 698 311 999 444 ... #> $ V : num 999 240 394 265 NA 275 479 562 512 777 ... head(datafortest) #> factor1X Y factor2 Z U V #> 1 1 4455.077 888 1 999 NA 999 #> 2 1 4348.031 333 1 475 NA 240 #> 3 1 .789 618 1 507 252 394 #> 4 1 3813.139 417 1 603 332 265 #> 5 1 7512.650 344 1 442 216 NA #> 6 1 5642.667 NA 1 486 217 275 # tried to use grubbs.test() on a single column of the dataframe, but # still not working grubbs.test(datafortest$X, type = 11) #> #> Grubbs test for two opposite outliers #> #> data: datafortest$X #> G = 4.6640014, U = 0.0091756, p-value = 0.02867 #> alternative hypothesis: 1826.851 and 0.608 are outliers Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE
Re: [R] grubbs test to detect all outliers
Às 14:09 de 28/04/2023, AbouEl-Makarim Aboueissa escreveu: *R: *Grubbs Test to detect all outliers Per group for all columns in a data frame Dear All: good morning I have a dataset (as an example) with two column factors (factor1 and factor2) and 5 numerical columns (X,Y,Z,U,V). The X and Y columns have same length as factor1; and Z, U, and V have same length as factor2. Please see dataset is copied below. Please note that all dataset columns have NAs values. *Need help on this:* Can we use the grubbs.test() function to detect all outliers and replace it by NA in X and Y datasets per group in factor1; and in Z, U, and V datasets per group in factor2. Columns in the dataframe have different lengths, but when I read the .csv file, R added NA values for the shorter columns. If you need the .csv data file, please let me know. Thank you very much for your help in advance. install.packages("outliers") library(outliers) datafortest<-read.csv("G:/data_for_test.csv", header=TRUE) datafortest datafortest<-data.frame(datafortest) datafortest$factor1<-as.factor(datafortest$factor1) datafortest$factor2<-as.factor(datafortest$factor2) str(datafortest) # tried to use grubbs.test() on a single column of the dataframe, but still not working tests.for.outliers.X<- grubbs.test(datafortest$X, na.rm = TRUE, type=11) *grubbs.test() on a single dataset: but this can only detect if the min and the max are outliers.* xx999<-c(0.088,1,2,3,4,5,6,7,8,9,88,98,99) grubbs.test(xx999, type=11) With many thanks Abou factor1 XY factor2 Z U V 1 4455.077 888 1 999 NA 999 1 4348.031 333 1 475NA 240 1.789 618 1 507 252 394 13813.139 417 1 603 332 265 1 7512.65 344 1 442 216 NA 1 5642.667NA 1 486 217 275 1 6684.386 341 1 927 698 479 2 5165.731 999 1 971 311 562 2 NA 265 1 388 999 512 2 3259.241 557 2 888 444 777 2 3288.383 234 2 514NA 322 2 1997.878 383 2 409 311 NA 2 0.61 NA 2 546 327 728 2 2655.977 NA 2 523 228 653 3 3189.49 2 313 456 450 3 1826.851 287 2 296 412 576 3 4386.002 352 2 320 251 NA 3 3295.091 308 2 388 888 396.5 3 2120.902 526 3 398 888 3 NA 489 3 677 438 307 3 2056.123 291 3 555 428 219 3 1995.088 444 3 NA 319 NA 3 NA 349 3 479 NA 321 3 2539.873 333 3 257 406 417 3 313 334 409 3 296 465 546 3 320 180 523 3 388 999 313 __ *AbouEl-Makarim Aboueissa, PhD* *Professor, Mathematics and Statistics* *Graduate Coordinator* *Department of Mathematics and Statistics* *University of Southern Maine* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Please post the output of dput(datafortest) your data is difficult to read into a R session. Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] grDevices::hcl.colors using two colours: Bug or Feature?
Às 11:07 de 28/04/2023, Achim Zeileis escreveu: This was introduced in 4.3.0 (hence Rui cannot reproduce it in 4.2.3). It's a bug and was introduced when fixing this other bug: https://bugs.R-project.org/show_bug.cgi?id=18476 https://hypatia.math.ethz.ch/pipermail/r-help/2023-February/476960.html Apparently, it only affects the case with n = 2 for diverging and divergingx palettes. The culprit is this line: i <- if(n2 == 1L) 0 else seq.int(1, by = -2/(n - 1), length.out = n2) I think n2 == 1L is not the right condition and we need to distinguish n = 1 and n = 2. Will have a closer look... Thanks for reporting this! Achim On Fri, 28 Apr 2023, Rui Barradas wrote: Às 06:01 de 28/04/2023, Stevie Pederson escreveu: Hi, I'm not sure if this is a bug or a feature, but after updating to Rv4.3, if requesting two colours from hcl.colors() you now get the same colour twice. This occurs for all palettes I've tried. My reprex: hcl.colors(2, "Vik") [1] "#F1F1F1" "#F1F1F1" As I have multiple workflows I run repeatedly with A vs B comparisons, this has just broken the visualisations in many of them. Obviously a workaround is hcl.colors(3, "Vik")[c(1, 3)] but this seems rather unintuitive. Thanks in advance, Stevie sessionInfo() R version 4.3.0 (2023-04-21) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.6 LTS Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0 locale: [1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_AU.UTF-8 LC_COLLATE=en_AU.UTF-8 [5] LC_MONETARY=en_AU.UTF-8 LC_MESSAGES=en_AU.UTF-8 [7] LC_PAPER=en_AU.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C time zone: Australia/Adelaide tzcode source: system (glibc) attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_4.3.0 tools_4.3.0 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, I cannot reproduce this on Windows. hcl.colors(2, "Vik") # [1] "#002E60" "#3E2000" clrs <- sapply(hcl.pals(), \(p) hcl.colors(2, p)) any(apply(clrs, 2, \(x) x[1] == x[2])) # [1] FALSE sessionInfo() # R version 4.2.3 (2023-03-15 ucrt) # Platform: x86_64-w64-mingw32/x64 (64-bit) # Running under: Windows 10 x64 (build 22621) # # Matrix products: default # # locale: # [1] LC_COLLATE=Portuguese_Portugal.utf8 LC_CTYPE=Portuguese_Portugal.utf8 # [3] LC_MONETARY=Portuguese_Portugal.utf8 LC_NUMERIC=C # [5] LC_TIME=Portuguese_Portugal.utf8 # # attached base packages: # [1] stats graphics grDevices utils datasets methods base # # loaded via a namespace (and not attached): # [1] compiler_4.2.3 Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, Right! I ran the wrong R version, here it is with R 4.3.0. The bug is now reproducible on Windows 11. hcl.colors(2, "Vik") # [1] "#F1F1F1" "#F1F1F1" clrs <- sapply(hcl.pals(), \(p) hcl.colors(2, p)) any(apply(clrs, 2, \(x) x[1] == x[2])) # [1] TRUE sum(apply(clrs, 2, \(x) x[1] == x[2])) # [1] 35 which(apply(clrs, 2, \(x) x[1] == x[2])) # Blue-RedBlue-Red 2Blue-Red 3 Red-Green Purple-Green #8081828384 # Purple-Brown Green-Brown Blue-Yellow 2 Blue-Yellow 3 Green-Orange #8586878889 # Cyan-MagentaTropic Broc Cork Vik #9091929394 #BerlinLisbonTofino Earth Fall #95969799 100 #Geyser TealRose Temps PuOr RdBu # 101 102 103 104 105 # RdGy PiYG PRGn BrBGRdYlBu # 106 107 108 109 110 #RdYlGn Spectral Zissou 1 Cividis Roma # 111 112 113 114 115 sessionInfo() # R version 4.3.0 (2023-04-21 ucrt) # Platform: x86_64-w64
Re: [R] grDevices::hcl.colors using two colours: Bug or Feature?
Às 06:01 de 28/04/2023, Stevie Pederson escreveu: Hi, I'm not sure if this is a bug or a feature, but after updating to Rv4.3, if requesting two colours from hcl.colors() you now get the same colour twice. This occurs for all palettes I've tried. My reprex: hcl.colors(2, "Vik") [1] "#F1F1F1" "#F1F1F1" As I have multiple workflows I run repeatedly with A vs B comparisons, this has just broken the visualisations in many of them. Obviously a workaround is hcl.colors(3, "Vik")[c(1, 3)] but this seems rather unintuitive. Thanks in advance, Stevie sessionInfo() R version 4.3.0 (2023-04-21) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.6 LTS Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0 locale: [1] LC_CTYPE=en_AU.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_AU.UTF-8LC_COLLATE=en_AU.UTF-8 [5] LC_MONETARY=en_AU.UTF-8LC_MESSAGES=en_AU.UTF-8 [7] LC_PAPER=en_AU.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_AU.UTF-8 LC_IDENTIFICATION=C time zone: Australia/Adelaide tzcode source: system (glibc) attached base packages: [1] stats graphics grDevices utils datasets methods base loaded via a namespace (and not attached): [1] compiler_4.3.0 tools_4.3.0 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, I cannot reproduce this on Windows. hcl.colors(2, "Vik") # [1] "#002E60" "#3E2000" clrs <- sapply(hcl.pals(), \(p) hcl.colors(2, p)) any(apply(clrs, 2, \(x) x[1] == x[2])) # [1] FALSE sessionInfo() # R version 4.2.3 (2023-03-15 ucrt) # Platform: x86_64-w64-mingw32/x64 (64-bit) # Running under: Windows 10 x64 (build 22621) # # Matrix products: default # # locale: # [1] LC_COLLATE=Portuguese_Portugal.utf8 LC_CTYPE=Portuguese_Portugal.utf8 # [3] LC_MONETARY=Portuguese_Portugal.utf8 LC_NUMERIC=C # [5] LC_TIME=Portuguese_Portugal.utf8 # # attached base packages: # [1] stats graphics grDevices utils datasets methods base # # loaded via a namespace (and not attached): # [1] compiler_4.2.3 Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] detect and replace outliers by the averaged
Hello, Às 09:42 de 21/04/2023, Jeff Newmiller escreveu: 0 Somewhat cryptic... Rui Barradas On April 21, 2023 4:08:08 AM GMT+09:00, Dr Eberhard W Lisse wrote: There is at least one outliers package on CRAN. el On 20/04/2023 20:43, AbouEl-Makarim Aboueissa wrote: Dear All: *please discard my previous email* *Re:* detect and replace outliers by the average The dataset, please see attached, contains a group factoring column “ *factor*” and two columns of data “x1” and “x2” with some NA values. I need some help to detect the outliers and replace it and the NAs with the average within each level (0,1,2) for each variable “x1” and “x2”. I tried the below code, but it did not accomplish what I want to do. data<-read.csv("G:/20-Spring_2023/Outliers/data.csv", header=TRUE) data replace_outlier_with_mean <- function(x) { replace(x, x %in% boxplot.stats(x)$out, mean(x, na.rm=TRUE)) , na.rm=TRUE NOT working } data[] <- lapply(data, replace_outlier_with_mean) Thank you all very much for your help in advance. with many thanks abou __ *AbouEl-Makarim Aboueissa, PhD* *Professor, Mathematics and Statistics* *Graduate Coordinator* *Department of Mathematics and Statistics* *University of Southern Maine* __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] detect and replace outliers by the average
Às 19:58 de 20/04/2023, Rui Barradas escreveu: Às 19:46 de 20/04/2023, AbouEl-Makarim Aboueissa escreveu: Hi Rui: here is the dataset factor x1 x2 0 700 700 0 700 500 0 470 470 0 710 560 0 520 0 610 720 0 710 670 0 610 1 690 620 1 580 540 1 690 690 1 NA 401 1 450 580 1 700 700 1 400 1 600 1 500 400 1 680 650 2 117 63 2 120 68 2 130 73 2 120 69 2 125 54 2 999 70 2 165 62 2 130 987 2 123 70 2 78 2 98 2 5 2 321 NA with many thanks abou __ *AbouEl-Makarim Aboueissa, PhD* *Professor, Mathematics and Statistics* *Graduate Coordinator* *Department of Mathematics and Statistics* *University of Southern Maine* On Thu, Apr 20, 2023 at 2:44 PM Rui Barradas wrote: Às 19:36 de 20/04/2023, AbouEl-Makarim Aboueissa escreveu: Dear All: *Re:* detect and replace outliers by the average The dataset, please see attached, contains a group factoring column “ *factor*” and two columns of data “x1” and “x2” with some NA values. I need some help to detect the outliers and replace it and the NAs with the average within each level (0,1,2) for each variable “x1” and “x2”. I tried the below code, but it did not accomplish what I want to do. data<-read.csv("G:/20-Spring_2023/Outliers/data.csv", header=TRUE) data replace_outlier_with_mean <- function(x) { replace(x, x %in% boxplot.stats(x)$out, mean(x, na.rm=TRUE)) , na.rm=TRUE NOT working } data[] <- lapply(data, replace_outlier_with_mean) Thank you all very much for your help in advance. with many thanks abou __ *AbouEl-Makarim Aboueissa, PhD* *Professor, Mathematics and Statistics* *Graduate Coordinator* *Department of Mathematics and Statistics* *University of Southern Maine* __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, There is no data set attached, see the posting guide on what file extensions are allowed as attachments. As for the question, try to compute mean(x, na.rm = TRUE) first, then use this value in the replace instruction. Without data I'm just guessing. Hope this helps, Rui Barradas Hello, Here is a way. It uses ave in the function to group the data by the factor. df1 <- "factor x1 x2 0 700 700 0 700 500 0 470 470 0 710 560 0 520 0 610 720 0 710 670 0 610 1 690 620 1 580 540 1 690 690 1 NA 401 1 450 580 1 700 700 1 400 1 600 1 500 400 1 680 650 2 117 63 2 120 68 2 130 73 2 120 69 2 125 54 2 999 70 2 165 62 2 130 987 2 123 70 2 78 NA 2 98 NA 2 5 NA 2 321 NA" df1 <- read.table(text = df1, header = TRUE, colClasses = c("factor", "numeric", "numeric")) replace_outlier_with_mean <- function(x, f) { ave(x, f, FUN = \(y) { i <- is.na(y) | y %in% boxplot.stats(y, do.conf = FALSE)$out y[i] <- mean(y, na.rm = TRUE) y }) } lapply(df1[-1], replace_outlier_with_mean, f = df1$factor) #> $x1 #> [1] 700. 700. 470. 710. 1258.1250 610. 710. #> [8] 610. 690. 580. 690. 1261.7778 450. 700. #> [15] 400. 1261.7778 500. 680. 117. 120. 130. #> [22] 120. 125. 194.6923 194.6923 130. 123. 194.6923 #> [29] 98. 194.6923 194.6923 #> #> $x2 #> [1] 700. 500. 470. 560. 520. 720. 670. #> [8] 1767.3750 620. 540. 690. 401. 580. 700. #> [15] 1406.9000 600. 400. 650. 63. 68. 73. #> [22] 69. 54. 70. 62. 168. 70. 168. #> [29] 168. 168. 168. Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, A simpler version of the same function, this time with replace(), like the OP. The results are identical(). replace_outlier_with_mean <- function(x, f) { ave(x, f, FUN = \(y) { i <- is.na(y) | y %in% boxplot.stats(y, do.conf = FALSE)$out replace(y, i, mean(y, na.rm = TRUE)) }) } Also, my data copy from a previous mail, is wrong, there are 3 NA's in the wrong column. The following is better. df1 <- read.table("data.txt", header = TRUE, sep = "\t", colClasses = c("factor", "numeric", "numeric")) Hope this helps, Rui Barradas __ R-he
Re: [R] detect and replace outliers by the average
Às 19:46 de 20/04/2023, AbouEl-Makarim Aboueissa escreveu: Hi Rui: here is the dataset factor x1 x2 0 700 700 0 700 500 0 470 470 0 710 560 0 520 0 610 720 0 710 670 0 610 1 690 620 1 580 540 1 690 690 1 NA 401 1 450 580 1 700 700 1 400 1 600 1 500 400 1 680 650 2 117 63 2 120 68 2 130 73 2 120 69 2 125 54 2 999 70 2 165 62 2 130 987 2 123 70 2 78 2 98 2 5 2 321 NA with many thanks abou __ *AbouEl-Makarim Aboueissa, PhD* *Professor, Mathematics and Statistics* *Graduate Coordinator* *Department of Mathematics and Statistics* *University of Southern Maine* On Thu, Apr 20, 2023 at 2:44 PM Rui Barradas wrote: Às 19:36 de 20/04/2023, AbouEl-Makarim Aboueissa escreveu: Dear All: *Re:* detect and replace outliers by the average The dataset, please see attached, contains a group factoring column “ *factor*” and two columns of data “x1” and “x2” with some NA values. I need some help to detect the outliers and replace it and the NAs with the average within each level (0,1,2) for each variable “x1” and “x2”. I tried the below code, but it did not accomplish what I want to do. data<-read.csv("G:/20-Spring_2023/Outliers/data.csv", header=TRUE) data replace_outlier_with_mean <- function(x) { replace(x, x %in% boxplot.stats(x)$out, mean(x, na.rm=TRUE)) , na.rm=TRUE NOT working } data[] <- lapply(data, replace_outlier_with_mean) Thank you all very much for your help in advance. with many thanks abou __ *AbouEl-Makarim Aboueissa, PhD* *Professor, Mathematics and Statistics* *Graduate Coordinator* *Department of Mathematics and Statistics* *University of Southern Maine* __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. Hello, There is no data set attached, see the posting guide on what file extensions are allowed as attachments. As for the question, try to compute mean(x, na.rm = TRUE) first, then use this value in the replace instruction. Without data I'm just guessing. Hope this helps, Rui Barradas Hello, Here is a way. It uses ave in the function to group the data by the factor. df1 <- "factor x1 x2 0 700 700 0 700 500 0 470 470 0 710 560 0 520 0 610 720 0 710 670 0 610 1 690 620 1 580 540 1 690 690 1 NA 401 1 450 580 1 700 700 1 400 1 600 1 500 400 1 680 650 2 117 63 2 120 68 2 130 73 2 120 69 2 125 54 2 999 70 2 165 62 2 130 987 2 123 70 2 78 NA 2 98 NA 2 5 NA 2 321 NA" df1 <- read.table(text = df1, header = TRUE, colClasses = c("factor", "numeric", "numeric")) replace_outlier_with_mean <- function(x, f) { ave(x, f, FUN = \(y) { i <- is.na(y) | y %in% boxplot.stats(y, do.conf = FALSE)$out y[i] <- mean(y, na.rm = TRUE) y }) } lapply(df1[-1], replace_outlier_with_mean, f = df1$factor) #> $x1 #> [1] 700. 700. 470. 710. 1258.1250 610. 710. #> [8] 610. 690. 580. 690. 1261.7778 450. 700. #> [15] 400. 1261.7778 500. 680. 117. 120. 130. #> [22] 120. 125. 194.6923 194.6923 130. 123. 194.6923 #> [29] 98. 194.6923 194.6923 #> #> $x2 #> [1] 700. 500. 470. 560. 520. 720. 670. #> [8] 1767.3750 620. 540. 690. 401. 580. 700. #> [15] 1406.9000 600. 400. 650. 63. 68. 73. #> [22] 69. 54. 70. 62. 168. 70. 168. #> [29] 168. 168. 168. Hope this helps, Rui Barradas __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.