Re: [R] Unexpected date format coercion
On Thu, 01 Jul 2021, Jeremie Juste writes: > Hello > > On Thursday, 1 Jul 2021 at 08:25, PIKAL Petr wrote: >> Hm. >> >> Seems to me, that both your codes are wrong but printing in Linux is >> different from Windows. >> >> With >> as.Date("20-12-2020","%Y-%m-%d") >> you say that 20 is year (actually year 20) and 2020 is day and only first >> two values are taken (but with some valueas result is NA) >> >> I can confirm 4.0.3 in Windows behaves this way too. >>> as.Date("20-12-2020","%Y-%m-%d") >> [1] "0020-12-20" > > Many thanks for confirming this. > > > On Thursday, 1 Jul 2021 at 18:22, Jim Lemon wrote: >> Hi Jeremie, >> Try: >> >> as.Date("20-12-2020","%y-%m-%d") >> [1] "2020-12-20" > > Thanks for this info. I'm looking for something that produce NA if the > date is not exactly in the specified format so that it can be > corrected. I was relying on the format parameter of the date for that. > > The issue is that there can be so many variations in date format that for the > time > being I still find it easier to delegate the correction to the user. A > particular nasty case is when there are multiple date format in the same > column. > > > Best regards, > Jeremie > You could explicitly test whether the specified format is as expcected, perhaps with a regex such as s <- c("2020-01-20", "20-12-2020") grepl("^[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]$", s) and/or by checking the components of the dates: valid_Date <- function(s) { tmp <- strsplit(s, "[-]") year <- as.numeric(sapply(tmp, `[[`, 1)) valid.year <- year < 2500 & year > 1800 month <- as.numeric(sapply(tmp, `[[`, 2)) valid.month <- month >= 0 & month <= 12 day <- as.numeric(sapply(tmp, `[[`, 3)) valid.day <- day >= 1 & day <= 31 ans <- as.Date(s) ans[!(valid.year & valid.month & valid.day)] <- NA ans } -- Enrico Schumann Lucerne, Switzerland http://enricoschumann.net __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unexpected date format coercion
Hi Maybe you could inspect parse_date_time {lubridate} R Documentation from package lubridate. Or see answers here https://stackoverflow.com/questions/25463523/how-to-convert-variable-with-mixed-date-formats-to-one-format Cheers Petr > -Original Message- > From: Jeremie Juste > Sent: Thursday, July 1, 2021 11:00 AM > To: PIKAL Petr > Cc: r-help > Subject: Re: [R] Unexpected date format coercion > > Hello > > On Thursday, 1 Jul 2021 at 08:25, PIKAL Petr wrote: > > Hm. > > > > Seems to me, that both your codes are wrong but printing in Linux is > > different from Windows. > > > > With > > as.Date("20-12-2020","%Y-%m-%d") > > you say that 20 is year (actually year 20) and 2020 is day and only > > first two values are taken (but with some valueas result is NA) > > > > I can confirm 4.0.3 in Windows behaves this way too. > >> as.Date("20-12-2020","%Y-%m-%d") > > [1] "0020-12-20" > > Many thanks for confirming this. > > > On Thursday, 1 Jul 2021 at 18:22, Jim Lemon wrote: > > Hi Jeremie, > > Try: > > > > as.Date("20-12-2020","%y-%m-%d") > > [1] "2020-12-20" > > Thanks for this info. I'm looking for something that produce NA if the date > is > not exactly in the specified format so that it can be corrected. I was > relying on > the format parameter of the date for that. > > The issue is that there can be so many variations in date format that for > the > time being I still find it easier to delegate the correction to the user. A > particular nasty case is when there are multiple date format in the same > column. > > > Best regards, > Jeremie __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unexpected date format coercion
Hello On Thursday, 1 Jul 2021 at 08:25, PIKAL Petr wrote: > Hm. > > Seems to me, that both your codes are wrong but printing in Linux is > different from Windows. > > With > as.Date("20-12-2020","%Y-%m-%d") > you say that 20 is year (actually year 20) and 2020 is day and only first > two values are taken (but with some valueas result is NA) > > I can confirm 4.0.3 in Windows behaves this way too. >> as.Date("20-12-2020","%Y-%m-%d") > [1] "0020-12-20" Many thanks for confirming this. On Thursday, 1 Jul 2021 at 18:22, Jim Lemon wrote: > Hi Jeremie, > Try: > > as.Date("20-12-2020","%y-%m-%d") > [1] "2020-12-20" Thanks for this info. I'm looking for something that produce NA if the date is not exactly in the specified format so that it can be corrected. I was relying on the format parameter of the date for that. The issue is that there can be so many variations in date format that for the time being I still find it easier to delegate the correction to the user. A particular nasty case is when there are multiple date format in the same column. Best regards, Jeremie __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unexpected date format coercion
Hm. Seems to me, that both your codes are wrong but printing in Linux is different from Windows. With as.Date("20-12-2020","%Y-%m-%d") you say that 20 is year (actually year 20) and 2020 is day and only first two values are taken (but with some valueas result is NA) I can confirm 4.0.3 in Windows behaves this way too. > as.Date("20-12-2020","%Y-%m-%d") [1] "0020-12-20" Cheers Petr > -Original Message- > From: R-help On Behalf Of Jeremie Juste > Sent: Thursday, July 1, 2021 10:06 AM > To: r-help > Subject: [R] Unexpected date format coercion > > Hello, > > I have been surprised when converting a character string to a date with the > following format, > > in R 4.1.0 (linux debian 10) > > as.Date("20-12-2020","%Y-%m-%d") > [1] "20-12-20" > > in R 4.0.5 (window 10) > > as.Date("20-12-2020","%Y-%m-%d") > [1] "0020-12-20" > > > Here I was expecting a blunt and sharp NA, am I missing something? > > Best regards, > Jeremie > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unexpected date format coercion
Hi Jeremie, Try: as.Date("20-12-2020","%y-%m-%d") [1] "2020-12-20" Jim On Thu, Jul 1, 2021 at 6:16 PM Jeremie Juste wrote: > > Hello, > > I have been surprised when converting a character string to a date with the > following > format, > > in R 4.1.0 (linux debian 10) > > as.Date("20-12-2020","%Y-%m-%d") > [1] "20-12-20" > > in R 4.0.5 (window 10) > > as.Date("20-12-2020","%Y-%m-%d") > [1] "0020-12-20" > > > Here I was expecting a blunt and sharp NA, am I missing something? > > Best regards, > Jeremie > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Unexpected date format coercion
On 01.07.2021 10:06, Jeremie Juste wrote: Hello, I have been surprised when converting a character string to a date with the following format, in R 4.1.0 (linux debian 10) as.Date("20-12-2020","%Y-%m-%d") [1] "20-12-20" in R 4.0.5 (window 10) as.Date("20-12-2020","%Y-%m-%d") [1] "0020-12-20" Yes, it is rather strange to specify "2020" as the day and "20" as the 4digits year, so different implementations may print the year in 2 or 4 digits. What you want is actually as.Date("20-12-2020","%d-%m-%Y") Best, Uwe Ligges Here I was expecting a blunt and sharp NA, am I missing something? Best regards, Jeremie __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Unexpected date format coercion
Hello, I have been surprised when converting a character string to a date with the following format, in R 4.1.0 (linux debian 10) as.Date("20-12-2020","%Y-%m-%d") [1] "20-12-20" in R 4.0.5 (window 10) as.Date("20-12-2020","%Y-%m-%d") [1] "0020-12-20" Here I was expecting a blunt and sharp NA, am I missing something? Best regards, Jeremie __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.