Re: [R] Unexpected date format coercion

2021-07-01 Thread Enrico Schumann
On Thu, 01 Jul 2021, Jeremie Juste writes:

> Hello 
>
> On Thursday,  1 Jul 2021 at 08:25, PIKAL Petr wrote:
>> Hm.
>>
>> Seems to me, that both your codes are wrong but printing in Linux is
>> different from Windows.
>>
>> With
>> as.Date("20-12-2020","%Y-%m-%d")
>> you say that 20 is year (actually year 20) and 2020 is day and only first
>> two values are taken (but with some valueas result is NA)
>>
>> I can confirm 4.0.3 in Windows behaves this way too.
>>> as.Date("20-12-2020","%Y-%m-%d")
>> [1] "0020-12-20"
>
> Many thanks for confirming this.
>
>
> On Thursday,  1 Jul 2021 at 18:22, Jim Lemon wrote:
>> Hi Jeremie,
>> Try:
>>
>> as.Date("20-12-2020","%y-%m-%d")
>> [1] "2020-12-20"
>
> Thanks for this info. I'm looking for something that produce NA if the
> date is not exactly in the specified format so that it can be
> corrected. I was relying on the format parameter of the date for that.
>
> The issue is that there can be so many variations in date format that for the 
> time
> being I still find it easier to delegate the correction to the user. A
> particular nasty case is when there are multiple date format in the same
> column.
>
>
> Best regards,
> Jeremie
>

You could explicitly test whether the specified format
is as expcected, perhaps with a regex such as

s <- c("2020-01-20", "20-12-2020")
grepl("^[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]$", s)

and/or by checking the components of the dates:

valid_Date <- function(s) {
tmp <- strsplit(s, "[-]")

year <- as.numeric(sapply(tmp, `[[`, 1))
valid.year <- year < 2500 & year > 1800

month <- as.numeric(sapply(tmp, `[[`, 2))
valid.month <- month >= 0 & month <= 12

day <- as.numeric(sapply(tmp, `[[`, 3))
valid.day <- day >= 1 & day <= 31

ans <- as.Date(s)
ans[!(valid.year & valid.month & valid.day)] <- NA
ans
}



-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected date format coercion

2021-07-01 Thread PIKAL Petr
Hi

Maybe you could inspect

parse_date_time {lubridate} R Documentation

from package lubridate.

Or see answers here

https://stackoverflow.com/questions/25463523/how-to-convert-variable-with-mixed-date-formats-to-one-format

Cheers
Petr

> -Original Message-
> From: Jeremie Juste 
> Sent: Thursday, July 1, 2021 11:00 AM
> To: PIKAL Petr 
> Cc: r-help 
> Subject: Re: [R] Unexpected date format coercion
>
> Hello
>
> On Thursday,  1 Jul 2021 at 08:25, PIKAL Petr wrote:
> > Hm.
> >
> > Seems to me, that both your codes are wrong but printing in Linux is
> > different from Windows.
> >
> > With
> > as.Date("20-12-2020","%Y-%m-%d")
> > you say that 20 is year (actually year 20) and 2020 is day and only
> > first two values are taken (but with some valueas result is NA)
> >
> > I can confirm 4.0.3 in Windows behaves this way too.
> >> as.Date("20-12-2020","%Y-%m-%d")
> > [1] "0020-12-20"
>
> Many thanks for confirming this.
>
>
> On Thursday,  1 Jul 2021 at 18:22, Jim Lemon wrote:
> > Hi Jeremie,
> > Try:
> >
> > as.Date("20-12-2020","%y-%m-%d")
> > [1] "2020-12-20"
>
> Thanks for this info. I'm looking for something that produce NA if the date 
> is
> not exactly in the specified format so that it can be corrected. I was 
> relying on
> the format parameter of the date for that.
>
> The issue is that there can be so many variations in date format that for 
> the
> time being I still find it easier to delegate the correction to the user. A
> particular nasty case is when there are multiple date format in the same
> column.
>
>
> Best regards,
> Jeremie

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected date format coercion

2021-07-01 Thread Jeremie Juste
Hello 

On Thursday,  1 Jul 2021 at 08:25, PIKAL Petr wrote:
> Hm.
>
> Seems to me, that both your codes are wrong but printing in Linux is
> different from Windows.
>
> With
> as.Date("20-12-2020","%Y-%m-%d")
> you say that 20 is year (actually year 20) and 2020 is day and only first
> two values are taken (but with some valueas result is NA)
>
> I can confirm 4.0.3 in Windows behaves this way too.
>> as.Date("20-12-2020","%Y-%m-%d")
> [1] "0020-12-20"

Many thanks for confirming this.


On Thursday,  1 Jul 2021 at 18:22, Jim Lemon wrote:
> Hi Jeremie,
> Try:
>
> as.Date("20-12-2020","%y-%m-%d")
> [1] "2020-12-20"

Thanks for this info. I'm looking for something that produce NA if the
date is not exactly in the specified format so that it can be
corrected. I was relying on the format parameter of the date for that.

The issue is that there can be so many variations in date format that for the 
time
being I still find it easier to delegate the correction to the user. A
particular nasty case is when there are multiple date format in the same
column.


Best regards,
Jeremie

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected date format coercion

2021-07-01 Thread PIKAL Petr
Hm.

Seems to me, that both your codes are wrong but printing in Linux is
different from Windows.

With
as.Date("20-12-2020","%Y-%m-%d")
you say that 20 is year (actually year 20) and 2020 is day and only first
two values are taken (but with some valueas result is NA)

I can confirm 4.0.3 in Windows behaves this way too.
> as.Date("20-12-2020","%Y-%m-%d")
[1] "0020-12-20"

Cheers
Petr


> -Original Message-
> From: R-help  On Behalf Of Jeremie Juste
> Sent: Thursday, July 1, 2021 10:06 AM
> To: r-help 
> Subject: [R] Unexpected date format coercion
> 
> Hello,
> 
> I have been surprised when converting a character string to a date with
the
> following format,
> 
> in R 4.1.0 (linux debian 10)
> 
> as.Date("20-12-2020","%Y-%m-%d")
> [1] "20-12-20"
> 
> in R 4.0.5 (window 10)
> 
> as.Date("20-12-2020","%Y-%m-%d")
> [1] "0020-12-20"
> 
> 
> Here I was expecting a blunt and sharp NA, am I missing something?
> 
> Best regards,
> Jeremie
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected date format coercion

2021-07-01 Thread Jim Lemon
Hi Jeremie,
Try:

as.Date("20-12-2020","%y-%m-%d")
[1] "2020-12-20"

Jim

On Thu, Jul 1, 2021 at 6:16 PM Jeremie Juste  wrote:
>
> Hello,
>
> I have been surprised when converting a character string to a date with the 
> following
> format,
>
> in R 4.1.0 (linux debian 10)
>
> as.Date("20-12-2020","%Y-%m-%d")
> [1] "20-12-20"
>
> in R 4.0.5 (window 10)
>
> as.Date("20-12-2020","%Y-%m-%d")
> [1] "0020-12-20"
>
>
> Here I was expecting a blunt and sharp NA, am I missing something?
>
> Best regards,
> Jeremie
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected date format coercion

2021-07-01 Thread Uwe Ligges




On 01.07.2021 10:06, Jeremie Juste wrote:

Hello,

I have been surprised when converting a character string to a date with the 
following
format,

in R 4.1.0 (linux debian 10)

as.Date("20-12-2020","%Y-%m-%d")
[1] "20-12-20"

in R 4.0.5 (window 10)

as.Date("20-12-2020","%Y-%m-%d")
[1] "0020-12-20"


Yes, it is rather strange to specify "2020" as the day and "20" as the 
4digits year, so different implementations may print the year in 2 or 4 
digits. What you want is actually


as.Date("20-12-2020","%d-%m-%Y")


Best,
Uwe Ligges









Here I was expecting a blunt and sharp NA, am I missing something?

Best regards,
Jeremie

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Unexpected date format coercion

2021-07-01 Thread Jeremie Juste
Hello,

I have been surprised when converting a character string to a date with the 
following
format,

in R 4.1.0 (linux debian 10)

as.Date("20-12-2020","%Y-%m-%d")
[1] "20-12-20"

in R 4.0.5 (window 10)

as.Date("20-12-2020","%Y-%m-%d")
[1] "0020-12-20"


Here I was expecting a blunt and sharp NA, am I missing something?

Best regards,
Jeremie

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.