Re: [Rd] R-4.3 version list.files function could not work correctly in chinese

2023-08-11 Thread Yihui Xie
Yes, I participated in the discussion. Basically dir() failed to list all
files since R 4.3.0 when filenames start with Chinese characters. I don't
have a Windows machine to test it, but this might be a minimal reproducible
example:

file.create("常用代码.R")
dir()

The OP said dir() would return "常用代码.R" in R.4.2.2 but not in R 4.3.0. In
the same discussion another person mentioned that the problem could also be
related to the file encoding, i.e., if the file content is encoded in
UTF-8, it could be recognized by dir(), but not in ANSI.

Regards,
Yihui
--
https://yihui.org


On Fri, Aug 11, 2023 at 6:25 AM Ivan Krylov  wrote:

> Dear 叶月光,
>
> Thank you for your message, but please follow the posting guide in your
> future messages: https://www.r-project.org/posting-guide.html
> https://www.r-project.org/bugs.html
>
> I understand from your link that list.files() ends up skipping some
> Chinese filenames in R-4.3.1 (but not R-4.2.2) on Windows, but would you
> (or perhaps Yihui Xie who I see is also participating in the discussion)
> mind translating the rest of your findings into English? Have you been
> able to narrow down the problem to certain character ranges, for
> example?
>
> --
> Best regards,
> Ivan
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] R 4.3: Change in behaviour of as.character.POSIXt for datetime values with midnight time

2023-08-11 Thread Andy Teucher
I understand that `as.character.POSIXt()` had an overhaul in R 4.3 
(https://github.com/wch/r-source/commit/f6fd993f8a2f799a56dbecbd8238f155191fc31b),
 and I have come across a new behaviour and I wonder if it is unintended?

When calling `as.character.POSIXt()` on a vector that contains elements where 
the time component is midnight (00:00:00), it drops the time component of that 
element in the resulting character vector. Previously the time component was 
retained: 

In R 4.2.3:

```
R.version$version.string
#> [1] "R version 4.2.3 (2023-03-15)"

(t <- as.POSIXct(c("1975-01-01 00:00:00", "1975-01-01 15:27:00")))
#> [1] "1975-01-01 00:00:00 PST" "1975-01-01 15:27:00 PST”

(tc <- as.character(t))
#> [1] "1975-01-01 00:00:00" "1975-01-01 15:27:00”
```

In R 4.3.1:

```
R.version$version.string
#> [1] "R version 4.3.1 (2023-06-16)"

(t <- as.POSIXct(c("1975-01-01 00:00:00", "1975-01-01 15:27:00")))
#> [1] "1975-01-01 00:00:00 PST" "1975-01-01 15:27:00 PST”

(tc <- as.character(t))
#> [1] "1975-01-01" "1975-01-01 15:27:00”
```

This has consequences when round-tripping from POSIXt -> character -> POSIXt, 
since `as.POSIXct.character()` drops the time component from the entire vector 
if any element does not have a time component:

In R 4.2.3:

```
R.version$version.string
#> [1] "R version 4.2.3 (2023-03-15)"

(t <- as.POSIXct(c("1975-01-01 00:00:00", "1975-01-01 15:27:00")))
#> [1] "1975-01-01 00:00:00 PST" "1975-01-01 15:27:00 PST”

(tc <- as.character(t))
#> [1] "1975-01-01 00:00:00" "1975-01-01 15:27:00”

as.POSIXct(tc)
#> [1] "1975-01-01 00:00:00 PST" "1975-01-01 15:27:00 PST”
```

In R 4.3.1:

```
R.version$version.string
#> [1] "R version 4.3.1 (2023-06-16)”

(t <- as.POSIXct(c("1975-01-01 00:00:00", "1975-01-01 15:27:00")))
#> [1] "1975-01-01 00:00:00 PST" "1975-01-01 15:27:00 PST”

(tc <- as.character(t))
#> [1] "1975-01-01" "1975-01-01 15:27:00”

as.POSIXct(tc)
#> [1] "1975-01-01 PST" "1975-01-01 PST”
```

`format.POSIXt()` retains its old behaviour in R 4.3:

```
R.version$version.string
#> [1] "R version 4.2.3 (2023-03-15)"

(t <- as.POSIXct(c("1975-01-01 00:00:00", "1975-01-01 15:27:00")))
#> [1] "1975-01-01 00:00:00 PST" "1975-01-01 15:27:00 PST”

(tf <- format(t))
#> [1] "1975-01-01 00:00:00" "1975-01-01 15:27:00”

as.POSIXct(tf)
#> [1] "1975-01-01 00:00:00 PST" "1975-01-01 15:27:00 PST”
```

```
R.version$version.string
#> [1] "R version 4.3.1 (2023-06-16)"

(t <- as.POSIXct(c("1975-01-01 00:00:00", "1975-01-01 15:27:00")))
#> [1] "1975-01-01 00:00:00 PST" "1975-01-01 15:27:00 PST”

(tf <- format(t))
#> [1] "1975-01-01 00:00:00" "1975-01-01 15:27:00”

as.POSIXct(tf)
#> [1] "1975-01-01 00:00:00 PST" "1975-01-01 15:27:00 PST”
```

And finally, the behaviour of `as.POSIXct.character()` has not changed (it 
previously did, and still does, drop the time component from all elements when 
any element has no time):

```R.version$version.string
#> [1] "R version 4.2.3 (2023-03-15)"

as.POSIXct(c("1975-01-01", "1975-01-01 15:27:00"))
#> [1] "1975-01-01 PST" "1975-01-01 PST”
```

```R.version$version.string
#> [1] "R version 4.3.1 (2023-06-16)"

as.POSIXct(c("1975-01-01", "1975-01-01 15:27:00"))
#> [1] "1975-01-01 PST" "1975-01-01 PST”
```

I don’t know if this is a bug/regression in `as.character.POSIXt()`, or 
intended behaviour. If it is intended, I think it would benefit from some more 
comprehensive documentation.

Thanks very much,
Andy Teucher

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Improving user-friendliness of S4 dispatch failure when mis-naming arguments?

2023-08-11 Thread Michael Chirico via R-devel
> I'm not entirely sure about the extra call. = FALSE

My thinking is the signature of the internal .InheritForDispatch() is
not likely to help anyone,
in fact having the opposite effect for beginners unsure how to use that info.

> Now I'd like to find an example that only uses packages with priority base | 
> Recommended

Sure, here are a few.

library(Matrix)
# searching for Matrix-owned generics
matrix_generics <- getGenerics(where = asNamespace("Matrix"))
matrix_generics@.Data[matrix_generics@package == "Matrix"]

# simple signature, one argument 'x'
symmpart()
# Error: unable to find an inherited method for function ‘symmpart’
for signature ‘x="missing"’

# more complicated signature, especially including ...
Cholesky(a = 1)
# Error: unable to find an inherited method for function ‘Cholesky’
for signature ‘A="missing"’
Cholesky(a = 1, perm = TRUE)
# Error: unable to find an inherited method for function ‘Cholesky’
for signature ‘A="missing"’
Cholesky(a = 1, perm = TRUE, IMult = 2)
# Error: unable to find an inherited method for function ‘Cholesky’
for signature ‘A="missing"’

---

'base' is a bit harder since stats4 just provides classes over stats
functions, so the missigness error comes from non-S4 code.

library(stats4)
coef()
# Error in coef.default() : argument "object" is missing, with no default

Defining our own generic:

setGeneric("BaseGeneric", \(a, ...) standardGeneric("BaseGeneric"))
BaseGeneric()
# Error: unable to find an inherited method for function ‘BaseGeneric’
for signature ‘a="missing"’

# getting multiple classes to show up requires setting the signature:
setMethod("BaseGeneric", signature(x = "double", y = "double"), \(x,
y, ...) x + y)
BaseGeneric(X = 1, Y = 2)
# Error: unable to find an inherited method for function ‘BaseGeneric’
for signature ‘x="missing", y="missing"’


On Fri, Aug 11, 2023 at 2:26 AM Martin Maechler
 wrote:
>
> > Michael Chirico via R-devel
> > on Thu, 10 Aug 2023 23:56:45 -0700 writes:
>
> > Here's a trivial patch that offers some improvement:
>
> > Index: src/library/methods/R/methodsTable.R
> > ===
> > --- src/library/methods/R/methodsTable.R (revision 84931)
> > +++ src/library/methods/R/methodsTable.R (working copy)
> > @@ -752,11 +752,12 @@
> > if(length(methods) == 1L)
> > return(methods[[1L]]) # the method
> > else if(length(methods) == 0L) {
> > -cnames <- paste0("\"", vapply(classes, as.character, ""), "\"",
> > +cnames <- paste0(head(fdef@signature, length(classes)), "=\"",
> > vapply(classes, as.character, ""), "\"",
> > collapse = ", ")
> > stop(gettextf("unable to find an inherited method for function %s
> > for signature %s",
> > sQuote(fdef@generic),
> > sQuote(cnames)),
> > + call. = FALSE,
> > domain = NA)
> > }
> > else
>
> > Here's the upshot for the example on DBI:
>
> > dbGetQuery(connection = conn, query = query)
> > Error: unable to find an inherited method for function ‘dbGetQuery’
> > for signature ‘conn="missing", statement="missing"’
>
> > I don't have any confidence about edge cases / robustness of this
> > patch for generic S4 use cases (make check-all seems fine),
>
> Good you checked, but you are right that that's not all enough to be sure:
>
> Checking error *messages* is not something we do often {not
> the least because you'd need to consider message translations
> and hence ensure you only check in case of English ...}.
>
> > but I don't suppose a full patch would be dramatically different from 
> the
> > above.
>
> I agree: The patch looks to make sense to me, too,
> while I'm not entirely sure about the extra   call. = FALSE
>  (which I of course understand you'd prefer for the above example)
>
> Now I'd like to find an example that only uses packages with priority
>  base | Recommended
>
> Martin
>
> --
> Martin Maechler
> ETH Zurich  and  R Core team
>
>
> > Mike C
>
> > On Thu, Aug 10, 2023 at 12:39 PM Gabriel Becker  
> wrote:
> >>
> >> I just want to add my 2 cents that I think it would be very useful and 
> beneficial to improve S4 to surface that information as well.
> >>
> >> More information about the way that the dispatch failed would be of 
> great help in situations like the one Michael pointed out.
> >>
> >> ~G
> >>
> >> On Thu, Aug 10, 2023 at 9:59 AM Michael Chirico via R-devel 
>  wrote:
> >>>
> >>> I forwarded that along to the original reporter with positive feedback
> >>> -- including the argument names is definitely a big help for cuing
> >>> what exactly is missing.
> >>>
> >>> Would a patch to do something similar for S4 be useful?
> >>>
> >>> On Thu, Aug 10, 2023 at 6:46 AM Hadley Wickham  
> wrote:
> >>> >
> >>> > Hi Michael,
> >>> >
> >>> > I can't help with S4, but I can help to make sure this isn't a 
> pr

Re: [Rd] R-4.3 version list.files function could not work correctly in chinese

2023-08-11 Thread Ivan Krylov
Dear 叶月光,

Thank you for your message, but please follow the posting guide in your
future messages: https://www.r-project.org/posting-guide.html
https://www.r-project.org/bugs.html

I understand from your link that list.files() ends up skipping some
Chinese filenames in R-4.3.1 (but not R-4.2.2) on Windows, but would you
(or perhaps Yihui Xie who I see is also participating in the discussion)
mind translating the rest of your findings into English? Have you been
able to narrow down the problem to certain character ranges, for
example?

-- 
Best regards,
Ivan

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] R-4.3 version list.files function could not work correctly in chinese

2023-08-11 Thread 叶月光
 ��ã�
  
R-4.3�汾�°汾��R��list.files�ȡȫ���ļ���Ϣ��BUG�ж���û��ᵽ��⣬ϣ��Ľ�
r4.3�汾��dir�ȡȫ���ļ� - COS��̳ | ͳ��֮�� | ͳ��ݿ�ѧ��̳ 
(cosx.org)
  ���ļ��б��ȡ��ʱ��

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Improving user-friendliness of S4 dispatch failure when mis-naming arguments?

2023-08-11 Thread Martin Maechler
> Michael Chirico via R-devel 
> on Thu, 10 Aug 2023 23:56:45 -0700 writes:

> Here's a trivial patch that offers some improvement:

> Index: src/library/methods/R/methodsTable.R
> ===
> --- src/library/methods/R/methodsTable.R (revision 84931)
> +++ src/library/methods/R/methodsTable.R (working copy)
> @@ -752,11 +752,12 @@
> if(length(methods) == 1L)
> return(methods[[1L]]) # the method
> else if(length(methods) == 0L) {
> -cnames <- paste0("\"", vapply(classes, as.character, ""), "\"",
> +cnames <- paste0(head(fdef@signature, length(classes)), "=\"",
> vapply(classes, as.character, ""), "\"",
> collapse = ", ")
> stop(gettextf("unable to find an inherited method for function %s
> for signature %s",
> sQuote(fdef@generic),
> sQuote(cnames)),
> + call. = FALSE,
> domain = NA)
> }
> else

> Here's the upshot for the example on DBI:

> dbGetQuery(connection = conn, query = query)
> Error: unable to find an inherited method for function ‘dbGetQuery’
> for signature ‘conn="missing", statement="missing"’

> I don't have any confidence about edge cases / robustness of this
> patch for generic S4 use cases (make check-all seems fine),

Good you checked, but you are right that that's not all enough to be sure:

Checking error *messages* is not something we do often {not
the least because you'd need to consider message translations
and hence ensure you only check in case of English ...}.

> but I don't suppose a full patch would be dramatically different from the
> above.

I agree: The patch looks to make sense to me, too,
while I'm not entirely sure about the extra   call. = FALSE
 (which I of course understand you'd prefer for the above example)

Now I'd like to find an example that only uses packages with priority
 base | Recommended

Martin

--
Martin Maechler
ETH Zurich  and  R Core team


> Mike C

> On Thu, Aug 10, 2023 at 12:39 PM Gabriel Becker  
wrote:
>> 
>> I just want to add my 2 cents that I think it would be very useful and 
beneficial to improve S4 to surface that information as well.
>> 
>> More information about the way that the dispatch failed would be of 
great help in situations like the one Michael pointed out.
>> 
>> ~G
>> 
>> On Thu, Aug 10, 2023 at 9:59 AM Michael Chirico via R-devel 
 wrote:
>>> 
>>> I forwarded that along to the original reporter with positive feedback
>>> -- including the argument names is definitely a big help for cuing
>>> what exactly is missing.
>>> 
>>> Would a patch to do something similar for S4 be useful?
>>> 
>>> On Thu, Aug 10, 2023 at 6:46 AM Hadley Wickham  
wrote:
>>> >
>>> > Hi Michael,
>>> >
>>> > I can't help with S4, but I can help to make sure this isn't a problem
>>> > with S7. What do you think of the current error message? Do you see
>>> > anything obvious we could do to improve?
>>> >
>>> > library(S7)
>>> >
>>> > dbGetQuery <- new_generic("dbGetQuery", c("conn", "statement"))
>>> > dbGetQuery(connection = NULL, query = NULL)
>>> > #> Error: Can't find method for generic `dbGetQuery(conn, statement)`:
>>> > #> - conn : MISSING
>>> > #> - statement: MISSING
>>> >
>>> > Hadley
>>> >
>>> > On Wed, Aug 9, 2023 at 10:02 PM Michael Chirico via R-devel
>>> >  wrote:
>>> > >
>>> > > I fielded a debugging request from a non-expert user today. At root
>>> > > was running the following:
>>> > >
>>> > > dbGetQuery(connection = conn, query = query)
>>> > >
>>> > > The problem is that they've named the arguments incorrectly -- it
>>> > > should have been [1]:
>>> > >
>>> > > dbGetQuery(conn = conn, statement = query)
>>> > >
>>> > > The problem is that the error message "looks" highly confusing to 
the
>>> > > untrained eye:
>>> > >
>>> > > Error in (function (classes, fdef, mtable)  :   unable to find an
>>> > > inherited method for function ‘dbGetQuery’ for signature ‘"missing",
>>> > > "missing"’
>>> > >
>>> > > In retrospect, of course, this makes sense -- the mis-named 
arguments
>>> > > are getting picked up by '...', leaving the required arguments
>>> > > missing.
>>> > >
>>> > > But I was left wondering how we could help users right their own 
ship here.
>>> > >
>>> > > Would it help to mention the argument names? To include some code
>>> > > checking for weird combinations of missing arguments? Any other
>>> > > suggestions?
>>> > >
>>> > > Mike C
>>> > >
>>> > > [1] 
https://github.com/r-dbi/DBI/blob/97934c885749dd87a6beb10e8ccb6a5ebea3675e/R/dbGetQuery.R#L62-L64

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo