Re: [Rd] Should subsetting named vector return named vector including named unmatched elements?

2024-01-18 Thread Hervé Pagès
Never been a big fan of this behavior either but maybe the intention was 
to make it easier to distinguish between 2 types of NAs in the result: 
those that were present in the original vector vs those that are 
introduced by an unmatched subscript. Like in this example:

     x <- setNames(c(101:108, NA), letters[1:9])
     x
     #   a   b   c   d   e   f   g   h   i
     # 101 102 103 104 105 106 107 108  NA

     x[c("g", "k", "a", "i")]
     #    g     a    i
     #  107   NA  101   NA

The first NA is the result of an unmatched subscript, while the second 
one comes from 'x'.

This is of limited interest though. In most real world applications I've 
worked on, we actually need to "fix" the names of the result.

Best,

H.

On 1/18/24 11:51, Jiří Moravec wrote:
> Subsetting vector (including lists) returns the same number of 
> elements as the subsetting vector, including unmatched elements which 
> are reported as `NA` or `NULL` (in case of lists).
>
> Consider:
>
> ```
> menu = list(
>   "bacon" = "foo",
>   "eggs" = "bar",
>   "beans" = "baz"
>   )
>
> select = c("bacon", "eggs", "spam")
>
> menu[select]
> # $bacon
> # [1] "foo"
> #
> # $eggs
> # [1] "bar"
> #
> # $
> # NULL
>
> ```
>
> Wouldn't it be more logical to return named vector/list including 
> names of unmatched elements when subsetting using names? After all, 
> the unmatched elements are already returned. I.e., the output would 
> look like this:
>
> ```
>
> menu[select]
> # $bacon
> # [1] "foo"
> #
> # $eggs
> # [1] "bar"
> #
> # $spam
> # NULL
>
> ```
>
> The simple fix `menu[select] |> setNames(select)` solves, but it feels 
> to me like something that could be a default behaviour.
>
> On slightly unrelated note, when I was asking if there is a better 
> solution, the `menu[select]` seems to allocate more memory than 
> `menu_env = list2env(menu); mget(select, envir = menu, ifnotfound = 
> list(NULL)`. Or the sapply solution. Is this a benchmarking artifact?
>
> https://stackoverflow.com/q/77828678/4868692
>
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

-- 
Hervé Pagès

Bioconductor Core Team
hpages.on.git...@gmail.com

[[alternative HTML version deleted]]

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


Re: [Rd] Should subsetting named vector return named vector including named unmatched elements?

2024-01-18 Thread Steve Martin via R-devel
Jiří,

For your first question, the NA names make sense if you think of indexing with 
a character vector as the same as menu[match(select, names(menu))]. You're not 
indexing with "beans"; rather, "beans" becomes NA because it's not in the names 
of menu. (This is how it's documented in ?`[`: "Character vectors will be 
matched to the names of the object...")

Steve


On Thursday, January 18th, 2024 at 2:51 PM, Jiří Moravec 
 wrote:


> Subsetting vector (including lists) returns the same number of elements
> as the subsetting vector, including unmatched elements which are
> reported as `NA` or `NULL` (in case of lists).
> 
> Consider:
> 
> ```
> menu = list(
> "bacon" = "foo",
> "eggs" = "bar",
> "beans" = "baz"
> )
> 
> select = c("bacon", "eggs", "spam")
> 
> menu[select]
> # $bacon
> # [1] "foo"
> #
> # $eggs
> # [1] "bar"
> #
> # $
> 
> # NULL
> 
> `Wouldn't it be more logical to return named vector/list including names of 
> unmatched elements when subsetting using names? After all, the unmatched 
> elements are already returned. I.e., the output would look like this:`
> 
> menu[select]
> # $bacon
> # [1] "foo"
> #
> # $eggs
> # [1] "bar"
> #
> # $spam
> # NULL
> 
> ```
> 
> The simple fix `menu[select] |> setNames(select)` solves, but it feels
> 
> to me like something that could be a default behaviour.
> 
> On slightly unrelated note, when I was asking if there is a better
> solution, the `menu[select]` seems to allocate more memory than
> `menu_env = list2env(menu); mget(select, envir = menu, ifnotfound = 
> list(NULL)`. Or the sapply solution. Is this a benchmarking artifact?
> 
> https://stackoverflow.com/q/77828678/4868692
> 
> __
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


[Rd] Should subsetting named vector return named vector including named unmatched elements?

2024-01-18 Thread Jiří Moravec
Subsetting vector (including lists) returns the same number of elements 
as the subsetting vector, including unmatched elements which are 
reported as `NA` or `NULL` (in case of lists).


Consider:

```
menu = list(
  "bacon" = "foo",
  "eggs" = "bar",
  "beans" = "baz"
  )

select = c("bacon", "eggs", "spam")

menu[select]
# $bacon
# [1] "foo"
#
# $eggs
# [1] "bar"
#
# $
# NULL

```

Wouldn't it be more logical to return named vector/list including names 
of unmatched elements when subsetting using names? After all, the 
unmatched elements are already returned. I.e., the output would look 
like this:


```

menu[select]
# $bacon
# [1] "foo"
#
# $eggs
# [1] "bar"
#
# $spam
# NULL

```

The simple fix `menu[select] |> setNames(select)` solves, but it feels 
to me like something that could be a default behaviour.


On slightly unrelated note, when I was asking if there is a better 
solution, the `menu[select]` seems to allocate more memory than 
`menu_env = list2env(menu); mget(select, envir = menu, ifnotfound = 
list(NULL)`. Or the sapply solution. Is this a benchmarking artifact?


https://stackoverflow.com/q/77828678/4868692

__
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel