amoeba commented on issue #39311:
URL: https://github.com/apache/arrow/issues/39311#issuecomment-1864901780
Thanks @jorisvandenbossche for diving into this.
> My understanding is that you rather want some scalar take-like kernel that
works on a list of elements and takes certain values from that list?
Yes. And the indices would be generated at execution time and would act as a
sort of mask.
I attempted to do something similar to what I want with DuckDB almost got
there except everything should lists.
<details>
<summary>Setup</summary>
```
create table ex (date INTEGER, symbol VARCHAR, value INTEGER);
insert into ex values (1, 'a', 3);
insert into ex values (1, 'b', 2);
insert into ex values (1, 'c', 1);
insert into ex values (2, 'a', 2);
insert into ex values (2, 'b', 3);
insert into ex values (2, 'c', 1);
insert into ex values (3, 'a', 1);
insert into ex values (3, 'b', 2);
insert into ex values (3, 'c', 3);
```
```
D select * from ex;
┌───────┬─────────┬───────┐
│ date │ symbol │ value │
│ int32 │ varchar │ int32 │
├───────┼─────────┼───────┤
│ 1 │ a │ 3 │
│ 1 │ b │ 2 │
│ 1 │ c │ 1 │
│ 2 │ a │ 2 │
│ 2 │ b │ 3 │
│ 2 │ c │ 1 │
│ 3 │ a │ 1 │
│ 3 │ b │ 2 │
│ 3 │ c │ 3 │
└───────┴─────────┴───────┘
```
</details>
```sql
select
date,
list(symbol) as symbol_list,
list(value) as value_list,
max(value) as value_max,
list_position(value_list, value_max) as idx,
symbol_list[idx] as symbol_out,
value_list[idx] as value_out
from ex
group by date;
```
```
┌───────┬─────────────┬────────────┬───────────┬───────┬────────────┬───────────┐
│ date │ symbol_list │ value_list │ value_max │ idx │ symbol_out │
value_out │
│ int32 │ varchar[] │ int32[] │ int32 │ int32 │ varchar │
int32 │
├───────┼─────────────┼────────────┼───────────┼───────┼────────────┼───────────┤
│ 1 │ [a, b, c] │ [3, 2, 1] │ 3 │ 1 │ a │
3 │
│ 2 │ [a, b, c] │ [2, 3, 1] │ 3 │ 2 │ b │
3 │
│ 3 │ [a, b, c] │ [1, 2, 3] │ 3 │ 3 │ c │
3 │
└───────┴─────────────┴────────────┴───────────┴───────┴────────────┴───────────┘
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]