[GitHub] [arrow] jorisvandenbossche commented on pull request #14697: ARROW-18265: [C++][Python] Support FieldRef to work with ListElement

GitBox Tue, 13 Dec 2022 05:21:55 -0800


jorisvandenbossche commented on PR #14697:
URL: https://github.com/apache/arrow/pull/14697#issuecomment-1348529294

> a purely metadata-based field lookup (the current semantics of `FieldRef`)

I am not sure that this is still entirely correct nowadays, since we already
do use `FieldRef` in other places in the compute engine to refer to fields (eg
to specify the sort key, in projections (although there wrapped in an
expression), ...).
And it is certainly true that right now we only support struct/union field
lookups, but also those are not purely metadata-based, as the actual lookup of
values (not just the type/field from a schema) also needs to potentially
calculate validity bitmaps. While for list types, the lookup of the type is
also metadata-only. It's only when looking up actual values that you need a
nontrivial computation.

> Is this actually important to support?

If you want to work with list types, this is a pretty basic operation. Now,
the actual operation is already supported (`list_element` kernel), here it is
only about being able to express this in a field reference. So this is somewhat
in the bucket of "user convenience", as you can already achieve something
similar with the kernel.
Now, for example also Substrait has this concept
(https://github.com/substrait-io/substrait/blob/7f272f13f22cd5f5842baea42bcf7961e6251881/proto/substrait/algebra.proto#L932-L938)

Another example of database that supports this in queries:
https://cloud.google.com/bigquery/docs/reference/standard-sql/arrays#accessing_array_elements

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] jorisvandenbossche commented on pull request #14697: ARROW-18265: [C++][Python] Support FieldRef to work with ListElement

Reply via email to