jorisvandenbossche commented on PR #14697: URL: https://github.com/apache/arrow/pull/14697#issuecomment-1348529294
> a purely metadata-based field lookup (the current semantics of `FieldRef`) I am not sure that this is still entirely correct nowadays, since we already do use `FieldRef` in other places in the compute engine to refer to fields (eg to specify the sort key, in projections (although there wrapped in an expression), ...). And it is certainly true that right now we only support struct/union field lookups, but also those are not purely metadata-based, as the actual lookup of values (not just the type/field from a schema) also needs to potentially calculate validity bitmaps. While for list types, the lookup of the type is also metadata-only. It's only when looking up actual values that you need a nontrivial computation. > Is this actually important to support? If you want to work with list types, this is a pretty basic operation. Now, the actual operation is already supported (`list_element` kernel), here it is only about being able to express this in a field reference. So this is somewhat in the bucket of "user convenience", as you can already achieve something similar with the kernel. Now, for example also Substrait has this concept (https://github.com/substrait-io/substrait/blob/7f272f13f22cd5f5842baea42bcf7961e6251881/proto/substrait/algebra.proto#L932-L938) Another example of database that supports this in queries: https://cloud.google.com/bigquery/docs/reference/standard-sql/arrays#accessing_array_elements -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
