notfilippo opened a new pull request, #12536: URL: https://github.com/apache/datafusion/pull/12536
This PR represents the first step originating from experiment #11978, which itself stems from the broad objective described in proposal #11513. --- ## Rationale for this change This change introduces the knowledge of `Scalar` type which identifies the physical representation of a scalar value. `Scalar` will replace `ScalarValue` in the `ColumnarValue` enum. This is done in order to allow, in future PRs, to remove some redundant variants of `ScalarValue` and transition to logical representation of scalar values in datafusion. ## Design rationale The new `Scalar` type embeds both a `ScalarValue` and a `DataType`. While the type information might seem redundant considering the current implementation `ScalarValue` it will beneficial once we start to remove redundant variants (such as replacing `ScalarValue::Utf8View` and `ScalarValue::LargeUtf8` while keeping `ScalarValue::Utf8`) in order to keep track of the represented type during physical execution. ### Alternatives considered: [`arrow::array::Scalar`](https://docs.rs/arrow/latest/arrow/array/struct.Scalar.html) Arrow's array crate provides a `Scalar` type which implements Datum and which holds a reference to a `len() = 1` arrow array. While this type fully identifies the physical representation of a scalar value it falls short when considering the overhead of copying a fully rendered ArrayRef instead of primitives types. This possibility was also already discussed here: https://github.com/apache/datafusion/issues/7353#issuecomment-1739664804. ## What changes are included in this PR? - Introduced the `Scalar` struct - Replaced `ScalarValue` with `Scalar` in `ColumnarValue`'s Scalar variant - Adapted existing code to the Scalar variant change ## Are these changes tested? This change should be a no-op as it doesn't include any behavioural change. The existing test suite is sufficient to verify this change. ## TODO - [ ] rustdoc on new types and functions -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org