viirya commented on PR #23033:
URL: https://github.com/apache/datafusion/pull/23033#issuecomment-4759948505
Some archaeology on the "Spark doesn't have a u64 type" point, in case it
helps decide the direction here.
You're right that Spark can't produce a `UInt64`. Spark SQL's integer types
are `ByteType`/`ShortType`/`IntegerType`/`LongType` — all signed (JVM has no
unsigned integers). This PR's own doc comment on `round_integer` reflects that:
it says it matches *"Spark's `RoundBase` behaviour for `ByteType`, `ShortType`,
`IntegerType`, and `LongType`"* — only the signed types.
So why does the code accept `UInt64` at all? It looks **incidental, not
intentional**. The signature has used
`Coercion::new_exact(TypeSignatureClass::Integer)` since the function was first
added in #21062, and `TypeSignatureClass::Integer` matches on
`NativeType::is_integer()`, which includes the unsigned types:
```rust
// datafusion/common/src/types/native.rs
pub fn is_integer(&self) -> bool {
matches!(self, UInt8 | UInt16 | UInt32 | UInt64 | Int8 | Int16 | Int32 |
Int64)
}
```
So `UInt8..UInt64` came in "for free" via that coercion class rather than by
deliberate design, and the `UInt64` dispatch arm exists only because the
signature lets those values through (e.g. via `arrow_cast(x, 'UInt64')`) —
Spark itself never generates them.
That seems to leave two coherent directions:
1. **Keep the codepath and make it correct** — what this PR (and #22697's
Spark portion) does.
2. **Narrow the signature to signed-only** and drop the `UInt*` dispatch
arms, so Spark `round` matches Spark's actual type domain — which would make
both this fix and #22697's Spark change unnecessary.
No opinion from me on which is preferable — just flagging that the unsigned
support appears to be a side effect of the coercion class rather than a
requirement, in case that informs the call.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]