viirya commented on PR #23033:
URL: https://github.com/apache/datafusion/pull/23033#issuecomment-4759948505

   Some archaeology on the "Spark doesn't have a u64 type" point, in case it 
helps decide the direction here.
   
   You're right that Spark can't produce a `UInt64`. Spark SQL's integer types 
are `ByteType`/`ShortType`/`IntegerType`/`LongType` — all signed (JVM has no 
unsigned integers). This PR's own doc comment on `round_integer` reflects that: 
it says it matches *"Spark's `RoundBase` behaviour for `ByteType`, `ShortType`, 
`IntegerType`, and `LongType`"* — only the signed types.
   
   So why does the code accept `UInt64` at all? It looks **incidental, not 
intentional**. The signature has used 
`Coercion::new_exact(TypeSignatureClass::Integer)` since the function was first 
added in #21062, and `TypeSignatureClass::Integer` matches on 
`NativeType::is_integer()`, which includes the unsigned types:
   
   ```rust
   // datafusion/common/src/types/native.rs
   pub fn is_integer(&self) -> bool {
       matches!(self, UInt8 | UInt16 | UInt32 | UInt64 | Int8 | Int16 | Int32 | 
Int64)
   }
   ```
   
   So `UInt8..UInt64` came in "for free" via that coercion class rather than by 
deliberate design, and the `UInt64` dispatch arm exists only because the 
signature lets those values through (e.g. via `arrow_cast(x, 'UInt64')`) — 
Spark itself never generates them.
   
   That seems to leave two coherent directions:
   1. **Keep the codepath and make it correct** — what this PR (and #22697's 
Spark portion) does.
   2. **Narrow the signature to signed-only** and drop the `UInt*` dispatch 
arms, so Spark `round` matches Spark's actual type domain — which would make 
both this fix and #22697's Spark change unnecessary.
   
   No opinion from me on which is preferable — just flagging that the unsigned 
support appears to be a side effect of the coercion class rather than a 
requirement, in case that informs the call.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to