scovich commented on PR #10157:
URL: https://github.com/apache/arrow-rs/pull/10157#issuecomment-4833171417

   > I'm still not on board with keeping `as_u*` APIs because they are not 
identity cast (since there's no unsigned Variant and we can't shred to 
unsigned) but let's leave it to @alamb / @scovich.
   
   From what I understand, the `as_xxx` methods serve two purposes:
   1. Shredding - `as_uxx` useless
   2. Convenience when e.g. picking apart the fields of a specific type that 
the code expects but where the underlying data may not match exactly. Here, 
`as_uxx` are very helpful, if that's what the "parsing" code expected.
   
   Orthogonally, there's the question of whether to cast at all, and if so how 
aggressively. This is something we debated at length in the early days, and I 
had called out that there are four possible levels of aggressiveness:
   1. Type-based lossless (e.g. i8 -> i64 always succeeds)
   2. Value-based lossless (e.g. i64 -> i8 _may_ succeed, for values in 
-128..128)
   3. Lossy (e.g. f64 -> i64 will truncate/round all trailing decimal digits)
   4. Converting (e.g. string -> i64 parses a string, i64 -> string)
   
   I had originally argued for 1/ only, because it is the least surprising. But 
2/ is useful for shredding and many people anyway expect it because their 
system does implicit casting. 3/ is where things start to get questionable 
(information loss happens too easily IMO). An explicit `cast` operation can 
reasonably do both 3/ and 4/, because the user explicitly opted in by 
requesting the cast.
   
   So we'll need to decide which level of aggressiveness variant does 
internally, vs. what requires a `cast`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to