scovich commented on PR #10157: URL: https://github.com/apache/arrow-rs/pull/10157#issuecomment-4833171417
> I'm still not on board with keeping `as_u*` APIs because they are not identity cast (since there's no unsigned Variant and we can't shred to unsigned) but let's leave it to @alamb / @scovich. From what I understand, the `as_xxx` methods serve two purposes: 1. Shredding - `as_uxx` useless 2. Convenience when e.g. picking apart the fields of a specific type that the code expects but where the underlying data may not match exactly. Here, `as_uxx` are very helpful, if that's what the "parsing" code expected. Orthogonally, there's the question of whether to cast at all, and if so how aggressively. This is something we debated at length in the early days, and I had called out that there are four possible levels of aggressiveness: 1. Type-based lossless (e.g. i8 -> i64 always succeeds) 2. Value-based lossless (e.g. i64 -> i8 _may_ succeed, for values in -128..128) 3. Lossy (e.g. f64 -> i64 will truncate/round all trailing decimal digits) 4. Converting (e.g. string -> i64 parses a string, i64 -> string) I had originally argued for 1/ only, because it is the least surprising. But 2/ is useful for shredding and many people anyway expect it because their system does implicit casting. 3/ is where things start to get questionable (information loss happens too easily IMO). An explicit `cast` operation can reasonably do both 3/ and 4/, because the user explicitly opted in by requesting the cast. So we'll need to decide which level of aggressiveness variant does internally, vs. what requires a `cast`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
