junjunjd commented on PR #7901: URL: https://github.com/apache/arrow-datafusion/pull/7901#issuecomment-1792050626
@alamb Thanks for the review. I have added comments to the panics I removed in [scalar.rs](https://github.com/apache/arrow-datafusion/pull/7901/files#diff-3ed49bf2d745740a87441ae1fcdec44fe853f4809ee671fe294818982e020da6). To summarize, these panics can be categorized into five types in general: - [panics](https://github.com/apache/arrow-datafusion/pull/7901/files#r1378446881) generated in `iter_to_array` when the `ScalarValue`s in the iterator [are not all the same type](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1093-L1094). I believe these errors are reachable. Most of the [`build_*` macros](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L1145) defined in the function return an internal error instead of panicking. IMO it makes sense to remove these panics to align with other internal errors returned. - [`typed_cast_*` macros](https://github.com/apache/arrow-datafusion/pull/7901/files#r1379681340) called in `try_from_array`. Since `try_from_array` is a public function, downcasting the array to certain types can fail depending on what `array` value the user passes to the function. I think it makes sense to return an internal error as this error is reachable and recoverable. This aligns with how downcast error is handled in the [downcast_value](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/lib.rs#L73) macro. `try_from_array` already returns a `Result`, so this does not require any change in client code. - [panics](https://github.com/apache/arrow-datafusion/pull/7901/files#r1381174929) generated when `with_precision_and_scale` is called on decimal arrays. I think these errors are reachable because [ScalarValue::try_new_decimal128](https://github.com/apache/arrow-datafusion/blob/main/datafusion/common/src/scalar.rs#L660-L661) allows decimals with precision 0 while [arrow-array](https://github.com/apache/arrow-rs/blob/master/arrow-array/src/types.rs#L1234-L1235) does not support that. We can update `try_new_decimal128` to disallow decimal with precision 0 and establish some other invariants to [`new_list`](https://github.com/apache/arrow-datafusion/pull/7901/files#r1379688238), [`eq_array`](https://github.com/apache/arrow-datafusion/pull/7901/files#r1379698814) and [`to_array_of_size`](https://github.com/apache/arrow-datafusion/pull/7901/files#r1381221881) so that these error becomes unreachable and these functions can panic. This should remove the impacts on client code. - the unimplemented errors. In many other `datafusion` code, a `NotImplemented` error is returned instead of using `unimplemented!`. IMHO returning an error makes more sense as user can choose whether to ignore the unimplemented errors instead of panicking and exiting. Would appreciate your thoughts on this. - All the `ArrowError`s and the "Invalid dictionary keys type" errors should be unreachable. I will change these back to panics. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
