andygrove commented on a change in pull request #8794:
URL: https://github.com/apache/arrow/pull/8794#discussion_r534510261
##########
File path: rust/arrow/src/compute/kernels/cast.rs
##########
@@ -376,6 +378,27 @@ pub fn cast(array: &ArrayRef, to_type: &DataType) ->
Result<ArrayRef> {
Int64 => cast_string_to_numeric::<Int64Type>(array),
Float32 => cast_string_to_numeric::<Float32Type>(array),
Float64 => cast_string_to_numeric::<Float64Type>(array),
+ Date32(DateUnit::Day) => {
+ use chrono::{NaiveDate, NaiveTime};
+ let zero_time = NaiveTime::from_hms(0, 0, 0);
+ let string_array =
array.as_any().downcast_ref::<StringArray>().unwrap();
+ let mut builder =
PrimitiveBuilder::<Date32Type>::new(string_array.len());
+ for i in 0..string_array.len() {
+ if string_array.is_null(i) {
+ builder.append_null()?;
+ } else {
+ match NaiveDate::parse_from_str(string_array.value(i),
"%Y-%m-%d")
+ {
+ Ok(date) => builder.append_value(
+ (date.and_time(zero_time).timestamp() /
SECONDS_IN_DAY)
+ as i32,
+ )?,
+ Err(_) => builder.append_null()?, // not a valid
date
Review comment:
Thanks @seddonm1 this is a great point and I am actually pretty familiar
with Spark's CAST and ANSI CAST support and I agree that Spark has a lot of
dangerous behavior that hides data errors. I wasn't really thinking about this
while reviewing this PR.
I filed https://issues.apache.org/jira/browse/ARROW-10793 to faciliate this
discussion.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]