seddonm1 commented on a change in pull request #8794:
URL: https://github.com/apache/arrow/pull/8794#discussion_r534504272
##########
File path: rust/arrow/src/compute/kernels/cast.rs
##########
@@ -376,6 +378,27 @@ pub fn cast(array: &ArrayRef, to_type: &DataType) ->
Result<ArrayRef> {
Int64 => cast_string_to_numeric::<Int64Type>(array),
Float32 => cast_string_to_numeric::<Float32Type>(array),
Float64 => cast_string_to_numeric::<Float64Type>(array),
+ Date32(DateUnit::Day) => {
+ use chrono::{NaiveDate, NaiveTime};
+ let zero_time = NaiveTime::from_hms(0, 0, 0);
+ let string_array =
array.as_any().downcast_ref::<StringArray>().unwrap();
+ let mut builder =
PrimitiveBuilder::<Date32Type>::new(string_array.len());
+ for i in 0..string_array.len() {
+ if string_array.is_null(i) {
+ builder.append_null()?;
+ } else {
+ match NaiveDate::parse_from_str(string_array.value(i),
"%Y-%m-%d")
+ {
+ Ok(date) => builder.append_value(
+ (date.and_time(zero_time).timestamp() /
SECONDS_IN_DAY)
+ as i32,
+ )?,
+ Err(_) => builder.append_null()?, // not a valid
date
Review comment:
@andygrove this is a fundamental question about ANSI type SQL support vs
a lower compatibility.
Apache Spark has put a lot of work into adding ANSI behaviour to Spark
(https://spark.apache.org/docs/3.0.0/sql-ref-ansi-compliance.html#type-conversion)
to correct error on invalid values which I believe is the correct approach
rather than the error suppression strategy which Spark inherited from Hive SQL.
This is not the place for this discussion but it feels like a good time to
address this kind of fundamental question.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]