alamb commented on a change in pull request #8611: URL: https://github.com/apache/arrow/pull/8611#discussion_r520546828
########## File path: rust/arrow/src/csv/reader.rs ########## @@ -219,6 +226,35 @@ pub fn infer_schema_from_files( Schema::try_merge(&schemas) } +/// Parses a string into the specified `ArrowPrimitiveType`. +fn parse_field<T: ArrowPrimitiveType>(s: &str) -> Result<T::Native> { + let from_ymd = chrono::NaiveDate::from_ymd; + let since = chrono::NaiveDate::signed_duration_since; + + match T::DATA_TYPE { + DataType::Boolean => s + .to_lowercase() + .parse::<T::Native>() + .map_err(|_| ArrowError::ParseError("Error parsing boolean".to_string())), + DataType::Date32(DateUnit::Day) => { + let days = chrono::NaiveDate::parse_from_str(s, "%Y-%m-%d") + .map(|t| since(t, from_ymd(1970, 1, 1)).num_days() as i32); + days.map(|t| unsafe { std::mem::transmute_copy::<i32, T::Native>(&t) }) Review comment: Another alternative would be to extend the `ArrowNativeType` with `from_i32` and `from_i64`, following the model of `from_usize` and then implement those functions for i32 and i64 respectively (as those are the underlying native types) I tried this approach out on a branch in case you are interested / want to take the change: Commit with change: https://github.com/alamb/arrow/commit/cc61e7a3fd55fcbc00f400be1ae74d64d7de6d21 The branch (with your change) is here: https://github.com/alamb/arrow/tree/alamb/less-unsafe) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org