jorgecarleitao commented on a change in pull request #8611:
URL: https://github.com/apache/arrow/pull/8611#discussion_r520319490



##########
File path: rust/arrow/src/csv/reader.rs
##########
@@ -219,6 +226,35 @@ pub fn infer_schema_from_files(
     Schema::try_merge(&schemas)
 }
 
+/// Parses a string into the specified `ArrowPrimitiveType`.
+fn parse_field<T: ArrowPrimitiveType>(s: &str) -> Result<T::Native> {
+    let from_ymd = chrono::NaiveDate::from_ymd;
+    let since = chrono::NaiveDate::signed_duration_since;
+
+    match T::DATA_TYPE {
+        DataType::Boolean => s
+            .to_lowercase()
+            .parse::<T::Native>()
+            .map_err(|_| ArrowError::ParseError("Error parsing 
boolean".to_string())),
+        DataType::Date32(DateUnit::Day) => {
+            let days = chrono::NaiveDate::parse_from_str(s, "%Y-%m-%d")
+                .map(|t| since(t, from_ymd(1970, 1, 1)).num_days() as i32);
+            days.map(|t| unsafe { std::mem::transmute_copy::<i32, 
T::Native>(&t) })

Review comment:
       what @vertexclique said: transmute is one of the most unsafe operations 
in rust, and this can easily lead to undefined behavior if it overflows.

##########
File path: rust/arrow/src/csv/reader.rs
##########
@@ -67,6 +67,9 @@ lazy_static! {
         .case_insensitive(true)
         .build()
         .unwrap();
+    static ref DATE_RE: Regex = Regex::new(r"^\d\d\d\d-\d\d-\d\d$").unwrap();

Review comment:
       isn't there a `\d{4}` or something like that? May make it a bit easier 
to read and more expressive, IMO




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to