alamb commented on code in PR #15537:
URL: https://github.com/apache/datafusion/pull/15537#discussion_r2042846106


##########
datafusion/common/src/config.rs:
##########
@@ -459,6 +459,14 @@ config_namespace! {
         /// BLOB instead.
         pub binary_as_string: bool, default = false
 
+        /// (reading) If true, parquet reader will read columns of
+        /// physical type int96 as originating from a different resolution
+        /// than nanosecond. This is useful for reading data from systems like 
Spark
+        /// which stores microsecond resolution timestamps in an int96 
allowing it
+        /// to write values with a larger date range than 64-bit timestamps 
with
+        /// nanosecond resolution.
+        pub coerce_int96: Option<String>, transform = str::to_lowercase, 
default = None

Review Comment:
   Given how dominant spark is and how rarely used int96 is outside the spark 
ecosystem, I was thinking that basically if anyone had such a file it is likely 
we should treat the values as microseconds. 
   
   I don't have a strong preference, I was just trying to come up with a way to 
keep the code less compilcated 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to