Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/22913#discussion_r230953015
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowUtils.scala
---
@@ -71,6 +71,7 @@ object ArrowUtils {
case d: ArrowType.Decimal => DecimalType(d.getPrecision, d.getScale)
case date: ArrowType.Date if date.getUnit == DateUnit.DAY => DateType
case ts: ArrowType.Timestamp if ts.getUnit == TimeUnit.MICROSECOND =>
TimestampType
+ case date: ArrowType.Date if date.getUnit == DateUnit.MILLISECOND =>
TimestampType
--- End diff --
I think it should map to `Date` and any extra time would be truncated.
Looking at the Arrow format from
https://github.com/apache/arrow/blob/master/format/Schema.fbs
```
/// Date is either a 32-bit or 64-bit type representing elapsed time since
UNIX
/// epoch (1970-01-01), stored in either of two units:
///
/// * Milliseconds (64 bits) indicating UNIX time elapsed since the epoch
(no
/// leap seconds), where the values are evenly divisible by 86400000
/// * Days (32 bits) since the UNIX epoch
```
So it's expected to be a specific number of days without any additional
milliseconds.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]