Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/18664#discussion_r145729785
--- Diff:
sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowUtils.scala
---
@@ -42,6 +43,13 @@ object ArrowUtils {
case StringType => ArrowType.Utf8.INSTANCE
case BinaryType => ArrowType.Binary.INSTANCE
case DecimalType.Fixed(precision, scale) => new
ArrowType.Decimal(precision, scale)
+ case DateType => new ArrowType.Date(DateUnit.DAY)
+ case TimestampType =>
+ timeZoneId match {
+ case Some(id) => new ArrowType.Timestamp(TimeUnit.MICROSECOND, id)
--- End diff --
ok I understand the reason now. Setting the timezone here is better than
setting it at python side. My only question is, why the timezone id is optional?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]