[
https://issues.apache.org/jira/browse/HIVE-19723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16500903#comment-16500903
]
Bryan Cutler commented on HIVE-19723:
-------------------------------------
> My understanding is that since the primary use-case for ArrowUtils is Python
> integration, some of the conversions are currently somewhat particular for
> Python. Perhaps Python/Pandas only supports MICROSECOND timestamps.
Python, with pandas and pyarrow, supports timestamps down to nanoseconds. The
reason for for using microseconds in Spark {{ArrowUtils}} is to match Sparks
internal representation, which is in microseconds. This way avoids any further
conversions once read into the Spark JVM.
> Arrow serde: "Unsupported data type: Timestamp(NANOSECOND, null)"
> -----------------------------------------------------------------
>
> Key: HIVE-19723
> URL: https://issues.apache.org/jira/browse/HIVE-19723
> Project: Hive
> Issue Type: Bug
> Reporter: Teddy Choi
> Assignee: Teddy Choi
> Priority: Major
> Labels: pull-request-available
> Fix For: 3.1.0, 4.0.0
>
> Attachments: HIVE-19723.1.patch, HIVE-19732.2.patch
>
>
> Spark's Arrow support only provides Timestamp at MICROSECOND granularity.
> Spark 2.3.0 won't accept NANOSECOND. Switch it back to MICROSECOND.
> The unit test org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow will just need
> to change the assertion to test microsecond. And we'll need to add this to
> documentation on supported datatypes.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)