[ https://issues.apache.org/jira/browse/HIVE-19723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16500903#comment-16500903 ]
Bryan Cutler commented on HIVE-19723: ------------------------------------- > My understanding is that since the primary use-case for ArrowUtils is Python > integration, some of the conversions are currently somewhat particular for > Python. Perhaps Python/Pandas only supports MICROSECOND timestamps. Python, with pandas and pyarrow, supports timestamps down to nanoseconds. The reason for for using microseconds in Spark {{ArrowUtils}} is to match Sparks internal representation, which is in microseconds. This way avoids any further conversions once read into the Spark JVM. > Arrow serde: "Unsupported data type: Timestamp(NANOSECOND, null)" > ----------------------------------------------------------------- > > Key: HIVE-19723 > URL: https://issues.apache.org/jira/browse/HIVE-19723 > Project: Hive > Issue Type: Bug > Reporter: Teddy Choi > Assignee: Teddy Choi > Priority: Major > Labels: pull-request-available > Fix For: 3.1.0, 4.0.0 > > Attachments: HIVE-19723.1.patch, HIVE-19732.2.patch > > > Spark's Arrow support only provides Timestamp at MICROSECOND granularity. > Spark 2.3.0 won't accept NANOSECOND. Switch it back to MICROSECOND. > The unit test org.apache.hive.jdbc.TestJdbcWithMiniLlapArrow will just need > to change the assertion to test microsecond. And we'll need to add this to > documentation on supported datatypes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)