[
https://issues.apache.org/jira/browse/HIVE-28650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17902076#comment-17902076
]
Butao Zhang commented on HIVE-28650:
------------------------------------
[~glapark] Thanks for your testing!
{code:java}
ORC 2.0.3 actually executes more S3 operations than ORC 1.9.4, which is a bit
surprising.{code}
I have not explored more about the Hadoop Vectored IO feature. But the test
reult that GetObject in 2.0.3 is more than 1.9.4 also surprised me, too.
I found this Hadoop Vectored IO doc link
[https://docs.google.com/presentation/d/1U5QRN4etbM7gkbnGO3OW4sCfUZx9LqJN/]
which was written by [[email protected]] . It would be great if
[[email protected]] can give some guidance.
> Upgrade Apache ORC version to 2.0.3
> -----------------------------------
>
> Key: HIVE-28650
> URL: https://issues.apache.org/jira/browse/HIVE-28650
> Project: Hive
> Issue Type: Improvement
> Security Level: Public(Viewable by anyone)
> Reporter: Butao Zhang
> Priority: Major
>
> ORC 2.0.x version added the Hadoop Vectored IO feature in ORC-1251.
> We can try to upgrade ORC to latest version 2.0.x to make this feature work
> in Hive.
> But ORC 2.0.x is built on JDK17+, so we need to upgrade Hive jdk to 17+
> first. This depends on this ticket HIVE-26473 upgrading jdk17.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)