[
https://issues.apache.org/jira/browse/HIVE-29419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
László Bodor updated HIVE-29419:
--------------------------------
Description:
This ticket is related to the Dockerized Hive and Tez initiative.
While Hive Docker was implemented in HIVE-26400, and a Tez AM image is
currently under development in TEZ-4682, there is an open question about how to
seamlessly integrate Hive and Tez docker containers (build and runtime also)
TEZ-4682 aims to build a generic Tez AM image, which is crucial for making Tez
a modern execution engine, while Hive has a lot of dependencies on Tez. This
makes the independent development of Hive (HiveServer2) and a Tez AM docker
images quite hard.
Consider the different classes used in TezAM.
!Screenshot 2026-01-27 at 14.51.39.png|width=462,height=227!
Every yellow class makes a separate question about "how Hive jars make their
way to an independent Tez image", and here is how this Jira could be a
game-changer. Consider *HiveSplitGenerator* (in hive-exec module) and
*LlapTaskCommunicator* (in llap-tez module) classes. In the Yarn world, their
localization was taken care of by Yarn, but from the point we deploy loosely
coupled Docker containers, we cannot rely on such a mechanism anymore.
Hence, the proposal is to include Tez jars into the Hive image (if they are not
yet included since HIVE-26400), and make a Tez AM specific entrypoint (and
separate Dockerfile if needed), that starts {*}DAGAppMaster{*}.
was:
This ticket is related to the Dockerized Hive and Tez initiative.
While Hive Docker was implemented in HIVE-26400, and a Tez AM image is
currently under development in TEZ-4682, there is an open question about how to
seamlessly integrate Hive and Tez docker containers (build and runtime also)
TEZ-4682 aims to build a generic Tez AM image, which is crucial for making Tez
a modern execution engine, while Hive has a lot of dependencies on Tez. This
makes the independent development of Hive (HiveServer2) and a Tez AM docker
images quite hard.
Consider the different classes used in !Screenshot 2026-01-27 at 14.51.39.png!
> Provide a Hive-specific docker image for Tez AM
> -----------------------------------------------
>
> Key: HIVE-29419
> URL: https://issues.apache.org/jira/browse/HIVE-29419
> Project: Hive
> Issue Type: Sub-task
> Reporter: László Bodor
> Priority: Major
> Attachments: Screenshot 2026-01-27 at 14.51.39.png
>
>
> This ticket is related to the Dockerized Hive and Tez initiative.
> While Hive Docker was implemented in HIVE-26400, and a Tez AM image is
> currently under development in TEZ-4682, there is an open question about how
> to seamlessly integrate Hive and Tez docker containers (build and runtime
> also)
> TEZ-4682 aims to build a generic Tez AM image, which is crucial for making
> Tez a modern execution engine, while Hive has a lot of dependencies on Tez.
> This makes the independent development of Hive (HiveServer2) and a Tez AM
> docker images quite hard.
> Consider the different classes used in TezAM.
> !Screenshot 2026-01-27 at 14.51.39.png|width=462,height=227!
> Every yellow class makes a separate question about "how Hive jars make their
> way to an independent Tez image", and here is how this Jira could be a
> game-changer. Consider *HiveSplitGenerator* (in hive-exec module) and
> *LlapTaskCommunicator* (in llap-tez module) classes. In the Yarn world, their
> localization was taken care of by Yarn, but from the point we deploy loosely
> coupled Docker containers, we cannot rely on such a mechanism anymore.
> Hence, the proposal is to include Tez jars into the Hive image (if they are
> not yet included since HIVE-26400), and make a Tez AM specific entrypoint
> (and separate Dockerfile if needed), that starts {*}DAGAppMaster{*}.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)