[ 
https://issues.apache.org/jira/browse/HIVE-29419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-29419:
--------------------------------
    Description: 
This ticket is related to the Dockerized Hive and Tez initiative.
While Hive Docker was implemented in HIVE-26400, and a Tez AM image is 
currently under development in TEZ-4682, there is an open question about how to 
seamlessly integrate Hive and Tez docker containers (build and runtime also)
TEZ-4682 aims to build a generic Tez AM image, which is crucial for making Tez 
a modern execution engine, while Hive has a lot of dependencies on Tez. This 
makes the independent development of Hive (HiveServer2) and a Tez AM docker 
images quite hard.
Consider the different classes used in TezAM.
!Screenshot 2026-01-27 at 14.51.39.png|width=462,height=227!

Every yellow class makes a separate question about "how Hive jars make their 
way to an independent Tez image", and here is how this Jira could be a 
game-changer. Consider *HiveSplitGenerator* (in hive-exec module) and 
*LlapTaskCommunicator* (in llap-tez module) classes. In the Yarn world, their 
localization was taken care of by Yarn, but from the point we deploy loosely 
coupled Docker containers, we cannot rely on such a mechanism anymore.

Hence, the proposal is to include Tez jars into the Hive image (if they are not 
yet included since HIVE-26400), and make a Tez AM specific entrypoint (and 
separate Dockerfile if needed), that starts {*}DAGAppMaster{*}.

  was:
This ticket is related to the Dockerized Hive and Tez initiative.
While Hive Docker was implemented in HIVE-26400, and a Tez AM image is 
currently under development in TEZ-4682, there is an open question about how to 
seamlessly integrate Hive and Tez docker containers (build and runtime also)
TEZ-4682 aims to build a generic Tez AM image, which is crucial for making Tez 
a modern execution engine, while Hive has a lot of dependencies on Tez. This 
makes the independent development of Hive (HiveServer2) and a Tez AM docker 
images quite hard.
Consider the different classes used in  !Screenshot 2026-01-27 at 14.51.39.png! 


> Provide a Hive-specific docker image for Tez AM
> -----------------------------------------------
>
>                 Key: HIVE-29419
>                 URL: https://issues.apache.org/jira/browse/HIVE-29419
>             Project: Hive
>          Issue Type: Sub-task
>            Reporter: László Bodor
>            Priority: Major
>         Attachments: Screenshot 2026-01-27 at 14.51.39.png
>
>
> This ticket is related to the Dockerized Hive and Tez initiative.
> While Hive Docker was implemented in HIVE-26400, and a Tez AM image is 
> currently under development in TEZ-4682, there is an open question about how 
> to seamlessly integrate Hive and Tez docker containers (build and runtime 
> also)
> TEZ-4682 aims to build a generic Tez AM image, which is crucial for making 
> Tez a modern execution engine, while Hive has a lot of dependencies on Tez. 
> This makes the independent development of Hive (HiveServer2) and a Tez AM 
> docker images quite hard.
> Consider the different classes used in TezAM.
> !Screenshot 2026-01-27 at 14.51.39.png|width=462,height=227!
> Every yellow class makes a separate question about "how Hive jars make their 
> way to an independent Tez image", and here is how this Jira could be a 
> game-changer. Consider *HiveSplitGenerator* (in hive-exec module) and 
> *LlapTaskCommunicator* (in llap-tez module) classes. In the Yarn world, their 
> localization was taken care of by Yarn, but from the point we deploy loosely 
> coupled Docker containers, we cannot rely on such a mechanism anymore.
> Hence, the proposal is to include Tez jars into the Hive image (if they are 
> not yet included since HIVE-26400), and make a Tez AM specific entrypoint 
> (and separate Dockerfile if needed), that starts {*}DAGAppMaster{*}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to