Thanks Yifeng for raising this thread!

Very glad to see the topic about tez on k8s. Tez on k8s would give tez more 
possibilities & elasticity. I think many users include me want to be involved 
in the development of tez on k8s. I am also thinking what Hive should to do to 
launch the tez on k8s job. :)
Anyway, Let's move forward this significant feature to make tez cloud-native!




Thanks,
Butao Zhang
---- Replied Message ----
| From | YifengLu<noisymons...@163.com> |
| Date | 1/30/2024 21:33 |
| To | <dev@tez.apache.org> |
| Subject | Re:Re: Re: tez on k8s |
Thank you very much for providing us with the PR and giving us insights into 
the implementation approach. It has been of great help to us.


Actually, we have made some attempts before. The scenario at that time may have 
been somewhat unique as we had our own client-side, used Tez AM and our custom 
Worker. We have completed the Kubernetes adaption of Tez AM and our custom 
Worker, Tez AM utilizes the K8S default scheduler for resource allocation and 
manages the startup and shutdown of the custom Worker. We have used it in 
production and served a significant number of batch processing jobs.


However, this differs significantly from the community's scenario, so we will 
carefully consider how to contribute back to the community.


Best regards,
Yifeng Lu


At 2024-01-29 16:52:50, "László Bodor" <bodorlaszlo0...@gmail.com> wrote:
Hi Yifeng!

I can tell that the most spectacular step toward tez on Kubernetes was the
unmanaged sessions umbrella: https://issues.apache.org/jira/browse/TEZ-3991
I can see some of those tasks are still not committed, however, unmanaged
sessions have been working for years in a Cloudera product, so we would be
happy to review these if someone picked them up and created a PR.

Once unmanaged sessions are done, we can explore other ways. Tez is still
tightly coupled with Yarn I believe, so on the road to k8s, we have to
explore the following:
1. *Tez AM* to be started by another resource manager than Yarn (which is
slightly different from the unmanaged sessions above, as unmanaged sessions
start independently of the client (e.g. Hive's hiveserver2)) <- this is not
a must-have as long as the unmanaged sessions approach is fine for the K8s
use-cases.
2. *Tez containers: t*his is a huge gap: the product I know works by Hive
LLAP as executors, so similarly to the sessions, the Hive LLAP containers
start independently of Tez, so if we want to make this work in Tez out of
the box, we need to implement new things like start a TezChild container or
rather a group of TezChild containers as a k8s pod for instance, just
thinking aloud.

Thanks for coming up with this topic. Let us get back to this later if any
plans, in the meantime, finishing the upstream contribution of TEZ-3991 is
more than welcome.

Regards,
Laszlo Bodor


YifengLu <noisymons...@163.com> ezt írta (időpont: 2024. jan. 29., H, 3:52):

Thank you very much for your response. We will continue to explore Tez on
Kubernetes, and if there are any further developments, we will reach out to
the community. Once again, thank you.
Best regards,
Yifeng Lu

Reply via email to