Re: [PR] HIVE-29492: Add AutoScaling to K8s operator [hive]

via GitHub Thu, 18 Jun 2026 06:38:01 -0700


zhangbutao commented on PR #6507:
URL: https://github.com/apache/hive/pull/6507#issuecomment-4742498986


   > Thanx @zhangbutao for the great insights!!!
   > 
   > You hit the nail on the head regarding the shift from "YARN-thinking" to 
"Kubernetes-native thinking."
   > 
   > 1. Physical vs. Logical Isolation
   >    You are completely right about Workload Management (WLM). Trying to 
carve up a single JVM's heap and CPU cycles among competing tenants is 
incredibly complex and never gives you 100% true isolation. By shifting to 
Kubernetes, we get true physical isolation via namespaces, cgroups, and 
dedicated pod resources.
   > 2. How this could work technically
   >    What you are describing is entirely feasible. The LLAP instances 
register themselves in ZooKeeper under a specific app name (defaulting to 
@llap0). If we update the Operator to support an array of LLAP profiles (e.g., 
llap-cluster1, llap-cluster2), the Operator would spin up multiple independent 
StatefulSets, each registering to a different ZK path.
   > 
   > Then, exactly as you said, a user simply sets 
hive.llap.daemon.service.hosts=@llap-cluster1 in their JDBC string or session. 
TezAM would look up that specific ZK path, find those specific pods, and route 
the fragments exclusively to that tenant's dedicated executors.
   > 
   > 3. The Autoscaling Synergy
   >    The best part is how it ties into the autoscaling logic in this PR! 
Because each tenant's LLAP cluster would be its own independent K8s 
StatefulSet, the autoscaler would scale llap-cluster1 and llap-cluster2 
completely independently. If user1 isn't running queries, their dedicated LLAP 
cluster scales to zero, costing nothing, while user2 can comfortably stay 
scaled up to 100 pods.
   > 
   > This is a fantastic concept for multi-tenancy. Since the core autoscaling 
loop and K8s operator primitives are established in this PR, building out 
"Multi-Tenant LLAP Compute Groups" on top of it feels like a perfect follow-up 
Jira ticket. I think it is definitely worth exploring! I will definitely give 
it a shot :-)
   
   
   Your thoughts align completely with mine—this idea is both feasible and 
highly valuable. The reason I came up with this idea is that other 
MPP-architecture OLAP analytical engines, such as StarRocks and Doris, already 
have similar compute-group functionality that effectively isolates multi-tenant 
workloads. So the solution we've conceived is absolutely feasible and has 
practical value. Therefore, it is well worth our effort to explore this 
capability in depth.  Thanks @ayushtkn 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] HIVE-29492: Add AutoScaling to K8s operator [hive]

Reply via email to