Copilot commented on code in PR #6557:
URL: https://github.com/apache/hive/pull/6557#discussion_r3447388208


##########
packaging/src/kubernetes/src/java/org/apache/hive/kubernetes/operator/reconciler/HiveWorkflowSpec.java:
##########
@@ -129,44 +109,21 @@ private static List<DependentResourceSpec> buildSpecs() {
         Set.of(METASTORE_CONFIGMAP),
         null, null, null, METASTORE_ENABLED, null));
 
+    // --- Shared Scratch PVC (Required for HS2 to TezAM communication) ---
+    specs.add(new DependentResourceSpec(
+        ScratchPvcDependent.class, SCRATCH_PVC,
+        Set.of(), null, null, null, TEZAM_ENABLED, null));
+
     specs.add(new DependentResourceSpec(
         HiveServer2DeploymentDependent.class, HIVESERVER2_DEPLOYMENT,
-        Set.of(HIVESERVER2_CONFIGMAP, HADOOP_CONFIGMAP),
+        Set.of(HIVESERVER2_CONFIGMAP, HADOOP_CONFIGMAP, SCRATCH_PVC),
         null, hs2Precondition(), null, null, null));

Review Comment:
   HiveServer2DeploymentDependent is declared to depend on the scratch PVC even 
when TezAM is disabled. Since ScratchPvcDependent is only created when 
`tezAm.enabled` is true, disabling TezAM can block HS2 reconciliation waiting 
on a dependency that will never be created. HS2 already conditionally mounts 
the PVC only when TezAM is enabled, so the workflow dependency is unnecessary.



##########
service/src/java/org/apache/hive/service/cli/session/SessionManager.java:
##########
@@ -209,6 +211,52 @@ public Integer getValue() {
     metrics.addRatio(MetricsConstant.HS2_AVG_ACTIVE_SESSION_TIME, 
activeSessionTime, activeSessionCnt);
   }
 
+  // Registers a per-LLAP-target session gauge on first use.
+  private void registerLlapTargetGaugeIfNeeded(HiveSession session) {
+    try {
+      String sanitized = extractLlapTarget(session.getSessionConf());
+      if (sanitized == null) {
+        return;
+      }
+      if (registeredLlapTargetGauges.contains(sanitized)) {
+        return; // already registered
+      }
+      if (!registeredLlapTargetGauges.add(sanitized)) {
+        return; // lost race with another thread
+      }
+      Metrics metrics = MetricsFactory.getInstance();
+      if (metrics == null) {
+        return;
+      }

Review Comment:
   `registeredLlapTargetGauges` is updated before verifying that 
`MetricsFactory.getInstance()` is available. If `metrics` is null, the method 
returns after adding the target to the set, preventing the gauge from ever 
being registered later. Also, the prior `contains()` check is redundant given 
`add()` already handles concurrency.



##########
packaging/src/kubernetes/README.md:
##########
@@ -58,8 +58,8 @@ mvn clean package -pl packaging/src/kubernetes -Pkubernetes 
-DskipTests
 ## Quick Start (Helm)
 
 The Helm chart defaults to a **Full-HA** cluster (Metastore x2, HiveServer2 x2,
-LLAP x2, TezAM x2). You only need to provide three things: database, ZooKeeper,
-and storage.
+LLAP x2, TezAM x2 — one TezAM per LLAP cluster). You only need to provide three
+things: database, ZooKeeper, and storage.
 

Review Comment:
   The PR description says there are no user-facing changes, but this PR 
changes the HiveCluster CRD/Helm values and documented configuration surface 
(e.g. `llap` → `llapClusters`, new `llapClusterRouting`, per-LLAP TezAM). 
Please update the PR description to reflect that operator users will need to 
adjust their Helm/CR specs.



##########
packaging/src/kubernetes/src/java/org/apache/hive/kubernetes/operator/model/spec/LlapSpec.java:
##########
@@ -52,23 +58,44 @@ public record LlapSpec(
     @JsonPropertyDescription("Memory in MB per LLAP daemon instance")
     @Default("1024")
     Integer memoryMb,
-    @JsonPropertyDescription("LLAP service hosts identifier for ZooKeeper 
registration")
+    @JsonPropertyDescription("LLAP service hosts identifier for ZooKeeper 
registration. "
+        + "Defaults to @{name} (e.g. @llap0).")
     String serviceHosts,
     @JsonPropertyDescription("Readiness probe configuration")
     ProbeSpec readinessProbe,
     @JsonPropertyDescription("Autoscaling configuration (operator-driven, no 
external dependencies)")
-    AutoscalingSpec autoscaling) {
+    AutoscalingSpec autoscaling,
+    @JsonPropertyDescription("Per-LLAP TezAM configuration. Each LLAP cluster 
gets its own TezAM "
+        + "with independent replica count and autoscaling.")
+    LlapTezAmSpec tezAm) {
+
+  /** Per-LLAP-cluster TezAM replica and autoscaling overrides. */
+  public record LlapTezAmSpec(
+      @JsonPropertyDescription("Max number of TezAM replicas for this LLAP 
cluster")
+      @Default("1")
+      Integer replicas,
+      @JsonPropertyDescription("Autoscaling configuration for this LLAP 
cluster's TezAM")
+      AutoscalingSpec autoscaling) {
+
+    public LlapTezAmSpec {
+      replicas = replicas != null ? replicas : 1;
+      autoscaling = autoscaling != null ? autoscaling : new AutoscalingSpec(
+          false, 0, 0, 60, 600, 120, 10, 0, 0, null);
+    }
+  }
 
   public LlapSpec {
+    Objects.requireNonNull(name, "llapClusters[].name is required");
     replicas = replicas != null ? replicas : 1;
     enabled = enabled != null ? enabled : true;
     executors = executors != null ? executors : 1;
     memoryMb = memoryMb != null ? memoryMb : 1024;
-    serviceHosts = serviceHosts != null ? serviceHosts : "@llap0";
+    serviceHosts = serviceHosts != null ? serviceHosts : "@" + name;
     extraVolumes = extraVolumes != null ? extraVolumes : List.of();

Review Comment:
   `llapClusters[].name` is used in Kubernetes resource names/labels and also 
embedded into Prometheus metric names (e.g. `hs2_llap_target_sessions_{name}`). 
Without validation, names containing characters like '-' or '.' can result in 
invalid Prometheus metric names (the JMX exporter will sanitize them, causing 
mismatches with the operator’s expected metric keys) and/or invalid Kubernetes 
resource names. Also, allowing `serviceHosts` to diverge from `@{name}` will 
break the operator’s per-target session metric lookup, since HS2 keys off 
`hive.llap.daemon.service.hosts` while the operator keys off `llapSpec.name()`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to