Copilot commented on code in PR #6557:
URL: https://github.com/apache/hive/pull/6557#discussion_r3447388208
##########
packaging/src/kubernetes/src/java/org/apache/hive/kubernetes/operator/reconciler/HiveWorkflowSpec.java:
##########
@@ -129,44 +109,21 @@ private static List<DependentResourceSpec> buildSpecs() {
Set.of(METASTORE_CONFIGMAP),
null, null, null, METASTORE_ENABLED, null));
+ // --- Shared Scratch PVC (Required for HS2 to TezAM communication) ---
+ specs.add(new DependentResourceSpec(
+ ScratchPvcDependent.class, SCRATCH_PVC,
+ Set.of(), null, null, null, TEZAM_ENABLED, null));
+
specs.add(new DependentResourceSpec(
HiveServer2DeploymentDependent.class, HIVESERVER2_DEPLOYMENT,
- Set.of(HIVESERVER2_CONFIGMAP, HADOOP_CONFIGMAP),
+ Set.of(HIVESERVER2_CONFIGMAP, HADOOP_CONFIGMAP, SCRATCH_PVC),
null, hs2Precondition(), null, null, null));
Review Comment:
HiveServer2DeploymentDependent is declared to depend on the scratch PVC even
when TezAM is disabled. Since ScratchPvcDependent is only created when
`tezAm.enabled` is true, disabling TezAM can block HS2 reconciliation waiting
on a dependency that will never be created. HS2 already conditionally mounts
the PVC only when TezAM is enabled, so the workflow dependency is unnecessary.
##########
service/src/java/org/apache/hive/service/cli/session/SessionManager.java:
##########
@@ -209,6 +211,52 @@ public Integer getValue() {
metrics.addRatio(MetricsConstant.HS2_AVG_ACTIVE_SESSION_TIME,
activeSessionTime, activeSessionCnt);
}
+ // Registers a per-LLAP-target session gauge on first use.
+ private void registerLlapTargetGaugeIfNeeded(HiveSession session) {
+ try {
+ String sanitized = extractLlapTarget(session.getSessionConf());
+ if (sanitized == null) {
+ return;
+ }
+ if (registeredLlapTargetGauges.contains(sanitized)) {
+ return; // already registered
+ }
+ if (!registeredLlapTargetGauges.add(sanitized)) {
+ return; // lost race with another thread
+ }
+ Metrics metrics = MetricsFactory.getInstance();
+ if (metrics == null) {
+ return;
+ }
Review Comment:
`registeredLlapTargetGauges` is updated before verifying that
`MetricsFactory.getInstance()` is available. If `metrics` is null, the method
returns after adding the target to the set, preventing the gauge from ever
being registered later. Also, the prior `contains()` check is redundant given
`add()` already handles concurrency.
##########
packaging/src/kubernetes/README.md:
##########
@@ -58,8 +58,8 @@ mvn clean package -pl packaging/src/kubernetes -Pkubernetes
-DskipTests
## Quick Start (Helm)
The Helm chart defaults to a **Full-HA** cluster (Metastore x2, HiveServer2 x2,
-LLAP x2, TezAM x2). You only need to provide three things: database, ZooKeeper,
-and storage.
+LLAP x2, TezAM x2 — one TezAM per LLAP cluster). You only need to provide three
+things: database, ZooKeeper, and storage.
Review Comment:
The PR description says there are no user-facing changes, but this PR
changes the HiveCluster CRD/Helm values and documented configuration surface
(e.g. `llap` → `llapClusters`, new `llapClusterRouting`, per-LLAP TezAM).
Please update the PR description to reflect that operator users will need to
adjust their Helm/CR specs.
##########
packaging/src/kubernetes/src/java/org/apache/hive/kubernetes/operator/model/spec/LlapSpec.java:
##########
@@ -52,23 +58,44 @@ public record LlapSpec(
@JsonPropertyDescription("Memory in MB per LLAP daemon instance")
@Default("1024")
Integer memoryMb,
- @JsonPropertyDescription("LLAP service hosts identifier for ZooKeeper
registration")
+ @JsonPropertyDescription("LLAP service hosts identifier for ZooKeeper
registration. "
+ + "Defaults to @{name} (e.g. @llap0).")
String serviceHosts,
@JsonPropertyDescription("Readiness probe configuration")
ProbeSpec readinessProbe,
@JsonPropertyDescription("Autoscaling configuration (operator-driven, no
external dependencies)")
- AutoscalingSpec autoscaling) {
+ AutoscalingSpec autoscaling,
+ @JsonPropertyDescription("Per-LLAP TezAM configuration. Each LLAP cluster
gets its own TezAM "
+ + "with independent replica count and autoscaling.")
+ LlapTezAmSpec tezAm) {
+
+ /** Per-LLAP-cluster TezAM replica and autoscaling overrides. */
+ public record LlapTezAmSpec(
+ @JsonPropertyDescription("Max number of TezAM replicas for this LLAP
cluster")
+ @Default("1")
+ Integer replicas,
+ @JsonPropertyDescription("Autoscaling configuration for this LLAP
cluster's TezAM")
+ AutoscalingSpec autoscaling) {
+
+ public LlapTezAmSpec {
+ replicas = replicas != null ? replicas : 1;
+ autoscaling = autoscaling != null ? autoscaling : new AutoscalingSpec(
+ false, 0, 0, 60, 600, 120, 10, 0, 0, null);
+ }
+ }
public LlapSpec {
+ Objects.requireNonNull(name, "llapClusters[].name is required");
replicas = replicas != null ? replicas : 1;
enabled = enabled != null ? enabled : true;
executors = executors != null ? executors : 1;
memoryMb = memoryMb != null ? memoryMb : 1024;
- serviceHosts = serviceHosts != null ? serviceHosts : "@llap0";
+ serviceHosts = serviceHosts != null ? serviceHosts : "@" + name;
extraVolumes = extraVolumes != null ? extraVolumes : List.of();
Review Comment:
`llapClusters[].name` is used in Kubernetes resource names/labels and also
embedded into Prometheus metric names (e.g. `hs2_llap_target_sessions_{name}`).
Without validation, names containing characters like '-' or '.' can result in
invalid Prometheus metric names (the JMX exporter will sanitize them, causing
mismatches with the operator’s expected metric keys) and/or invalid Kubernetes
resource names. Also, allowing `serviceHosts` to diverge from `@{name}` will
break the operator’s per-target session metric lookup, since HS2 keys off
`hive.llap.daemon.service.hosts` while the operator keys off `llapSpec.name()`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]