gemini-code-assist[bot] commented on code in PR #38987:
URL: https://github.com/apache/beam/pull/38987#discussion_r3422724696


##########
runners/kafka-streams/src/main/java/org/apache/beam/runners/kafka/streams/translation/ExecutableStageProcessor.java:
##########
@@ -87,22 +98,43 @@ class ExecutableStageProcessor
   @Override
   public void init(ProcessorContext<byte[], KStreamsPayload<?>> context) {
     this.context = context;
-    ExecutableStage executableStage = 
ExecutableStage.fromPayload(stagePayload);
-    this.stageContext = 
KafkaStreamsExecutableStageContextFactory.getInstance().get(jobInfo);
-    this.stageBundleFactory = 
stageContext.getStageBundleFactory(executableStage);
+    // The SDK harness (stage context + bundle factory) is created lazily on 
the first data
+    // element, so a stage that only forwards watermarks never spins one up. 
This mirrors Spark's
+    // SparkExecutableStageFunction, which likewise does not build a bundle 
factory when there are
+    // no inputs to process.
+  }
+
+  private StageBundleFactory ensureStageBundleFactory() {
+    StageBundleFactory factory = stageBundleFactory;
+    if (factory == null) {
+      ExecutableStage executableStage = 
ExecutableStage.fromPayload(stagePayload);
+      ExecutableStageContext sc =
+          KafkaStreamsExecutableStageContextFactory.getInstance().get(jobInfo);
+      this.stageContext = sc;
+      factory = sc.getStageBundleFactory(executableStage);
+      this.stageBundleFactory = factory;
+    }
+    return factory;
   }
 
   @Override
   public void process(Record<byte[], KStreamsPayload<?>> record) {
     KStreamsPayload<?> payload = record.value();
     if (payload.isWatermark()) {

Review Comment:
   ![medium](https://www.gstatic.com/codereviewagent/medium-priority.svg)
   
   Defensive programming check: `record.value()` can potentially be `null` (for 
example, if a tombstone record is encountered). Accessing 
`payload.isWatermark()` directly without a null check will result in a 
`NullPointerException`. Consider adding a null check to handle this case safely.
   
   ```suggestion
     public void process(Record<byte[], KStreamsPayload<?>> record) {
       KStreamsPayload<?> payload = record.value();
       if (payload == null) {
         throw new IllegalArgumentException("Received record with null 
payload");
       }
       if (payload.isWatermark()) {
   ```



##########
runners/kafka-streams/src/main/java/org/apache/beam/runners/kafka/streams/translation/KStreamsPayload.java:
##########
@@ -54,21 +56,36 @@ private enum Kind {
   private final Kind kind;
   private final @Nullable WindowedValue<T> data;
   private final long watermarkMillis;
+  private final int sourcePartition;
+  private final int totalSourcePartitions;
 
-  private KStreamsPayload(Kind kind, @Nullable WindowedValue<T> data, long 
watermarkMillis) {
+  private KStreamsPayload(
+      Kind kind,
+      @Nullable WindowedValue<T> data,
+      long watermarkMillis,
+      int sourcePartition,
+      int totalSourcePartitions) {
     this.kind = kind;
     this.data = data;
     this.watermarkMillis = watermarkMillis;
+    this.sourcePartition = sourcePartition;
+    this.totalSourcePartitions = totalSourcePartitions;
   }
 
   /** Returns a data payload wrapping the given {@link WindowedValue}. */
   public static <T> KStreamsPayload<T> data(WindowedValue<T> value) {
-    return new KStreamsPayload<>(Kind.DATA, value, 0L);
+    return new KStreamsPayload<>(Kind.DATA, value, 0L, 0, 0);
   }
 
-  /** Returns a watermark payload carrying the given event-time milliseconds. 
*/
-  public static <T> KStreamsPayload<T> watermark(long watermarkMillis) {
-    return new KStreamsPayload<>(Kind.WATERMARK, null, watermarkMillis);
+  /**
+   * Returns a watermark report payload: the event-time milliseconds together 
with the in-band
+   * coordination fields the downstream stage's {@link WatermarkManager} needs 
— which source
+   * partition this report is for and how many source partitions feed the 
stage in total.
+   */
+  public static <T> KStreamsPayload<T> watermark(
+      long watermarkMillis, int sourcePartition, int totalSourcePartitions) {
+    return new KStreamsPayload<>(
+        Kind.WATERMARK, null, watermarkMillis, sourcePartition, 
totalSourcePartitions);

Review Comment:
   ![medium](https://www.gstatic.com/codereviewagent/medium-priority.svg)
   
   To prevent invalid partition coordinates from causing unexpected behavior or 
index out of bounds errors in the downstream `WatermarkManager`, consider 
validating that `sourcePartition` is non-negative, `totalSourcePartitions` is 
positive, and `sourcePartition` is strictly less than `totalSourcePartitions`.
   
   ```suggestion
     public static <T> KStreamsPayload<T> watermark(
         long watermarkMillis, int sourcePartition, int totalSourcePartitions) {
       if (sourcePartition < 0 || totalSourcePartitions <= 0 || sourcePartition 
>= totalSourcePartitions) {
         throw new IllegalArgumentException(
             String.format(
                 "Invalid partition coordinates: %d of %d", sourcePartition, 
totalSourcePartitions));
       }
       return new KStreamsPayload<>(
           Kind.WATERMARK, null, watermarkMillis, sourcePartition, 
totalSourcePartitions);
     }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to