This is an automated email from the ASF dual-hosted git repository.

kabhwan pushed a commit to branch branch-3.5
in repository https://gitbox.apache.org/repos/asf/spark.git


The following commit(s) were added to refs/heads/branch-3.5 by this push:
     new a3d23fdb775b [MINOR][SS] Minor update to watermark propagation comments
a3d23fdb775b is described below

commit a3d23fdb775bee3f03c52a77b80bc0c724108e20
Author: Neil Ramaswamy <[email protected]>
AuthorDate: Wed Dec 18 15:45:36 2024 +0900

    [MINOR][SS] Minor update to watermark propagation comments
    
    ### What changes were proposed in this pull request?
    
    A few minor changes to clarify (and fix one typo) in the comments for 
watermark propagation in Structured Streaming.
    
    ### Why are the changes needed?
    
    I found some of the terminology around "simulation" confusing, and the 
current comment describes incorrect logic for output watermark calculation.
    
    ### Does this PR introduce _any_ user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    N/A.
    
    ### Was this patch authored or co-authored using generative AI tooling?
    
    No
    
    Closes #49188 from neilramaswamy/nr/minor-wm-prop.
    
    Authored-by: Neil Ramaswamy <[email protected]>
    Signed-off-by: Jungtaek Lim <[email protected]>
    (cherry picked from commit 2b41131d7fa66ef5b23fbe247e057d631ee5e4f6)
    Signed-off-by: Jungtaek Lim <[email protected]>
---
 .../spark/sql/execution/streaming/WatermarkPropagator.scala       | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/WatermarkPropagator.scala
 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/WatermarkPropagator.scala
index 6f3725bebb9a..3d9325f9c98c 100644
--- 
a/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/WatermarkPropagator.scala
+++ 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/WatermarkPropagator.scala
@@ -124,12 +124,14 @@ class UseSingleWatermarkPropagator extends 
WatermarkPropagator {
 /**
  * This implementation simulates propagation of watermark among operators.
  *
- * The simulation algorithm traverses the physical plan tree via post-order 
(children first) to
- * calculate (input watermark, output watermark) for all nodes.
+ * It is considered a "simulation" because watermarks are not being physically 
sent between
+ * operators, but rather propagated up the tree via post-order (children 
first) traversal of
+ * the query plan. This allows Structured Streaming to determine the new 
(input watermark, output
+ * watermark) for all nodes.
  *
  * For each node, below logic is applied:
  *
- * - Input watermark for specific node is decided by `min(input watermarks 
from all children)`.
+ * - Input watermark for specific node is decided by `min(output watermarks 
from all children)`.
  *   -- Children providing no input watermark (DEFAULT_WATERMARK_MS) are 
excluded.
  *   -- If there is no valid input watermark from children, input watermark = 
DEFAULT_WATERMARK_MS.
  * - Output watermark for specific node is decided as following:


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to