Yicong-Huang opened a new pull request, #4547:
URL: https://github.com/apache/texera/pull/4547

   ## Summary
   - restore consume-on-read semantics for Python `current_internal_marker`
   - stop clearing the marker a second time after replaying internal channel 
markers
   - prevent stale internal `EndChannel` markers from being observed again 
during reconfiguration propagation
   
   ## Root Cause
   This regression was introduced by `#4424` (`ef66190f22`). That change made 
`get_internal_marker()` return `current_internal_marker` without consuming it, 
and moved marker cleanup to the end of the main-loop replay path.
   
   For Python source operators, the internal `EndChannel` marker can stay live 
across the pause and reconfiguration window. When the reconfiguration ECM is 
processed, the stale internal marker is still visible and gets replayed again, 
which corrupts end-of-stream handling and hangs the workflow.
   
   This change restores the previous consume-on-read behavior so the internal 
marker is observed exactly once.
   
   Fixes #4545.
   
   ## Validation
   - `WorkflowExecutionService/testOnly 
org.apache.texera.amber.engine.e2e.ReconfigurationSpec`
   - `WorkflowExecutionService/testOnly 
org.apache.texera.amber.engine.e2e.ReconfigurationSpec -- -z "be able to modify 
a python UDF worker in workflow"`
   - `WorkflowExecutionService/testOnly 
org.apache.texera.amber.engine.e2e.ReconfigurationSpec -- -z "propagate 
reconfiguration through a source operator in workflow"`
   - `WorkflowExecutionService/testOnly 
org.apache.texera.amber.engine.e2e.ReconfigurationSpec -- -z "be able to modify 
two python UDFs in workflow"`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to