Tanishq1030 opened a new pull request, #37413:
URL: https://github.com/apache/beam/pull/37413

   Fixes #19711
   
   This PR addresses the issue where `step_id` (instruction ID) was 
consistently missing or empty in worker logs generated during the 
`DoFn.setup()` lifecycle method.
   
   ### Rationale
   The `FnApiLogRecordHandler` relies on `statesampler` thread-local storage to 
populate the `instruction_id` in log entries. Previously, the `BundleProcessor` 
executed the `setup()` method for operations *before* the thread-local context 
was fully initialized for that instruction, causing logs emitted during setup 
to become orphaned (missing metadata).
   
   ### Changes
   1. **`sdks/python/apache_beam/runners/worker/sdk_worker.py`**: Updated 
`create_bundle_processor` to pass the active `instruction_id` into the 
`BundleProcessor` constructor.
   2. **`sdks/python/apache_beam/runners/worker/bundle_processor.py`**:
       * Updated `__init__` to accept `instruction_id`.
       * Added logic to manually inject the `instruction_id` into the 
`statesampler` context specifically while iterating through operations to call 
`op.setup()`.
   3. **`sdks/python/apache_beam/runners/worker/log_handler.py`**: Updated 
`emit()` to check `record.instruction_id` before falling back to thread-local 
storage, ensuring explicitly injected IDs are respected.
   
   ### Verification
   I verified this fix locally using a reproduction script which forces a log 
during `setup()`.
   * **Before fix:** Logs during `setup()` had `instruction_id: None`.
   * **After fix:** Logs during `setup()` correctly display the 
`instruction_id` (e.g., `bundle_...`).
   
   ------------------------
   
   - [x] Mention the appropriate issue in your description (e.g. `fixes 
#19711`).
   - [ ] Update `CHANGES.md` with noteworthy changes.
   - [ ] If this contribution is large, please file an Apache [Individual 
Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to