[PR] [FLINK-38857][model] Add sequence ID auto-increment support for Triton inference [flink]

via GitHub Mon, 09 Feb 2026 19:12:26 -0800


featzhang opened a new pull request, #27562:
URL: https://github.com/apache/flink/pull/27562


   ## What is the purpose of the change
   
   This PR adds sequence ID auto-increment strategy support for Triton 
Inference Server integration in Flink, addressing the need for sequence 
isolation across job failovers and restarts.
   
   ## Brief change log
   
   - Add `sequence-id-auto-increment` configuration option in TritonOptions
   - Implement auto-increment logic in TritonInferenceModelFunction
     * Generate unique sequence IDs: `{sequence-id}-{subtask-index}-{counter}`
     * Initialize AtomicLong counter in open() method
     * Increment counter for each inference request
   - Add validation to ensure `sequence-id` is configured when auto-increment 
is enabled
   - Add detailed logging for debugging sequence ID generation
   
   ## Use Case
   
   This feature is particularly useful for **non-reentrant models** where:
   - Duplicate inference requests must be avoided after failover
   - Sequence batching requires strict isolation between parallel instances
   - Models maintain stateful context that cannot handle repeated sequence IDs
   
   ### Example Configuration
   
   ```sql
   CREATE MODEL my_triton_model WITH (
     'provider' = 'triton',
     'endpoint' = 'https://triton-server:8000/v2/models',
     'model-name' = 'my_stateful_model',
     'sequence-id' = 'flink-job-123',
     'sequence-id-auto-increment' = 'true',  -- Enable auto-increment
     'sequence-start' = 'true',
     'sequence-end' = 'true'
   );
   ```
   
   ### Sequence ID Format
   
   When auto-increment is enabled, sequence IDs follow this pattern:
   ```
   {base-sequence-id}-{subtask-index}-{counter}
   Example: flink-job-123-0-0, flink-job-123-0-1, flink-job-123-1-0
   ```
   
   This ensures:
   - Unique sequences across parallel subtasks (via subtask-index)
   - Monotonically increasing sequences per subtask (via counter)
   - Sequence isolation across job restarts (new counter starts from 0)
   
   ## Verifying this change
   
   This change is a trivial rework / code cleanup without any test coverage.
   
   - Manual testing with Triton Inference Server
   - Code formatting verified with Spotless
   
   ## Does this pull request potentially affect one of the following parts
   
   - Dependencies: no
   - The public API: yes (adds new configuration option)
   - The serializers: no
   - The runtime per-record code paths: no
   - Anything that affects deployment or recovery: JobManager, Checkpoint 
Coordinator, State backends, etc.: no
   - The S3 file system connector: no
   
   ## Documentation
   
   - Does this pull request introduce a new feature? yes
   - If yes, how is the feature documented? JavaDocs
   
   ## Related Issues
   
   Closes #FLINK-38857


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[PR] [FLINK-38857][model] Add sequence ID auto-increment support for Triton inference [flink]

Reply via email to