featzhang opened a new pull request, #27562:
URL: https://github.com/apache/flink/pull/27562
## What is the purpose of the change
This PR adds sequence ID auto-increment strategy support for Triton
Inference Server integration in Flink, addressing the need for sequence
isolation across job failovers and restarts.
## Brief change log
- Add `sequence-id-auto-increment` configuration option in TritonOptions
- Implement auto-increment logic in TritonInferenceModelFunction
* Generate unique sequence IDs: `{sequence-id}-{subtask-index}-{counter}`
* Initialize AtomicLong counter in open() method
* Increment counter for each inference request
- Add validation to ensure `sequence-id` is configured when auto-increment
is enabled
- Add detailed logging for debugging sequence ID generation
## Use Case
This feature is particularly useful for **non-reentrant models** where:
- Duplicate inference requests must be avoided after failover
- Sequence batching requires strict isolation between parallel instances
- Models maintain stateful context that cannot handle repeated sequence IDs
### Example Configuration
```sql
CREATE MODEL my_triton_model WITH (
'provider' = 'triton',
'endpoint' = 'https://triton-server:8000/v2/models',
'model-name' = 'my_stateful_model',
'sequence-id' = 'flink-job-123',
'sequence-id-auto-increment' = 'true', -- Enable auto-increment
'sequence-start' = 'true',
'sequence-end' = 'true'
);
```
### Sequence ID Format
When auto-increment is enabled, sequence IDs follow this pattern:
```
{base-sequence-id}-{subtask-index}-{counter}
Example: flink-job-123-0-0, flink-job-123-0-1, flink-job-123-1-0
```
This ensures:
- Unique sequences across parallel subtasks (via subtask-index)
- Monotonically increasing sequences per subtask (via counter)
- Sequence isolation across job restarts (new counter starts from 0)
## Verifying this change
This change is a trivial rework / code cleanup without any test coverage.
- Manual testing with Triton Inference Server
- Code formatting verified with Spotless
## Does this pull request potentially affect one of the following parts
- Dependencies: no
- The public API: yes (adds new configuration option)
- The serializers: no
- The runtime per-record code paths: no
- Anything that affects deployment or recovery: JobManager, Checkpoint
Coordinator, State backends, etc.: no
- The S3 file system connector: no
## Documentation
- Does this pull request introduce a new feature? yes
- If yes, how is the feature documented? JavaDocs
## Related Issues
Closes #FLINK-38857
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]