featzhang opened a new pull request, #27561:
URL: https://github.com/apache/flink/pull/27561

   ## What is the purpose of the change?
   
   This PR adds retry mechanism with default value fallback for Triton model 
inference failures, enabling robust error handling and downstream filtering.
   
   ## Brief change log
   
   ### 1. New Configuration Options (TritonOptions.java)
   - `max-retries`: Maximum retry attempts (default: 0)
   - `retry-backoff`: Initial backoff duration with exponential strategy 
(default: 100ms)
   - `default-value`: Fallback value when all retries fail
   
   ### 2. Retry Logic (TritonInferenceModelFunction.java)
   - Implements exponential backoff retry strategy
   - Retries on network errors and 5xx server errors (503, 504)
   - Fails immediately on 4xx client errors (configuration issues)
   - Detailed logging for each retry attempt
   
   ### 3. Default Value Fallback
   - Returns configured default value after exhausting all retries
   - Supports all output types: STRING, numeric, ARRAY
   - Enables downstream view-based routing for success/failure cases
   - Backward compatible: throws exceptions if no default value configured
   
   ### 4. AbstractTritonModelFunction.java
   - Added fields and getters for retry configuration
   
   ## Use Cases
   
   **Scenario**: After N consecutive failures, return a default value that 
downstream can use to route records to success/failure paths.
   
   **Example Configuration**:
   ```sql
   CREATE MODEL my_triton_model
   WITH (
     'provider' = 'triton',
     'endpoint' = 'http://triton:8000/v2/models',
     'model-name' = 'my-model',
     'max-retries' = '3',              -- Retry up to 3 times
     'retry-backoff' = '100ms',        -- 100ms, 200ms, 400ms backoff
     'default-value' = 'FAILED'        -- Return 'FAILED' on all failures
   );
   ```
   
   **Downstream Processing**:
   ```sql
   -- Route based on prediction result
   INSERT INTO success_table
   SELECT * FROM predictions WHERE result != 'FAILED';
   
   INSERT INTO failure_table
   SELECT * FROM predictions WHERE result = 'FAILED';
   ```
   
   ## Verifying this change
   
   - [x] Manually tested: Compiled successfully with `mvn clean compile`
   - [x] Code follows spotless formatting standards
   - [ ] Unit tests will be added in follow-up if needed
   
   ## Does this pull request potentially affect one of the following parts:
   
   - [ ] Dependencies (does it add or upgrade a dependency)
   - [ ] The public API
   - [x] The model configuration options
   - [ ] The build infrastructure
   - [ ] Other (please describe)
   
   ## Documentation
   
   - Configuration options are documented in TritonOptions.java
   - Detailed examples provided in commit message and PR description
   - Will add user-facing documentation in separate PR if requested


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to