featzhang commented on code in PR #27385:
URL: https://github.com/apache/flink/pull/27385#discussion_r2702541741


##########
flink-models/flink-model-triton/README.md:
##########
@@ -0,0 +1,174 @@
+# Flink Triton Model Integration
+
+This module provides integration between Apache Flink and NVIDIA Triton 
Inference Server, enabling real-time model inference within Flink streaming 
applications.
+
+## Features
+
+- **REST API Integration**: Communicates with Triton Inference Server via 
HTTP/REST API
+- **Asynchronous Processing**: Non-blocking inference requests for high 
throughput
+- **Flexible Configuration**: Comprehensive configuration options for various 
use cases
+- **Error Handling**: Built-in retry mechanisms and error handling
+- **Resource Management**: Efficient HTTP client pooling and resource 
management
+
+## Configuration Options
+
+### Required Options
+
+| Option | Type | Description |
+|--------|------|-------------|
+| `endpoint` | String | Full URL of the Triton Inference Server endpoint 
(e.g., `http://localhost:8000/v2/models`) |
+| `model-name` | String | Name of the model to invoke on Triton server |
+| `model-version` | String | Version of the model to use (defaults to 
"latest") |
+
+### Optional Options
+
+| Option | Type | Default | Description |
+|--------|------|---------|-------------|
+| `timeout` | Long | 30000 | Request timeout in milliseconds |
+| `max-retries` | Integer | 3 | Maximum number of retries for failed requests |
+| `batch-size` | Integer | 1 | Batch size for inference requests |
+| `priority` | Integer | - | Request priority level (0-255, higher values = 
higher priority) |

Review Comment:
   Here `-` does not represent a numeric priority value.
   
   It indicates that the priority is **not set**. In this case, Flink does not 
send any priority field in the inference request, and Triton applies its 
default scheduling behavior.
   
   This is different from specifying `0` explicitly. A value of `0` would be 
sent to Triton as a valid priority, whereas `-` means the parameter is omitted 
entirely.
   
   I agree the table is ambiguous here, and I will update the documentation to 
clarify that `-` means “not configured / unset”, not a priority value.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to