davidradl commented on code in PR #27385: URL: https://github.com/apache/flink/pull/27385#discussion_r2668511015
########## docs/content.zh/docs/connectors/models/triton.md: ########## @@ -0,0 +1,422 @@ +--- +title: "Triton" +weight: 2 +type: docs +--- +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + http://www.apache.org/licenses/LICENSE-2.0 +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +# Triton + +The Triton Model Function allows Flink SQL to call [NVIDIA Triton Inference Server](https://github.com/triton-inference-server/server) for real-time model inference tasks. + +## Overview + +The function supports calling remote Triton Inference Server via Flink SQL for prediction/inference tasks. Triton Inference Server is a high-performance inference serving solution that supports multiple machine learning frameworks including TensorFlow, PyTorch, ONNX, and more. + +Key features: +* **High Performance**: Optimized for low-latency and high-throughput inference +* **Multi-Framework Support**: Works with models from various ML frameworks +* **Asynchronous Processing**: Non-blocking inference requests for better resource utilization +* **Flexible Configuration**: Comprehensive configuration options for different use cases +* **Resource Management**: Efficient HTTP client pooling and automatic resource cleanup + +## Usage Examples + +The following example creates a Triton model for text classification and uses it to analyze sentiment in movie reviews. + +First, create the Triton model with the following SQL statement: + +```sql +CREATE MODEL triton_sentiment_classifier +INPUT (`input` STRING) +OUTPUT (`output` STRING) +WITH ( + 'provider' = 'triton', + 'endpoint' = 'http://localhost:8000/v2/models', + 'model-name' = 'text-classification', + 'model-version' = '1', + 'timeout' = '10000', + 'max-retries' = '3' +); +``` + +Suppose the following data is stored in a table named `movie_reviews`, and the prediction result is to be stored in a table named `classified_reviews`: + +```sql +CREATE TEMPORARY VIEW movie_reviews(id, movie_name, user_review, actual_sentiment) +AS VALUES + (1, 'Great Movie', 'This movie was absolutely fantastic! Great acting and storyline.', 'positive'), Review Comment: nit: I wonder whether -1, 0 and +1 would be more intuitive values. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
