corgy-w commented on code in PR #7534:
URL: https://github.com/apache/seatunnel/pull/7534#discussion_r1740838060


##########
docs/en/transform-v2/embedding.md:
##########
@@ -0,0 +1,366 @@
+# Embedding
+
+> Embedding Transform Plugin
+
+## Description
+
+The `Embedding` transform plugin leverages embedding models to convert text 
data into vectorized representations. This
+transformation can be applied to various fields. The plugin supports multiple 
model providers and can be integrated with
+different API endpoints.
+
+## Options
+
+| Name                           | Type   | Required | Default Value | 
Description                                                                     
                            |
+|--------------------------------|--------|----------|---------------|-------------------------------------------------------------------------------------------------------------|
+| model_provider                 | enum   | yes      | -             | The 
model provider for embedding. Options may include `QIANFAN`, `OPENAI`, etc.     
                        |
+| api_key                        | string | yes      | -             | The API 
key required to authenticate with the embedding service.                        
                    |
+| secret_key                     | string | yes      | -             | The 
secret key required for additional authentication with the embedding service.   
                        |
+| single_vectorized_input_number | int    | no       | 1             | The 
number of inputs vectorized in one request. Default is 1.                       
                        |
+| vectorization_fields           | map    | yes      | -             | A 
mapping between input fields and their corresponding output vector fields.      
                          |
+| model                          | string | yes      | -             | The 
specific model to use for embedding (e.g: `text-embedding-3-small` for OPENAI). 
                        |
+| api_path                       | string | no       | -             | The API 
endpoint for the embedding service. Typically provided by the model provider.   
                    |
+| oauth_path                     | string | no       | -             | The API 
endpoint for the oauth service.                                                 
                    |
+| custom_config                  | map    | no       |               | Custom 
configurations for the model.                                                   
                     |
+| custom_response_parse          | string | no       |               | 
Specifies how to parse the response from the model using JsonPath. Example: 
`$.choices[*].message.content`. |
+| custom_request_headers         | map    | no       |               | Custom 
headers for the request to the model.                                           
                     |
+| custom_request_body            | map    | no       |               | Custom 
body for the request. Supports placeholders like `${model}`, `${input}`, 
`${prompt}`.                |
+
+### model_provider
+
+The model provider to use for generating embeddings. Common options might 
include `QIANFAN`, `OPENAI`, etc. Depending on

Review Comment:
   get



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to