yuxiqian commented on code in PR #3642:
URL: https://github.com/apache/flink-cdc/pull/3642#discussion_r1799126412
##########
flink-cdc-runtime/src/main/java/org/apache/flink/cdc/runtime/functions/SystemFunctionUtils.java:
##########
@@ -77,6 +80,53 @@ public static int currentDate(long epochTime, String
timezone) {
return timestampMillisToDate(localtimestamp(epochTime,
timezone).getMillisecond());
}
+ private static final String DEFAULT_MODEL_NAME = "text-embedding-ada-002";
+ private static OpenAiEmbeddingModel embeddingModel;
+
+ public static void initializeOpenAiEmbeddingModel(String apiKey, String
baseUrl) {
+ embeddingModel =
+ OpenAiEmbeddingModel.builder()
+ .apiKey(apiKey)
+ .baseUrl(baseUrl)
+ .modelName(DEFAULT_MODEL_NAME)
+ .timeout(Duration.ofSeconds(30))
+ .maxRetries(3)
+ .build();
+ }
+
+ public static String getEmbedding(String input, String apiKey, String
model) {
+ if (input == null || input.trim().isEmpty()) {
+ LOG.debug("Empty or null input provided for embedding.");
+ return "";
+ }
+
+ try {
+ // 确保 OpenAiEmbeddingModel 已初始化
+ if (embeddingModel == null) {
+ initializeOpenAiEmbeddingModel(apiKey,
"https://api.openai.com/v1/");
Review Comment:
Is the endpoint hard-encoded here? Why we still need this function and
passing apiKeys manually? Shouldn't these be configured in `models:` rule block?
##########
flink-cdc-runtime/src/main/java/org/apache/flink/cdc/runtime/functions/SystemFunctionUtils.java:
##########
@@ -77,6 +80,53 @@ public static int currentDate(long epochTime, String
timezone) {
return timestampMillisToDate(localtimestamp(epochTime,
timezone).getMillisecond());
}
+ private static final String DEFAULT_MODEL_NAME = "text-embedding-ada-002";
+ private static OpenAiEmbeddingModel embeddingModel;
+
+ public static void initializeOpenAiEmbeddingModel(String apiKey, String
baseUrl) {
+ embeddingModel =
+ OpenAiEmbeddingModel.builder()
+ .apiKey(apiKey)
+ .baseUrl(baseUrl)
+ .modelName(DEFAULT_MODEL_NAME)
+ .timeout(Duration.ofSeconds(30))
+ .maxRetries(3)
+ .build();
+ }
+
+ public static String getEmbedding(String input, String apiKey, String
model) {
+ if (input == null || input.trim().isEmpty()) {
+ LOG.debug("Empty or null input provided for embedding.");
+ return "";
+ }
+
+ try {
+ // 确保 OpenAiEmbeddingModel 已初始化
+ if (embeddingModel == null) {
+ initializeOpenAiEmbeddingModel(apiKey,
"https://api.openai.com/v1/");
Review Comment:
Is the endpoint hard-encoded here? Why we need this function and passing
apiKeys manually? Shouldn't these be configured in `models:` rule block?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]