utkarsharma2 commented on code in PR #35023:
URL: https://github.com/apache/airflow/pull/35023#discussion_r1376207422


##########
airflow/providers/openai/operators/openai.py:
##########
@@ -0,0 +1,73 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+
+from __future__ import annotations
+
+from functools import cached_property
+from typing import TYPE_CHECKING, Any, Sequence
+
+from airflow.models import BaseOperator
+from airflow.providers.openai.hooks.openai import OpenAIHook
+
+if TYPE_CHECKING:
+    from airflow.utils.context import Context
+
+
+class OpenAIEmbeddingOperator(BaseOperator):
+    """
+    Operator that accepts input text to generate OpenAI embeddings using the 
specified model.
+
+    .. seealso::
+        For more information on how to use this operator, take a look at the 
guide:
+        :ref:`howto/operator:OpenAIEmbeddingOperator`
+
+    :param conn_id: The OpenAI connection.
+    :param input_text: The text to generate OpenAI embeddings on. Either 
input_text or input_callable
+        should be provided.
+    :param input_callable: The callable that provides the input text to 
generate OpenAI embeddings.
+        Either input_text or input_callable should be provided.
+    :param input_callable_args: The list of arguments to be passed to 
``input_callable``
+    :param input_callable_kwargs: The kwargs to be passed to ``input_callable``
+    :param model: The OpenAI model to be used for generating the embeddings.
+    """
+
+    template_fields: Sequence[str] = ("input_text",)
+
+    def __init__(
+        self,
+        conn_id: str,
+        input_text: str | list[Any],
+        model: str = "text-embedding-ada-002",
+        **kwargs: Any,
+    ):
+        self.embedding_params = kwargs.pop("embedding_params", {})
+        self.hook_params = kwargs.pop("hook_params", {})

Review Comment:
   @hussein-awala I agree, that we should be more explicit, but here we are 
just passing the `kwargs` to BaseHook `super().__init__(*args, **kwargs)` and 
not the OpenAI client. But I'm happy to remove 
   `self.hook_params = kwargs.pop("hook_params", {})` and related logic if that 
is not the ideal way of doing it.
   
   Also,  WRT `self.embedding_params = kwargs.pop("embedding_params", {})` I 
think if we are explicit with all the params supported with 
`openai.Embedding.create` method, we may have to do releases every time there 
is a change with respect to params in OpenAI SDK.
   
   Maybe we can take a middle ground here and in our operator params explicitly 
mention the possible options for `embedding_params` by referring to the link 
`https://platform.openai.com/docs/api-reference/embeddings` as OpenAI does in 
there SDK. So it can be immune to changes in  OpenAI SDK. WDYT?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to