Taragolis commented on code in PR #34891:
URL: https://github.com/apache/airflow/pull/34891#discussion_r1362910782
##########
airflow/providers/postgres/operators/postgres.py:
##########
@@ -80,3 +86,60 @@ def __init__(
AirflowProviderDeprecationWarning,
stacklevel=2,
)
+
+
+class PgVectorIngestOperator(BaseOperator):
+ """
+ Operator for ingesting text and embeddings into a PostgreSQL database
using the pgvector library.
+
+ :param conn_id: The connection ID for the postgresql database.
+ :param input_data: Tuple containing the string input content and
corresponding list of float vector
+ embeddings.
+ :param input_callable: A callable that returns a tuple containing the
string input content and
+ corresponding list of float vector embeddings, if ``input_data`` is
not provided.
+ :param input_callable_args: Positional arguments for the 'input_callable'.
+ :param input_callable_kwargs: Keyword arguments for the 'input_callable'.
+ :param kwargs: Additional keyword arguments for the BaseOperator.
+ """
+
+ def __init__(
+ self,
+ conn_id: str,
+ input_data: tuple[str, list[float]] | None = None,
+ input_callable: Callable[[Any], Any] | None = None,
+ input_callable_args: Collection[Any] | None = None,
+ input_callable_kwargs: Mapping[str, Any] | None = None,
Review Comment:
I wan't to address the same question as I asked in
https://github.com/apache/airflow/pull/34921#discussion_r1358525838
What benefits provide this arguments, because for me it could be replaced be
replaced by different things
1. Make input data templated field and provide thought XCom from upstream
task.
2. Combination of taskflow (or PythonOperator) + PostgresHook
```python
@task
def awesome_task(conn_id: str):
hook = PostgresHook(postgres_conn_id=conn_id)
...
hook.ingest_embedding(
table="foo",
input_data=...
vector_size=42
)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]