Re: [PR] [api][python] Add built-in Ollama embedding model support [flink-agents]

via GitHub Wed, 10 Sep 2025 20:59:45 -0700


xintongsong commented on code in PR #150:
URL: https://github.com/apache/flink-agents/pull/150#discussion_r2338378062



##########
python/flink_agents/api/embedding_models/embedding_model.py:
##########
@@ -0,0 +1,107 @@
+################################################################################
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+#################################################################################
+from abc import ABC, abstractmethod
+from typing import Any, Dict
+
+from pydantic import Field
+from typing_extensions import override
+
+from flink_agents.api.resource import Resource, ResourceType
+
+
+class BaseEmbeddingModelConnection(Resource, ABC):
+    """Base abstract class for text embedding model connection.
+
+    Responsible for managing model service connection configurations.
+    Specific implementations can add their own connection parameters like:
+    - Service address (base_url) for remote services
+    - API key (api_key) for authenticated services
+    - Authentication information, timeouts, etc.
+
+    Provides the basic embedding interface for direct communication with model 
services.
+
+    One connection can be shared in multiple embedding model setup.
+    """
+
+    @classmethod
+    @override
+    def resource_type(cls) -> ResourceType:
+        """Return resource type of class."""
+        return ResourceType.EMBEDDING_MODEL_CONNECTION
+
+    @abstractmethod
+    def embed_query(self, text: str, **kwargs: Any) -> list[float]:

Review Comment:
   Minor: for simplicity
   ```suggestion
       def embed(self, text: str, **kwargs: Any) -> list[float]:
   ```



##########
python/flink_agents/api/decorators.py:
##########
@@ -87,6 +87,41 @@ def chat_model_setup(func: Callable) -> Callable:
     return func
 
 
+def embedding_model_connection(func: Callable) -> Callable:
+    """Decorator for marking a function declaring an embedding model 
connection.
+
+    Parameters
+    ----------
+    func : Callable
+        Function to be decorated.
+
+    Returns:
+    -------
+    Callable
+        Decorator function that marks the target function declare an embedding 
model
+        connection.
+    """
+    func._is_embedding_model_connection = True
+    return func
+
+
+def embedding_model_setup(func: Callable) -> Callable:
+    """Decorator for marking a function declaring an embedding model setup.
+
+    Parameters
+    ----------
+    func : Callable
+        Function to be decorated.
+
+    Returns:
+    -------
+    Callable
+        Decorator function that marks the target function declare an embedding 
model.
+    """
+    func._is_embedding_model_setup = True
+    return func
+
+

Review Comment:
   In order for the decorated functions to be accessed as embed model 
connections and setups, we also need to convert the functions into resource 
providers and put them into the agent plan. See 
`AgentPlan._get_resource_providers()`.



##########
python/flink_agents/api/embedding_models/embedding_model.py:
##########
@@ -0,0 +1,107 @@
+################################################################################
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+#################################################################################
+from abc import ABC, abstractmethod
+from typing import Any, Dict
+
+from pydantic import Field
+from typing_extensions import override
+
+from flink_agents.api.resource import Resource, ResourceType
+
+
+class BaseEmbeddingModelConnection(Resource, ABC):
+    """Base abstract class for text embedding model connection.
+
+    Responsible for managing model service connection configurations.
+    Specific implementations can add their own connection parameters like:
+    - Service address (base_url) for remote services
+    - API key (api_key) for authenticated services
+    - Authentication information, timeouts, etc.
+
+    Provides the basic embedding interface for direct communication with model 
services.
+
+    One connection can be shared in multiple embedding model setup.
+    """
+
+    @classmethod
+    @override
+    def resource_type(cls) -> ResourceType:
+        """Return resource type of class."""
+        return ResourceType.EMBEDDING_MODEL_CONNECTION
+
+    @abstractmethod
+    def embed_query(self, text: str, **kwargs: Any) -> list[float]:
+        """Generate embedding vector for a single text query.
+
+        Converts the input text into a high-dimensional vector representation
+        suitable for semantic similarity search and retrieval operations.
+
+        Args:
+            text: The text string to convert into an embedding vector.
+            **kwargs: Additional parameters passed to the embedding model.
+
+        Returns:
+            A list of floating-point numbers representing the embedding vector.
+            The dimension of the vector depends on the specific embedding 
model used.
+        """
+
+
+class BaseEmbeddingModelSetup(Resource, ABC):
+    """Base abstract class for text embedding model setup.
+
+    Responsible for managing embedding model configurations, such as:
+    - Connection to embedding model service (connection_name)
+    - Model name (model_name)
+
+    Provides the basic embedding interface for generating embeddings from text 
inputs.
+    """
+
+    connection_name: str = Field(description="Name of the referenced 
connection.")

Review Comment:
   Minor:
   ```suggestion
       connection: str = Field(description="Name of the referenced connection.")
   ```



##########
python/flink_agents/api/embedding_models/embedding_model.py:
##########
@@ -0,0 +1,107 @@
+################################################################################
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+#################################################################################
+from abc import ABC, abstractmethod
+from typing import Any, Dict
+
+from pydantic import Field
+from typing_extensions import override
+
+from flink_agents.api.resource import Resource, ResourceType
+
+
+class BaseEmbeddingModelConnection(Resource, ABC):
+    """Base abstract class for text embedding model connection.
+
+    Responsible for managing model service connection configurations.
+    Specific implementations can add their own connection parameters like:
+    - Service address (base_url) for remote services
+    - API key (api_key) for authenticated services
+    - Authentication information, timeouts, etc.
+
+    Provides the basic embedding interface for direct communication with model 
services.
+
+    One connection can be shared in multiple embedding model setup.
+    """
+
+    @classmethod
+    @override
+    def resource_type(cls) -> ResourceType:
+        """Return resource type of class."""
+        return ResourceType.EMBEDDING_MODEL_CONNECTION
+
+    @abstractmethod
+    def embed_query(self, text: str, **kwargs: Any) -> list[float]:
+        """Generate embedding vector for a single text query.
+
+        Converts the input text into a high-dimensional vector representation
+        suitable for semantic similarity search and retrieval operations.
+
+        Args:
+            text: The text string to convert into an embedding vector.
+            **kwargs: Additional parameters passed to the embedding model.
+
+        Returns:
+            A list of floating-point numbers representing the embedding vector.
+            The dimension of the vector depends on the specific embedding 
model used.
+        """
+
+
+class BaseEmbeddingModelSetup(Resource, ABC):
+    """Base abstract class for text embedding model setup.
+
+    Responsible for managing embedding model configurations, such as:
+    - Connection to embedding model service (connection_name)
+    - Model name (model_name)
+
+    Provides the basic embedding interface for generating embeddings from text 
inputs.
+    """
+
+    connection_name: str = Field(description="Name of the referenced 
connection.")
+    model_name: str = Field(description="Name of the embedding model to use.")

Review Comment:
   Minor:
   ```suggestion
       model: str = Field(description="Name of the embedding model to use.")
   ```



##########
python/flink_agents/api/embedding_models/embedding_model.py:
##########
@@ -0,0 +1,107 @@
+################################################################################
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+#################################################################################
+from abc import ABC, abstractmethod
+from typing import Any, Dict
+
+from pydantic import Field
+from typing_extensions import override
+
+from flink_agents.api.resource import Resource, ResourceType
+
+
+class BaseEmbeddingModelConnection(Resource, ABC):
+    """Base abstract class for text embedding model connection.
+
+    Responsible for managing model service connection configurations.
+    Specific implementations can add their own connection parameters like:
+    - Service address (base_url) for remote services
+    - API key (api_key) for authenticated services
+    - Authentication information, timeouts, etc.
+
+    Provides the basic embedding interface for direct communication with model 
services.
+
+    One connection can be shared in multiple embedding model setup.
+    """
+
+    @classmethod
+    @override
+    def resource_type(cls) -> ResourceType:
+        """Return resource type of class."""
+        return ResourceType.EMBEDDING_MODEL_CONNECTION
+
+    @abstractmethod
+    def embed_query(self, text: str, **kwargs: Any) -> list[float]:
+        """Generate embedding vector for a single text query.
+
+        Converts the input text into a high-dimensional vector representation
+        suitable for semantic similarity search and retrieval operations.
+
+        Args:
+            text: The text string to convert into an embedding vector.
+            **kwargs: Additional parameters passed to the embedding model.
+
+        Returns:
+            A list of floating-point numbers representing the embedding vector.
+            The dimension of the vector depends on the specific embedding 
model used.
+        """
+
+
+class BaseEmbeddingModelSetup(Resource, ABC):
+    """Base abstract class for text embedding model setup.
+
+    Responsible for managing embedding model configurations, such as:
+    - Connection to embedding model service (connection_name)
+    - Model name (model_name)
+
+    Provides the basic embedding interface for generating embeddings from text 
inputs.
+    """
+
+    connection_name: str = Field(description="Name of the referenced 
connection.")
+    model_name: str = Field(description="Name of the embedding model to use.")

Review Comment:
   It seems we have multiple ways to specify which model to use.
   - For ollama and tongyi chat model, it's specified in the connection.
   - For anthropic and openai chat model, it's specified in the setup.
   - For both chat model connection and setup, `model` is a implementation 
specific parameter, rather than commonly defined in the abstraction.
   - For this embedding model abstraction, it's defined as part of the setup 
abstraction.
   
   I think we'd better have a unified way, so that users don't get confused. 
IMO, having it in the setup abstraction (like in this PR) might make more 
sense. @wenjin272, WDYT?



##########
python/flink_agents/integrations/embedding_models/local/tests/start_ollama_server.sh:
##########
@@ -0,0 +1,32 @@
+################################################################################
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+################################################################################
+
+# only works on linux
+os=$(uname -s)
+echo $os
+if [[ $os == "Linux" ]]; then
+  curl -fsSL https://ollama.com/install.sh | sh
+  ret=$?
+  if [ "$ret" != "0" ]
+  then
+    exit $ret
+  fi
+  ollama serve
+  ollama pull all-minilm:22m
+  ollama run all-minilm:22m
+fi

Review Comment:
   There's a ci failure related to this.
   
   We have 2 test cases, for chat and embed models respectively, that rely on 
the ollama server. We start the ollama server with this script, but never shut 
it down. So whichever test case is executed later, the script will fail at 
`ollama serve` because the port is already taken. And we did not check whether 
the script fail in the test case. So the test case continues, and connect to 
the server started by the previous test case, where the required model is not 
running.



##########
python/flink_agents/integrations/embedding_models/local/ollama_embedding_model.py:
##########
@@ -0,0 +1,165 @@
+################################################################################
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+################################################################################
+from typing import Any, Dict, Optional, Union
+
+from ollama import Client
+from pydantic import Field
+
+from flink_agents.api.embedding_models.embedding_model import (
+    BaseEmbeddingModelConnection,
+    BaseEmbeddingModelSetup,
+)
+
+DEFAULT_REQUEST_TIMEOUT = 30.0
+
+
+class OllamaEmbeddingModelConnection(BaseEmbeddingModelConnection):
+    """Ollama Embedding Model Connection which manages connection to Ollama 
server.
+
+    Visit https://ollama.com/ to download and install Ollama.
+
+    Run `ollama serve` to start a server.
+
+    Run `ollama pull <model_name>` to download an embedding model
+    (e.g. nomic-embed-text, all-minilm, etc) to run.
+
+    Attributes:
+    ----------
+    base_url : str
+        Base url the Ollama server is hosted under.
+    model : str
+        Embedding model name to use.
+    request_timeout : float
+        The timeout for making http request to Ollama API server.
+    """
+
+    model: str = Field(description="Embedding model name to use.")

Review Comment:
   This also confuses me. We have a `model` in the 
`OllamaEmbeddingModelConnection`, and also a `model_name` in 
`BaseEmbeddingModelSetup`. What is the relationship between these two?



##########
python/flink_agents/integrations/embedding_models/local/tests/start_ollama_server.sh:
##########
@@ -0,0 +1,32 @@
+################################################################################
+#  Licensed to the Apache Software Foundation (ASF) under one
+#  or more contributor license agreements.  See the NOTICE file
+#  distributed with this work for additional information
+#  regarding copyright ownership.  The ASF licenses this file
+#  to you under the Apache License, Version 2.0 (the
+#  "License"); you may not use this file except in compliance
+#  with the License.  You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+#  Unless required by applicable law or agreed to in writing, software
+#  distributed under the License is distributed on an "AS IS" BASIS,
+#  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+#  See the License for the specific language governing permissions and
+# limitations under the License.
+################################################################################
+
+# only works on linux
+os=$(uname -s)
+echo $os
+if [[ $os == "Linux" ]]; then
+  curl -fsSL https://ollama.com/install.sh | sh
+  ret=$?
+  if [ "$ret" != "0" ]
+  then
+    exit $ret
+  fi
+  ollama serve
+  ollama pull all-minilm:22m
+  ollama run all-minilm:22m
+fi

Review Comment:
   I see several improvements needed for these test cases
   
   1. We might want to reuse the script, passing in the required models as 
arguments.
   2. In the script, we may check whether the server is already started. If 
true, we can just start the required model on the server, if not already 
running.
   3. We should check whether the script fails in the test cases.
   4. We'd better shutdown the server at some point. We may rely on GHA to 
clear the environment, but when executed locally this will leave a running 
progress.
   
   cc @wenjin272 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] [api][python] Add built-in Ollama embedding model support [flink-agents]

Reply via email to