Hi everyone,
I’d like to propose an enhancement to the Text to Vector(language-models) module to support pluggable/custom embedding model implementations. At the moment, SolrTextToVectorModel is tightly coupled to LangChain4j’s EmbeddingModel interface. This effectively limits support to the bundled LangChain4j providers (HuggingFace, OpenAI, etc.). If someone wants to integrate a custom embedding endpoint, they currently need to implement the full LangChain4j EmbeddingModel interface, including its builder conventions — even if they don’t otherwise use LangChain4j. There’s also no Solr-native abstraction for text-to-vector conversion today. My proposal is to introduce a Solr-native TextToVectorModel interface and decouple the module from LangChain4j. For backward compatibility, we could add a Langchain4jModelAdapter that implements TextToVectorModel by wrapping a LangChain4j EmbeddingModel. That way, existing configurations would continue to work unchanged. With this approach, users could implement TextToVectorModel in their own JAR, drop it into Solr’s classpath, and register it via the existing REST API without taking on a LangChain4j dependency. The change would involve: - adding TextToVectorModel - adding Langchain4jModelAdapter - updating SolrTextToVectorModel factory logic to support both paths I’d appreciate feedback on whether this direction makes sense. I’m happy to open a JIRA and put together a draft PR for discussion. I have a working implementation locally that demonstrates the approach. Thanks, Prathmesh Deshmukh
