Re: Solr native TextToVectorModel interface for custom embedding endpoints

Prathmesh Deshmukh Mon, 23 Feb 2026 02:25:38 -0800

Thanks David and Eric for the pointers.
David - I checked out that PR you linked. The custom model @renatoh added
(extending LangChain4j’s EmbeddingModel) should still work exactly the same
with this change, since the LC4j path is untouched and everything continues
to run through the adapter.
Eric - thanks for the links to the PR on onnx/huggingface integration. I’m
not deeply familiar with that part of the code yet, so I’ll spend some time
going through it. Longer term I agree it would be great if users could drop
in smaller local models with minimal config.
I’ve opened a PR with the proposed changes for review -
https://github.com/apache/solr/pull/4161
I didn’t add any custom implementation of TextToVectorModel in this PR
beyond the interface itself. I figured the actual model implementations
will vary a lot depending on users’ setups, so the PR just provides the
interface, the LC4j adapter, and the initialization changes needed to
support both paths.
Feedback is very welcome!


Thanks,
Prathmesh Deshmukh


On Fri, Feb 20, 2026 at 7:13 AM David Eric Pugh via dev <[email protected]>
wrote:

>  More of a comment, but I would someday like to see us be able to just
> drop one of the smaller LLM's like Qwen3 model into Solr and immediately
> get some value out of it.  Without all the plumbing and effort that we have
> today with various apis and setups.
> I want to draw your attention to some work done in
> https://issues.apache.org/jira/browse/SOLR-17023 and espeically
> https://github.com/apache/solr/pull/1999....
> Right now you can bring in models from HuggingFace through ONNX, and
> reference them in the update pipeline.   We also have the text to vector
> appraoch.   I could imagine that all that complexity is someday hidden form
> the end user and you just define a field type that handles it all!
>     On Thursday, February 19, 2026 at 08:32:24 PM EST, David Smiley <
> [email protected]> wrote:
>
>  Sounds great!  See https://github.com/apache/solr/pull/3476 where
> @renatoh
> is faced with this conundrum
>
> On Thu, Feb 19, 2026 at 6:32 AM Prathmesh Deshmukh <[email protected]>
> wrote:
>
> > Hi everyone,
> >
> >
> > I’d like to propose an enhancement to the Text to Vector(language-models)
> > module to support pluggable/custom embedding model implementations.
> >
> > At the moment, SolrTextToVectorModel is tightly coupled to LangChain4j’s
> > EmbeddingModel interface. This effectively limits support to the bundled
> > LangChain4j providers (HuggingFace, OpenAI, etc.). If someone wants to
> > integrate a custom embedding endpoint, they currently need to implement
> the
> > full LangChain4j EmbeddingModel interface, including its builder
> > conventions — even if they don’t otherwise use LangChain4j.
> >
> > There’s also no Solr-native abstraction for text-to-vector conversion
> > today.
> >
> > My proposal is to introduce a Solr-native TextToVectorModel interface and
> > decouple the module from LangChain4j. For backward compatibility, we
> could
> > add a Langchain4jModelAdapter that implements TextToVectorModel by
> wrapping
> > a LangChain4j EmbeddingModel. That way, existing configurations would
> > continue to work unchanged.
> >
> > With this approach, users could implement TextToVectorModel in their own
> > JAR, drop it into Solr’s classpath, and register it via the existing REST
> > API without taking on a LangChain4j dependency.
> >
> > The change would involve:
> >
> >    - adding TextToVectorModel
> >    - adding Langchain4jModelAdapter
> >    - updating SolrTextToVectorModel factory logic to support both paths
> >
> > I’d appreciate feedback on whether this direction makes sense. I’m happy
> to
> > open a JIRA and put together a draft PR for discussion. I have a working
> > implementation locally that demonstrates the approach.
> >
> >
> > Thanks,
> >
> > Prathmesh Deshmukh
> >
>

Re: Solr native TextToVectorModel interface for custom embedding endpoints

Reply via email to