prathyand opened a new pull request, #4161: URL: https://github.com/apache/solr/pull/4161
https://issues.apache.org/jira/browse/SOLR-18127 <!-- _(If you are a project committer then you may remove some/all of the following template.)_ Before creating a pull request, please file an issue in the ASF Jira system for Solr: * https://issues.apache.org/jira/projects/SOLR For something minor (i.e. that wouldn't be worth putting in release notes), you can skip JIRA. To create a Jira issue, you will need to create an account there first. The title of the PR should reference the Jira issue number in the form: * SOLR-####: <short description of problem or changes> SOLR must be fully capitalized. A short description helps people scanning pull requests for items they can work on. Properly referencing the issue in the title ensures that Jira is correctly updated with code review comments and commits. --> # Description This PR adds a Solr‑native `TextToVectorModel` interface so that users can plug in their own embedding models without needing to implement LangChain4j classes. Right now, `SolrTextToVectorModel` is tied directly to LangChain4j’s `EmbeddingModel`, which makes it awkward for anyone who has a custom embedding service — they have to implement the whole LC4j API even if they don’t use it anywhere else. The goal here is to give Solr a simple internal abstraction for “text → vector” while keeping all existing LangChain4j providers working the same as before. # Solution The change is broken up into a few pieces: 1. **New `TextToVectorModel` interface** A small interface that provides the basic methods Solr needs. 2. **`LangChain4jModelAdapter`** An adapter that wraps an LC4j `EmbeddingModel` and exposes it through the new interface. This keeps everything backwards‑compatible so existing configs keep working without changes. 3. **Updates to the model initialization workflow** SolrTextToVectorModel's getInstance() now recognizes either: - a native `TextToVectorModel` implementation, or - a LangChain4j model wrapped using the adapter. ### AI Assist Disclosure The `ModelConfigUtils.convertValue` helper method was written with some assistance from the AI tool **Windsurf**. It’s a small utility that converts parsed JSON model config values into the types expected by constructors. # Tests - Added a test to `TextToVectorUpdateProcessorTest` that loads a custom `DummyTextToVectorModel`. - This verifies the new interface works end‑to‑end and that the updated initialization logic can discover and load custom models correctly. - Existing LC4j-based tests continue to work through the adapter. # Checklist - [x] I have reviewed the Contributing Guide and tried to follow the conventions as closely as possible. - [x] I have created a Jira issue and referenced it in the PR title. - [x] I have given Solr maintainers access to my PR branch. - [x] I have developed this change against `main`. - [x] I have run `./gradlew check`. - [x] I have added tests for the new functionality. - [ ] I have added documentation to the Reference Guide. - [ ] I have added a changelog entry. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
