[
https://issues.apache.org/jira/browse/SOLR-17023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18020067#comment-18020067
]
ASF subversion and git services commented on SOLR-17023:
--------------------------------------------------------
Commit 08c07309d878b9d2367484bfeb542fb31a28b407 in solr's branch
refs/heads/main from Eric Pugh
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=08c07309d87 ]
SOLR-17023: Use Modern NLP Models via ONNX and Apache OpenNLP with Solr (#1999)
* first version w bats test
* not positive I need this
* new name, and OpenNLP is kind of a implmentation detail ;-).
* baby steps, found that packagesstore blows up killing solr when i post 600 mb
file.
* remove resolutionStraregy force from gradle build..
* match name in the underlying OpenNLP project. bikeshedding!
* tidy
* reorder params
* log formatting
* regenerate...
* dynamically grab the models from hugging face.
* use logging structure for stack traces
* download the correct jar, and document the work to remove this need in the
OpenNLP project
* Upgrade to OpenNLP 2.3.1 and related dependencies.
* no longer need workarounds for gpu/cpu issues with updated OpenNLP.
* We cleaned up the name ;-)
* prompted to update the locks
* Add in required license files
* lint
* precommit warning
* upgrade OpenNLP from 2.3.1 to 2.3.2 (to match Lucene main branch)
* tentative: minimum Java17 for this PR
* Update gradle-precommit.yml - Java 11 --> 17
* Update solrj-test.yml - Java 11 --> 17
* Update docker-test.yml - Java 11 --> 17
* Update bin-solr-test.yml - Java 11 --> 17
* undo 'tentative: minimum Java17 for this PR' -- see PR 1510 instead
* Migrate to new way of specifying dependency versions
* New home for FILESTORE_DIRECTORY constant
* lint
* Add Onnx to list of dependencies
* Update test for recent Solr changes.
* Satisfy linking warning
* Update for latest file pattern
* Update to latest OpenNLP, also used by Lucene.
* One more dependency declaration.
* Bump dependencies to match what opennlp 2.5.3,
opennlp 2.5.3 is what lucene 10.2 ships with...
* Update licenses
* we require numpy 1, not numpy 2 to run the transformer
* Remove use of java.io.File to pass precommit
* Using onnx model instead of converting.
* Align version of OpenNLP
* Update NOTICE files
* Change to nightlies.a.o for file downloading to avoid toggling if files
change on huggingface
* Revamp steps to take advantage of cached models.
* fix precommit
* Basic unit test ofr setting up.
* add license files
* lint
---------
Co-authored-by: Christine Poerschke <[email protected]>
Co-authored-by: jzonthemtn <[email protected]>
Co-authored-by: Richard Zowalla <[email protected]>
> Use Modern NLP Models from Apache OpenNLP with Solr
> ---------------------------------------------------
>
> Key: SOLR-17023
> URL: https://issues.apache.org/jira/browse/SOLR-17023
> Project: Solr
> Issue Type: New Feature
> Reporter: Eric Pugh
> Priority: Minor
> Labels: pull-request-available
> Time Spent: 11h 40m
> Remaining Estimate: 0h
>
> During the 2023 Halifax Community over Code event we had a hackathon.
> [~jzemerick] and I experimented with code he wrote a year ago and blogged
> about at
> https://opensourceconnections.com/blog/2022/06/27/using-modern-nlp-models-from-apache-opennlp-with-solr/.
> This is to experiment a bit more with this and start getting some feedback
> from the community on ideas.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]