[ 
https://issues.apache.org/jira/browse/SOLR-17023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18020067#comment-18020067
 ] 

ASF subversion and git services commented on SOLR-17023:
--------------------------------------------------------

Commit 08c07309d878b9d2367484bfeb542fb31a28b407 in solr's branch 
refs/heads/main from Eric Pugh
[ https://gitbox.apache.org/repos/asf?p=solr.git;h=08c07309d87 ]

SOLR-17023: Use Modern NLP Models via ONNX and Apache OpenNLP with Solr (#1999)

* first version w bats test

* not positive I need this

* new name, and OpenNLP is kind of a implmentation detail ;-).

* baby steps, found that packagesstore blows up killing solr when i post 600 mb 
file.

* remove resolutionStraregy force from gradle build..

* match name in the underlying OpenNLP project.  bikeshedding!

* tidy

* reorder params

* log formatting

* regenerate...

* dynamically grab the models from hugging face.

* use logging structure for stack traces

* download the correct jar, and document the work to remove this need in the 
OpenNLP project

* Upgrade to OpenNLP 2.3.1 and related dependencies.

* no longer need workarounds for gpu/cpu issues with updated OpenNLP.

* We cleaned up the name ;-)

* prompted to update the locks

* Add in required license files

* lint

* precommit warning

* upgrade OpenNLP from 2.3.1 to 2.3.2 (to match Lucene main branch)

* tentative: minimum Java17 for this PR

* Update gradle-precommit.yml - Java 11 --> 17

* Update solrj-test.yml - Java 11 --> 17

* Update docker-test.yml - Java 11 --> 17

* Update bin-solr-test.yml - Java 11 --> 17

* undo 'tentative: minimum Java17 for this PR' -- see PR 1510 instead

* Migrate to new way of specifying dependency versions

* New home for FILESTORE_DIRECTORY constant

* lint

* Add Onnx to list of dependencies

* Update test for recent Solr changes.

* Satisfy linking warning

* Update for latest file pattern

* Update to latest OpenNLP, also used by Lucene.

* One more dependency declaration.

* Bump dependencies to match what opennlp 2.5.3,

opennlp 2.5.3 is what lucene 10.2 ships with...

* Update licenses

* we require numpy 1, not numpy 2 to run the transformer

* Remove use of java.io.File to pass precommit

* Using onnx model instead of converting.

* Align version of OpenNLP

* Update NOTICE files

* Change to nightlies.a.o for file downloading to avoid toggling if files 
change on huggingface

* Revamp steps to take advantage of cached models.

* fix precommit

* Basic unit test ofr setting up.

* add license files

* lint

---------

Co-authored-by: Christine Poerschke <[email protected]>
Co-authored-by: jzonthemtn <[email protected]>
Co-authored-by: Richard Zowalla <[email protected]>

> Use Modern NLP Models from Apache OpenNLP with Solr
> ---------------------------------------------------
>
>                 Key: SOLR-17023
>                 URL: https://issues.apache.org/jira/browse/SOLR-17023
>             Project: Solr
>          Issue Type: New Feature
>            Reporter: Eric Pugh
>            Priority: Minor
>              Labels: pull-request-available
>          Time Spent: 11h 40m
>  Remaining Estimate: 0h
>
> During the 2023 Halifax Community over Code event we had a hackathon.   
> [~jzemerick] and I experimented with code he wrote a year ago and blogged 
> about at 
> https://opensourceconnections.com/blog/2022/06/27/using-modern-nlp-models-from-apache-opennlp-with-solr/.
> This is to experiment a bit more with this and start getting some feedback 
> from the community on ideas.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to