cpoerschke commented on code in PR #2120:
URL: https://github.com/apache/solr/pull/2120#discussion_r1417572475
##########
solr/modules/analysis-extras/src/test/org/apache/solr/update/processor/TestOpenNLPExtractNamedEntitiesUpdateProcessorFactory.java:
##########
@@ -34,6 +34,13 @@ public static void beforeClass() throws Exception {
"solrconfig-opennlp-extract.xml", "schema-opennlp-extract.xml",
testHome.getAbsolutePath());
}
+ @Test
+ public void testTextToVector() throws Exception {
+ SolrInputDocument doc =
+ processAdd("text-to-vector", doc(f("id", "42"), f("name", "Hello
World")));
+ assertEquals("TODO", "", doc.getFieldValue("film_vector"));
+ }
+
Review Comment:
`./gradlew -p solr/modules/analysis-extras test --tests
TestOpenNLPExtractNamedEntitiesUpdateProcessorFactory.testTextToVector`
currently fails for me locally -- not yet looked into further
```
...
Caused by:
> java.security.AccessControlException: access denied
("java.lang.RuntimePermission"
"loadLibrary./Users/cpoerschke/solr/solr/modules/analysis-extras/build/tmp/tests-tmp/onnxruntime-java2654437641082471485/libonnxruntime.dylib")
> at
java.base/java.security.AccessControlContext.checkPermission(AccessControlContext.java:485)
> at
java.base/java.security.AccessController.checkPermission(AccessController.java:1068)
> at
java.base/java.lang.SecurityManager.checkPermission(SecurityManager.java:416)
> at
java.base/java.lang.SecurityManager.checkLink(SecurityManager.java:703)
> at java.base/java.lang.Runtime.load0(Runtime.java:748)
> at java.base/java.lang.System.load(System.java:1953)
> at ai.onnxruntime.OnnxRuntime.load(OnnxRuntime.java:369)
> at ai.onnxruntime.OnnxRuntime.init(OnnxRuntime.java:160)
> at
ai.onnxruntime.OrtEnvironment.<clinit>(OrtEnvironment.java:31)
...
```
##########
solr/modules/analysis-extras/src/java/org/apache/solr/update/processor/TextToVectorProcessor.java:
##########
@@ -0,0 +1,119 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.solr.update.processor;
+
+import ai.onnxruntime.OrtException;
+import java.io.File;
+import java.io.IOException;
+import java.lang.invoke.MethodHandles;
+import opennlp.dl.vectors.SentenceVectorsDL;
+import org.apache.solr.common.SolrInputDocument;
+import org.apache.solr.common.params.SolrParams;
+import org.apache.solr.request.SolrQueryRequest;
+import org.apache.solr.response.SolrQueryResponse;
+import org.apache.solr.update.AddUpdateCommand;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public class TextToVectorProcessor extends UpdateRequestProcessor {
+
+ private static final Logger log =
LoggerFactory.getLogger(MethodHandles.lookup().lookupClass());
+
+ private static final String INPUT_FIELD_PARAM = "inputField";
+ private static final String OUTPUT_FIELD_PARAM = "outputField";
+ private static final String MODEL_FILE_NAME_PARAM = "model";
+ private static final String VOCAB_FILE_NAME_PARAM = "vocab";
+
+ private static final String DEFAULT_INPUT_FIELDNAME = "name";
+ private static final String DEFAULT_OUTPUT_FIELDNAME = "film_vector";
+ private static final String DEFAULT_MODEL_FILE_NAME =
+ "/Users/cpoerschke/opennlp-dataonnx/sentence-transformers/model.onnx";
+ private static final String DEFAULT_VOCAB_FILE_NAME =
+ "/Users/cpoerschke/opennlp-dataonnx/sentence-transformers/vocab.txt";
Review Comment:
temporarily using the OpenNLP test models here i.e. the ones used by the
https://github.com/apache/opennlp/blob/main/opennlp-dl/src/test/java/opennlp/dl/vectors/SentenceVectorsDLEval.java
available as per https://github.com/apache/opennlp/pull/560/files
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]