[ 
https://issues.apache.org/jira/browse/OPENNLP-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17707404#comment-17707404
 ] 

ASF GitHub Bot commented on OPENNLP-1442:
-----------------------------------------

jzonthemtn commented on code in PR #523:
URL: https://github.com/apache/opennlp/pull/523#discussion_r1154812833


##########
opennlp-dl/src/main/java/opennlp/dl/vectors/SentenceVectorsDL.java:
##########
@@ -0,0 +1,112 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package opennlp.dl.vectors;
+
+import java.io.File;
+import java.io.IOException;
+import java.nio.LongBuffer;
+import java.util.Arrays;
+import java.util.HashMap;
+import java.util.Map;
+
+import ai.onnxruntime.OnnxTensor;
+import ai.onnxruntime.OrtEnvironment;
+import ai.onnxruntime.OrtException;
+import ai.onnxruntime.OrtSession;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import opennlp.dl.AbstractDL;
+import opennlp.dl.Tokens;
+import opennlp.tools.tokenize.Tokenizer;
+import opennlp.tools.tokenize.WordpieceTokenizer;
+
+/**
+ * Facilitates the generation of sentence vectors using
+ * a sentence-transformers model converted to ONNX.
+ */
+public class SentenceVectorsDL extends AbstractDL {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(SentenceVectorsDL.class);
+
+  /**
+   * Creates an instance of the class.
+   * @param model The file name of a sentence vectors ONNX model.
+   * @param vocabulary The file name of the vocabulary file for the model.
+   * @throws OrtException Thrown if the model cannot be loaded.
+   * @throws IOException Thrown if the vocabulary file cannot be loaded.
+   */
+  public SentenceVectorsDL(final File model, final File vocabulary)
+      throws OrtException, IOException {
+
+    env = OrtEnvironment.getEnvironment();
+    session = env.createSession(model.getPath(), new 
OrtSession.SessionOptions());
+    vocab = loadVocab(new File(vocabulary.getPath()));
+    tokenizer = new WordpieceTokenizer(vocab.keySet());

Review Comment:
   Not a stupid question at all 

> Use ONNX Runtime to support sentence-transformers
> -------------------------------------------------
>
>                 Key: OPENNLP-1442
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-1442
>             Project: OpenNLP
>          Issue Type: Task
>          Components: Deep Learning
>            Reporter: Jeff Zemerick
>            Assignee: Jeff Zemerick
>            Priority: Major
>
> Use ONNX Runtime to support sentence-transformers. OpenNLP should be able to 
> generate embeddings using an ONNX model.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to