A parser to output universal sentence encodings to text. This uses Tensorflow Java APIs, currently have added tests only to verify its abilities. In tests, I mainly shows, how this parser can be used to output sentence embeddings for multiple sentences all at once. Once the embeddings are generated, I calculate cosine similarities between each and every sentence embedding and simply prints out the sentence couples that have high correlations.
[ Full content available at: https://github.com/apache/tika/pull/248 ] This message was relayed via gitbox.apache.org for [email protected]
