[ 
https://issues.apache.org/jira/browse/TIKA-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Allison updated TIKA-2720:
------------------------------
    Fix Version/s:     (was: 2.0.0)
                   2.0.0-BETA

> A parser to output universal sentence encodings to text
> -------------------------------------------------------
>
>                 Key: TIKA-2720
>                 URL: https://issues.apache.org/jira/browse/TIKA-2720
>             Project: Tika
>          Issue Type: New Feature
>          Components: tika-dl
>            Reporter: Thejan Wijesinghe
>            Priority: Major
>             Fix For: 2.0.0-BETA
>
>
> This parser encodes a text into high dimensional vectors that can be used for 
> text classification, semantic similarity, clustering and other natural 
> language tasks. The model is trained and optimized for greater-than-word 
> length text, such as sentences, phrases or short paragraphs. It is trained on 
> a variety of data sources and a variety of tasks with the aim of dynamically 
> accommodating a wide variety of natural language understanding tasks. The 
> input is variable length English text and the output is a 512 dimensional 
> vector.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to