[
https://issues.apache.org/jira/browse/HIVE-27743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sreenath reassigned HIVE-27743:
-------------------------------
Assignee: Sreenath
> Semantic Search In Hive
> -----------------------
>
> Key: HIVE-27743
> URL: https://issues.apache.org/jira/browse/HIVE-27743
> Project: Hive
> Issue Type: Wish
> Environment: *
> Reporter: Sreenath
> Assignee: Sreenath
> Priority: Major
>
> _Semantic search is the tech power *vector databases,* and we can have the
> same power in Hive._
> Semantic search is a way for computers to understand the meaning behind words
> and phrases when you're searching for something. Instead of just looking for
> exact matches of keywords, it tries to figure out what you're really asking
> and provides results that are more relevant and meaningful to your question.
> It's like having a search engine that can understand what you mean, not just
> what you say, making it easier to find the information you're looking for.
> This ticket is a wish to have Semantic search in Hive.
> On the implementation side, semantic search uses an embedding model and any
> of the similarity distance functions.
> My proposal is to implement functions for on-the-fly calculation of
> similarity distance between two values. Once we have them we could easily do
> semantic search as part of a where clause.
> * Eg (using a cosine similarity function): “WHERE cos_dist(region, 'europe')
> > 0.9“. And it could return records with regions like Scandinavia, Nordic,
> Baltic etc…
> * We could have functions thats accept values as text or as vector
> embeddings.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)