rahil-c opened a new issue, #18079:
URL: https://github.com/apache/hudi/issues/18079

   ### Feature Description
   
   **What the feature achieves:**
   This feature provides the ability to perform a native **vector similarity 
search** on Hudi tables.
   
   **Why this feature is needed:**
   Building on 
[RFC-100](https://github.com/apache/hudi/pull/13924/files?short_path=a945f8d#diff-f05ae69c4f41edc32aabfbfc016a12ad1af72917314844f8ae52671234508c56)
 (unstructured data storage in Hudi), Hudi tables would contain unstructured 
content (e.g., images, video, documents) as well as the related *embeddings* 
for those contents. The next natural requirement for AI/ML workloads on Hudi is 
to **search these embeddings efficiently**:
   
   
   
   ### User Experience
   
   **How users will use this feature:**
   The initial scope of this feature was to be able to allow spark users the 
ability to perform vector search by providing a new `vector_search` SQL similar 
to other table value functions we have in Hudi. See the proposed RFC for more 
details: https://github.com/apache/hudi/pull/14218/changes. 
   
   
   ### Hudi RFC Requirements
   
   **RFC PR link:** See the proposed RFC for more details: 
https://github.com/apache/hudi/pull/14218/changes
   
   **Why RFC is/isn't needed:**
   - Does this change public interfaces/APIs? (Yes/No)
   - Does this change storage format? (Yes/No)
   - Justification:
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to