ColdL opened a new issue, #7011:
URL: https://github.com/apache/paimon/issues/7011

   ### Search before asking
   
   - [x] I searched in the [issues](https://github.com/apache/paimon/issues) 
and found nothing similar.
   
   
   ### Motivation
   
   
[PIP-40](https://cwiki.apache.org/confluence/display/PAIMON/PIP-40%3A+Introduce+a+new+Vector+data+type)
   
   ### Solution
   
   As discussed in 
[PIP-40](https://cwiki.apache.org/confluence/display/PAIMON/PIP-40%3A+Introduce+a+new+Vector+data+type),
 we propose introducing a dedicated vector data type in Paimon to better 
support storage and retrieval of vector data for AI workloads.
   
   PIP-40 can be roughly split into two parts:
   (1) introducing the vector data type itself;
   (2) allowing users to specify the file format for vector data, to further 
optimize storage/access efficiency in mixed workloads.
   
   For Part (1), a basic implementation is already available and includes:
    - Introducing a new vector type. To avoid confusion with the existing term 
"Vector" in the codebase, the new type is named VecType.java.
    - Providing a ColumnVector implementation for the vector type, with support 
in the paimon-arrow module, so Arrow-related file formats (e.g., Lance) can map 
FixedSizeList to VecType.
    - Adding Flink-side compatibility: via configuration, a Flink Array can be 
stored as Paimon VecType.
   
   For Part (2) (specifying the file format for vector data), work is still in 
progress.
   
   Although the code is still in a draft state, it changes some basic 
interfaces (e.g., DataGetters), thus I'd like to discuss it early. Any comments 
on this @JingsongLi 
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit a PR?
   
   - [x] I'm willing to submit a PR!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to