[
https://issues.apache.org/jira/browse/CALCITE-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
krooswu reassigned CALCITE-7353:
--------------------------------
Assignee: krooswu
> Support Milvus Calcite Adapter
> ---------------------------------
>
> Key: CALCITE-7353
> URL: https://issues.apache.org/jira/browse/CALCITE-7353
> Project: Calcite
> Issue Type: New Feature
> Reporter: wgcn007
> Assignee: krooswu
> Priority: Major
> Fix For: 1.42.0
>
>
> h2. Background
> The exponential growth of AI and large model applications has driven a surge
> in demand for vector similarity search. Databases like PostgreSQL, Redis,
> Doris, and Elasticsearch have already added vector retrieval support. Milvus,
> as a high-performance, cloud-native vector database designed for scalable
> Approximate Nearest Neighbor (ANN) search, has been is widely adopted. The
> goal of this Jira is to make Milvus more accessible by creating a full SQL
> abstraction layer.
> h2. Implementation
> I have completed a feasibility-validated demo implementation in [my
> repository |https://github.com/wg1026688210/calcite/commits/dev/], building a
> Calcite adapter with operator push-down to ensure computation executes on the
> Milvus side. Key capabilities include:
> * Vector Search Push-down
> The MilvusVectorSearchRule identifies Sort→Project→[Filter]→Scan patterns.
> When ORDER BY contains a vector distance function, it fuses the entire query
> (filtering, projection, vector search, sorting, LIMIT) into a single
> MilvusVectorSearch operator pushed to Milvus, with SQL Hint support for
> search parameters:
> {code:java}
> SELECT book_name, l2_distance(VectorFieldAutoTest, '[0.1, 0.2, 0.3, 0.4]') AS
> d
> FROM milvus.test_vector_search /*+ MILVUS_OPTIONS(nprobe='100000') */
> WHERE book_name <> '小王子'
> ORDER BY d
> LIMIT 3 {code}
>
> * Predicate Push-down:
> Converts scalar field comparison operators (=, <>, >, <, >=, <=, LIKE) and
> logical operators (AND, OR, NOT) into Milvus expression strings for
> server-side filtering:
> {code:java}
> SELECT * FROM milvus.vector_table WHERE id > 1 {code}
>
> * Projection Push-down:
> Supports constant and column name projection push-down to minimize data
> transfer:
> {code:java}
> SELECT book_name, 'content' FROM milvus.vector_table {code}
>
> * Vector UDF Fallback:
> Added distance calculation UDFs: L2_DISTANCE(), COSINE_DISTANCE(), and
> INNER_PRODUCT(). For complex operators (JOIN/UNION/AGG) or non-pushdown
> scenarios (unsupported UDFs, least-similar vector queries), vector
> computation falls back to in-memory execution, ensuring complete
> functionality. Example:
>
> {code:java}
> SELECT book_name, l2_distance(vector_field, '[...]') d
> FROM milvus.vector_table
> ORDER BY d DESC LIMIT 5 -- Find least similar Top N {code}
>
> Core Component:
> * Milvus Metadata Layer:
> * MilvusSchema: Manages all collections in a Milvus Database, responsible
> for automatic collection discovery
> * MilvusTranslatableTable (Core): Table abstraction bridging Milvus
> collections and Calcite tables, providing field metadata, type mapping, and
> creating MilvusTableScan nodes in toRel()
> * Milvus SQL Operators & Rules Layer:
> * MilvusTableScan: Handles Milvus data scanning and SQL Hint storage
> * MilvusFilter / MilvusFilterRule: Implements predicate push-down,
> converting SQL conditions to Milvus expressions
> * MilvusProject / MilvusProjectRule: Supports constant and column name
> projection push-down
> * MilvusVectorSearch / MilvusVectorSearchRule (Core Optimization Rule):
> Identifies vector query patterns, validates sorting direction against
> distance type (L2/ASC, IP/COSINE/DESC), and fuses entire queries into a
> single operator for push-down
> * MilvusToEnumerableConverterRule / MilvusToEnumerableConverter: Code
> generation layer that converts Milvus physical operators to executable Java
> code, generating table.vectorSearch() or table.scan() calls
> * MilvusRel: Defines Milvus adapter calling convention, the unified
> interface for all Milvus operators
> * Milvus Query Execution Layer:
> * MilvusSearchEnumerator / MilvusQueryEnumerator: Respectively correspond to
> the Search (vector retrieval) and Query (paginated scanning) operations in
> Milvus.
> * Vector UDF:
> * MilvusOperatorTable / MilvusVectorFunction / MilvusVectorUdfs: Implements
> registration, declaration and computation logic for L2/Cosine/IP distance
> functions
> h2. Use Cases:
> * Build Milvus SQL Gateway Service: Provides standard SQL interface for
> Milvus data read/write and collection management, significantly improving
> usability and compatibility with existing SQL ecosystem tools.
> * Internal developed Multimodal Compute Engine Integration: Serves as a
> vector search execution engine integrated into internal compute platforms,
> enhancing the engine's functionality and improving the performance of vector
> retrieval.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)