rahil-c commented on code in PR #18867: URL: https://github.com/apache/hudi/pull/18867#discussion_r3319033517
########## website/docs/lance_file_format.md: ########## @@ -87,7 +119,45 @@ All Hudi table services work with Lance-backed tables: - **Compaction** — merges log files into Lance base files - **Clustering** — reorganizes Lance files for better data locality - **Cleaning** — removes old Lance file versions -- **Metadata indexing** — column stats and bloom filters work across Lance files +- **Metadata indexing** — bloom filters work across Lance files; column stats and partition stats are + **automatically disabled** for Lance tables + +## VECTOR Storage on Lance + +VECTOR columns are stored natively in Lance as `FixedSizeList<Float32/Float64, dim>` — Lance's own +vector column encoding. This unlocks Lance's built-in IVF-PQ approximate nearest neighbor (ANN) index Review Comment: I think our file format integration with lance does NOT give us any native support for their index (since that is i think handled at their table format level) vs we are just using file format. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
