JingsongLi commented on PR #37: URL: https://github.com/apache/paimon-vector-index/pull/37#issuecomment-4688343729
Potential performance concern: this PR adds finite-f32 validation while decoding persisted list payloads. For IVF-FLAT in particular, `read_inverted_list()` is on the search path and runs once per probed list (`search`) / unique probed list (`search_batch`). The new `bytes_to_f32_vec_checked()` first decodes the whole vector payload into `Vec<f32>` and then scans all values again with `is_finite()`, so each query adds an extra O(sum(count * d) over probed lists) pass over candidate vectors. Could we either merge the finite check into the bytes-to-f32 decode loop to avoid the second pass, or validate/cache a list only once if the reader assumes the underlying index bytes are immutable? The one-time checks in `ensure_loaded()` look fine; the main concern is the repeated per-search list-vector validation. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
