On Tue, Feb 18, 2025 at 1:49 PM R, Rakshit <rakshi...@intel.com> wrote: > > Hi All, > > Thank you very much for your feedback! We investigated the recommendations as > we developed the current patch. Please refer to the comments below. > > > I'd suggest targeting pg_qsort() directly, instead of list_sort(). > > list_sort() is not used in very performance critical parts. > Using x86-simd-sort would require a new pg_qsort variant taking an extractor > function. The new API would still have to bubble up to functions like > list_sort, which internally call pg_qsort. We targeted list_sort first, as we > were aware of at least one use case that benefits from SIMD sorting > (pgvector). As we target more use cases (such as tuple sort), we may need to > move the SIMD-enabled implementation to a common location, and a new variant > of pg_qsort may be a good candidate. > > > I'd suggest proposing this to the pgvector project directly, since pgvector > > would immediately benefit. > Yes, we’ll share the performance gains on HNSW index build time with the > pgvector community. However, other users of list_sort (e.g., other > extensions) may benefit from the simd-sort version of list_sort as well. As > it is not a pgvector-specific optimization, we propose it for PostgreSQL.
I don't think "another extension might use it someday" makes a very strong case, particularly for something that requires a new dependency. > > If you could use this to speed up tuple sorting, that would be much more > > interesting for PostgreSQL itself. > Enabling x86-simd-sort for tuple sorting is in development. We observed > promising results, but there is still work to do (e.g., to ensure there are > no regressions in different conditions). We are planning to share another > patch for tuple sort building on this one. Note that our current implemention is highly optimized for low-cardinality inputs. This is needed for aggregate queries. I found this write-up of a couple scalar and vectorized sorts, and they show this library doing very poorly on very-low cardinality inputs. I would look into that before trying to get something in shape to share. https://github.com/Voultapher/sort-research-rs/blob/main/writeup/intel_avx512/text.md There is also the question of hardware support. It seems AVX-512 is not supported well on client side, where most developers work. And availability of any flavor is not guaranteed on server either. Something to keep in mind. -- John Naylor Amazon Web Services