On Tue, Feb 18, 2025 at 1:49 PM R, Rakshit <rakshi...@intel.com> wrote:
>
> Hi All,
>
> Thank you very much for your feedback! We investigated the recommendations as 
> we developed the current patch. Please refer to the comments below.
>
> > I'd suggest targeting pg_qsort() directly, instead of list_sort(). 
> > list_sort() is not used in very performance critical parts.
> Using x86-simd-sort would require a new pg_qsort variant taking an extractor 
> function. The new API would still have to bubble up to functions like 
> list_sort, which internally call pg_qsort. We targeted list_sort first, as we 
> were aware of at least one use case that benefits from SIMD sorting 
> (pgvector). As we target more use cases (such as tuple sort), we may need to 
> move the SIMD-enabled implementation to a common location, and a new variant 
> of pg_qsort may be a good candidate.
>
> > I'd suggest proposing this to the pgvector project directly, since pgvector 
> > would immediately benefit.
> Yes, we’ll share the performance gains on HNSW index build time with the 
> pgvector community. However, other users of list_sort (e.g., other 
> extensions) may benefit from the simd-sort version of list_sort as well. As 
> it is not a pgvector-specific optimization, we propose it for PostgreSQL.

I don't think "another extension might use it someday" makes a very
strong case, particularly for something that requires a new
dependency.

> > If you could use this to speed up tuple sorting, that would be much more 
> > interesting for PostgreSQL itself.
> Enabling x86-simd-sort for tuple sorting is in development. We observed 
> promising results, but there is still work to do (e.g., to ensure there are 
> no regressions in different conditions). We are planning to share another 
> patch for tuple sort building on this one.

Note that our current implemention is highly optimized for
low-cardinality inputs. This is needed for aggregate queries. I found
this write-up of a couple scalar and vectorized sorts, and they show
this library doing very poorly on very-low cardinality inputs. I would
look into that before trying to get something in shape to share.

https://github.com/Voultapher/sort-research-rs/blob/main/writeup/intel_avx512/text.md

There is also the question of hardware support. It seems AVX-512 is
not supported well on client side, where most developers work. And
availability of any flavor is not guaranteed on server either.
Something to keep in mind.

-- 
John Naylor
Amazon Web Services


Reply via email to