> "Looks like"?

I cannot find the reference, but I've read a while back that a well-known 
company from Redwood uses it for their in-memory columnar storage. That might 
have just been a rumor or might have been research only - not sure. It does not 
really matter anyways.

> SortTuples are currently 24 bytes, and supported vector registers are 16 
> bytes, so not sure how you think that would work.

The thought was to logically group multiple sort tuples together and then 
create a vectorized version of that group with just the primitive type sort key 
as well as a small-sized index/offset into that sort group to later swap the 
corresponding sort tuple referenced by that index/offset. The sorting network 
would allow us to do branch-less register based sorting for a particular sort 
run. I guess this idea is moot, given ...

> More broadly, the more invasive a change is, the more risk is involved, and 
> the more effort to test and evaluate. If you're serious about trying to 
> improve insertion sort performance, the simple idea we discussed at the start 
> of the thread is a much more modest step that has a good chance of justifying 
> the time put into it. That is not to say it's easy, however, because testing 
> is a non-trivial amount of work.

I absolutely agree. Let's concentrate on improving things incrementally.
Please excuse me wondering given that you have contributed some of the recent 
vectorization stuff and seeing that you have obviously experimented a lot with 
the sort code, that you might already have tried something along those lines or 
researched the subject - it is definitely a very interesting topic.

Cheers, Ben


Reply via email to