[Impala-CR](cdh5-trunk) Use AVX2 operations to speedup Bloom filters by 10-100%.

Jim Apple (Code Review) Mon, 13 Jun 2016 14:16:42 -0700

Jim Apple has posted comments on this change.

Change subject: Use AVX2 operations to speedup Bloom filters by 10-100%.
......................................................................



Patch Set 5:

> It just occurred to me that this could give incorrect results
 > running on a mixed cluster of avx2/non-avx2 machines.
 > 
 > Would it make sense to just use the avx2-optimised layout for the
 > non-avx2 case?

Good point, Tim!

Using the same layout is certainly possible. Using the same hash functions, 
however, would slow down the non-avx2 code.

The reason is that, between PS4 and PS5, I stated using the vpmulld instruction 
to rehash the 32-bit value by multiplying it by 8 different odd 32-bit 
constants and taking the top 5 bits of each. In the serial code, I multiply by 
two different 64-bit constants using Rehash32to64, add other 64-bit constants, 
then take the top 32-bits of each of each. Switching to eight 32-bit 
multiplications would be a good bit slower, I suspect.

This could be alleviated using pmulld, which can perform 4 32-bit 
multiplications with one instruction, but that was added in SSE4.1.

I see two options:

1. Leave some performance on the table with this commit by moving back to PS4.

2. Take a regression for pre-sse4.1 machines (ended in 2008ish for Intel, 
2012ish for AMD, if I'm reading correctly) and a bigger speedup for more modern 
machines.

I have another change I've already started testing that increases the gap 
between #1 and #2 by another 50-100%. 

Tim, Dan: what do you think is the right choice?

-- 
To view, visit http://gerrit.cloudera.org:8080/3338
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I6fef4f6652876f8fd7e3f0e41431702380418c98
Gerrit-PatchSet: 5
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Jim Apple <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Jim Apple <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: No

[Impala-CR](cdh5-trunk) Use AVX2 operations to speedup Bloom filters by 10-100%.

Reply via email to