Tim Armstrong has posted comments on this change.

Change subject: IMPALA-5266 Impala ABM / LZCNT support
......................................................................


Patch Set 8:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/5821/8/be/src/benchmarks/bit-intrinsics-benchmark.cc
File be/src/benchmarks/bit-intrinsics-benchmark.cc:

Line 33: //                   PopcountSlowNoHw               78.8     79.6     
80.3    0.0782X    0.0788X    0.0787X
This should be the baseline for the popcount functions right? I think you'll 
get the right relative timings for those if you divide this up into multiple 
suites, one per function.


http://gerrit.cloudera.org:8080/#/c/5821/8/be/src/codegen/CMakeLists.txt
File be/src/codegen/CMakeLists.txt:

Line 67:   COMMAND ${LLVM_CLANG_EXECUTABLE} ${CLANG_IR_CXX_FLAGS} "-msse4.2" 
"-mlzcnt" ${CLANG_INCLUDE_FLAGS} ${IR_INPUT_FILES} -o ${IR_SSE_TMP_OUTPUT_FILE}
I don't think this is safe - this tells clang it's safe to emit lzcnt anywhere. 
Ivy Bridge has SSE4.2 but not LZCNT from what I can tell and we only check for 
SSE4.2 before loading this version of the module (in 
LlvmCodeGen::CreateFromMemory()).

Currently we have two ways of dealing with these architectural differences - 
this compile-time approach where we generate different versions of the module, 
some of which have the unsupported instructions removed at compile-time (see 
be/src/util/sse-util.h) - and a run-time approach where we apply a target 
attribute to the individual function and have a runtime check before calling 
the function. I think both approaches have downsides so I'm not sure what's 
right to do here. For the compile-time approach we need an extra IR module for 
every configuration.

I wish we had a better solution for cases like this.


http://gerrit.cloudera.org:8080/#/c/5821/8/be/src/util/bit-util.h
File be/src/util/bit-util.h:

Line 323:   static inline int Log2FloorNonZero(uint32_t n) {
> This actually ended up unused, as the 32-bit version of the RoundUpToPowerO
Can we remove it?


Line 387:   static inline int32_t RoundUpToPowerOfTwo(int32_t v) {
Do we need the 32-bit version? I think in most cases we should be doing the 
kind of calculations that use this (buffer sizes, etc) in 64-bit arithmetic - 
having the overloads could lead to subtle bugs.


http://gerrit.cloudera.org:8080/#/c/5821/8/be/src/util/fixed-size-hash-table.h
File be/src/util/fixed-size-hash-table.h:

Line 53:     DCHECK_EQ(capacity_, 
BitUtil::RoundUpToPowerOfTwo(static_cast<int64_t>(capacity_)));
> Uses of unsigned integers actually get caught now.
This should really be DCHECK(BitUtil::IsPowerOfTwo()) - I think this was 
written before that function existed.


http://gerrit.cloudera.org:8080/#/c/5821/8/be/src/util/sse-util.h
File be/src/util/sse-util.h:

Line 190: #define POPCNT_popcnt_u32 _mm_popcnt_u32
I mentioned this elsewhere but I don't think these are always available when 
SSE4.2 is present, so it's not necessarilysafe to have these under the SSE4.2 
ifdef.

It looks like POPCNT is (mostly?) available when SSE42 is available: 
https://en.wikipedia.org/wiki/SSE4#Supporting_CPUs but LZCNT definitely isn't.

I've also seen some cases where VMs report weird CPU info that doesn't match 
any real processor (e.g. "haswell" without AVX support), so we should be a 
little careful about what we assume. I can't think of a reason why SSE4.2 would 
be enabled but POPCNT disabled but who knows.


-- 
To view, visit http://gerrit.cloudera.org:8080/5821
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9f6a465ab4a9ee4f582847f8e211a779bdede3d2
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden <[email protected]>
Gerrit-Reviewer: Attila Jeges <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Jim Apple <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Zach Amsden <[email protected]>
Gerrit-HasComments: Yes

Reply via email to