Tim Armstrong has posted comments on this change. Change subject: IMPALA-3286: Prefetching for PHJ probing. ......................................................................
Patch Set 1: (10 comments) http://gerrit.cloudera.org:8080/#/c/2959/1/be/src/exec/hash-table.inline.h File be/src/exec/hash-table.inline.h: Line 136: inline void HashTable::PrefetchBucketData(int64_t bucket_idx) { It seems to me like the extra overhead here might hurt in some cases - there's a few branches and computations. Anyway, we'll see what the benchmark results say. Line 140: UNLIKELY It seems like this is workload-dependent rather than always unlikely. E.g. for some joins there would be many duplicates. Line 153: inline void HashTable::PrefetchBucket(uint32_t hash) { With multiple overloads only differentiated by integer type, it's too easy to call the wrong one. Maybe rename PrefetchBucketForHash() or something like that? http://gerrit.cloudera.org:8080/#/c/2959/1/be/src/exec/partitioned-hash-join-node-ir.cc File be/src/exec/partitioned-hash-join-node-ir.cc: Line 321: // Prefetch the hash table buckets. If it turns out that this slows down probing of small hash tables, we could probably add a check here to skip prefetching in that case. Line 323: hash_tbl_iterator_.PrefetchBucket<true>(); Maybe a brief comment to explain why it's necessary to prefetch this separately. Line 343: if (LIKELY(hash_tbl != NULL)) hash_tbl->PrefetchBucketData<true>(hash); We talked about this a little offline, but prefetching the bucket data is a little tricky since the join key isn't necessarily in the first cache line of the tuple. Either the slot isn't in the first 64 bytes, or the key is a string stored after the fixed tuple. For string keys we'd still see a benefit if we prefetched the fixed-length StringValue slot since we need to get StringValue::ptr from there first. IIRC the planner often puts StringValue slots at the back of the tuple because they're large, so we might even not be prefetching the StringValue if there are a bunch of smaller slots in the tuple. I guess we maybe also need to check the null indicators at the beginning of the tuple, so it may be tricky in some cases to prefetch all the right bits given our typical tuple layouts. http://gerrit.cloudera.org:8080/#/c/2959/1/be/src/exec/partitioned-hash-join-node.h File be/src/exec/partitioned-hash-join-node.h: Line 251: uint32_t** cur_hash_value, int* remaining_capacity, Mention that this increments cur_hash_value to line up with probe_batch_iterator. I think this aspect is a little tricky. Line 417: /// computed during prefetching. Document that these are sized to the RuntimeState's batch_size Line 418: std::vector<uint32_t> hash_values_; Mention that the indices line up with current_probe_batch_ (I think that's correct?) Line 421: boost::scoped_ptr<Bitmap> null_bitmap_; Same here. -- To view, visit http://gerrit.cloudera.org:8080/2959 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib42b93d99d09c833571e39d20d58c11ef73f3cc0 Gerrit-PatchSet: 1 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Michael Ho <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
