Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3286: Prefetching for PHJ probing.
......................................................................


Patch Set 1:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/2959/1/be/src/exec/hash-table.inline.h
File be/src/exec/hash-table.inline.h:

Line 136: inline void HashTable::PrefetchBucketData(int64_t bucket_idx) {
It seems to me like the extra overhead here might hurt in some cases - there's 
a few branches and computations. Anyway, we'll see what the benchmark results 
say.


Line 140: UNLIKELY
It seems like this is workload-dependent rather than always unlikely. E.g. for 
some joins there would be many duplicates.


Line 153: inline void HashTable::PrefetchBucket(uint32_t hash) {
With multiple overloads only differentiated by integer type, it's too easy to 
call the wrong one. Maybe rename PrefetchBucketForHash() or something like that?


http://gerrit.cloudera.org:8080/#/c/2959/1/be/src/exec/partitioned-hash-join-node-ir.cc
File be/src/exec/partitioned-hash-join-node-ir.cc:

Line 321:   // Prefetch the hash table buckets.
If it turns out that this slows down probing of small hash tables, we could 
probably add a check here to skip prefetching in that case.


Line 323:     hash_tbl_iterator_.PrefetchBucket<true>();
Maybe a brief comment to explain why it's necessary to prefetch this separately.


Line 343:       if (LIKELY(hash_tbl != NULL)) 
hash_tbl->PrefetchBucketData<true>(hash);
We talked about this a little offline, but prefetching the bucket data is a 
little tricky since the join key isn't necessarily in the first cache line of 
the tuple. Either the slot isn't in the first 64 bytes, or the key is a string 
stored after the fixed tuple.

For string keys we'd still see a benefit if we prefetched the fixed-length 
StringValue slot since we need to get StringValue::ptr from there first. IIRC 
the planner often puts StringValue slots at the back of the tuple because 
they're large, so we might even not be prefetching the StringValue if there are 
a bunch of smaller slots in the tuple.

I guess we maybe also need to check the null indicators at the beginning of the 
tuple, so it may be tricky in some cases to prefetch all the right bits given 
our typical tuple layouts.


http://gerrit.cloudera.org:8080/#/c/2959/1/be/src/exec/partitioned-hash-join-node.h
File be/src/exec/partitioned-hash-join-node.h:

Line 251:       uint32_t** cur_hash_value, int* remaining_capacity,
Mention that this increments cur_hash_value to line up with 
probe_batch_iterator. I think this aspect is a little tricky.


Line 417:   /// computed during prefetching.
Document that these are sized to the RuntimeState's batch_size


Line 418:   std::vector<uint32_t> hash_values_;
Mention that the indices line up with current_probe_batch_ (I think that's 
correct?)


Line 421:   boost::scoped_ptr<Bitmap> null_bitmap_;
Same here.


-- 
To view, visit http://gerrit.cloudera.org:8080/2959
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib42b93d99d09c833571e39d20d58c11ef73f3cc0
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Michael Ho <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to