Tim Armstrong has posted comments on this change.

Change subject: IMPALA-3286: Software prefetching for hash table build.
......................................................................


Patch Set 1:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/2896/1//COMMIT_MSG
Commit Message:

Line 21: with lineitem reduces by more than half (going from 10.5s to 4.5s).
Very nice!


http://gerrit.cloudera.org:8080/#/c/2896/1/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

Line 864: 12
Hmm this is the index of the field in the class?

I think it's difficult to avoid hardcoding it somewhere, but I think we should 
make it clear in the class defintion that the index matters and needs to be 
updated if members are rearranged. E.g. make this a constant in the class next 
to that member?


http://gerrit.cloudera.org:8080/#/c/2896/1/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

Line 295:   /// for matches. It's not evaluated if probing doesn't call 
Equals().
There seems to be some logic that if this is NULL, it won't be reevaluated - 
could you explain this in the comment?


Line 296:   TupleRow* expr_values_row_;
> Thanks for the pointer to that patch. That will be very helpful. My prefere
I'm ok with moving ahead with this code so long as it's not the permanent 
solution.


http://gerrit.cloudera.org:8080/#/c/2896/1/be/src/exec/hash-table.inline.h
File be/src/exec/hash-table.inline.h:

Line 122: PrefetchBucket
PrefetchBucketForWrite()? To make it clear the flags to __builtin_prefetch 
aren't quite right if you're just reading the bucket e.g. when probing.


http://gerrit.cloudera.org:8080/#/c/2896/1/be/src/exec/partitioned-hash-join-node-ir.cc
File be/src/exec/partitioned-hash-join-node-ir.cc:

Line 313:   FOREACH_ROW(prefetch_batch, 0, row) {
Prefetching 1024 rows at a time seems suboptimal. If the cache line is 64 bytes 
and L1 cache is 64kb there's going to be a significant number of conflict 
misses between buckets and with other data if you try to prefetch 1024 buckets.

I'm ok with this for now since it's obviously still very beneficial since all 
should fit in L2 and many will fit in L1 but it might be worth a comment.

I think this relates to my other comment about processing subsets of the batch.


Line 329:       ht_ctx->SetRowToEval(row);
The logic with SetRowToEval() is kind of tricky - maybe it needs a comment to 
explain that it needs to be done before Insert()


http://gerrit.cloudera.org:8080/#/c/2896/1/be/src/exec/partitioned-hash-join-node.cc
File be/src/exec/partitioned-hash-join-node.cc:

Line 337:  hash_values_.reset(new uint32_t[state->batch_size()]);
> I tried implementing that too but that's a nested loop inside the FOREACH_R
1024 should be the default max. There are some places where larger batches are 
created but they shouldn't be passed between operators. Also I think this batch 
is actually sized by this operator so we could apply a max there and keep that 
local to this node.

I believe Mostafa had looked at increasing the row batch size and it has 
benefits for many queries. I don't know if anyone actually changes the batch 
size in production.


-- 
To view, visit http://gerrit.cloudera.org:8080/2896
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib85e7fc162ad25c849b9e716b629e226697cd940
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Michael Ho <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to