Michael Ho has posted comments on this change. Change subject: IMPALA-3286: Prefetching for PHJ probing. ......................................................................
Patch Set 6: (18 comments) http://gerrit.cloudera.org:8080/#/c/2959/6/be/src/exec/hash-table.h File be/src/exec/hash-table.h: Line 316: values > value (since "each") Done Line 346: value's > values (plural not possessive) and could delete "buffer" Done http://gerrit.cloudera.org:8080/#/c/2959/6/be/src/exec/partitioned-hash-join-node-ir.cc File be/src/exec/partitioned-hash-join-node-ir.cc: Line 373: bool has_probe_rows = current_probe_row_ != NULL || !probe_batch_iterator.AtEnd(); > I find the positive conditionals easier to reason about, but if you or Tim Hmm....I prefer to keep it as is as I find it easier to understand. I added a comment about the loop condition. Line 383: Prefetching if enabled > Prefetching, if enabled, Done Line 387: In this case, prefetching may not be effective as there aren't : // enough cycles for prefetched items to trickle in. > this sentence is still confusing. I think what you mean is that we shouldn' Done Line 410: has_probe_rows = current_probe_row_ != NULL; > maybe add: Yes, that's the invariant. DCHECK added. Line 413: probe_batch_->num_tuples_per_row(); > just to check my understanding, if we're AtEnd(), then probe_batch_pos_ = b Yes. Line 475: cur_row > how did the old code work? The old code goes through the entire batch to computes all the hash values and then goes through the entire batch again for insertion. This new code only goes through as many rows as the capacity of ExprValsCache. Note that cur_row is incremented in the second loop below and it tracks our location into the build batch. Line 494: if (UNLIKELY(!hash_tbl_->Insert(ht_ctx, row_idx, row))) { > can combine these two if-stmts Done http://gerrit.cloudera.org:8080/#/c/2959/6/be/src/exec/partitioned-hash-join-node.cc File be/src/exec/partitioned-hash-join-node.cc: Line 677: // 'probe_exp_ctxs_' should have made no local allocations in this function. > is there a simple DCHECK we could write to verify that? Not with existing interfaces. Added some new interfaces in ExprContext and ExprContextImpl for this purpose. Line 1032: DCHECK(probe_batch_pos_ == -1 || ht_ctx_->expr_values_cache()->AtEnd()); > why can't this just be: We may not call ResetProbeBatch() if we reach eos but the ExprValueCache should have reached the end at this point so I suppose it should be okay to not check for probe_batch_pos_ == -1. Line 1754: DCHECK(process_probe_batch_fn->getLinkage() == GlobalValue::WeakODRLinkage) > out of curiosity why does this end up weak now but not before? Probably because it's being instantiated differently now. I believe we used to be indirectly instantiating them via the interpreted version of ProcessProbeBatch() http://gerrit.cloudera.org:8080/#/c/2959/6/be/src/exec/partitioned-hash-join-node.h File be/src/exec/partitioned-hash-join-node.h: Line 252: probing > probe Done Line 254: > in Done Line 255: free space > doesn't this always reset the cache? "free space" seems to imply it won't b Done Line 579: > Document the return value. Done Line 579: constant. > a constant Done http://gerrit.cloudera.org:8080/#/c/2959/6/be/src/runtime/row-batch.h File be/src/runtime/row-batch.h: Line 409: . Will iterate up till the number : /// of rows in the given row batch if '_start_row_idx + _limit' is beyond the last row. > I was just being dumb when I didn't notice how limit works. I think you sho Done -- To view, visit http://gerrit.cloudera.org:8080/2959 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: Ib42b93d99d09c833571e39d20d58c11ef73f3cc0 Gerrit-PatchSet: 6 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Michael Ho <[email protected]> Gerrit-Reviewer: Dan Hecht <[email protected]> Gerrit-Reviewer: Huaisi Xu <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Mostafa Mokhtar <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
