Michael Ho has posted comments on this change.

Change subject: IMPALA-3286: Prefetching for PHJ probing.
......................................................................


Patch Set 6:

(18 comments)

http://gerrit.cloudera.org:8080/#/c/2959/6/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

Line 316: values
> value (since "each")
Done


Line 346: value's
> values (plural not possessive) and could delete "buffer"
Done


http://gerrit.cloudera.org:8080/#/c/2959/6/be/src/exec/partitioned-hash-join-node-ir.cc
File be/src/exec/partitioned-hash-join-node-ir.cc:

Line 373:   bool has_probe_rows = current_probe_row_ != NULL || 
!probe_batch_iterator.AtEnd();
> I find the positive conditionals easier to reason about, but if you or Tim 
Hmm....I prefer to keep it as is as I find it easier to understand. I added a 
comment about the loop condition.


Line 383: Prefetching if enabled
> Prefetching, if enabled,
Done


Line 387: In this case, prefetching may not be effective as there aren't
        :     // enough cycles for prefetched items to trickle in.
> this sentence is still confusing. I think what you mean is that we shouldn'
Done


Line 410:     has_probe_rows = current_probe_row_ != NULL;
> maybe add:
Yes, that's the invariant. DCHECK added.


Line 413:         probe_batch_->num_tuples_per_row();
> just to check my understanding, if we're AtEnd(), then probe_batch_pos_ = b
Yes.


Line 475: cur_row
> how did the old code work?
The old code goes through the entire batch to computes all the hash values and 
then goes through the entire batch again for insertion.

This new code only goes through as many rows as the capacity of ExprValsCache. 
Note that cur_row is incremented in the second loop below and it tracks our 
location into the build batch.


Line 494:         if (UNLIKELY(!hash_tbl_->Insert(ht_ctx, row_idx, row))) {
> can combine these two if-stmts
Done


http://gerrit.cloudera.org:8080/#/c/2959/6/be/src/exec/partitioned-hash-join-node.cc
File be/src/exec/partitioned-hash-join-node.cc:

Line 677:     // 'probe_exp_ctxs_' should have made no local allocations in 
this function.
> is there a simple DCHECK we could write to verify that?
Not with existing interfaces. Added some new interfaces in ExprContext and 
ExprContextImpl for this purpose.


Line 1032:     DCHECK(probe_batch_pos_ == -1 || 
ht_ctx_->expr_values_cache()->AtEnd());
> why can't this just be:
We may not call ResetProbeBatch() if we reach eos but the ExprValueCache should 
have reached the end at this point so I suppose it should be okay to not check 
for probe_batch_pos_ == -1.


Line 1754:   DCHECK(process_probe_batch_fn->getLinkage() == 
GlobalValue::WeakODRLinkage)
> out of curiosity why does this end up weak now but not before?
Probably because it's being instantiated differently now. I believe we used to 
be indirectly instantiating them via the interpreted version of 
ProcessProbeBatch()


http://gerrit.cloudera.org:8080/#/c/2959/6/be/src/exec/partitioned-hash-join-node.h
File be/src/exec/partitioned-hash-join-node.h:

Line 252: probing
> probe
Done


Line 254:  
> in
Done


Line 255: free space
> doesn't this always reset the cache? "free space" seems to imply it won't b
Done


Line 579:  
> Document the return value.
Done


Line 579: constant. 
> a constant
Done


http://gerrit.cloudera.org:8080/#/c/2959/6/be/src/runtime/row-batch.h
File be/src/runtime/row-batch.h:

Line 409: . Will iterate up till the number
        : /// of rows in the given row batch if  '_start_row_idx + _limit' is 
beyond the last row.
> I was just being dumb when I didn't notice how limit works. I think you sho
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/2959
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib42b93d99d09c833571e39d20d58c11ef73f3cc0
Gerrit-PatchSet: 6
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Michael Ho <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Huaisi Xu <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to