Dan Hecht has posted comments on this change.

Change subject: Impala-3286: Prefetching For Phj Probing.
......................................................................


Patch Set 4:

(81 comments)

Looking good. I think we need to think through the FreeLocalAllocations stuff a 
bit more. Otherwise just comments, names and minor changes.

http://gerrit.cloudera.org:8080/#/c/2959/4//COMMIT_MSG
Commit Message:

Line 8: 
> The cases of the words were somehow messed up by my editor. Will update tha
lol


http://gerrit.cloudera.org:8080/#/c/2959/4/be/src/exec/hash-table.cc
File be/src/exec/hash-table.cc:

Line 126: int32_t
be careful when you merge with Tim's recent patch that fixes this to uint32_t. 
might as well do that now since you moved this code.


Line 231: ;
delete


Line 241: state->batch_size()
this is where the comment about finding a better maximum prefetch group size 
should be moved to.


Line 281: num_build_exprs
num_exprs?


Line 293:   return &cur_expr_values_null_[expr_idx];
cur_expr_values_null_ + expr_idx

to be consistent with line 289 style


Line 312: void HashTableCtx::ExprValuesCache::PrepareForRead() {
// Remember for AtEnd().


http://gerrit.cloudera.org:8080/#/c/2959/4/be/src/exec/hash-table.h
File be/src/exec/hash-table.h:

Line 140: IR_ALWAYS_INLINE
why not ALWAYS_INLINE even for non-IR?


Line 156: returning in *hash
update indicating what the side effects are


Line 185: .
by the exec nodes.  (to make it clear that the hash table itself doesn't do 
pipelining but exposes infrastructure for doing so).


Line 186: Rows
A set of rows of a row batch
(since it's not all the rows at once).


Line 187: all
        :   /// the evaluated values, null byte and hash values to avoid 
re-evaluating the rows
        :   /// again during probing
the results of expression evaluation for the rows in the prefetch set, to avoid 
re-evaluating the rows during probing.


Line 191: values
expression evaluation results


Line 196: evaluated
each evaluated expression in each row.


Line 202: nsertion or parsing
for performing a write pass followed by a read pass

to make it more explicit the state transitions.
also, only one read pass is currently allowed, is that right?


Line 203: or most of the data
        :   /// structures
delete this (see comment around SetRowNull()


Line 206: pointers
iterator
(from outside the abstraction the 'pointers' can be an implementation detail).


Line 207: pointers
iterator


Line 208: consumed
this doesn't quite make sense for "write" mode.


Line 211:   ///   row respectively.
This comment can be removed since it seems redundant with the one for each of 
those fields below.


Line 215: .
(e.g. if an expression value is a StringValue).


Line 231: Resets
Sets the iterator to the beginning to start a write pass by resetting


Line 234: Resets
Sets the iterator to the beginning to start a read pass by resetting the 
pointers (except for the end pointer) ...


Line 237: pointers
iterator


Line 237:  
row, by moving the pointers to the next entries  ...


Line 256:     int IR_ALWAYS_INLINE CurIdx() const {
make this private (see comments below).


Line 260: all cached values have been consumed
the read pass is complete.
(to signal that this is only valid during the read pass).


Line 265:     /// Resets the ExprValuesCache to indicate it is empty.
I think this should say:

Sets the iterator to the end.

The fact that it resets the state is just how you chose to implement it, no? 
couldn't you alternatively set the cur pointers to end?


Line 266:     void IR_ALWAYS_INLINE SetAtEnd();
> Where is this defined?
in the .cc file.

Now I think we should call this Reset() or something since SetAtEnd() kind of 
implies we could still make another read pass, but we can't. it's destructive.


Line 271:       return null_bitmap_.Get<false>(row_idx);
to stick with the iterator interface (rather than this random access), how 
about not passing row_idx but instead just using CurIdx() inside of here? 
that'd be consistent with SetExprValuesHash(). and then update the comment.


Line 275:     void IR_ALWAYS_INLINE SetRowNull(int row_idx) {
same comment


Line 290: expressions
should this be singular?


Line 291: i
might be nice to change these 'i' to 'expr_idx' for clarity here and below


Line 293: values
singular?


Line 294: expressions
singular and no apostrophe


Line 298: expressions'
expression's


Line 302:     friend class HashTableCtx;
does the HashTableCtx access these private fields directly? okay to leave it if 
so.


Line 305: ResetPointers
how about calling it ResetIterator() to distinguish that it doesn't reset the 
"end" pointer.


Line 307: memory
in bytes?


Line 310: capacity
number of rows worth of expression evaluation state


Line 314: substitution
constant substitution


Line 318: num_build_exprs_
maybe just call it num_exprs_ since it's also used by probe exprs?


Line 320:  
current row's cached expression values.


Line 324: cached nullness of the
        :     /// expressions' values of the current row
current row's cached nullness of each expression value.


Line 328: cached expressions' hash values
        :     /// of the current row.
cached hash value of the current row's expression values.


Line 332: cached expressions'
        :     /// hash values.
cached hash values.


Line 336: rows'
rows worth of


Line 340: bits
booleans (otherwise it's too easy to think this is a bitset).


Line 340: rows'
rows worth of


Line 346: rows
rows worth of hash values.


Line 354: Stores the offsets into the expression values buffers of expressions' 
results.
Maps from expression index to byte offset in a row's expression values buffer 
(i.e. this gives the offset relative to cur_expr_values_ for the i-th 
expression's value).


Line 358: .
for the row.


Line 364: IR_ALWAYS_INLINE
why not ALWAYS_INLINE?


Line 432:  
expression evaluation


Line 539: 'row'
what is this?


Line 541: values in 'ht_ctx'
this seems state


Line 736:   /// this function.
this could use an updated explanation


http://gerrit.cloudera.org:8080/#/c/2959/4/be/src/exec/hash-table.inline.h
File be/src/exec/hash-table.inline.h:

Line 320:     DCHECK_EQ((bucket_idx_ & ~(table_->num_buckets_ - 1)), 0);
okay, how about a comment explaining this then:
// HashTable::PrefetchBucket() takes a hash value to index into the bucket 
array. bucket_idx_ is sufficient for that.


http://gerrit.cloudera.org:8080/#/c/2959/4/be/src/exec/partitioned-hash-join-node-ir.cc
File be/src/exec/partitioned-hash-join-node-ir.cc:

Line 254:   int idx = expr_vals_cache->CurIdx();
delete this (see hash-table.h comments)


Line 330: Batch
see header. this doesn't do the whole batch so let's rename.


Line 344:       if (LIKELY(hash_tbl != NULL)) 
hash_tbl->PrefetchBucket<true>(hash);
should this have a prefetch_mode check?


Line 369: current_probe_row_ != NULL || !probe_batch_iterator.AtEnd();
why are both of these needed? why isn't the AtEnd() check sufficient? or 
really, why can't the has_probe_rows loop condition be replaced with !AtEnd()?


Line 381:     // prefetching to be effective.
I don't understand what this comment is trying to say. first, the probe batch 
being complete and the values cache being empty aren't equivalent, right? also, 
do we really avoid it because prefetching won't be effective or is it to avoid 
re-evaluation of the expressions that are already cached? I think we should 
slim down this comment.


Line 502: batch_size
cache_size or prefetch_size or something. batch is too overloaded.


Line 509:           // processor dependent so we may need calibration at Impala 
startup time.
the better place for this comment seems to be when we're calculating the cache 
capacity.


Line 527:       }
shouldn't we terminate this loop when we hit expr_vals_cache->AtEnd()? or did 
you intend to loop through the empty rows but take the IsRowNull() path?


http://gerrit.cloudera.org:8080/#/c/2959/4/be/src/exec/partitioned-hash-join-node.cc
File be/src/exec/partitioned-hash-join-node.cc:

Line 147:   AddExprCtxsToFree(build_expr_ctxs_);
it's pretty confusing now that we free this one explicitly sometimes. let's 
clean this up in 2399.


Line 652: 
if keeping this, please comment with something like:
// Collect the BufferedTupleStream pointers to occupy less cache space.


Line 912: during
from


Line 913: when probing the hash tables
delete this - it's confusing and Equals() already tells you what you need to 
know.


Line 913: We need to hold on to the local allocation of the
        :     // evaluated probe rows until this probe batch is done.
This sentence is confusing because it's talking about what we need to do for 
probe exprs but we don't do anything for probe exprs here. How about:

We can't free probe expr local allocations until after the ExprValuesCache is 
reset.


Line 915: build_expr_ctxs_
also, what about all the other ctxs QueryMaintenance frees? Instead of special 
casing build_expr_ctxs, wouldn't it make more sense to special case 
probe_expr_ctxs?  Oh, I guess you wanted to keep probe_expr_ctxs registered as 
a local allocation?  Maybe it'd be cleaner to not do that.


Line 989: probe batch
is this what's important? isn't it only safe once we reset the ExprValuesCache?


http://gerrit.cloudera.org:8080/#/c/2959/4/be/src/exec/partitioned-hash-join-node.h
File be/src/exec/partitioned-hash-join-node.h:

Line 167: build_rows
comment for this.


Line 259:   /// that 'prefetch_mode' will be substituted with constants during 
codegen time.
this comment needs updating. also it needs clarification that it only evaluates 
as many rows as can fit in the cache.


Line 260: EvalAndHashProbeBatch
Since this doesn't evaulate the whole probe batch, I think we should call it:

EvalAndHashProbeCache() or EvalAndHashProbePrefetchGroup() or 
FillProbePrefetchCache() or FillProbePrefetchGroup()


Line 615: build_rows
can this use the array syntax to match the function decl above?


http://gerrit.cloudera.org:8080/#/c/2959/4/be/src/runtime/row-batch.h
File be/src/runtime/row-batch.h:

Line 151: FIXED_LEN_BUFFER_LIMIT
I don't think this is always the upper bound on row-batches. Tim, can you 
comment?


Line 155: + limit
also i don't see why we'd add this to row_idx. why does it matter where you 
start?

what is the 'limit' trying to do?


http://gerrit.cloudera.org:8080/#/c/2959/4/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

Line 197: 1
can't we use the enum value here?


http://gerrit.cloudera.org:8080/#/c/2959/4/common/thrift/Types.thrift
File common/thrift/Types.thrift:

Line 128:   HT_BUCKET
what other modes will we have? prefetch the hash table data?


-- 
To view, visit http://gerrit.cloudera.org:8080/2959
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib42b93d99d09c833571e39d20d58c11ef73f3cc0
Gerrit-PatchSet: 4
Gerrit-Project: Impala
Gerrit-Branch: cdh5-trunk
Gerrit-Owner: Michael Ho <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Mostafa Mokhtar <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-HasComments: Yes

Reply via email to