Tim Armstrong has posted comments on this change.

Change subject: IMPALA-4864 Speed up single slot predicates with dictionaries
......................................................................


Patch Set 16:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6726/16/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

Line 430:             if (!dictionary_results_.Get(dict_index)) {
I'm thinking that this branch lengthened the critical path through this 
function for the non-selective case, since the two loads can't necessarily 
happen in parallel. Hard to know if that's the source of the regression without 
profiling. We could try to do something like this so that the two loads can 
happen in parallel:
  
  void* slot = tuple->GetSlot(tuple_offset_);
  T* val_ptr = reinterpret_cast<T*>(slot);
  dict_decoder_.GetValue(dict_index, val_ptr);
  if (!dictionary_results_.Get(dict_index)) {
    filtered_rows_->Set(*num_values + val_count, true);
  }

Or maybe even change Set() so that it uses branch-free code and do something 
like:    

   filtered_rows_->Set(!dictionary_results_.Get(dict_index));


-- 
To view, visit http://gerrit.cloudera.org:8080/6726
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054
Gerrit-PatchSet: 16
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden <[email protected]>
Gerrit-Reviewer: Joe McDonnell <[email protected]>
Gerrit-Reviewer: Marcel Kornacker <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Zach Amsden <[email protected]>
Gerrit-HasComments: Yes

Reply via email to