Tim Armstrong has posted comments on this change. Change subject: IMPALA-4864 Speed up single slot predicates with dictionaries ......................................................................
Patch Set 16: (1 comment) http://gerrit.cloudera.org:8080/#/c/6726/16/be/src/exec/parquet-column-readers.cc File be/src/exec/parquet-column-readers.cc: Line 430: if (!dictionary_results_.Get(dict_index)) { I'm thinking that this branch lengthened the critical path through this function for the non-selective case, since the two loads can't necessarily happen in parallel. Hard to know if that's the source of the regression without profiling. We could try to do something like this so that the two loads can happen in parallel: void* slot = tuple->GetSlot(tuple_offset_); T* val_ptr = reinterpret_cast<T*>(slot); dict_decoder_.GetValue(dict_index, val_ptr); if (!dictionary_results_.Get(dict_index)) { filtered_rows_->Set(*num_values + val_count, true); } Or maybe even change Set() so that it uses branch-free code and do something like: filtered_rows_->Set(!dictionary_results_.Get(dict_index)); -- To view, visit http://gerrit.cloudera.org:8080/6726 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054 Gerrit-PatchSet: 16 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Zach Amsden <[email protected]> Gerrit-HasComments: Yes
