Tim Armstrong has posted comments on this change. Change subject: IMPALA-3838: Codegen EvalRuntimeFilters(). ......................................................................
Patch Set 1: (13 comments) Did another more detailed pass. http://gerrit.cloudera.org:8080/#/c/4833/1/be/src/exec/filter-context.cc File be/src/exec/filter-context.cc: Line 82: // TODO: codegen expr evaluation and hashing Do we have a JIRA for this? PS1, Line 151: expr_ctx->root() Maybe we should factor this out into a local Expr* variable since we're going to start avoiding using ExprContext's for codegen. I'm ok with this either way. Line 158: // Load 'expr_ctx' from 'this_arg' FilterContext object. We should document next to the class members that we're depending on the order for codegen. PS1, Line 179: CreatePointerCast How does this differ to BitCast? Should we be using this in some places instead? Line 195: Value* filter_ptr = builder.CreateStructGEP(NULL, this_arg, 1, "filter_ptr"); We should document next to the class members that we're depending on the order for codegen. http://gerrit.cloudera.org:8080/#/c/4833/1/be/src/exec/hdfs-parquet-scanner-ir.cc File be/src/exec/hdfs-parquet-scanner-ir.cc: Line 73: if (!filter_ctxs_[i]->Eval(row)) { The logic already did this, but it's weird that we count it as rejected when the filter hasn't arrived yet. I think it might be a bug. Asking Henry about it but wanted to flag the issue. http://gerrit.cloudera.org:8080/#/c/4833/1/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: Line 273: stats->IncrCounters(FilterStats::ROWS_KEY, local.considered, local.considered, local.rejected); Long line Line 617: ++scratch_batches_processed_; This doesn't seem to be particularly related to TransferScratchTuples(). Maybe just make it a separate function and call it from GetNext()? PS1, Line 720: bail_out_block Maybe "filter_rejected_block" or "return_false_block" or something. bail_out sounds kind-of like there was an error. Line 721: if (num_filters > 0) { Might be simpler to just handle the 0-filter case up the top by generating a "return true". Line 725: if (filter_ctxs[i].filter->AlwaysTrue()) continue; I don't think this makes sense - we don't expect the filters to arrive at this point, so we shouldn't know whether they're true or not. http://gerrit.cloudera.org:8080/#/c/4833/1/be/src/runtime/raw-value-ir.cc File be/src/runtime/raw-value-ir.cc: Line 111: //The choice of hash function needs to be consistent across all hosts of the cluster. Missing space http://gerrit.cloudera.org:8080/#/c/4833/1/tests/query_test/test_tpch_queries.py File tests/query_test/test_tpch_queries.py: Line 39: v.get_value('table_format').file_format in ['text', 'parquet', 'kudu']) > Does this provide meaningful coverage? It doesn't seem like it's worth runn I guess because we don't have the same runtime filter support for text. -- To view, visit http://gerrit.cloudera.org:8080/4833 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I27114869840e268d17e91d6e587ef811628e3837 Gerrit-PatchSet: 1 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Michael Ho <[email protected]> Gerrit-Reviewer: Michael Ho <[email protected]> Gerrit-Reviewer: Mostafa Mokhtar <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-HasComments: Yes
