Zach Amsden has posted comments on this change. Change subject: IMPALA-4864 Speed up single slot predicates with dictionaries ......................................................................
Patch Set 7: (17 comments) Thanks for looking at this! http://gerrit.cloudera.org:8080/#/c/6726/7//COMMIT_MSG Commit Message: Line 7: IMPALA-4864 Speed up single slot predicates with dictionaries > The JIRA seems to be talking about doing something slightly different: taki There is a bit of a disconnect with the JIRA as specified and the intention behind the JIRA, which had me initially working down the wrong path. I realized that (x OP k) type predicates can only be evaluated for OP = or <=>, and started working on a way to identify such predicates in sub-expressions of the conjuncts. It all seemed quite speculative and limited in applicability. Once can imagine expanding this to ordering predicates with sorted dictionaries, but as we don't have that yet, it would be building without a foundation. After clarification from Marcel, it became apparent that what was intended was to be able to evaluate arbitrary single slot predicates against the dictionary (as we do this anyway for row-group filtering) and save those results to eliminate the cost of re-evaluating the same conjuncts again later. The central difficulty of this problem is that the set of conjuncts which can be dictionary evaluated is not known until row group time. This is the best proposal that I have come up with so far at attempting to solve that problem. I agree we should get the design right on this one. Doing this wrong could be costly in terms of either performance or complexity. http://gerrit.cloudera.org:8080/#/c/6726/7/be/src/exec/hdfs-parquet-scanner-ir.cc File be/src/exec/hdfs-parquet-scanner-ir.cc: Line 40 > I believe this code was deliberately written like this to avoid the indirec I have no problem reverting this. Originally I had the two functions merged into one, but I didn't like imposing the overhead of checking the bitmap on this version. While the code evolved, I had to put an interface on ScratchBatch just for sanity, as too many pieces of code were playing with its internals. I think the interface is actually broad enough now to go back to the original code (minus the interference with internals). I'll simply do this at the end: scratch_batch_->Advance((scratch_tuple - scratch_tuple_start) / tuple_size); Line 77: tuple_index = scratch_batch_->NextValidTuple()) { > It seems like this might be slow for the case when many or all of the tuple Your intuition and mine are similar. I did my best to make the "simple" version of NextSetBit() as fast as possible, but I believe intrinsics might actually be warranted here. Wanted to get the core ideas sound before going in that direction though. Oh, and for reference, I also tried various other bitmap implementations - boost::dynamic_bitset and std::vector<bool>. I didn't like the boost dependency (or the code too much) and the std::vector<bool> (specialized bit-vector) implementation compiled to assembler sent me running for the hills. Branches, branches everywhere... http://gerrit.cloudera.org:8080/#/c/6726/7/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: Line 935 > I'm not sure, but this hoisting may have been deliberate because the compil It certainly shouldn't hurt to re-hoist Line 208: dict_filters_active_.reset(new bool[scanner_conjunct_ctxs_->size()]); > This probably works fine in practice but afaik scoped_ptr will call free in Will check PS7, Line 1051: _ > nit: extraneous _ Originally a class member, until I realized this function had no access. Will fix. http://gerrit.cloudera.org:8080/#/c/6726/7/be/src/exec/parquet-column-readers.cc File be/src/exec/parquet-column-readers.cc: Line 317: bool ReadValueBatch(ScratchTupleBatch& scratch, int* num_values) { > I think it would be best to avoid using the ScratchTupleBatch abstraction i My guess is that inlining the functions from the header should get about as optimal as possible. Maybe hoisting 'max = scratch.capacity()' to tell the compiler it is unchanged throughout the loop, but other than that, I don't see much room for improvement that has been lost. Let's see what perf results look like when that is ready. I don't want to pass the bitmap directly, as I only want the bitmap to be touched (or even known about) by those pieces of code which actually are running in a filtered column reader. Line 408: if (IS_FILTERED && IS_DICT_ENCODED && > This patch pushes the predicate evaluation down into ReadFilteredSlot() so Perhaps there is a way to do this, but I don't see a clear path to that. We have some choices, but they are all problematic: Materialize codewords and values. Where do the codewords go? There is no space in the tuple. Materialize codewords only. What do we do when there is no dictionary? Materialize codewords and values conditionally. Now we have to write code to deal with either of the two - and to go back inside the dictionary and convert codewords back into values, while the entire functionality to do that was hidden inside the private details of the dict<T> decoder and not exposed to the scanner at all. In any case, doing predicate evaluation columnwise is certainly interesting (SIMD combined with a lazy value lookup cache seems especially promising), but it completely wreaks havoc with expression evaluation order, and I think dealing with the ramifications of that is probably beyond the scope of this diff. Here, we are just trying to make use of our pre-computed dictionary filter results as efficiently as possible. PS7, Line 408: IS_FILTERED > I think IS_FILTERED could just be a function template parameter similar to The behavior is different as a class, not just at this function level, so I don't see how that would work. Line 417: if (IS_FILTERED) batch.InvalidateTuple(val_index); > This and setting the null indicator can be mutually exclusive, right? Sin True. (Was not so in original code). There's actually a bug here in our decoding, but I don't want to open that right now. We don't throw any kind of error on getting a NULL decoding for a non-NULLable column. PS7, Line 468: !column_has_match_ > Maybe rename the member variable to match the function name. I think the cu Done Line 609: bool true_on_null_; > Unused? It does seem like we need this for correctness in some cases, right I tried to do some optimization based on this and allowed passing in predicates that were true on null in the last diff, but it looked very speculative and ended up being a lot of work and complexity. I'll get rid of this. http://gerrit.cloudera.org:8080/#/c/6726/7/be/src/exec/parquet-column-readers.h File be/src/exec/parquet-column-readers.h: Line 24: #include "exec/parquet-scratch-tuple-batch.h" > I think we only need a forward declaration in this header - not a big deal Will do the bare minimum I can get away with. Unfortunately very familiar with the compile time problem. PS7, Line 374: aller must pass a tuple that can be used to : // materialize temporary values from the dictionary. > I think this needs updating - there's a callsite that passes in nullptr. Done Line 391: // Returns true if the row group has no columns which pass filtering conjuncts, > The function name and comment aren't too enlightening for me. Done http://gerrit.cloudera.org:8080/#/c/6726/7/be/src/exec/parquet-scratch-tuple-batch.h File be/src/exec/parquet-scratch-tuple-batch.h: Line 115: Bitmap valid_bitmap_; > We should consider if it's better to use a bitmap or an array of bools. The I think our bitmap implementation is quite good - if we manage to get this compiled down to a `testb M, bit`, we should be in good shape. Again lets wait for benchmarks to tell us the answer. http://gerrit.cloudera.org:8080/#/c/6726/7/testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test File testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test: Line 25: parquet dictionary predicates: int_col IS NULL, int_col > 1 Oops, this needs reverting since I removed support for true on NULL predicates. -- To view, visit http://gerrit.cloudera.org:8080/6726 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I65981c89e5292086809ec1268f5a273f4c1fe054 Gerrit-PatchSet: 7 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Zach Amsden <[email protected]> Gerrit-Reviewer: Joe McDonnell <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Zach Amsden <[email protected]> Gerrit-HasComments: Yes
