Hi, Glancing at the code a bit, this seems like a reasonable optimization. The SelectionVectorView was originally added as an optimization for decoders that are able to fully evaluate predicates, but that isn't to say it can't be used further by other decoders as a means to avoiding unnecessary copying. As you suggest, it'd be particularly helpful in materializing large rows on which the predicates are very selective (not many rows returned).
Hope this helped, Andrew On Thu, Jun 14, 2018 at 12:20 AM, helifu <[email protected]> wrote: > Hi all, > > > > I read the code of ‘CFileIterator::Scan’, and found that it would be > better to pass ‘remaining_sel’ to function ‘CopyNextValues’ to skip > coping the unnecessary data for the columns that are not in predicates. In > other words, the decoder will copy all the data of the columns that are not > in predicates. > > > > CFileIterator::Scan: > > for (const auto& col_pred : (ctx->DecoderEvalNotDisabled()) { > > RETURN_NOT_OK(pb->dblk_->CopyNextAndEval(&this_batch, ctx, &remaining_sel, > &remaining_dst)); > > } else { > > RETURN_NOT_OK(pb->dblk_->CopyNextValues(&this_batch, &remaining_dst)); > <-- Here > > } > > > > > > For example: select column_a, column_b from table where column_c=’c’; > > In function ‘MaterializingIterator::MaterializeBlock’: > > Because the column_c is a predicate, so it is in > ‘col_idx_predicates_’. And then the decoder will evaluate and copy the > right data or sets the ‘SelectionVector’ to false. > > Next, column_a and column_b are not predicates, so they are in > ‘non_predicate_column_indexes_’. This time, the decoder will copy the data > directly even though some rows have been set to false. > > > > I just want to make sure I am interpreting this correctly. Thanks in > advance. > > > > 何李夫 > > 2017-04-10 16:06:24 > > > > -- Andrew Wong
