Dan Burkert has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/8869 )
Change subject: KUDU-2231: sparse column predicate can cause excessive data-block reads ...................................................................... KUDU-2231: sparse column predicate can cause excessive data-block reads When scanning with a sparsely-matching predicate, the CFileIterator can repeatedly materialize non-predicate column blocks multiple times. The result is huge amounts of CPU wasted in block decoding and poor performance. The root cause is that CFileIterator::SeekToOrdinal does not check whether the currently materialized data block contains the ordinal index being seeked to. Instead, it throws away the currently prepared blocks (in CFileIterator::PrepareForNewSeek), and re-materializes the blocks again. This commit is a very targeted fix. Since I've had some time to get familiar with this codepath in the past few days, I've found some things that I think we could improve and simplify in follow-up commits, which I've filed as KUDU-2243. Change-Id: I8eb3be4a809f882ccd80c48612099b2071306ff7 Reviewed-on: http://gerrit.cloudera.org:8080/8869 Tested-by: Kudu Jenkins Reviewed-by: Dan Burkert <[email protected]> --- M src/kudu/cfile/cfile_reader.cc M src/kudu/tablet/tablet-pushdown-test.cc 2 files changed, 107 insertions(+), 24 deletions(-) Approvals: Kudu Jenkins: Verified Dan Burkert: Looks good to me, approved -- To view, visit http://gerrit.cloudera.org:8080/8869 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: kudu Gerrit-Branch: master Gerrit-MessageType: merged Gerrit-Change-Id: I8eb3be4a809f882ccd80c48612099b2071306ff7 Gerrit-Change-Number: 8869 Gerrit-PatchSet: 6 Gerrit-Owner: Dan Burkert <[email protected]> Gerrit-Reviewer: Dan Burkert <[email protected]> Gerrit-Reviewer: David Ribeiro Alves <[email protected]> Gerrit-Reviewer: Kudu Jenkins Gerrit-Reviewer: Tidy Bot Gerrit-Reviewer: Todd Lipcon <[email protected]>
