GitHub user saketj opened a pull request:
https://github.com/apache/incubator-quickstep/pull/101
Optimize PackedRowStoreValueAccessor & BasicColumnStoreValueAccessor by
removing redundant computations and clearly exposing a strided memory access
pattern
This PR proposes to optimize the way value accessors are currently used to
read values from a storage block. Code profiling revealed that almost 10%-15%
of the execution time was being spent in making redundant method calls to
CatalogRelationSchema to find out properties that are anyway always constant
when iterating over a given column. A simple caching of these values could give
significant improvements. Moreover, during code analysis it was found that even
more opportunities for compiler induced optimization and vectorization exist,
if a simple semantics of a strided memory access pattern is exposed to the
compiler & the runtime. After all, most iterations over a column in a relation
can be expressed by as simple equation as: `base_address + tuple_id * offset`,
where base_address and offset are constant. Exposing this semantics opens a
whole lot of interesting opportunities, which otherwise have been obscured in
the existing design of ValueAccessors. Rewriting the entire set o
f ValueAccessors is a mammoth task. Therefore, this PR and few subsequent PRs
will propose to do this incrementally, always making sure that each PR
addresses end-to-end functionality for some subset of features.
This PR specifically optimizes PackedRowStoreValueAccessor &
BasicColumnStoreValueAccessor only. To demonstrate the end-to-end benefit of
this optimization, predicate evaluation involving relational operators, regular
expressions & literal comparison have been refactored to be cognizant of such
optimizations.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/saketj/incubator-quickstep
optimize-value-accessor
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-quickstep/pull/101.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #101
commit 88f810c0eac4a248a81a7cca614389cfd94aa1dd
Author: Saket Saurabh
Date: 2016-09-21T08:17:19Z
Optimize PackedRowStoreValueAccessor & BasicColumnStoreValueAccessor by
removing redundant computations and clearly exposing a strided memory access
pattern
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---