GitHub user saketj opened a pull request: https://github.com/apache/incubator-quickstep/pull/101
Optimize PackedRowStoreValueAccessor & BasicColumnStoreValueAccessor by removing redundant computations and clearly exposing a strided memory access pattern This PR proposes to optimize the way value accessors are currently used to read values from a storage block. Code profiling revealed that almost 10%-15% of the execution time was being spent in making redundant method calls to CatalogRelationSchema to find out properties that are anyway always constant when iterating over a given column. A simple caching of these values could give significant improvements. Moreover, during code analysis it was found that even more opportunities for compiler induced optimization and vectorization exist, if a simple semantics of a strided memory access pattern is exposed to the compiler & the runtime. After all, most iterations over a column in a relation can be expressed by as simple equation as: `base_address + tuple_id * offset`, where base_address and offset are constant. Exposing this semantics opens a whole lot of interesting opportunities, which otherwise have been obscured in the existing design of ValueAccessors. Rewriting the entire set o f ValueAccessors is a mammoth task. Therefore, this PR and few subsequent PRs will propose to do this incrementally, always making sure that each PR addresses end-to-end functionality for some subset of features. This PR specifically optimizes PackedRowStoreValueAccessor & BasicColumnStoreValueAccessor only. To demonstrate the end-to-end benefit of this optimization, predicate evaluation involving relational operators, regular expressions & literal comparison have been refactored to be cognizant of such optimizations. You can merge this pull request into a Git repository by running: $ git pull https://github.com/saketj/incubator-quickstep optimize-value-accessor Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-quickstep/pull/101.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #101 ---- commit 88f810c0eac4a248a81a7cca614389cfd94aa1dd Author: Saket Saurabh <ssaur...@cs.wisc.edu> Date: 2016-09-21T08:17:19Z Optimize PackedRowStoreValueAccessor & BasicColumnStoreValueAccessor by removing redundant computations and clearly exposing a strided memory access pattern ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---