[GitHub] incubator-quickstep pull request #101: Optimize PackedRowStoreValueAccessor ...

2016-10-11 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/incubator-quickstep/pull/101


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-quickstep pull request #101: Optimize PackedRowStoreValueAccessor ...

2016-09-21 Thread saketj
GitHub user saketj opened a pull request:

https://github.com/apache/incubator-quickstep/pull/101

Optimize PackedRowStoreValueAccessor & BasicColumnStoreValueAccessor by 
removing redundant computations and clearly exposing a strided memory access 
pattern

This PR proposes to optimize the way value accessors are currently used to 
read values from a storage block. Code profiling revealed that almost 10%-15% 
of the execution time was being spent in making redundant method calls to 
CatalogRelationSchema to find out properties that are anyway always constant 
when iterating over a given column. A simple caching of these values could give 
significant improvements. Moreover, during code analysis it was found that even 
more opportunities for compiler induced optimization and vectorization exist, 
if a simple semantics of a strided memory access pattern is exposed to the 
compiler & the runtime. After all, most iterations over a column in a relation 
can be expressed by as simple equation as: `base_address + tuple_id * offset`, 
where base_address and offset are constant. Exposing this semantics opens a 
whole lot of interesting opportunities, which otherwise have been obscured in 
the existing design of ValueAccessors. Rewriting the entire set o
 f ValueAccessors is a mammoth  task. Therefore, this PR and few subsequent PRs 
will propose to do this incrementally, always making sure that each PR 
addresses end-to-end functionality for some subset of features.

This PR specifically optimizes PackedRowStoreValueAccessor & 
BasicColumnStoreValueAccessor only. To demonstrate the end-to-end benefit of 
this optimization, predicate evaluation involving relational operators, regular 
expressions & literal comparison have been refactored to be cognizant of such 
optimizations. 


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/saketj/incubator-quickstep 
optimize-value-accessor

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/incubator-quickstep/pull/101.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #101


commit 88f810c0eac4a248a81a7cca614389cfd94aa1dd
Author: Saket Saurabh 
Date:   2016-09-21T08:17:19Z

Optimize PackedRowStoreValueAccessor & BasicColumnStoreValueAccessor by 
removing redundant computations and clearly exposing a strided memory access 
pattern




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---