GitHub user cramja opened a pull request: https://github.com/apache/incubator-quickstep/pull/109
Refectored bulk insertion to the SplitRow store The inner loop of the insert algorithm has been changed to reduce function calls to only those that are absolutely necessary. Also, we merge copies which come from other rowstore source, speeding up insertion time. Also adds support for the idea of 'partial inserts'. Partial inserts are when you are only inserting a subset of the columns at a time. Partial inserts will be used in a later commit. *Testing* Unit tests have been updated. The old bulkInsert tests needed to be modified because now we have situations where a block will not be filled up completely- only to a threshold value. This reduces the runtime of the costly inner loop at the cost of a few tuples. *Performance* I had a [similar PR-100 open](https://github.com/apache/incubator-quickstep/pull/100) last week. I ran TPCH SF100 queries 1-17 with this branch and with the branch from PR-100. They performed within a 1% margin of each other so it is safe to say that this branch is as fast as the last branch (which was 2x the base). You can merge this pull request into a Git repository by running: $ git pull https://github.com/cramja/incubator-quickstep splitrow_insert_refactor Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-quickstep/pull/109.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #109 ---- commit 4ce5acf046e0d5fce320efcae7aea648549e98e9 Author: cramja <marc.spehlm...@gmail.com> Date: 2016-10-05T21:40:30Z Refectored bulk insertion to the SplitRow store The inner loop of the insert algorithm has been changed to reduce function calls to only those that are absolutely necessary. Also, we merge copies which come from other rowstore source, speeding up insertion time. Also adds support for the idea of 'partial inserts'. Partial inserts are when you are only inserting a subset of the columns at a time. Partial inserts will be used in a later commit. ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---