[jira] [Commented] (QUICKSTEP-59) Improve performance of BitVector in Quickstep by eliminating branches

2016-10-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/QUICKSTEP-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15615587#comment-15615587
 ] 

ASF GitHub Bot commented on QUICKSTEP-59:
-

Github user saketj commented on the issue:

https://github.com/apache/incubator-quickstep/pull/114
  
Merged in master


> Improve performance of BitVector in Quickstep by eliminating branches
> -
>
> Key: QUICKSTEP-59
> URL: https://issues.apache.org/jira/browse/QUICKSTEP-59
> Project: Apache Quickstep
>  Issue Type: Improvement
>  Components: Query Execution, Utility
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
> Attachments: QUICKSTEP-59.01.patch
>
>
> The {{setBitRegularVersion()}} of {{BitVector.hpp}} is a critical function 
> that is called in various tight loop iterations over storage blocks 
> throughout the Quickstep code. This function has a simple purpose of setting 
> a bit value in a BitVector to true/false given a boolean argument. However, 
> it has an expensive if-else branch that can add a significant penalty at 
> runtime due to branch mis-predictions. 
> This short PR completely removes branching from the 
> {{setBitRegularVersion()}} by replacing the same functionality with a set of 
> bitwise arithmetic operations. Given that a branch mis-prediction costs about 
> 10 cycles, the branchless code is expected to save those precious 10 cycles 
> at the slight expense of 4 additional bitwise operations (an additional 2-4 
> cycles only, given hyper-threading).
> *Tests:*
> The existing unit tests should already cover the changes introduced by this 
> PR. Correctness also verified by comparing TPC-H query output results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (QUICKSTEP-59) Improve performance of BitVector in Quickstep by eliminating branches

2016-10-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/QUICKSTEP-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15615586#comment-15615586
 ] 

ASF GitHub Bot commented on QUICKSTEP-59:
-

Github user saketj closed the pull request at:

https://github.com/apache/incubator-quickstep/pull/114


> Improve performance of BitVector in Quickstep by eliminating branches
> -
>
> Key: QUICKSTEP-59
> URL: https://issues.apache.org/jira/browse/QUICKSTEP-59
> Project: Apache Quickstep
>  Issue Type: Improvement
>  Components: Query Execution, Utility
>Reporter: Saket Saurabh
>Assignee: Saket Saurabh
> Attachments: QUICKSTEP-59.01.patch
>
>
> The {{setBitRegularVersion()}} of {{BitVector.hpp}} is a critical function 
> that is called in various tight loop iterations over storage blocks 
> throughout the Quickstep code. This function has a simple purpose of setting 
> a bit value in a BitVector to true/false given a boolean argument. However, 
> it has an expensive if-else branch that can add a significant penalty at 
> runtime due to branch mis-predictions. 
> This short PR completely removes branching from the 
> {{setBitRegularVersion()}} by replacing the same functionality with a set of 
> bitwise arithmetic operations. Given that a branch mis-prediction costs about 
> 10 cycles, the branchless code is expected to save those precious 10 cycles 
> at the slight expense of 4 additional bitwise operations (an additional 2-4 
> cycles only, given hyper-threading).
> *Tests:*
> The existing unit tests should already cover the changes introduced by this 
> PR. Correctness also verified by comparing TPC-H query output results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (QUICKSTEP-59) Improve performance of BitVector in Quickstep by eliminating branches

2016-10-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/QUICKSTEP-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15599695#comment-15599695
 ] 

ASF GitHub Bot commented on QUICKSTEP-59:
-

Github user hbdeshmukh commented on the issue:

https://github.com/apache/incubator-quickstep/pull/114
  
@saketj You may have to close this PR manually, as it was created from your 
personal fork of Quickstep. 


> Improve performance of BitVector in Quickstep by eliminating branches
> -
>
> Key: QUICKSTEP-59
> URL: https://issues.apache.org/jira/browse/QUICKSTEP-59
> Project: Apache Quickstep
>  Issue Type: Improvement
>  Components: Query Execution, Utility
>Reporter: Saket Saurabh
> Attachments: QUICKSTEP-59.01.patch
>
>
> The {{setBitRegularVersion()}} of {{BitVector.hpp}} is a critical function 
> that is called in various tight loop iterations over storage blocks 
> throughout the Quickstep code. This function has a simple purpose of setting 
> a bit value in a BitVector to true/false given a boolean argument. However, 
> it has an expensive if-else branch that can add a significant penalty at 
> runtime due to branch mis-predictions. 
> This short PR completely removes branching from the 
> {{setBitRegularVersion()}} by replacing the same functionality with a set of 
> bitwise arithmetic operations. Given that a branch mis-prediction costs about 
> 10 cycles, the branchless code is expected to save those precious 10 cycles 
> at the slight expense of 4 additional bitwise operations (an additional 2-4 
> cycles only, given hyper-threading).
> *Tests:*
> The existing unit tests should already cover the changes introduced by this 
> PR. Correctness also verified by comparing TPC-H query output results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (QUICKSTEP-59) Improve performance of BitVector in Quickstep by eliminating branches

2016-10-22 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/QUICKSTEP-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15598252#comment-15598252
 ] 

ASF GitHub Bot commented on QUICKSTEP-59:
-

Github user cramja commented on the issue:

https://github.com/apache/incubator-quickstep/pull/114
  
@saketj so what's the final analysis on why the branchless code takes 
longer on these queries? Is it because if the branch is predicted correctly, 
then the old code is slightly faster than the bit-fiddling code in the new 
method?


> Improve performance of BitVector in Quickstep by eliminating branches
> -
>
> Key: QUICKSTEP-59
> URL: https://issues.apache.org/jira/browse/QUICKSTEP-59
> Project: Apache Quickstep
>  Issue Type: Improvement
>  Components: Query Execution, Utility
>Reporter: Saket Saurabh
> Attachments: QUICKSTEP-59.01.patch
>
>
> The {{setBitRegularVersion()}} of {{BitVector.hpp}} is a critical function 
> that is called in various tight loop iterations over storage blocks 
> throughout the Quickstep code. This function has a simple purpose of setting 
> a bit value in a BitVector to true/false given a boolean argument. However, 
> it has an expensive if-else branch that can add a significant penalty at 
> runtime due to branch mis-predictions. 
> This short PR completely removes branching from the 
> {{setBitRegularVersion()}} by replacing the same functionality with a set of 
> bitwise arithmetic operations. Given that a branch mis-prediction costs about 
> 10 cycles, the branchless code is expected to save those precious 10 cycles 
> at the slight expense of 4 additional bitwise operations (an additional 2-4 
> cycles only, given hyper-threading).
> *Tests:*
> The existing unit tests should already cover the changes introduced by this 
> PR. Correctness also verified by comparing TPC-H query output results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (QUICKSTEP-59) Improve performance of BitVector in Quickstep by eliminating branches

2016-10-20 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/QUICKSTEP-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15593830#comment-15593830
 ] 

ASF GitHub Bot commented on QUICKSTEP-59:
-

Github user saketj commented on the issue:

https://github.com/apache/incubator-quickstep/pull/114
  
It turns out that these queries have very few scan (selection) operations 
as compared to other queries and mostly involve joins where BitVectors might 
not be aggressively used. Also, if the selectivity is high, branch predictions 
will anyway be more-or-less accurate. (more so after the LIP filter 
optimization).
However, another interesting thing is that for these queries 09, 20, 21, 
the differences are actually not high across different trials. A cold cache may 
be playing some tricks, when we report our aggregate numbers.


| Query 9  | w/o PR (ms) | w/ PR (ms) |
|--|-||
| Trial 1  | 110251  | 42551  |
| Trial 2  | 13763   | 14386  |
| Trial 3  | 10302   | 10129  |
| Trial 4  | 11783   | 10616  |
| Trial 5  | 10070   | 9243   |
|  | ||
|  | ||
| Query 20 | w/o PR (ms) | w/ PR (ms) |
| Trial 1  | 85988   | 104532 |
| Trial 2  | 56639   | 58389  |
| Trial 3  | 50041   | 51712  |
| Trial 4  | 54269   | 50740  |
| Trial 5  | 49688   | 50723  |
|  | ||
|  | ||
| Query 21 | w/o PR (ms) | w/ PR (ms) |
| Trial 1  | 126323  | 142742 |
| Trial 2  | 120334  | 119632 |
| Trial 3  | 126304  | 115668 |
| Trial 4  | 115185  | 114618 |
| Trial 5  | 114663  | 113232 |


> Improve performance of BitVector in Quickstep by eliminating branches
> -
>
> Key: QUICKSTEP-59
> URL: https://issues.apache.org/jira/browse/QUICKSTEP-59
> Project: Apache Quickstep
>  Issue Type: Improvement
>  Components: Query Execution, Utility
>Reporter: Saket Saurabh
> Attachments: QUICKSTEP-59.01.patch
>
>
> The {{setBitRegularVersion()}} of {{BitVector.hpp}} is a critical function 
> that is called in various tight loop iterations over storage blocks 
> throughout the Quickstep code. This function has a simple purpose of setting 
> a bit value in a BitVector to true/false given a boolean argument. However, 
> it has an expensive if-else branch that can add a significant penalty at 
> runtime due to branch mis-predictions. 
> This short PR completely removes branching from the 
> {{setBitRegularVersion()}} by replacing the same functionality with a set of 
> bitwise arithmetic operations. Given that a branch mis-prediction costs about 
> 10 cycles, the branchless code is expected to save those precious 10 cycles 
> at the slight expense of 4 additional bitwise operations (an additional 2-4 
> cycles only, given hyper-threading).
> *Tests:*
> The existing unit tests should already cover the changes introduced by this 
> PR. Correctness also verified by comparing TPC-H query output results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (QUICKSTEP-59) Improve performance of BitVector in Quickstep by eliminating branches

2016-10-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/QUICKSTEP-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15590351#comment-15590351
 ] 

ASF GitHub Bot commented on QUICKSTEP-59:
-

Github user saketj commented on the issue:

https://github.com/apache/incubator-quickstep/pull/114
  
@pateljm Sure Jignesh, I have rebased this PR now with the master.

Thanks @hbdeshmukh for helping in re-running the performance numbers for 
this PR over the updated master after recent commits. Here are the updated 
numbers for the entire TPC-H suite of queries now:

|  | Execution time w/o PR (in ms) | Execution time w/ PR (in 
ms) |

|--|---|--|
| Query 01.sql | 16,405| 14,901 
  |
| Query 02.sql | 5,603 | 5,582  
  |
| Query 03.sql | 8,267 | 6,018  
  |
| Query 04.sql | 4,823 | 2,741  
  |
| Query 05.sql | 5,210 | 4,592  
  |
| Query 06.sql | 396   | 402
  |
| Query 07.sql | 23,907| 23,154 
  |
| Query 08.sql | 3,324 | 3,268  
  |
| Query 09.sql | 9,641 | 15,749 
  |
| Query 10.sql | 15,215| 14,823 
  |
| Query 11.sql | 2,141 | 2,160  
  |
| Query 12.sql | 2,479 | 2,121  
  |
| Query 13.sql | 34,369| 34,159 
  |
| Query 14.sql | 795   | 822
  |
| Query 15.sql | 4,696 | 5,109  
  |
| Query 16.sql | 9,442 | 9,040  
  |
| Query 17.sql | 149,132   | 142,230
  |
| Query 18.sql | 81,042| 76,434 
  |
| Query 19.sql | 1,453 | 1,564  
  |
| Query 20.sql | 53,268| 59,553 
  |
| Query 21.sql | 119,326   | 124,905
  |
| Query 22.sql | 6,812 | 7,187  
  |
| Total| 557745| 556515 
  |

Overall, there is certainly some benefit to be gained by introducing a 
branchless codepath. Most queries seem to improve in general. However, the 
results for Query 09, 20, and 21 are definitely surprising. I will spend some 
time understanding why this is the case.


> Improve performance of BitVector in Quickstep by eliminating branches
> -
>
> Key: QUICKSTEP-59
> URL: https://issues.apache.org/jira/browse/QUICKSTEP-59
> Project: Apache Quickstep
>  Issue Type: Improvement
>  Components: Query Execution, Utility
>Reporter: Saket Saurabh
> Attachments: QUICKSTEP-59.01.patch
>
>
> The {{setBitRegularVersion()}} of {{BitVector.hpp}} is a critical function 
> that is called in various tight loop iterations over storage blocks 
> throughout the Quickstep code. This function has a simple purpose of setting 
> a bit value in a BitVector to true/false given a boolean argument. However, 
> it has an expensive if-else branch that can add a significant penalty at 
> runtime due to branch mis-predictions. 
> This short PR completely removes branching from the 
> {{setBitRegularVersion()}} by replacing the same functionality with a set of 
> bitwise arithmetic operations. Given that a branch mis-prediction costs about 
> 10 cycles, the branchless code is expected to save those precious 10 cycles 
> at the slight expense of 4 additional bitwise operations (an additional 2-4 
> cycles only, given hyper-threading).
> *Tests:*
> The existing unit tests should already cover the changes introduced by this 
> PR. Correctness also verified by comparing TPC-H query output results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (QUICKSTEP-59) Improve performance of BitVector in Quickstep by eliminating branches

2016-10-19 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/QUICKSTEP-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15589300#comment-15589300
 ] 

ASF GitHub Bot commented on QUICKSTEP-59:
-

Github user pateljm commented on the issue:

https://github.com/apache/incubator-quickstep/pull/114
  
Sweet @saketj. Can your rebase? I can merge this. 


> Improve performance of BitVector in Quickstep by eliminating branches
> -
>
> Key: QUICKSTEP-59
> URL: https://issues.apache.org/jira/browse/QUICKSTEP-59
> Project: Apache Quickstep
>  Issue Type: Improvement
>  Components: Query Execution, Utility
>Reporter: Saket Saurabh
> Attachments: QUICKSTEP-59.01.patch
>
>
> The {{setBitRegularVersion()}} of {{BitVector.hpp}} is a critical function 
> that is called in various tight loop iterations over storage blocks 
> throughout the Quickstep code. This function has a simple purpose of setting 
> a bit value in a BitVector to true/false given a boolean argument. However, 
> it has an expensive if-else branch that can add a significant penalty at 
> runtime due to branch mis-predictions. 
> This short PR completely removes branching from the 
> {{setBitRegularVersion()}} by replacing the same functionality with a set of 
> bitwise arithmetic operations. Given that a branch mis-prediction costs about 
> 10 cycles, the branchless code is expected to save those precious 10 cycles 
> at the slight expense of 4 additional bitwise operations (an additional 2-4 
> cycles only, given hyper-threading).
> *Tests:*
> The existing unit tests should already cover the changes introduced by this 
> PR. Correctness also verified by comparing TPC-H query output results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (QUICKSTEP-59) Improve performance of BitVector in Quickstep by eliminating branches

2016-10-18 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/QUICKSTEP-59?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15587163#comment-15587163
 ] 

ASF GitHub Bot commented on QUICKSTEP-59:
-

Github user saketj commented on the issue:

https://github.com/apache/incubator-quickstep/pull/114
  
@cramja  Thanks Marc for reviewing this.


> Improve performance of BitVector in Quickstep by eliminating branches
> -
>
> Key: QUICKSTEP-59
> URL: https://issues.apache.org/jira/browse/QUICKSTEP-59
> Project: Apache Quickstep
>  Issue Type: Improvement
>  Components: Query Execution, Utility
>Reporter: Saket Saurabh
> Attachments: QUICKSTEP-59.01.patch
>
>
> The {{setBitRegularVersion()}} of {{BitVector.hpp}} is a critical function 
> that is called in various tight loop iterations over storage blocks 
> throughout the Quickstep code. This function has a simple purpose of setting 
> a bit value in a BitVector to true/false given a boolean argument. However, 
> it has an expensive if-else branch that can add a significant penalty at 
> runtime due to branch mis-predictions. 
> This short PR completely removes branching from the 
> {{setBitRegularVersion()}} by replacing the same functionality with a set of 
> bitwise arithmetic operations. Given that a branch mis-prediction costs about 
> 10 cycles, the branchless code is expected to save those precious 10 cycles 
> at the slight expense of 4 additional bitwise operations (an additional 2-4 
> cycles only, given hyper-threading).
> *Tests:*
> The existing unit tests should already cover the changes introduced by this 
> PR. Correctness also verified by comparing TPC-H query output results.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)