[
https://issues.apache.org/jira/browse/HIVE-10179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16251015#comment-16251015
]
liyunzhang edited comment on HIVE-10179 at 11/14/17 7:28 AM:
-------------------------------------------------------------
[~teddy.choi]: i want ask a question about
[DoubleColAddRepeatingDoubleColumnBench|https://github.com/apache/hive/blob/master/itests/hive-jmh/src/main/java/org/apache/hive/benchmark/vectorization/VectorizedArithmeticBench.java#L53].
Why should we test {{DoubleColAddRepeatingDoubleColumnBench}}, in my view,
this test relates col1+col2 and the elements in col2 is same. Is there any
difference between {{DoubleColAddRepeatingDoubleColumnBench}} and
{{DoubleColAddDoubleColumnBench}} in SIMD instructions?
I add some code in VectorizedArithmeticBench.java like following
{code}
public static class DoubleColAddDoubleColumnBench extends AbstractExpression {
@Override
public void setup() {
rowBatch = buildRowBatch(new DoubleColumnVector(), 2,
getDoubleColumnVector(),
getDoubleColumnVector());
expression = new DoubleColAddDoubleColumn(0, 1, 2);
}
}
{code}
After testing {{DoubleColAddDoubleColumnBench}} and
{{DoubleColAddRepeatingDoubleColumnBench}}, I found
|| ||AVX1||AVX2|| perf improvement ||
| DoubleColAddDoubleColumnBench |150709| 159073| 5% |
| DoubleColAddRepeatingDoubleColumnBench | 111093| 95520 |14% |
It is very interesting that great improvement on
{{DoubleColAddRepeatingDoubleColumnBench}} while no obvious improvement on
{{DoubleColAddDoubleColumnBench}}
I guess the goal to add {{DoubleColAddRepeatingDoubleColumnBench}} is to test
whether there is benefit from SIMD instructions if one vector add a constant
value or not? Is my understanding right?
was (Author: kellyzly):
[~teddy.choi]: i want ask a question about
[DoubleColAddRepeatingDoubleColumnBench|https://github.com/apache/hive/blob/master/itests/hive-jmh/src/main/java/org/apache/hive/benchmark/vectorization/VectorizedArithmeticBench.java#L53].
Why should we test {{DoubleColAddRepeatingDoubleColumnBench}}, in my view,
this test relates col1+col2 and the elements in col2 is same. Is there any
difference between {{DoubleColAddRepeatingDoubleColumnBench}} and
{{DoubleColAddDoubleColumnBench}} in SIMD instructions?
I add some code in VectorizedArithmeticBench.java like following
{code}
public static class DoubleColAddDoubleColumnBench extends AbstractExpression {
@Override
public void setup() {
rowBatch = buildRowBatch(new DoubleColumnVector(), 2,
getDoubleColumnVector(),
getDoubleColumnVector());
expression = new DoubleColAddDoubleColumn(0, 1, 2);
}
}
{code}
After testing {{DoubleColAddDoubleColumnBench}} and
{{DoubleColAddRepeatingDoubleColumnBench}}, I found
|| ||AVX1||AVX2|| perf improvement ||
| DoubleColAddDoubleColumnBench |159588 | 158131 | 0.9% |
| DoubleColAddRepeatingDoubleColumnBench | 111093| 95520 |14% |
It is very interesting that great improvement on
{{DoubleColAddRepeatingDoubleColumnBench}} while no obvious improvement on
{{DoubleColAddDoubleColumnBench}}
I guess the goal to add {{DoubleColAddRepeatingDoubleColumnBench}} is to test
whether there is benefit from SIMD instructions if one vector add a constant
value or not? Is my understanding right?
> Optimization for SIMD instructions in Hive
> ------------------------------------------
>
> Key: HIVE-10179
> URL: https://issues.apache.org/jira/browse/HIVE-10179
> Project: Hive
> Issue Type: Improvement
> Reporter: Chengxiang Li
> Assignee: Chengxiang Li
> Labels: optimization
>
> [SIMD|http://en.wikipedia.org/wiki/SIMD] instuctions could be found in most
> of current CPUs, such as Intel's SSE2, SSE3, SSE4.x, AVX and AVX2, and it
> would help Hive to outperform if we can vectorize the mathematical
> manipulation part of Hive. This umbrella JIRA may contains but not limited to
> the subtasks like:
> # Code schema adaption, current JVM is quite strictly on the code schema
> which could be transformed into SIMD instructions during execution.
> # New implementation of mathematical manipulation part of Hive which designed
> to be optimized for SIMD instructions.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)