[ 
https://issues.apache.org/jira/browse/STORM-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14970164#comment-14970164
 ] 

Haohui Mai commented on STORM-1105:
-----------------------------------

Revisited the design of using LLVM as the intermediate representations in 
StormSQL.

Pros:
* Mature optimization from LLVM
* Auto vectorization

Cons:
* Nontrivial to compile LLVM back to Java source code, especially preserving 
the effects of vectorization 
* Native code

The main question is that whether the LLVM based approach can deliver more 
performance. To answer the question I wrote a benchmark that filters and counts 
a continuous stream of 32-bit integers. The integers are grouped in a batch of 
8.

I ran three experiments:

* Native: the filter is compiled to native code with vectorization turned on. 
The worker runs the code via JNI.
* Without boxing: the filter gets the integers by accessing specific offsets of 
the bytearray. There is no serialization overhead.
* With boxing: the filter first deserializes the bytearray to {{Values}}. The 
filter casts the {{Object}} to {{Integer}} during filtering.

I ran the experiment on an early-2013 Macbook Pro to filter 20 millions 
integers for 10 times. Here is the result:

{noformat}
native
Time: 3037
Time: 3007
Time: 3178
Time: 3194
Time: 3258
Time: 3170
Time: 3092
Time: 3238
Time: 3125
Time: 3127
without boxing
Time: 2272
Time: 2243
Time: 2275
Time: 2084
Time: 2101
Time: 2159
Time: 2176
Time: 2116
Time: 2055
Time: 2174
with boxing
Time: 3132
Time: 2934
Time: 2915
Time: 2902
Time: 2731
Time: 2815
Time: 2834
Time: 2842
Time: 2793
Time: 2849
{noformat}

The full Java approach w/o boxing delivers the best performance even there is 
no vectorization in the filter. Therefore it might make sense to delegate the 
optimization problem to the JIT directly and not having LLVM involved.

> Compile the logical plans into LLVM functions
> ---------------------------------------------
>
>                 Key: STORM-1105
>                 URL: https://issues.apache.org/jira/browse/STORM-1105
>             Project: Apache Storm
>          Issue Type: New Feature
>          Components: storm-sql
>            Reporter: Haohui Mai
>            Assignee: Haohui Mai
>
> This jira tracks the effort on compiling the stages of the logical plans into 
> LLVM  functions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to