[
https://issues.apache.org/jira/browse/TAJO-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13728496#comment-13728496
]
Henry Saputra commented on TAJO-104:
------------------------------------
Hi Hyunsik, could you elaborate more what is this umbrella ticket for? So Tajo
will have a new data ingest flow that store target data into columnar storage?
Or is this similar to Impala where we Tajo is trying to run Query directly
against target data source?
> JIT Query Compilation and Vectorized Engine (Umbrella)
> ------------------------------------------------------
>
> Key: TAJO-104
> URL: https://issues.apache.org/jira/browse/TAJO-104
> Project: Tajo
> Issue Type: New Feature
> Components: physical operator, worker
> Reporter: Hyunsik Choi
>
> In these days, it's unnecessary to say the advantages of columnar store and
> vectorized processing on analytic workloads. These approaches are well known
> as the state-of-the-art techniques in database community and are also
> acceptable in practical areas.
> Since we started Tajo project in 2010 year, we have planed the new engine
> using both JIT query compilation and vectorized engine. My colleagues and I
> have surveyed columnar store, vectorized processing, cache conscious
> techniques, and query compilation.
> In this issue, we will design and implement the new engine. The key
> implementation plan is as follows:
> * Implemented in C++
> * Vectorization primitives will be generated by LLVM.
> * Two or more primitives by using JIT can be blurred according to the
> situation.
> This is an umbrella issue, and we will create lots of subtasks for this issue.
> The design references are as follows:
> * DSM vs. NSM: CPU Performance Tradeoffs in Block-Oriented Query Processing.
> * Efficiently Compiling Efficient Query Plans for Modern Hardware
> * Just-in-time Compilation in Vectorized Query Execution
> * MonetDB/X100: Hyper-Pipelining Query Execution
> * Column-Stores vs. Row-Stores: How Different Are They Really?
> * Balancing vectorized query execution with bandwidth-optimized storage
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira