[
https://issues.apache.org/jira/browse/KYLIN-679?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14566629#comment-14566629
]
Henry Saputra commented on KYLIN-679:
-------------------------------------
Integration should be done via abstraction to support different runtime engine,
including native one from Kylin.
Tight couple with Spark will make the project unable to leverage other runtime
engines.
> Adding Spark Support to Apache Kylin
> ------------------------------------
>
> Key: KYLIN-679
> URL: https://issues.apache.org/jira/browse/KYLIN-679
> Project: Kylin
> Issue Type: New Feature
> Components: General
> Reporter: Luke Han
> Fix For: Future
>
>
> Challenges in current architecture:
> High latency when reading data from Hive
> --Several hours to fetch data when join big tables
> --Route to SQL-on-Hadoop turned off due to performance issue
> Time-to-Market of data latency
> --Huge IO & Network traffic with MR jobs
> Streaming
> --Streaming process and pre-calculate cubes
> Where Spark could bring benefits to Kylin:
> Integrating with Spark SQL:
> --Option I: Read data from SparkSQL instead of Hive
> --Option II: Route unsupported queries to SparkSQL
> --Option III: Kylin to be OLAP source of SparkSQL
> Spark Cube Build Engine
> --Efficiency cube generate engine with Spark
> Spark Streaming
> --Leverage SparkStreaming for StreamingOLAP
> HBase?
> --Any idea?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)