Luke Han created KYLIN-679:
------------------------------
Summary: Adding Spark Support to Apache Kylin
Key: KYLIN-679
URL: https://issues.apache.org/jira/browse/KYLIN-679
Project: Kylin
Issue Type: New Feature
Components: General
Reporter: Luke Han
Fix For: Future
Challenges in current architecture:
High latency when reading data from Hive
--Several hours to fetch data when join big tables
--Route to SQL-on-Hadoop turned off due to performance issue
Time-to-Market of data latency
--Huge IO & Network traffic with MR jobs
Streaming
--Streaming process and pre-calculate cubes
Where Spark could bring benefits to Kylin:
Integrating with Spark SQL:
--Option I: Read data from SparkSQL instead of Hive
--Option II: Route unsupported queries to SparkSQL
--Option III: Kylin to be OLAP source of SparkSQL
Spark Cube Build Engine
--Efficiency cube generate engine with Spark
Spark Streaming
--Leverage SparkStreaming for StreamingOLAP
HBase?
--Any idea?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)