Luke Han created KYLIN-679:
------------------------------

             Summary: Adding Spark Support to Apache Kylin
                 Key: KYLIN-679
                 URL: https://issues.apache.org/jira/browse/KYLIN-679
             Project: Kylin
          Issue Type: New Feature
          Components: General
            Reporter: Luke Han
             Fix For: Future


Challenges in current architecture:

High latency when reading data from Hive 
--Several hours to fetch data when join big tables
--Route to SQL-on-Hadoop turned off due to performance issue

Time-to-Market of data latency
--Huge IO & Network traffic with MR jobs

Streaming
--Streaming process and pre-calculate cubes



Where Spark could bring benefits to Kylin:

Integrating with Spark SQL: 
--Option I: Read data from SparkSQL instead of Hive
--Option II: Route unsupported queries to SparkSQL
--Option III: Kylin to be OLAP source of SparkSQL

Spark Cube Build Engine
--Efficiency cube generate engine with Spark

Spark Streaming
--Leverage SparkStreaming for StreamingOLAP

HBase?
--Any idea?






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to