There are widely discussions around Spark and Kylin when we talking to different ordinations, teams and individuals. To leverage Spark ecosystem is always in our mind, it could bring some benefits for the current challenges Kylin has:
*High latency when reading data from Hive * --Several hours to fetch data when join big tables --Route to SQL-on-Hadoop turned off due to performance issue *Time-to-Market of data latency* --Huge IO & Network traffic with MR jobs *Streaming* --Streaming process and pre-calculate cubes Leveraging Spark, there are some options: *Integrating with Spark SQL: * --Option I: Read data from SparkSQL instead of Hive --Option II: Route unsupported queries to SparkSQL --Option III: Kylin to be OLAP source of SparkSQL *Spark Cube Build Engine* --Efficiency cube generate engine with Spark *Spark Streaming * --Leverage SparkStreaming for StreamingOLAP (TBD) *HBase?* --Any idea? With great meetings with Huawei, Paypal and others, I think it's time to bring this to design and architecture phase now. Here's epic and features for tracking: https://issues.apache.org/jira/browse/KYLIN-679 KYLIN-741 <https://issues.apache.org/jira/browse/KYLIN-741> Read data from SparkSQL KYLIN-742 <https://issues.apache.org/jira/browse/KYLIN-742> Route unsupported queries to SparkSQL KYLIN-743 <https://issues.apache.org/jira/browse/KYLIN-743> Kylin to be OLAP source of SparkSQL KYLIN-744 <https://issues.apache.org/jira/browse/KYLIN-744> Spark Cube Build Engine The initial efforts will focus on SparkSQL and then Sparking Cube Engine. Please leave general discussion here but please start to talking technical detail, design, architecture under each JIRA (mailing list will got all update automatically). If you have any idea or willing to contribute, please feel free to let's know or add comments in each ticket. PS. as previous plan, Spark relative stuff will be managed under v0.9.x version: https://issues.apache.org/jira/browse/KYLIN-577 <https://issues.apache.org/jira/browse/KYLIN-577> Thanks. Best Regards! --------------------- Luke Han
