I believe we do not care about Spark client APIs for the distributed execution engine, so I would recommend to take a look also at Apache Flink [1].
Similar to Spark, it has execution engine that could run standalone or on YARN as DAG. But since we want to focus mostly on backend, it has some special features like built-in iteration operator, heap memory management, and also cost optimizer for execution plan. - Henry [1] http://flink.apache.org/ On Mon, Jan 12, 2015 at 10:17 PM, Li Yang <[email protected]> wrote: > Agree. We shall proceed to refactor the job engine. It needs to be more > extensible and friendly to add new jobs and steps. This is a prerequisite > for Kylin to explore other opportunities for faster cube build, like Spark > and > > Please update with finer designs. > > On Fri, Jan 9, 2015 at 10:07 AM, 周千昊 <[email protected]> wrote: > >> Currently Kylin has its own Job Engine to schedule cubing process. However >> there are some demerits >> 1. It is too tightly couple with cubing process, thus cannot support other >> kind of jobs easily >> 2. It is hard to expand or to integrate with other techniques (for example >> Spark) >> Thus I have proposed a refactor for the current job engine. >> Below is the wiki page in Github >> https://github.com/KylinOLAP/Kylin/wiki/%5BProposal%5D-New-Job-Engine >>
