Actually we have done a Spark Cubing POC in last year. The performance is not showing so good yet with same algorithm, especially when you have huge datasets.
Do not expect Spark cubing will fast than MR cubing so far, but I think we should look into Spark2.0 which maybe bring some benefits for it. Thanks. Luke Best Regards! --------------------- Luke Han On Fri, Jul 15, 2016 at 4:29 PM, wangfeng <[email protected]> wrote: > Hello, when I am used kylin1.5.0 to build the cube, I see than MR Jobs > took a > lot of time, as I think the result of MapReduce Job in hadoop are saved > in > HDFS, thus Job should read and write HDFS with increasing frequency. > However > , Spark save the result in internal memory which could save time. > I want to konw whether I can use Spark to replace Hadoop when I use kylin? > If it is possible to do than Please tell me how to do this, Thanks.... > > -- > View this message in context: > http://apache-kylin.74782.x6.nabble.com/Can-kylin-build-cube-based-on-spark-instead-of-hadoop-tp5330.html > Sent from the Apache Kylin mailing list archive at Nabble.com. >
