[jira] [Comment Edited] (KYLIN-4321) Create fact distinct columns using spark by default when build engine is spark

weibin0516 (Jira) Tue, 07 Jan 2020 16:51:04 -0800


    [ 
https://issues.apache.org/jira/browse/KYLIN-4321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17010215#comment-17010215
 ]


weibin0516 edited comment on KYLIN-4321 at 1/8/20 12:49 AM:
------------------------------------------------------------

Past experience and a large amount of test data show that Spark's performance 
is significantly better than Hive(MapReduce).

The following pictures are the test result of spark and hive on tpc-ds
 !screenshot-2.png! 
 !screenshot-1.png! 

Currently, when the cube is built with the spark engine, the `Create fact 
distinct columns` step uses mapreduce by default. Here we want to use the spark 
engine to perform this step by default, that is, modify the` 
kylin.engine.spark-fact-distinct` value to true.


was (Author: codingforfun):
Past experience and a large amount of test data show that Spark's performance 
is significantly better than Hive(MapReduce).
 !screenshot-2.png! 
 !screenshot-1.png! 
Currently, when the cube is built with the spark engine, the `Create fact 
distinct columns` step uses mapreduce by default. Here we want to use the spark 
engine to perform this step by default, that is, modify the` 
kylin.engine.spark-fact-distinct` value to true.

> Create fact distinct columns using spark by default when build engine is spark
> ------------------------------------------------------------------------------
>
>                 Key: KYLIN-4321
>                 URL: https://issues.apache.org/jira/browse/KYLIN-4321
>             Project: Kylin
>          Issue Type: Improvement
>            Reporter: weibin0516
>            Assignee: weibin0516
>            Priority: Major
>             Fix For: v3.1.0
>
>         Attachments: screenshot-1.png, screenshot-2.png
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (KYLIN-4321) Create fact distinct columns using spark by default when build engine is spark

Reply via email to