Sidonet opened a new pull request #487: KYLIN-3714 Register kryo for spark spilling process. URL: https://github.com/apache/kylin/pull/487 After patch was applied in my env, i test it. In my test case, i set properties kylin.engine.spark-conf.spark.driver.memory=2G kylin.engine.spark-conf.spark.executor.memory=512M Launch build a huge cube, with no mapreduce.input.fileinputformat.split at all. The most huge data goes to executor 20 (4.6G) and spark spilling process goes here: 2019-02-26 17:55:13 INFO ShuffleBlockFetcherIterator:54 - Getting 3898 non-empty blocks out of 3898 blocks 2019-02-26 17:55:13 INFO ShuffleBlockFetcherIterator:54 - Started 6 remote fetches in 39 ms 2019-02-26 17:55:15 INFO ExternalAppendOnlyMap:54 - Thread 34 spilling in-memory map of 97.5 MB to disk (1 time so far) 2019-02-26 17:55:40 INFO ExternalAppendOnlyMap:54 - Thread 34 spilling in-memory map of 97.5 MB to disk (2 times so far) 2019-02-26 17:56:02 INFO ExternalAppendOnlyMap:54 - Thread 34 spilling in-memory map of 97.6 MB to disk (3 times so far) 2019-02-26 17:56:25 INFO ExternalAppendOnlyMap:54 - Thread 34 spilling in-memory map of 97.5 MB to disk (4 times so far) 2019-02-26 17:56:54 INFO ExternalAppendOnlyMap:54 - Thread 34 spilling in-memory map of 99.2 MB to disk (5 times so far) 2019-02-26 17:57:18 INFO ExternalAppendOnlyMap:54 - Thread 34 spilling in-memory map of 97.5 MB to disk (6 times so far) 2019-02-26 17:57:42 INFO ExternalAppendOnlyMap:54 - Thread 34 spilling in-memory map of 98.4 MB to disk (7 times so far) 2019-02-26 17:58:09 INFO ExternalAppendOnlyMap:54 - Thread 34 spilling in-memory map of 99.2 MB to disk (8 times so far) 2019-02-26 17:58:34 INFO ExternalAppendOnlyMap:54 - Thread 34 spilling in-memory map of 97.5 MB to disk (9 times so far) 2019-02-26 17:58:54 INFO ExternalAppendOnlyMap:54 - Thread 34 spilling in-memory map of 97.5 MB to disk (10 times so far) 2019-02-26 17:59:17 INFO ExternalAppendOnlyMap:54 - Thread 34 spilling in-memory map of 99.2 MB to disk (11 times so far) 2019-02-26 17:59:46 INFO ExternalAppendOnlyMap:54 - Thread 34 spilling in-memory map of 97.5 MB to disk (12 times so far) 2019-02-26 18:00:13 INFO AbstractHadoopJob:511 - KylinConfig cached for : kylin_metadata@hdfs,path=hdfs://apachai1.apm.local:8020/kylin/kylin_metadata/kylin-5774d00a-bd56-ac28-e867-f9f5cb5d24f3/Test_Cube_2_clone/metadata 2019-02-26 18:00:13 INFO SparkFactDistinct:707 - Partition 19 handling column DEFAULT.SIDA_CASHIER_1.CASHIER_NAME, buildDictInReducer=true 2019-02-26 18:00:13 INFO SparkFactDistinct:716 - Received value: Николенко Наталья Николае Step finished succesfully. ![spilling](https://user-images.githubusercontent.com/39062077/53426681-79399400-39f8-11e9-96ab-868cd3712234.png)
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services