Hope this may help: http://kylin.apache.org/docs/tutorial/cube_spark.html
Jon Shoberg <[email protected]> 于2018年12月18日周二 上午2:34写道: > Is there a good/favorite article for tuning spark settings within Kylin? > > I finally have Spark (2.1.3 as distributed with Kylin 2.5.2) running on my > systems. > > My small data set (35M records) runs well the default settings. > > My medium data set (4B records, 40GB compressed source file, 5 measures, 6 > dimensions with low carnality) often dies at Step 3 (Extract Fact Table > Distinct Columns) with out of memory errors. > > After using exceptionally large memory settings the job completed but I'm > trying to see if there is an optimization possible. > > Any suggestions or ideas? I've searched/read on spark tuning in general > but otherwise feeling I'm not making too much progress on optimizing with > the settings I've tried. > > Thanks!J > -- Regards! Aron Tao
