Can I have specification for these properties? KYLIN_JOB_MAPREDUCE_DEFAULT_REDUCE_COUNT_RATIO = "kylin.job.mapreduce.default.reduce.count.ratio"; KYLIN_JOB_MAPREDUCE_DEFAULT_REDUCE_INPUT_MB = "kylin.job.mapreduce.default.reduce.input.mb"; KYLIN_JOB_MAPREDUCE_MAX_REDUCER_NUMBER = "kylin.job.mapreduce.max.reducer.number";
Thanks! On Sun, Jun 14, 2015 at 11:59 PM, Vineet Mishra <[email protected]> wrote: > Hi Shi, > > Its alright! > So I was wondering my source hive Table is around 3 Gb, despite of my hive > table being partitioned and holding the data around 50-70 Mb per partition > the Mapper and Reducer getting spawned are single. The amount of data that > is being processed in the M/R is nothing as expected but it takes hell lot > of time. > > As mentioned in the trailing mail that the job is getting very slow, the > process Build Base Cuboid Data itself takes around 50mins to get > completed. > > I can tweak the reducer parameter mentioned by you but do u think that > will make a difference since the mapper is where the most of the time is > spent. > > Can you share your thoughts for performance tuning for the cube build! > > Thanks! > > On Sun, Jun 14, 2015 at 7:26 PM, Shi, Shaofeng <[email protected]> wrote: > >> Hi, sorry, a busy weekend; >> >> Usually Kylin will request proper number of mapper and reducers; If you >> see single mapper/recudder, how much of your input and output? If your >> cube is quite small, single mapper/reducer is possible; >> >> Number of mappers is decided by the FileInputFormat; But number of reducer >> was set by Kylin, see: >> >> https://github.com/apache/incubator-kylin/blob/master/job/src/main/java/org >> /apache/kylin/job/hadoop/cube/CuboidJob.java#L141 >> <https://github.com/apache/incubator-kylin/blob/master/job/src/main/java/org/apache/kylin/job/hadoop/cube/CuboidJob.java#L141> >> >> >> >> >> On 6/14/15, 5:25 PM, "Vineet Mishra" <[email protected]> wrote: >> >> >Urgent call, any follow up on this? >> > >> >On Fri, Jun 12, 2015 at 6:46 PM, Vineet Mishra <[email protected]> >> >wrote: >> > >> >> >> >> Why org.apache.kylin.job.hadoop.cube.CuboidReducer is running Single >> >> Mapper/Reducer for the job. Can I have the understanding behind the >> >>reason >> >> of running it as single mapper/reducer. >> >> >> >> Thanks! >> >> >> >> On Fri, Jun 12, 2015 at 6:30 PM, Vineet Mishra <[email protected] >> > >> >> wrote: >> >> >> >>> Hi All, >> >>> >> >>> I am building a cube using Kylin and I could see that the job is >> >>>running >> >>> with Single Mapper and Reducer for some of the intermediate process >> >>>such as >> >>> >> >>> Extract Fact Table Distinct Columns >> >>> Build Dimension Dictionary >> >>> Build N-Dimension Cuboid >> >>> >> >>> I am not sure what's the reason behind running the job with single >> M/R, >> >>> is it really necessary or is it some default config. which can be >> >>>tweaked, >> >>> its 70 Mins and the job status is 25% ! >> >>> >> >>> Urgent Call! >> >>> >> >>> Thanks! >> >>> >> >> >> >> >> >> >
