Re: Jobs Running with Single Mapper/Reducer

Vineet Mishra Sun, 14 Jun 2015 11:53:17 -0700

Can I have specification for these properties?

KYLIN_JOB_MAPREDUCE_DEFAULT_REDUCE_COUNT_RATIO =
"kylin.job.mapreduce.default.reduce.count.ratio";
KYLIN_JOB_MAPREDUCE_DEFAULT_REDUCE_INPUT_MB =
"kylin.job.mapreduce.default.reduce.input.mb";
KYLIN_JOB_MAPREDUCE_MAX_REDUCER_NUMBER =
"kylin.job.mapreduce.max.reducer.number";


Thanks!

On Sun, Jun 14, 2015 at 11:59 PM, Vineet Mishra <[email protected]>
wrote:

> Hi Shi,
>
> Its alright!
> So I was wondering my source hive Table is around 3 Gb, despite of my hive
> table being partitioned and holding the data around 50-70 Mb per partition
> the Mapper and Reducer getting spawned are single. The amount of data that
> is being processed in the M/R is nothing as expected but it takes hell lot
> of time.
>
> As mentioned in the trailing mail that the job is getting very slow, the
> process Build Base Cuboid Data itself takes around 50mins to get
> completed.
>
> I can tweak the reducer parameter mentioned by you but do u think that
> will make a difference since the mapper is where the most of the time is
> spent.
>
> Can you share your thoughts for performance tuning for the cube build!
>
> Thanks!
>
> On Sun, Jun 14, 2015 at 7:26 PM, Shi, Shaofeng <[email protected]> wrote:
>
>> Hi, sorry, a busy weekend;
>>
>> Usually Kylin will request proper number of mapper and reducers; If you
>> see single mapper/recudder, how much of your input and output? If your
>> cube is quite small, single mapper/reducer is possible;
>>
>> Number of mappers is decided by the FileInputFormat; But number of reducer
>> was set by Kylin, see:
>>
>> https://github.com/apache/incubator-kylin/blob/master/job/src/main/java/org
>> /apache/kylin/job/hadoop/cube/CuboidJob.java#L141
>> <https://github.com/apache/incubator-kylin/blob/master/job/src/main/java/org/apache/kylin/job/hadoop/cube/CuboidJob.java#L141>
>>
>>
>>
>>
>> On 6/14/15, 5:25 PM, "Vineet Mishra" <[email protected]> wrote:
>>
>> >Urgent call, any follow up on this?
>> >
>> >On Fri, Jun 12, 2015 at 6:46 PM, Vineet Mishra <[email protected]>
>> >wrote:
>> >
>> >>
>> >> Why org.apache.kylin.job.hadoop.cube.CuboidReducer is running Single
>> >> Mapper/Reducer for the job. Can I have the understanding behind the
>> >>reason
>> >> of running it as single mapper/reducer.
>> >>
>> >> Thanks!
>> >>
>> >> On Fri, Jun 12, 2015 at 6:30 PM, Vineet Mishra <[email protected]
>> >
>> >> wrote:
>> >>
>> >>> Hi All,
>> >>>
>> >>> I am building a cube using Kylin and I could see that the job is
>> >>>running
>> >>> with Single Mapper and Reducer for some of the intermediate process
>> >>>such as
>> >>>
>> >>> Extract Fact Table Distinct Columns
>> >>> Build Dimension Dictionary
>> >>> Build N-Dimension Cuboid
>> >>>
>> >>> I am not sure what's the reason behind running the job with single
>> M/R,
>> >>> is it really necessary or is it some default config. which can be
>> >>>tweaked,
>> >>> its 70 Mins and the job status is 25% !
>> >>>
>> >>> Urgent Call!
>> >>>
>> >>> Thanks!
>> >>>
>> >>
>> >>
>>
>>
>

Re: Jobs Running with Single Mapper/Reducer

Reply via email to