Hi Yiming,

The "mapreduce.job.reduces"  need by set at runtime, whose number is
calculated based on user tables' size, it couldn't be pre-configured.

The "hive.merge.mapredfiles=false" can be externalized to the conf file;
The hive merge is not needed since 1.5.3, I set in code to ensure it will
be not be enabled (config files before 1.5.3 has this param set to true).

For other parameters, I think they're optional, but it is better to keep as
they're good for performance, like dfs.replication=2, compress.codec etc.

Usually in a hadoop cluster, Apache Kylin should be treated as a
priviledged user (instead of a normal user like analyst), which can execute
necessary hadoop/hdfs/hbase/hive actions (like mkdir, create htable, etc);
To achieve this, the administartor need do some configurations and
authorizations; What we need do is to compose a document to list
these privileges, what's your opinion?

Thanks for the comment!


2016-07-30 14:03 GMT+08:00 Yiming Liu <[email protected]>:

> Hi Kylin dev,
>
> The first step is building cube is to CreateFlatHiveTable, it will call a
> few hive configuration commands, such as
> CreateFlatHiveTableStep line 78 and 79.
> set mapreduce.job.reduces=numReduces
> set hive.merge.mapredfiles=false
>
> Are these commands necessary for the cube building? Could we configure them
> in files? I met some cases, where the hiveserver would say "Configuration
> is not allowed to modify at runtime". It will break the build.
>
> Maybe there are some other hard code hadoop commands still. It will be more
> friendly if they could turn off on demand.
>
> --
> With Warm regards
>
> Yiming Liu (刘一鸣)
>



-- 
Best regards,

Shaofeng Shi

Reply via email to