one reducer only when inserting into an orc dynimicaly partitioned table

David Ginzburg Sun, 24 May 2015 13:35:31 -0700

Hi,

I am running on 10 node cluster hdp 2.2.
Using tez and yarn.
hive version is 0.14


I have a 90 milion row table stroed in a plain text csv 10GB text file.

When trying to insert into an orc partitioned table using the statement:

"insert overwrite table 2h2 partition (dt) select *,TIME_STAMP  from
2h_tmp;"

dt is the dynamic partition key.

Tez alloactes only one reducer to the job which results in a 6 hour run.

I expect about 120 partions to be created .

How can I increase number of reducers to speed up this job?

Is this related to https://issues.apache.org/jira/browse/HIVE-7158 , it is
marked as resolved for hive 0.14

I am running with default values

hive.tez.auto.reducer.parallelism

    Default Value: false
    Added In: Hive 0.14.0 with HIVE-7158

hive.tez.max.partition.factor

    Default Value: 2
    Added In: Hive 0.14.0 with HIVE-7158

hive.tez.min.partition.factor

    Default Value: 0.25
    Added In: Hive 0.14.0 with HIVE-7158

and  hive.exec.dynamic.partition=true;
 hive.exec.dynamic.partition.mode=nonstrict;

one reducer only when inserting into an orc dynimicaly partitioned table

Reply via email to