[
https://issues.apache.org/jira/browse/HIVE-14633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15448941#comment-15448941
]
Hanu commented on HIVE-14633:
-----------------------------
Yeah it is expected and hive will be managing... But it is increasing number of
small files.. Which in turn increases number of mappers.. So more # of
resources unnecessarily.
Please correct me if wrong
> #.of Files in a partition ! = #.Of buckets in a partitioned,bucketed table
> --------------------------------------------------------------------------
>
> Key: HIVE-14633
> URL: https://issues.apache.org/jira/browse/HIVE-14633
> Project: Hive
> Issue Type: Bug
> Components: Hive
> Affects Versions: 1.2.1
> Environment: HDP 2.3.2
> Reporter: Hanu
>
> Ideally the number of files should be equal to number of buckets declared in
> a table DDL. It is working fine whenever an initial insert or every insert
> overwrite is performed. But, insert into hive bucketed table is creating
> extra files.
> ex:
> # of Buckets = 4
> No. of files after Initial insert --> 4
> No. of files after 2nd insert --> 8
> No. of files after 3rd insert --> 12
> No. of files after n insert --> n* # of Buckets.
> First insert list :
> -rwxrwxrwx 3 hvallur hdfs 49 2016-08-25 12:42
> hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000000_0
> -rwxrwxrwx 3 hvallur hdfs 49 2016-08-25 12:42
> hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000001_0
> -rwxrwxrwx 3 hvallur hdfs 308 2016-08-25 12:42
> hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000002_0
> -rwxrwxrwx 3 hvallur hdfs 49 2016-08-25 12:42
> hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000003_0
> 2nd Insert:
> -rwxrwxrwx 3 hvallur hdfs 49 2016-08-25 12:42
> hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000000_0
> -rwxrwxrwx 3 hvallur hdfs 49 2016-08-25 12:47
> hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000000_0_copy_1
> -rwxrwxrwx 3 hvallur hdfs 49 2016-08-25 12:42
> hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000001_0
> -rwxrwxrwx 3 hvallur hdfs 49 2016-08-25 12:47
> hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000001_0_copy_1
> -rwxrwxrwx 3 hvallur hdfs 308 2016-08-25 12:42
> hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000002_0
> -rwxrwxrwx 3 hvallur hdfs 302 2016-08-25 12:47
> hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000002_0_copy_1
> -rwxrwxrwx 3 hvallur hdfs 49 2016-08-25 12:42
> hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000003_0
> -rwxrwxrwx 3 hvallur hdfs 49 2016-08-25 12:47
> hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000003_0_copy_1
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)