Hanu created HIVE-14633: --------------------------- Summary: #.of Files in a partition ! = #.Of buckets in a partitioned,bucketed table Key: HIVE-14633 URL: https://issues.apache.org/jira/browse/HIVE-14633 Project: Hive Issue Type: Bug Components: Hive Affects Versions: 1.2.1 Environment: HDP 2.3.2 Reporter: Hanu
Ideally the number of files should be equal to number of buckets declared in a table DDL. It is working fine whenever an initial insert or every insert overwrite is performed. But, insert into hive bucketed table is creating extra files. ex: # of Buckets = 4 No. of files after Initial insert --> 4 No. of files after 2nd insert --> 8 No. of files after 3rd insert --> 12 No. of files after n insert --> n* # of Buckets. First insert list : -rwxrwxrwx 3 hvallur hdfs 49 2016-08-25 12:42 hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000000_0 -rwxrwxrwx 3 hvallur hdfs 49 2016-08-25 12:42 hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000001_0 -rwxrwxrwx 3 hvallur hdfs 308 2016-08-25 12:42 hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000002_0 -rwxrwxrwx 3 hvallur hdfs 49 2016-08-25 12:42 hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000003_0 2nd Insert: -rwxrwxrwx 3 hvallur hdfs 49 2016-08-25 12:42 hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000000_0 -rwxrwxrwx 3 hvallur hdfs 49 2016-08-25 12:47 hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000000_0_copy_1 -rwxrwxrwx 3 hvallur hdfs 49 2016-08-25 12:42 hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000001_0 -rwxrwxrwx 3 hvallur hdfs 49 2016-08-25 12:47 hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000001_0_copy_1 -rwxrwxrwx 3 hvallur hdfs 308 2016-08-25 12:42 hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000002_0 -rwxrwxrwx 3 hvallur hdfs 302 2016-08-25 12:47 hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000002_0_copy_1 -rwxrwxrwx 3 hvallur hdfs 49 2016-08-25 12:42 hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000003_0 -rwxrwxrwx 3 hvallur hdfs 49 2016-08-25 12:47 hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000003_0_copy_1 -- This message was sent by Atlassian JIRA (v6.3.4#6332)