[ 
https://issues.apache.org/jira/browse/HIVE-14633?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15448941#comment-15448941
 ] 

Hanu commented on HIVE-14633:
-----------------------------

Yeah it is expected and hive will be managing... But it is increasing number of 
small files.. Which in turn increases number of mappers.. So more # of 
resources unnecessarily.  
Please correct me if wrong

> #.of Files in a partition ! = #.Of buckets in a partitioned,bucketed table
> --------------------------------------------------------------------------
>
>                 Key: HIVE-14633
>                 URL: https://issues.apache.org/jira/browse/HIVE-14633
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>    Affects Versions: 1.2.1
>         Environment: HDP 2.3.2
>            Reporter: Hanu
>
> Ideally the number of files should be equal to number of buckets declared in 
> a table DDL. It is working fine whenever an initial insert or every insert 
> overwrite is performed. But, insert into hive bucketed table is creating 
> extra files. 
> ex:
> # of Buckets = 4
> No. of files after Initial insert --> 4
> No. of files after 2nd insert --> 8
> No. of files after 3rd insert --> 12
> No. of files after n insert --> n* # of Buckets.
> First insert list : 
> -rwxrwxrwx   3 hvallur hdfs         49 2016-08-25 12:42 
> hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000000_0
> -rwxrwxrwx   3 hvallur hdfs         49 2016-08-25 12:42 
> hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000001_0
> -rwxrwxrwx   3 hvallur hdfs        308 2016-08-25 12:42 
> hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000002_0
> -rwxrwxrwx   3 hvallur hdfs         49 2016-08-25 12:42 
> hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000003_0
> 2nd Insert:
> -rwxrwxrwx   3 hvallur hdfs         49 2016-08-25 12:42 
> hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000000_0
> -rwxrwxrwx   3 hvallur hdfs         49 2016-08-25 12:47 
> hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000000_0_copy_1
> -rwxrwxrwx   3 hvallur hdfs         49 2016-08-25 12:42 
> hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000001_0
> -rwxrwxrwx   3 hvallur hdfs         49 2016-08-25 12:47 
> hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000001_0_copy_1
> -rwxrwxrwx   3 hvallur hdfs        308 2016-08-25 12:42 
> hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000002_0
> -rwxrwxrwx   3 hvallur hdfs        302 2016-08-25 12:47 
> hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000002_0_copy_1
> -rwxrwxrwx   3 hvallur hdfs         49 2016-08-25 12:42 
> hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000003_0
> -rwxrwxrwx   3 hvallur hdfs         49 2016-08-25 12:47 
> hdfs://dshdp-dev-cluster/apps/hive/warehouse/upsert_testing.db/test3/lname=vr/000003_0_copy_1



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to