Rui Li created HIVE-7956:
----------------------------

             Summary: When inserting into a bucketed table, all data goes to a 
single bucket [Spark Branch]
                 Key: HIVE-7956
                 URL: https://issues.apache.org/jira/browse/HIVE-7956
             Project: Hive
          Issue Type: Bug
          Components: Spark
            Reporter: Rui Li


I created a bucketed table:
{code}
create table testBucket(x int,y string) clustered by(x) into 10 buckets;
{code}
Then I run a query like:
{code}
set hive.enforce.bucketing = true;
insert overwrite table testBucket select intCol,stringCol from src;
{code}
Here {{src}} is a simple textfile-based table containing 40000000 records (not 
bucketed). The query launches 10 reduce tasks but all the data goes to only one 
of them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to