[ https://issues.apache.org/jira/browse/CARBONDATA-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Indhumathi resolved CARBONDATA-4322. ------------------------------------ Fix Version/s: 2.3.0 Resolution: Fixed > Insert into local sort partition table select * from text table launch > thousands tasks > -------------------------------------------------------------------------------------- > > Key: CARBONDATA-4322 > URL: https://issues.apache.org/jira/browse/CARBONDATA-4322 > Project: CarbonData > Issue Type: Bug > Reporter: SHREELEKHYA GAMPA > Priority: Major > Fix For: 2.3.0 > > Time Spent: 7h 50m > Remaining Estimate: 0h > > [Reproduce steps] > # CREATE TABLE partitionthree1 (empno int, doj Timestamp, > workgroupcategoryname String, deptno int, deptname String, projectcode int, > projectjoindate Timestamp, projectenddate Timestamp,attendance int, > utilization int,salary int, empname String, designation String) PARTITIONED > BY (workgroupcategory int) STORED AS carbondata > tblproperties('sort_scope'='local_sort', 'sort_columns'='deptname,empname'); > # CREATE TABLE partitionthree2 (empno int, doj Timestamp, > workgroupcategoryname String, deptno int, deptname String, projectcode int, > projectjoindate Timestamp, projectenddate Timestamp,attendance int, > utilization int,salary int, empname String, designation String) PARTITIONED > BY (workgroupcategory int); > # LOAD DATA local inpath 'hdfs://hacluster/user/data.csv' INTO TABLE > partitionthree1 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '"', > 'TIMESTAMPFORMAT'='dd-MM-yyyy'); > # set hive.exec.dynamic.partition.mode=nonstrict; > # insert into partitionthree2 select * from partitionthree1; > insert into partitionthree2 select * from partitionthree1; > insert into partitionthree2 select * from partitionthree1; > insert into partitionthree2 select * from partitionthree1; > insert into partitionthree2 select * from partitionthree1; > insert into partitionthree2 select * from partitionthree1; > insert into partitionthree2 select * from partitionthree1; > insert into partitionthree2 select * from partitionthree1; > insert into partitionthree2 select * from partitionthree1; > insert into partitionthree2 select * from partitionthree1; > insert into partitionthree2 select * from partitionthree1; > insert into partitionthree2 select * from partitionthree1; > insert into partitionthree2 select * from partitionthree1; > insert into partitionthree2 select * from partitionthree1; > insert into partitionthree2 select * from partitionthree1; > insert into partitionthree2 select * from partitionthree1; > # insert into partitionthree1 select * from partitionthree2; > > [Expect Result] > Step 6 only launches number of tasks equal to number of nodes. > > [Current Behavior] > Number of tasks far larger than number of nodes. > > [Impact] > In several product sites, query performance get impact significantly. > > [Initial analysis] > Insert into non partition local sort table will launch number of tasks equal > to number of nodes, make partition table the same. -- This message was sent by Atlassian Jira (v8.20.1#820001)