[
https://issues.apache.org/jira/browse/CARBONDATA-4042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Venugopal Reddy K updated CARBONDATA-4042:
------------------------------------------
Summary: Insert into select and CTAS launches fewer tasks(task count
limited to number of nodes in cluster) even when target table is of no_sort
(was: Insert into select and CTAS launches fewer tasks(limited to max nodes)
even when target table is of no_sort)
> Insert into select and CTAS launches fewer tasks(task count limited to number
> of nodes in cluster) even when target table is of no_sort
> ---------------------------------------------------------------------------------------------------------------------------------------
>
> Key: CARBONDATA-4042
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4042
> Project: CarbonData
> Issue Type: Improvement
> Components: data-load, spark-integration
> Reporter: Venugopal Reddy K
> Priority: Major
>
> *Issue:*
> At present, When we do insert into table select from or create table as
> select from, we lauch one single task per node. Whereas when we do a simple
> select * from table query, tasks launched are equal to number of carbondata
> files(CARBON_TASK_DISTRIBUTION default is CARBON_TASK_DISTRIBUTION_BLOCK).
> Thus, slows down the load performance of insert into select and ctas cases.
> Refer [Community discussion regd. task
> lauch|http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/Discussion-Query-Regarding-Task-launch-mechanism-for-data-load-operations-tt98711.html]
>
> *Suggestion:*
> Lauch the same number of tasks as in select query for insert into select and
> ctas cases when the target table is of no-sort.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)