Github user jackylk commented on the issue: https://github.com/apache/carbondata/pull/2909 Since it is using CSV split in this PR, there are more tasks and each task is smaller, thus we can get better parallelism and require less resource to run in each task
---