Github user ravipesala commented on the issue:
https://github.com/apache/carbondata/pull/1825
@xuchuanyin There is a reason why we do copy instead of directly writing to
HDFS.
1. We make sure that one complete carbondata file goes to one HDFS block
only, while copying it to HDFS from local disk we specify the block size. Other
wise it impacts query performance a lot.
2. Remove the overhead of writing to HDFS directly (it internally writes to
replication as well) , so start copying in a different thread to avoid blocking
of main loading flow.
---