Github user QiangCai commented on a diff in the pull request:
https://github.com/apache/carbondata/pull/1898#discussion_r165247013
--- Diff: docs/configuration-parameters.md ---
@@ -60,6 +60,7 @@ This section provides the details of all the
configurations required for CarbonD
| carbon.options.is.empty.data.bad.record | false | If false, then empty
("" or '' or ,,) data will not be considered as bad record and vice versa. | |
| carbon.options.bad.record.path | | Specifies the HDFS path where bad
records are stored. By default the value is Null. This path must to be
configured by the user if bad record logger is enabled or bad record action
redirect. | |
| carbon.enable.vector.reader | true | This parameter increases the
performance of select queries as it fetch columnar batch of size 4*1024 rows
instead of fetching data row by row. | |
+| carbon.task.distribution | merge_small_files | Setting this parameter
value to *merge_small_files* will merge all the small files to a size of (128
MB). During data loading, all the small CSV files are combined to a map task to
reduce the number of read task. This enhances the performance. | |
--- End diff --
1. carbon.task.distribution is only for the query, not be used by data
loading.
Global_Sort loading will always merge small CSV files, not require this
configuration.
2. better to list all values of carbon.task.distribution
custom, block(default), blocklet, merge_small_files
---