ravipesala commented on a change in pull request #3115: Document Update URL: https://github.com/apache/carbondata/pull/3115#discussion_r252180869
########## File path: docs/configuration-parameters.md ########## @@ -61,7 +61,7 @@ This section provides the details of all the configurations required for the Car | carbon.number.of.cores.while.loading | 2 | Number of cores to be used while loading data. This also determines the number of threads to be used to read the input files (csv) in parallel.**NOTE:** This configured value is used in every data loading step to parallelize the operations. Configuring a higher value can lead to increased early thread pre-emption by OS and there by reduce the overall performance. | | enable.unsafe.sort | true | CarbonData supports unsafe operations of Java to avoid GC overhead for certain operations. This configuration enables to use unsafe functions in CarbonData. **NOTE:** For operations like data loading, which generates more short lived Java objects, Java GC can be a bottle neck. Using unsafe can overcome the GC overhead and improve the overall performance. | | enable.offheap.sort | true | CarbonData supports storing data in off-heap memory for certain operations during data loading and query. This helps to avoid the Java GC and thereby improve the overall performance. This configuration enables using off-heap memory for sorting of data during data loading.**NOTE:** ***enable.unsafe.sort*** configuration needs to be configured to true for using off-heap | -| carbon.load.sort.scope | LOCAL_SORT | CarbonData can support various sorting options to match the balance between load and query performance. LOCAL_SORT:All the data given to an executor in the single load is fully sorted and written to carbondata files. Data loading performance is reduced a little as the entire data needs to be sorted in the executor. BATCH_SORT:Sorts the data in batches of configured size and writes to carbondata files. Data loading performance increases as the entire data need not be sorted. But query performance will get reduced due to false positives in block pruning and also due to more number of carbondata files written. Due to more number of carbondata files, if identified blocks > cluster parallelism, query performance and concurrency will get reduced. GLOBAL SORT:Entire data in the data load is fully sorted and written to carbondata files. Data loading performance would get reduced as the entire data needs to be sorted. But the query performance increases significantly due to very less false positives and concurrency is also improved. **NOTE:** when BATCH_SORT is configured, it is recommended to keep ***carbon.load.batch.sort.size.inmb*** > ***carbon.blockletgroup.size.in.mb*** | +| carbon.load.sort.scope | LOCAL_SORT | CarbonData can support various sorting options to match the balance between load and query performance. LOCAL_SORT:All the data given to an executor in the single load is fully sorted and written to carbondata files. Data loading performance is reduced a little as the entire data needs to be sorted in the executor. BATCH_SORT:Sorts the data in batches of configured size and writes to carbondata files. Data loading performance increases as the entire data need not be sorted. But query performance will get reduced due to false positives in block pruning and also due to more number of carbondata files written. Due to more number of carbondata files, if identified blocks > cluster parallelism, query performance and concurrency will get reduced. GLOBAL SORT:Entire data in the data load is fully sorted and written to carbondata files. Data loading performance would get reduced as the entire data needs to be sorted. But the query performance increases significantly due to very less false positives and concurrency is also improved. **NOTE 1:** This property will be taken into account when SORT COLUMNS are specified explicitely, otherwise it is always NO SORT **NOTE 2:** When BATCH_SORT is configured, it is recommended to keep ***carbon.load.batch.sort.size.inmb*** > ***carbon.blockletgroup.size.in.mb***.| Review comment: Please correct it like when SORT_COLUMNS are specified during carbon table creation ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services