[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-25 Thread xuchuanyin
Github user xuchuanyin closed the pull request at:

https://github.com/apache/carbondata/pull/1177


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-25 Thread xuchuanyin
Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r129289344
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/newflow/sort/impl/ParallelReadMergeSorterWithBucketingImpl.java
 ---
@@ -185,12 +188,17 @@ private boolean processRowToNextStep(SortDataRows[] 
sortDataRows, SortParameters
   }
 
   private void setTempLocation(SortParameters parameters) {
-String carbonDataDirectoryPath = CarbonDataProcessorUtil
+String[] carbonDataDirectoryPath = CarbonDataProcessorUtil
 .getLocalDataFolderLocation(parameters.getDatabaseName(),
 parameters.getTableName(), parameters.getTaskNo(),
 parameters.getPartitionID(), parameters.getSegmentId(), false);
-parameters.setTempFileLocation(
-carbonDataDirectoryPath + File.separator + 
CarbonCommonConstants.SORT_TEMP_FILE_LOCATION);
+String[] tmpLocs = new String[carbonDataDirectoryPath.length];
--- End diff --

:+1: all fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-25 Thread xuchuanyin
Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r129289336
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/newflow/sort/impl/UnsafeParallelReadMergeSorterWithBucketingImpl.java
 ---
@@ -168,11 +171,15 @@ private boolean 
processRowToNextStep(UnsafeSortDataRows[] sortDataRows, SortPara
   }
 
   private void setTempLocation(SortParameters parameters) {
-String carbonDataDirectoryPath = CarbonDataProcessorUtil
+String[] carbonDataDirectoryPath = CarbonDataProcessorUtil
 .getLocalDataFolderLocation(parameters.getDatabaseName(), 
parameters.getTableName(),
 parameters.getTaskNo(), parameters.getPartitionID(), 
parameters.getSegmentId(), false);
-parameters.setTempFileLocation(
-carbonDataDirectoryPath + File.separator + 
CarbonCommonConstants.SORT_TEMP_FILE_LOCATION);
+String[] tmpLoc = new String[carbonDataDirectoryPath.length];
--- End diff --

:+1: all fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-25 Thread xuchuanyin
Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r129286013
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/newflow/steps/DataWriterBatchProcessorStepImpl.java
 ---
@@ -58,12 +58,14 @@ public 
DataWriterBatchProcessorStepImpl(CarbonDataLoadConfiguration configuratio
 child.initialize();
   }
 
-  private String getStoreLocation(CarbonTableIdentifier tableIdentifier, 
String partitionId) {
-String storeLocation = CarbonDataProcessorUtil
+  private String[] getStoreLocation(CarbonTableIdentifier tableIdentifier, 
String partitionId) {
+String[] storeLocation = CarbonDataProcessorUtil
 .getLocalDataFolderLocation(tableIdentifier.getDatabaseName(),
 tableIdentifier.getTableName(), 
String.valueOf(configuration.getTaskNo()), partitionId,
 configuration.getSegmentId() + "", false);
-new File(storeLocation).mkdirs();
+for (String loc : storeLocation) {
--- End diff --

:+1: all related reference fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-25 Thread xuchuanyin
Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r129281827
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/merger/RowResultMergerProcessor.java
 ---
@@ -57,12 +57,15 @@
   
LogServiceFactory.getLogService(RowResultMergerProcessor.class.getName());
 
   public RowResultMergerProcessor(String databaseName,
-  String tableName, SegmentProperties segProp, String 
tempStoreLocation,
+  String tableName, SegmentProperties segProp, String[] 
tempStoreLocation,
   CarbonLoadModel loadModel, CompactionType compactionType) {
 this.segprop = segProp;
-if (!new File(tempStoreLocation).mkdirs()) {
-  LOGGER.error("Error while new File(tempStoreLocation).mkdirs() ");
+for (String temLoc : tempStoreLocation) {
+  if (!new File(temLoc).mkdirs()) {
+LOGGER.error("Error while new File(tempStoreLocation).mkdirs() ");
--- End diff --

:+1: fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-25 Thread xuchuanyin
Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r129281674
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ---
@@ -1310,6 +1310,18 @@
   public static final String CARBON_LEASE_RECOVERY_RETRY_INTERVAL =
   "carbon.lease.recovery.retry.interval";
 
+  /**
+   * whether to use multi directories when loading data,
+   * the main purpose is to avoid single-disk-hot-spot
+   */
+  @CarbonProperty
+  public static final String CARBON_USING_MULTI_TEMP_DIR = 
"carbon.use.multiple.temp.dir";
--- End diff --

:+1: fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-25 Thread xuchuanyin
Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r129281496
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/newflow/steps/CarbonRowDataWriterProcessorStepImpl.java
 ---
@@ -112,9 +115,11 @@ private String getStoreLocation(CarbonTableIdentifier 
tableIdentifier, String pa
   isNoDictionaryDimensionColumn =
   
CarbonDataProcessorUtil.getNoDictionaryMapping(configuration.getDataFields());
   measureDataType = configuration.getMeasureDataType();
+  //choose a tmp location randomly
--- End diff --

just a temporary variable, I'll make the function call in the argument list


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-25 Thread xuchuanyin
Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r129280832
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/merger/RowResultMergerProcessor.java
 ---
@@ -57,12 +57,15 @@
   
LogServiceFactory.getLogService(RowResultMergerProcessor.class.getName());
 
   public RowResultMergerProcessor(String databaseName,
-  String tableName, SegmentProperties segProp, String 
tempStoreLocation,
+  String tableName, SegmentProperties segProp, String[] 
tempStoreLocation,
   CarbonLoadModel loadModel, CompactionType compactionType) {
 this.segprop = segProp;
-if (!new File(tempStoreLocation).mkdirs()) {
-  LOGGER.error("Error while new File(tempStoreLocation).mkdirs() ");
+for (String temLoc : tempStoreLocation) {
+  if (!new File(temLoc).mkdirs()) {
+LOGGER.error("Error while new File(tempStoreLocation).mkdirs() ");
--- End diff --

:+1: fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-25 Thread xuchuanyin
Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r129280724
  
--- Diff: docs/useful-tips-on-carbondata.md ---
@@ -231,5 +231,6 @@ scenarios. After the completion of POC, some of the 
configurations impacting the
 | spark.executor.instances/spark.executor.cores/spark.executor.memory | 
spark/conf/spark-defaults.conf | Querying | The number of executors, CPU cores, 
and memory used for CarbonData query. | In the bank scenario, we provide the 4 
CPUs cores and 15 GB for each executor which can get good performance. This 2 
value does not mean more the better. It needs to be configured properly in case 
of limited resources. For example, In the bank scenario, it has enough CPU 32 
cores each node but less memory 64 GB each node. So we cannot give more CPU but 
less memory. For example, when 4 cores and 12GB for each executor. It sometimes 
happens GC during the query which impact the query performance very much from 
the 3 second to more than 15 seconds. In this scenario need to increase the 
memory or decrease the CPU cores. |
 | carbon.detail.batch.size | spark/carbonlib/carbon.properties | Data 
loading | The buffer size to store records, returned from the block scan. | In 
limit scenario this parameter is very important. For example your query limit 
is 1000. But if we set this value to 3000 that means we get 3000 records from 
scan but spark will only take 1000 rows. So the 2000 remaining are useless. In 
one Finance test case after we set it to 100, in the limit 1000 scenario the 
performance increase about 2 times in comparison to if we set this value to 
12000. |
 | carbon.use.local.dir | spark/carbonlib/carbon.properties | Data loading 
| Whether use YARN local directories for multi-table load disk load balance | 
If this is set it to true CarbonData will use YARN local directories for 
multi-table load disk load balance, that will improve the data load 
performance. |
+| carbon.use.multiple.temp.dir | spark/carbonlib/carbon.properties | Data 
loading | Whether to use multiple YARN local directories during table data 
loading for disk load balance | After enabling 'carbon.use.local.dir', if this 
is set to true, CarbonData will use YARN local directories during data load for 
disk load balance, that will improve the data load performance. Please enable 
this property especially when you encounter disk hotspot problem during data 
loading. |
--- End diff --

:+1: fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-25 Thread xuchuanyin
Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r129280640
  
--- Diff: docs/useful-tips-on-carbondata.md ---
@@ -231,5 +231,6 @@ scenarios. After the completion of POC, some of the 
configurations impacting the
 | spark.executor.instances/spark.executor.cores/spark.executor.memory | 
spark/conf/spark-defaults.conf | Querying | The number of executors, CPU cores, 
and memory used for CarbonData query. | In the bank scenario, we provide the 4 
CPUs cores and 15 GB for each executor which can get good performance. This 2 
value does not mean more the better. It needs to be configured properly in case 
of limited resources. For example, In the bank scenario, it has enough CPU 32 
cores each node but less memory 64 GB each node. So we cannot give more CPU but 
less memory. For example, when 4 cores and 12GB for each executor. It sometimes 
happens GC during the query which impact the query performance very much from 
the 3 second to more than 15 seconds. In this scenario need to increase the 
memory or decrease the CPU cores. |
 | carbon.detail.batch.size | spark/carbonlib/carbon.properties | Data 
loading | The buffer size to store records, returned from the block scan. | In 
limit scenario this parameter is very important. For example your query limit 
is 1000. But if we set this value to 3000 that means we get 3000 records from 
scan but spark will only take 1000 rows. So the 2000 remaining are useless. In 
one Finance test case after we set it to 100, in the limit 1000 scenario the 
performance increase about 2 times in comparison to if we set this value to 
12000. |
 | carbon.use.local.dir | spark/carbonlib/carbon.properties | Data loading 
| Whether use YARN local directories for multi-table load disk load balance | 
If this is set it to true CarbonData will use YARN local directories for 
multi-table load disk load balance, that will improve the data load 
performance. |
+| carbon.use.multiple.temp.dir | spark/carbonlib/carbon.properties | Data 
loading | Whether to use multiple YARN local directories during table data 
loading for disk load balance | After enabling 'carbon.use.local.dir', if this 
is set to true, CarbonData will use YARN local directories during data load for 
disk load balance, that will improve the data load performance. Please enable 
this property especially when you encounter disk hotspot problem during data 
loading. |
--- End diff --

:+1: fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-25 Thread xuchuanyin
Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r129280458
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java ---
@@ -892,6 +892,23 @@ public int 
getNoDeleteDeltaFilesThresholdForIUDCompaction() {
   }
 
   /**
+   * Returns whether to use multi temp dirs
+   * @return boolean
+   */
+  public boolean isUseMultiTempDir() {
+String usingMultiDirStr = 
getProperty(CarbonCommonConstants.CARBON_USING_MULTI_TEMP_DIR,
+CarbonCommonConstants.CARBON_USING_MULTI_TEMP_DIR_DEFAULT);
+boolean validateBoolean = CarbonUtil.validateBoolean(usingMultiDirStr);
+if (!validateBoolean) {
+  LOGGER.info("The using multi temp dir value \"" + usingMultiDirStr
--- End diff --

:+1: fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-25 Thread xuchuanyin
Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r129279731
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/merger/RowResultMergerProcessor.java
 ---
@@ -57,12 +57,15 @@
   
LogServiceFactory.getLogService(RowResultMergerProcessor.class.getName());
 
   public RowResultMergerProcessor(String databaseName,
-  String tableName, SegmentProperties segProp, String 
tempStoreLocation,
+  String tableName, SegmentProperties segProp, String[] 
tempStoreLocation,
   CarbonLoadModel loadModel, CompactionType compactionType) {
 this.segprop = segProp;
-if (!new File(tempStoreLocation).mkdirs()) {
-  LOGGER.error("Error while new File(tempStoreLocation).mkdirs() ");
+for (String temLoc : tempStoreLocation) {
+  if (!new File(temLoc).mkdirs()) {
+LOGGER.error("Error while new File(tempStoreLocation).mkdirs() ");
--- End diff --

:+1:  fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-25 Thread xuchuanyin
Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r129279394
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/newflow/sort/unsafe/UnsafeSortDataRows.java
 ---
@@ -306,7 +309,9 @@ private void writeData(UnsafeCarbonRowPage rowPage, 
File file)
* This method will be used to delete sort temp location is it is exites
*/
   public void deleteSortLocationIfExists() {
-
CarbonDataProcessorUtil.deleteSortLocationIfExists(parameters.getTempFileLocation());
+for (String loc : parameters.getTempFileLocation()) {
--- End diff --

:+1:  fixed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-25 Thread sraghunandan
Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r129198419
  
--- Diff: docs/useful-tips-on-carbondata.md ---
@@ -231,5 +231,6 @@ scenarios. After the completion of POC, some of the 
configurations impacting the
 | spark.executor.instances/spark.executor.cores/spark.executor.memory | 
spark/conf/spark-defaults.conf | Querying | The number of executors, CPU cores, 
and memory used for CarbonData query. | In the bank scenario, we provide the 4 
CPUs cores and 15 GB for each executor which can get good performance. This 2 
value does not mean more the better. It needs to be configured properly in case 
of limited resources. For example, In the bank scenario, it has enough CPU 32 
cores each node but less memory 64 GB each node. So we cannot give more CPU but 
less memory. For example, when 4 cores and 12GB for each executor. It sometimes 
happens GC during the query which impact the query performance very much from 
the 3 second to more than 15 seconds. In this scenario need to increase the 
memory or decrease the CPU cores. |
 | carbon.detail.batch.size | spark/carbonlib/carbon.properties | Data 
loading | The buffer size to store records, returned from the block scan. | In 
limit scenario this parameter is very important. For example your query limit 
is 1000. But if we set this value to 3000 that means we get 3000 records from 
scan but spark will only take 1000 rows. So the 2000 remaining are useless. In 
one Finance test case after we set it to 100, in the limit 1000 scenario the 
performance increase about 2 times in comparison to if we set this value to 
12000. |
 | carbon.use.local.dir | spark/carbonlib/carbon.properties | Data loading 
| Whether use YARN local directories for multi-table load disk load balance | 
If this is set it to true CarbonData will use YARN local directories for 
multi-table load disk load balance, that will improve the data load 
performance. |
+| carbon.use.multiple.temp.dir | spark/carbonlib/carbon.properties | Data 
loading | Whether to use multiple YARN local directories during table data 
loading for disk load balance | After enabling 'carbon.use.local.dir', if this 
is set to true, CarbonData will use YARN local directories during data load for 
disk load balance, that will improve the data load performance. Please enable 
this property especially when you encounter disk hotspot problem during data 
loading. |
--- End diff --

can remove the word especially


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-25 Thread sraghunandan
Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r129204565
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/newflow/steps/CarbonRowDataWriterProcessorStepImpl.java
 ---
@@ -112,9 +115,11 @@ private String getStoreLocation(CarbonTableIdentifier 
tableIdentifier, String pa
   isNoDictionaryDimensionColumn =
   
CarbonDataProcessorUtil.getNoDictionaryMapping(configuration.getDataFields());
   measureDataType = configuration.getMeasureDataType();
+  //choose a tmp location randomly
--- End diff --

why we moved to a local variable?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-25 Thread sraghunandan
Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r129195193
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java ---
@@ -892,6 +892,23 @@ public int 
getNoDeleteDeltaFilesThresholdForIUDCompaction() {
   }
 
   /**
+   * Returns whether to use multi temp dirs
+   * @return boolean
+   */
+  public boolean isUseMultiTempDir() {
+String usingMultiDirStr = 
getProperty(CarbonCommonConstants.CARBON_USING_MULTI_TEMP_DIR,
+CarbonCommonConstants.CARBON_USING_MULTI_TEMP_DIR_DEFAULT);
+boolean validateBoolean = CarbonUtil.validateBoolean(usingMultiDirStr);
+if (!validateBoolean) {
+  LOGGER.info("The using multi temp dir value \"" + usingMultiDirStr
--- End diff --

carbon.use.multiple.temp.dir configuration value is invalid.Configured 
value:usingMultiDirStr. Data Load will not use multiple temp directories


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-25 Thread sraghunandan
Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r129219109
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ---
@@ -1310,6 +1310,18 @@
   public static final String CARBON_LEASE_RECOVERY_RETRY_INTERVAL =
   "carbon.lease.recovery.retry.interval";
 
+  /**
+   * whether to use multi directories when loading data,
+   * the main purpose is to avoid single-disk-hot-spot
+   */
+  @CarbonProperty
+  public static final String CARBON_USING_MULTI_TEMP_DIR = 
"carbon.use.multiple.temp.dir";
--- End diff --

change to CARBON_USE_MULTI_TEMP_DIR


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-25 Thread sraghunandan
Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r129197372
  
--- Diff: docs/useful-tips-on-carbondata.md ---
@@ -231,5 +231,6 @@ scenarios. After the completion of POC, some of the 
configurations impacting the
 | spark.executor.instances/spark.executor.cores/spark.executor.memory | 
spark/conf/spark-defaults.conf | Querying | The number of executors, CPU cores, 
and memory used for CarbonData query. | In the bank scenario, we provide the 4 
CPUs cores and 15 GB for each executor which can get good performance. This 2 
value does not mean more the better. It needs to be configured properly in case 
of limited resources. For example, In the bank scenario, it has enough CPU 32 
cores each node but less memory 64 GB each node. So we cannot give more CPU but 
less memory. For example, when 4 cores and 12GB for each executor. It sometimes 
happens GC during the query which impact the query performance very much from 
the 3 second to more than 15 seconds. In this scenario need to increase the 
memory or decrease the CPU cores. |
 | carbon.detail.batch.size | spark/carbonlib/carbon.properties | Data 
loading | The buffer size to store records, returned from the block scan. | In 
limit scenario this parameter is very important. For example your query limit 
is 1000. But if we set this value to 3000 that means we get 3000 records from 
scan but spark will only take 1000 rows. So the 2000 remaining are useless. In 
one Finance test case after we set it to 100, in the limit 1000 scenario the 
performance increase about 2 times in comparison to if we set this value to 
12000. |
 | carbon.use.local.dir | spark/carbonlib/carbon.properties | Data loading 
| Whether use YARN local directories for multi-table load disk load balance | 
If this is set it to true CarbonData will use YARN local directories for 
multi-table load disk load balance, that will improve the data load 
performance. |
+| carbon.use.multiple.temp.dir | spark/carbonlib/carbon.properties | Data 
loading | Whether to use multiple YARN local directories during table data 
loading for disk load balance | After enabling 'carbon.use.local.dir', if this 
is set to true, CarbonData will use YARN local directories during data load for 
disk load balance, that will improve the data load performance. Please enable 
this property especially when you encounter disk hotspot problem during data 
loading. |
--- End diff --

will use all yarn local directories


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-25 Thread sraghunandan
Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r129202230
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/merger/RowResultMergerProcessor.java
 ---
@@ -57,12 +57,15 @@
   
LogServiceFactory.getLogService(RowResultMergerProcessor.class.getName());
 
   public RowResultMergerProcessor(String databaseName,
-  String tableName, SegmentProperties segProp, String 
tempStoreLocation,
+  String tableName, SegmentProperties segProp, String[] 
tempStoreLocation,
   CarbonLoadModel loadModel, CompactionType compactionType) {
 this.segprop = segProp;
-if (!new File(tempStoreLocation).mkdirs()) {
-  LOGGER.error("Error while new File(tempStoreLocation).mkdirs() ");
+for (String temLoc : tempStoreLocation) {
+  if (!new File(temLoc).mkdirs()) {
+LOGGER.error("Error while new File(tempStoreLocation).mkdirs() ");
--- End diff --

remove new File(tempStoreLocation).mkdirs()
can log the temLoc


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-24 Thread bill1208
Github user bill1208 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r129195768
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/merger/RowResultMergerProcessor.java
 ---
@@ -57,12 +57,15 @@
   
LogServiceFactory.getLogService(RowResultMergerProcessor.class.getName());
 
   public RowResultMergerProcessor(String databaseName,
-  String tableName, SegmentProperties segProp, String 
tempStoreLocation,
+  String tableName, SegmentProperties segProp, String[] 
tempStoreLocation,
   CarbonLoadModel loadModel, CompactionType compactionType) {
 this.segprop = segProp;
-if (!new File(tempStoreLocation).mkdirs()) {
-  LOGGER.error("Error while new File(tempStoreLocation).mkdirs() ");
+for (String temLoc : tempStoreLocation) {
+  if (!new File(temLoc).mkdirs()) {
+LOGGER.error("Error while new File(tempStoreLocation).mkdirs() ");
--- End diff --

the log should be LOGGER.error("Error while new File(tempLoc.mkdirs() ");


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-24 Thread sraghunandan
Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r129061078
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/newflow/sort/unsafe/UnsafeSortDataRows.java
 ---
@@ -306,7 +309,9 @@ private void writeData(UnsafeCarbonRowPage rowPage, 
File file)
* This method will be used to delete sort temp location is it is exites
*/
   public void deleteSortLocationIfExists() {
-
CarbonDataProcessorUtil.deleteSortLocationIfExists(parameters.getTempFileLocation());
+for (String loc : parameters.getTempFileLocation()) {
--- End diff --

could have changed the method signature of 
CarbonDataProcessorUtil.deleteSortLocationIfExists to take list


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-24 Thread sraghunandan
Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r129062284
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/newflow/steps/DataWriterBatchProcessorStepImpl.java
 ---
@@ -58,12 +58,14 @@ public 
DataWriterBatchProcessorStepImpl(CarbonDataLoadConfiguration configuratio
 child.initialize();
   }
 
-  private String getStoreLocation(CarbonTableIdentifier tableIdentifier, 
String partitionId) {
-String storeLocation = CarbonDataProcessorUtil
+  private String[] getStoreLocation(CarbonTableIdentifier tableIdentifier, 
String partitionId) {
+String[] storeLocation = CarbonDataProcessorUtil
 .getLocalDataFolderLocation(tableIdentifier.getDatabaseName(),
 tableIdentifier.getTableName(), 
String.valueOf(configuration.getTaskNo()), partitionId,
 configuration.getSegmentId() + "", false);
-new File(storeLocation).mkdirs();
+for (String loc : storeLocation) {
--- End diff --

can create a utility method, Same operations repeating in multiple functions


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-24 Thread sraghunandan
Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r129057078
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/newflow/sort/impl/ParallelReadMergeSorterWithBucketingImpl.java
 ---
@@ -185,12 +188,17 @@ private boolean processRowToNextStep(SortDataRows[] 
sortDataRows, SortParameters
   }
 
   private void setTempLocation(SortParameters parameters) {
-String carbonDataDirectoryPath = CarbonDataProcessorUtil
+String[] carbonDataDirectoryPath = CarbonDataProcessorUtil
 .getLocalDataFolderLocation(parameters.getDatabaseName(),
 parameters.getTableName(), parameters.getTaskNo(),
 parameters.getPartitionID(), parameters.getSegmentId(), false);
-parameters.setTempFileLocation(
-carbonDataDirectoryPath + File.separator + 
CarbonCommonConstants.SORT_TEMP_FILE_LOCATION);
+String[] tmpLocs = new String[carbonDataDirectoryPath.length];
--- End diff --

Can move to common utility function


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-24 Thread sraghunandan
Github user sraghunandan commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r129057384
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/newflow/sort/impl/UnsafeParallelReadMergeSorterWithBucketingImpl.java
 ---
@@ -168,11 +171,15 @@ private boolean 
processRowToNextStep(UnsafeSortDataRows[] sortDataRows, SortPara
   }
 
   private void setTempLocation(SortParameters parameters) {
-String carbonDataDirectoryPath = CarbonDataProcessorUtil
+String[] carbonDataDirectoryPath = CarbonDataProcessorUtil
 .getLocalDataFolderLocation(parameters.getDatabaseName(), 
parameters.getTableName(),
 parameters.getTaskNo(), parameters.getPartitionID(), 
parameters.getSegmentId(), false);
-parameters.setTempFileLocation(
-carbonDataDirectoryPath + File.separator + 
CarbonCommonConstants.SORT_TEMP_FILE_LOCATION);
+String[] tmpLoc = new String[carbonDataDirectoryPath.length];
--- End diff --

can move to utility function


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-20 Thread xuchuanyin
Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r128666796
  
--- Diff: 
integration/spark-common-test/src/test/scala/org/apache/spark/util/SparkUtil4Test.scala
 ---
@@ -0,0 +1,33 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.util
+
+import org.apache.spark.SparkConf
+
+/**
+ * This class is for accessing utils in spark package for tests
+ */
+object SparkUtil4Test {
--- End diff --

CANNOT.

The class `SparkUtil4Test` is created to access the functions provided by  
class `Utils` in Spark whose effective scope is package private. 

The new class `SparkUtil4Test` and test class are in different packages, so 
they cannot co-exist in one file.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-20 Thread xuchuanyin
Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r128665149
  
--- Diff: docs/useful-tips-on-carbondata.md ---
@@ -231,5 +231,6 @@ scenarios. After the completion of POC, some of the 
configurations impacting the
 | spark.executor.instances/spark.executor.cores/spark.executor.memory | 
spark/conf/spark-defaults.conf | Querying | The number of executors, CPU cores, 
and memory used for CarbonData query. | In the bank scenario, we provide the 4 
CPUs cores and 15 GB for each executor which can get good performance. This 2 
value does not mean more the better. It needs to be configured properly in case 
of limited resources. For example, In the bank scenario, it has enough CPU 32 
cores each node but less memory 64 GB each node. So we cannot give more CPU but 
less memory. For example, when 4 cores and 12GB for each executor. It sometimes 
happens GC during the query which impact the query performance very much from 
the 3 second to more than 15 seconds. In this scenario need to increase the 
memory or decrease the CPU cores. |
 | carbon.detail.batch.size | spark/carbonlib/carbon.properties | Data 
loading | The buffer size to store records, returned from the block scan. | In 
limit scenario this parameter is very important. For example your query limit 
is 1000. But if we set this value to 3000 that means we get 3000 records from 
scan but spark will only take 1000 rows. So the 2000 remaining are useless. In 
one Finance test case after we set it to 100, in the limit 1000 scenario the 
performance increase about 2 times in comparison to if we set this value to 
12000. |
 | carbon.use.local.dir | spark/carbonlib/carbon.properties | Data loading 
| Whether use YARN local directories for multi-table load disk load balance | 
If this is set it to true CarbonData will use YARN local directories for 
multi-table load disk load balance, that will improve the data load 
performance. |
+| carbon.use.multiple.temp.dir | spark/carbonlib/carbon.properties | Data 
loading | Whether to use multiple YARN local directories during table data 
loading for disk load balance | After enabling 'carbon.use.local.dir', if this 
is set to true, CarbonData will use YARN local directories during data load for 
disk load balance, that will improve the data load performance. Please enable 
this property especially when you encounter disk hotspot problem during data 
loading. |
--- End diff --

1. The default value is false, just the same as the property 
`carbon.use.local.dir`

2. Currently it only works in `LOAD DATA INPATH` (that's my use case). In 
other use case, suck like update/merge/load-using-global-sort, I haven't 
studied and changed the related code, so leave this property `default` will not 
change the existing behavior.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-20 Thread xuchuanyin
Github user xuchuanyin commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r128664527
  
--- Diff: 
integration/spark-common-test/src/test/resources/multi_dir/data.csv ---
@@ -0,0 +1,10 @@
+1,name1
+2,name2
--- End diff --

OK, I'll find suitable one


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-20 Thread chenliang613
Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r128560327
  
--- Diff: 
integration/spark-common-test/src/test/scala/org/apache/spark/util/SparkUtil4Test.scala
 ---
@@ -0,0 +1,33 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.util
+
+import org.apache.spark.SparkConf
+
+/**
+ * This class is for accessing utils in spark package for tests
+ */
+object SparkUtil4Test {
--- End diff --

Whether can merge "SparkUtil4Test.scala's functions " into 
TestLoadDataWithYarnLocalDirs.scala,or not ?  because only be used by 
TestLoadDataWithYarnLocalDirs.scala.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-20 Thread chenliang613
Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r128558208
  
--- Diff: 
integration/spark-common-test/src/test/resources/multi_dir/data.csv ---
@@ -0,0 +1,10 @@
+1,name1
+2,name2
--- End diff --

Can you reuse the current test csv file, don't suggest adding new csv file.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-20 Thread chenliang613
Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r128556528
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/newflow/sort/unsafe/UnsafeSortDataRows.java
 ---
@@ -345,9 +350,12 @@ public DataSorterAndWriter(UnsafeCarbonRowPage 
rowPage) {
 }
 if (rowPage.isSaveToDisk()) {
   // create a new file every time
+  // create a new file and pick a temp directory randomly every 
time
--- End diff --

please merge the two comments into one 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-20 Thread chenliang613
Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r128555999
  
--- Diff: docs/useful-tips-on-carbondata.md ---
@@ -231,5 +231,6 @@ scenarios. After the completion of POC, some of the 
configurations impacting the
 | spark.executor.instances/spark.executor.cores/spark.executor.memory | 
spark/conf/spark-defaults.conf | Querying | The number of executors, CPU cores, 
and memory used for CarbonData query. | In the bank scenario, we provide the 4 
CPUs cores and 15 GB for each executor which can get good performance. This 2 
value does not mean more the better. It needs to be configured properly in case 
of limited resources. For example, In the bank scenario, it has enough CPU 32 
cores each node but less memory 64 GB each node. So we cannot give more CPU but 
less memory. For example, when 4 cores and 12GB for each executor. It sometimes 
happens GC during the query which impact the query performance very much from 
the 3 second to more than 15 seconds. In this scenario need to increase the 
memory or decrease the CPU cores. |
 | carbon.detail.batch.size | spark/carbonlib/carbon.properties | Data 
loading | The buffer size to store records, returned from the block scan. | In 
limit scenario this parameter is very important. For example your query limit 
is 1000. But if we set this value to 3000 that means we get 3000 records from 
scan but spark will only take 1000 rows. So the 2000 remaining are useless. In 
one Finance test case after we set it to 100, in the limit 1000 scenario the 
performance increase about 2 times in comparison to if we set this value to 
12000. |
 | carbon.use.local.dir | spark/carbonlib/carbon.properties | Data loading 
| Whether use YARN local directories for multi-table load disk load balance | 
If this is set it to true CarbonData will use YARN local directories for 
multi-table load disk load balance, that will improve the data load 
performance. |
+| carbon.use.multiple.temp.dir | spark/carbonlib/carbon.properties | Data 
loading | Whether to use multiple YARN local directories during table data 
loading for disk load balance | After enabling 'carbon.use.local.dir', if this 
is set to true, CarbonData will use YARN local directories during data load for 
disk load balance, that will improve the data load performance. Please enable 
this property especially when you encounter disk hotspot problem during data 
loading. |
--- End diff --

So , by default, you propose to set "true" or "false"?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] carbondata pull request #1177: [CARBONDATA-1281] Support multiple temp dirs ...

2017-07-18 Thread chenliang613
Github user chenliang613 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1177#discussion_r128147381
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ---
@@ -1308,6 +1308,18 @@
   public static final String CARBON_LEASE_RECOVERY_RETRY_INTERVAL =
   "carbon.lease.recovery.retry.interval";
 
+  /**
+   * whether to use multi directories when loading data,
+   * the main purpose is to avoid single-disk-hot-spot
+   */
+  @CarbonProperty
+  public static final String CARBON_USING_MULTI_TEMP_DIR = 
"carbon.using.multi.temp.dir";
+
+  /**
+   * default value for multi temp dir
+   */
+  public static final String CARBON_SUING_MULTI_TEMP_DIR_DEFAULT = "false";
--- End diff --

CARBON_SUING_MULTI_TEMP_DIR_DEFAULT  or CARBON_UING_MULTI_TEMP_DIR_DEFAULT ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---