[GitHub] [carbondata] kunal642 commented on a change in pull request #3995: [CARBONDATA-4043] Fix data load failure issue for columns added in legacy store

2020-10-28 Thread GitBox


kunal642 commented on a change in pull request #3995:
URL: https://github.com/apache/carbondata/pull/3995#discussion_r513983814



##
File path: 
processing/src/main/java/org/apache/carbondata/processing/util/CarbonDataProcessorUtil.java
##
@@ -424,6 +440,38 @@ public static boolean isHeaderValid(String tableName, 
String[] csvHeader,
 return noDicSortColMapping;
   }
 
+  /**
+   * Get the sort/no_sort column map based on schema order.
+   * This will be used in the final sort step to find the index of sort 
column, to compare the
+   * intermediate row data based on schema.
+   */
+  public static Map> 
getSortColSchemaOrderMapping(CarbonTable carbonTable) {

Review comment:
   Please add unit test for all the new methods





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3999: [CARBONDATA-4044] Fix dirty data in indexfile while IUD with stale data in segment folder

2020-10-28 Thread GitBox


CarbonDataQA1 commented on pull request #3999:
URL: https://github.com/apache/carbondata/pull/3999#issuecomment-718352690


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4719/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3999: [CARBONDATA-4044] Fix dirty data in indexfile while IUD with stale data in segment folder

2020-10-28 Thread GitBox


CarbonDataQA1 commented on pull request #3999:
URL: https://github.com/apache/carbondata/pull/3999#issuecomment-718352364


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2962/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on a change in pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-28 Thread GitBox


QiangCai commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r513891217



##
File path: 
core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentStatusManager.java
##
@@ -1136,7 +1137,8 @@ public static void 
deleteLoadsAndUpdateMetadata(CarbonTable carbonTable, boolean
   if (updateCompletionStatus) {
 DeleteLoadFolders
 .physicalFactAndMeasureMetadataDeletion(carbonTable, 
newAddedLoadHistoryList,
-isForceDeletion, partitionSpecs);
+isForceDeletion, partitionSpecs, String.valueOf(new 
Timestamp(System

Review comment:
   if this method is used by clean files only, can we move it to 
CleanFilesUtil?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on a change in pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-28 Thread GitBox


QiangCai commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r513889604



##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/cleanfiles/CleanFilesUtil.scala
##
@@ -0,0 +1,259 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.cleanfiles
+
+import java.util.concurrent.{Executors, ScheduledExecutorService, TimeUnit}
+
+import scala.collection.JavaConverters._
+
+import org.apache.hadoop.fs.permission.{FsAction, FsPermission}
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.indexstore.PartitionSpec
+import org.apache.carbondata.core.locks.{CarbonLockUtil, ICarbonLock, 
LockUsage}
+import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, 
CarbonMetadata, SegmentFileStore}
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil
+import org.apache.carbondata.core.statusmanager.{SegmentStatus, 
SegmentStatusManager}
+import org.apache.carbondata.core.util.{CarbonProperties, CarbonUtil}
+import org.apache.carbondata.core.util.path.CarbonTablePath
+
+object CleanFilesUtil {
+  private val LOGGER = 
LogServiceFactory.getLogService(this.getClass.getCanonicalName)
+
+  /**
+   * The method deletes all data if forceTableCLean  and lean garbage 
segment
+   * (MARKED_FOR_DELETE state) if forceTableCLean 
+   *
+   * @param dbName : Database name
+   * @param tableName  : Table name
+   * @param tablePath  : Table path
+   * @param carbonTable: CarbonTable Object  in case of 
force clean
+   * @param forceTableClean:  for force clean it will delete all 
data
+   *it will clean garbage segment 
(MARKED_FOR_DELETE state)
+   * @param currentTablePartitions : Hive Partitions  details
+   */
+  def cleanFiles(
+  dbName: String,
+  tableName: String,
+  tablePath: String,
+  carbonTable: CarbonTable,
+  forceTableClean: Boolean,
+  currentTablePartitions: Option[Seq[PartitionSpec]] = None,
+  truncateTable: Boolean = false): Unit = {
+var carbonCleanFilesLock: ICarbonLock = null
+val absoluteTableIdentifier = if (forceTableClean) {
+  AbsoluteTableIdentifier.from(tablePath, dbName, tableName, tableName)
+} else {
+  carbonTable.getAbsoluteTableIdentifier
+}
+try {
+  val errorMsg = "Clean files request is failed for " +
+s"$dbName.$tableName" +
+". Not able to acquire the clean files lock due to another clean files 
" +
+"operation is running in the background."
+  // in case of force clean the lock is not required
+  if (forceTableClean) {
+FileFactory.deleteAllCarbonFilesOfDir(
+  FileFactory.getCarbonFile(absoluteTableIdentifier.getTablePath))
+  } else {
+carbonCleanFilesLock =
+  CarbonLockUtil
+.getLockObject(absoluteTableIdentifier, 
LockUsage.CLEAN_FILES_LOCK, errorMsg)
+if (truncateTable) {
+  SegmentStatusManager.truncateTable(carbonTable)

Review comment:
   truncateTable is a function, maybe we need it.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on a change in pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-28 Thread GitBox


QiangCai commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r513889893



##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/cleanfiles/CleanFilesUtil.scala
##
@@ -0,0 +1,400 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.cleanfiles
+
+import java.util
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable.ListBuffer
+
+import org.apache.spark.sql.{AnalysisException, CarbonEnv, Row, SparkSession}
+import org.apache.spark.sql.index.CarbonIndexUtil
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.exception.ConcurrentOperationException
+import org.apache.carbondata.core.indexstore.PartitionSpec
+import org.apache.carbondata.core.locks.{CarbonLockFactory, CarbonLockUtil, 
ICarbonLock, LockUsage}
+import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, 
CarbonMetadata, SegmentFileStore}
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil
+import org.apache.carbondata.core.statusmanager.{LoadMetadataDetails, 
SegmentStatus, SegmentStatusManager}
+import org.apache.carbondata.core.util.{CarbonProperties, CarbonUtil}
+import org.apache.carbondata.core.util.path.{CarbonTablePath, TrashUtil}
+import org.apache.carbondata.processing.loading.TableProcessingOperations
+
+object CleanFilesUtil {
+  private val LOGGER = 
LogServiceFactory.getLogService(this.getClass.getCanonicalName)
+
+  /**
+   * The method deletes all data if forceTableClean  and clean garbage 
segment
+   * (MARKED_FOR_DELETE state) if forceTableClean 
+   *
+   * @param dbName : Database name
+   * @param tableName  : Table name
+   * @param tablePath  : Table path
+   * @param carbonTable: CarbonTable Object  in case of 
force clean
+   * @param forceTableClean:  for force clean it will delete all 
data
+   *it will clean garbage segment 
(MARKED_FOR_DELETE state)
+   * @param currentTablePartitions : Hive Partitions  details
+   */
+  def cleanFiles(
+dbName: String,
+tableName: String,
+tablePath: String,
+carbonTable: CarbonTable,
+forceTableClean: Boolean,
+currentTablePartitions: Option[Seq[PartitionSpec]] = None,
+truncateTable: Boolean = false): Unit = {
+var carbonCleanFilesLock: ICarbonLock = null
+val absoluteTableIdentifier = if (forceTableClean) {
+  AbsoluteTableIdentifier.from(tablePath, dbName, tableName, tableName)
+} else {
+  carbonTable.getAbsoluteTableIdentifier
+}
+try {
+  val errorMsg = "Clean files request is failed for " +
+s"$dbName.$tableName" +
+". Not able to acquire the clean files lock due to another clean files 
" +
+"operation is running in the background."
+  // in case of force clean the lock is not required
+  if (forceTableClean) {
+FileFactory.deleteAllCarbonFilesOfDir(
+  FileFactory.getCarbonFile(absoluteTableIdentifier.getTablePath))
+  } else {
+carbonCleanFilesLock =
+  CarbonLockUtil
+.getLockObject(absoluteTableIdentifier, 
LockUsage.CLEAN_FILES_LOCK, errorMsg)
+if (truncateTable) {
+  SegmentStatusManager.truncateTable(carbonTable)
+}
+SegmentStatusManager.deleteLoadsAndUpdateMetadata(
+  carbonTable, true, currentTablePartitions.map(_.asJava).orNull)
+CarbonUpdateUtil.cleanUpDeltaFiles(carbonTable, true)

Review comment:
   is there a conflict with update/delete/merge?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on a change in pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-28 Thread GitBox


QiangCai commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r513889155



##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/cleanfiles/CleanFilesUtil.scala
##
@@ -0,0 +1,259 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.cleanfiles
+
+import java.util.concurrent.{Executors, ScheduledExecutorService, TimeUnit}
+
+import scala.collection.JavaConverters._
+
+import org.apache.hadoop.fs.permission.{FsAction, FsPermission}
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.indexstore.PartitionSpec
+import org.apache.carbondata.core.locks.{CarbonLockUtil, ICarbonLock, 
LockUsage}
+import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, 
CarbonMetadata, SegmentFileStore}
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil
+import org.apache.carbondata.core.statusmanager.{SegmentStatus, 
SegmentStatusManager}
+import org.apache.carbondata.core.util.{CarbonProperties, CarbonUtil}
+import org.apache.carbondata.core.util.path.CarbonTablePath
+
+object CleanFilesUtil {
+  private val LOGGER = 
LogServiceFactory.getLogService(this.getClass.getCanonicalName)
+
+  /**
+   * The method deletes all data if forceTableCLean  and lean garbage 
segment
+   * (MARKED_FOR_DELETE state) if forceTableCLean 
+   *
+   * @param dbName : Database name
+   * @param tableName  : Table name
+   * @param tablePath  : Table path
+   * @param carbonTable: CarbonTable Object  in case of 
force clean
+   * @param forceTableClean:  for force clean it will delete all 
data
+   *it will clean garbage segment 
(MARKED_FOR_DELETE state)
+   * @param currentTablePartitions : Hive Partitions  details
+   */
+  def cleanFiles(
+  dbName: String,
+  tableName: String,
+  tablePath: String,
+  carbonTable: CarbonTable,
+  forceTableClean: Boolean,
+  currentTablePartitions: Option[Seq[PartitionSpec]] = None,
+  truncateTable: Boolean = false): Unit = {
+var carbonCleanFilesLock: ICarbonLock = null
+val absoluteTableIdentifier = if (forceTableClean) {
+  AbsoluteTableIdentifier.from(tablePath, dbName, tableName, tableName)
+} else {
+  carbonTable.getAbsoluteTableIdentifier
+}
+try {
+  val errorMsg = "Clean files request is failed for " +
+s"$dbName.$tableName" +
+". Not able to acquire the clean files lock due to another clean files 
" +
+"operation is running in the background."
+  // in case of force clean the lock is not required
+  if (forceTableClean) {

Review comment:
   if we do not use it, we can remove it





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on a change in pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-28 Thread GitBox


QiangCai commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r513884891



##
File path: 
core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java
##
@@ -1105,28 +1109,79 @@ public static void cleanSegments(CarbonTable table, 
List partitio
* @throws IOException
*/
   public static void deleteSegment(String tablePath, Segment segment,
-  List partitionSpecs,
-  SegmentUpdateStatusManager updateStatusManager) throws Exception {
+  List partitionSpecs, SegmentUpdateStatusManager 
updateStatusManager,
+  SegmentStatus segmentStatus, Boolean isPartitionTable, String timeStamp)
+  throws Exception {
 SegmentFileStore fileStore = new SegmentFileStore(tablePath, 
segment.getSegmentFileName());
 List indexOrMergeFiles = 
fileStore.readIndexFiles(SegmentStatus.SUCCESS, true,
 FileFactory.getConfiguration());
+List filesToDelete = new ArrayList<>();
 Map> indexFilesMap = fileStore.getIndexFilesMap();
 for (Map.Entry> entry : indexFilesMap.entrySet()) {
-  FileFactory.deleteFile(entry.getKey());
+  // Move the file to the trash folder in case the segment status is 
insert in progress
+  if (segmentStatus == SegmentStatus.INSERT_IN_PROGRESS) {
+if (!isPartitionTable) {
+  TrashUtil.moveDataToTrashFolderByFile(tablePath, entry.getKey(), 
timeStamp +
+  CarbonCommonConstants.FILE_SEPARATOR + 
CarbonCommonConstants.LOAD_FOLDER + segment
+  .getSegmentNo());
+} else {
+  TrashUtil.moveDataToTrashFolderByFile(tablePath, entry.getKey(), 
timeStamp +

Review comment:
   ok





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on a change in pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-28 Thread GitBox


QiangCai commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r513883725



##
File path: 
core/src/main/java/org/apache/carbondata/core/util/DeleteLoadFolders.java
##
@@ -192,11 +208,17 @@ private static boolean 
checkIfLoadCanBeDeleted(LoadMetadataDetails oneLoad,
   }
 
   private static boolean checkIfLoadCanBeDeletedPhysically(LoadMetadataDetails 
oneLoad,
-  boolean isForceDelete) {
+  boolean isForceDelete, AbsoluteTableIdentifier absoluteTableIdentifier) {
 // Check if the segment is added externally and path is set then do not 
delete it
 if ((SegmentStatus.MARKED_FOR_DELETE == oneLoad.getSegmentStatus()
-|| SegmentStatus.COMPACTED == oneLoad.getSegmentStatus()) && 
(oneLoad.getPath() == null
+|| SegmentStatus.COMPACTED == oneLoad.getSegmentStatus() || 
SegmentStatus
+.INSERT_IN_PROGRESS == oneLoad.getSegmentStatus()) && 
(oneLoad.getPath() == null

Review comment:
   about clean files, because cleaning different load status have different 
risk level,
   so in my opinion, different load status should have different expiration 
time for keeping it at the original place.
   
   1. MARKED_FOR_DELETE and COMPACTED => MaxQueryTimeout is enough
   
   2. INSERT_IN_PROGRESS and INSERT_OVERWRITE_IN_PROGRESS => more than data 
processing time(maybe 3 days)
 if loading or compaction takes a long time which is longer than 
LockingTimeout and MaxQueryTimeout, clean in_progress will have a big risk.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-28 Thread GitBox


CarbonDataQA1 commented on pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#issuecomment-717883448


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2960/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3999: [CARBONDATA-4044] Fix dirty data in indexfile while IUD with stale data in segment folder

2020-10-28 Thread GitBox


CarbonDataQA1 commented on pull request #3999:
URL: https://github.com/apache/carbondata/pull/3999#issuecomment-717883291


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2957/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update

2020-10-28 Thread GitBox


CarbonDataQA1 commented on pull request #3986:
URL: https://github.com/apache/carbondata/pull/3986#issuecomment-717880662


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2959/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update

2020-10-28 Thread GitBox


CarbonDataQA1 commented on pull request #3986:
URL: https://github.com/apache/carbondata/pull/3986#issuecomment-717880461


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4716/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-28 Thread GitBox


CarbonDataQA1 commented on pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#issuecomment-717879695


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4717/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3995: [CARBONDATA-4043] Fix data load failure issue for columns added in legacy store

2020-10-28 Thread GitBox


CarbonDataQA1 commented on pull request #3995:
URL: https://github.com/apache/carbondata/pull/3995#issuecomment-717879374


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2958/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3995: [CARBONDATA-4043] Fix data load failure issue for columns added in legacy store

2020-10-28 Thread GitBox


CarbonDataQA1 commented on pull request #3995:
URL: https://github.com/apache/carbondata/pull/3995#issuecomment-717879514


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4715/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3999: [CARBONDATA-4044] Fix dirty data in indexfile while IUD with stale data in segment folder

2020-10-28 Thread GitBox


CarbonDataQA1 commented on pull request #3999:
URL: https://github.com/apache/carbondata/pull/3999#issuecomment-717879622


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4714/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 commented on pull request #3995: [CARBONDATA-4043] Fix data load failure issue for columns added in legacy store

2020-10-28 Thread GitBox


Indhumathi27 commented on pull request #3995:
URL: https://github.com/apache/carbondata/pull/3995#issuecomment-717788093


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3988: [CARBONDATA-4037] Improve the table status and segment file writing

2020-10-28 Thread GitBox


akashrn5 commented on a change in pull request #3988:
URL: https://github.com/apache/carbondata/pull/3988#discussion_r513210534



##
File path: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
##
@@ -361,6 +361,11 @@ private CarbonCommonConstants() {
 
   public static final String CARBON_INDEX_SCHEMA_STORAGE_DATABASE = "DATABASE";
 
+  @CarbonProperty
+  public static final String CARBON_INDEXFILES_DELETETIME = 
"carbon.index.delete.time";

Review comment:
   1. Since this is user exposed property, millisecond level is not good, 
please change to seconds level.
   2. 
   ```suggestion
 public static final String CARBON_INDEXFILES_DELETETIME_IN_SECONDS = 
"carbon.index.files.delete.time.seconds";
   ```
   3. please add the proper comments.

##
File path: 
core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java
##
@@ -464,6 +472,38 @@ public boolean accept(CarbonFile file) {
 return null;
   }
 
+  public static List getMergedIndexFiles(CarbonFile[] indexFiles) 
throws IOException {

Review comment:
   can you please add comment to this method with example, so that it will 
be clear why this required, now its little confusing. Also add the comments in 
caller if required. Then we will have one more review

##
File path: 
core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java
##
@@ -306,11 +306,15 @@ public static boolean writeSegmentFile(CarbonTable 
carbonTable, Segment segment)
   folderDetails.setStatus(SegmentStatus.SUCCESS.getMessage());
   folderDetails.setRelative(false);
   segmentFile.addPath(segment.getSegmentPath(), folderDetails);
+  List mergedIndexFiles = getMergedIndexFiles(indexFiles);

Review comment:
   i think duplicate code is present in some three places, so please try if 
you can refactor it.

##
File path: 
core/src/main/java/org/apache/carbondata/core/readcommitter/TableStatusReadCommittedScope.java
##
@@ -91,6 +95,19 @@ public TableStatusReadCommittedScope(AbsoluteTableIdentifier 
identifier,
 
segment.setSegmentMetaDataInfo(fileStore.getSegmentFile().getSegmentMetaDataInfo());
   }
 }
+List index = new ArrayList<>(indexFiles.keySet());
+CarbonFile[] carbonIndexFiles = new CarbonFile[index.size()];
+for (int i = 0; i < index.size(); i++) {
+  carbonIndexFiles[i] = FileFactory.getCarbonFile(index.get(i));
+}

Review comment:
   same as above

##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/api/CarbonStore.scala
##
@@ -353,6 +356,79 @@ object CarbonStore {
 }
   }
 
+  /**
+   * Clean invalid and expired index files of carbon table.
+   *
+   * @param carbonTable CarbonTable

Review comment:
   remove variable desc, its straight forward, just keep method level 
comment

##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/api/CarbonStore.scala
##
@@ -353,6 +356,79 @@ object CarbonStore {
 }
   }
 
+  /**
+   * Clean invalid and expired index files of carbon table.
+   *
+   * @param carbonTable CarbonTable
+   */
+  def cleanUpIndexFiles(carbonTable: CarbonTable): Unit = {
+val validSegments = new 
SegmentStatusManager(carbonTable.getAbsoluteTableIdentifier)
+  .getValidAndInvalidSegments.getValidSegments.asScala.toList
+validSegments.foreach( segment => {
+  val sfs: SegmentFileStore = new 
SegmentFileStore(carbonTable.getTablePath,
+segment.getSegmentFileName)
+  var indexFiles = List[CarbonFile]()
+  if (carbonTable.isHivePartitionTable) {
+val partitionSpecs = sfs.getPartitionSpecs.asScala.toList
+val segmentName = 
segment.getSegmentFileName.replace(CarbonTablePath.SEGMENT_EXT, "")
+for (partitionSpec <- partitionSpecs) {
+  var carbonIndexFiles = SegmentIndexFileStore
+.getCarbonIndexFiles(partitionSpec.getLocation.toString, 
FileFactory.getConfiguration)

Review comment:
   `SegmentIndexFileStore.getCarbonIndexFiles` does list files at a path, 
it will be slow, why do you need to use this? can't we get the index files from 
`sfs`?

##
File path: 
core/src/main/java/org/apache/carbondata/core/metadata/SegmentFileStore.java
##
@@ -464,6 +472,38 @@ public boolean accept(CarbonFile file) {
 return null;
   }
 
+  public static List getMergedIndexFiles(CarbonFile[] indexFiles) 
throws IOException {
+SegmentIndexFileStore indexFileStore = new SegmentIndexFileStore();
+List mergedIndexFiles = new ArrayList<>();
+long lastModifiedTime = 0L;
+long length = 0L;
+CarbonFile validMergeIndexFile = null;
+List mergeIndexCarbonFiles = new ArrayList<>();
+for (CarbonFile file : indexFiles) {

Review comment:
   ```suggestion
   for (CarbonFile indexFile : indexFiles) {
   ```

##
File path: 

[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-28 Thread GitBox


vikramahuja1001 commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r513252427



##
File path: 
core/src/main/java/org/apache/carbondata/core/util/DeleteLoadFolders.java
##
@@ -138,8 +143,19 @@ public boolean accept(CarbonFile file) {
   if (filesToBeDeleted.length == 0) {
 status = true;
   } else {
-
 for (CarbonFile eachFile : filesToBeDeleted) {
+  // If the file to be deleted is a carbondata file, index 
file, index merge file
+  // or a delta file, copy that file to the trash folder.
+  if 
((eachFile.getName().endsWith(CarbonCommonConstants.FACT_FILE_EXT) ||

Review comment:
   changed, it's not needed anymore





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] shenjiayu17 commented on a change in pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update

2020-10-28 Thread GitBox


shenjiayu17 commented on a change in pull request #3986:
URL: https://github.com/apache/carbondata/pull/3986#discussion_r513249713



##
File path: 
processing/src/main/java/org/apache/carbondata/processing/merger/CarbonDataMergerUtil.java
##
@@ -1138,73 +1126,43 @@ private static Boolean 
checkUpdateDeltaFilesInSeg(Segment seg,
   }
 
   /**
-   * Check is the segment passed qualifies for IUD delete delta compaction or 
not i.e.
-   * if the number of delete delta files present in the segment is more than
-   * numberDeltaFilesThreshold.
+   * Check whether the segment passed qualifies for IUD delete delta 
compaction or not,
+   * i.e., if the number of delete delta files present in the segment is more 
than
+   * numberDeltaFilesThreshold, this segment will be selected.
*
-   * @param seg
-   * @param segmentUpdateStatusManager
-   * @param numberDeltaFilesThreshold
-   * @return
+   * @param seg segment to be qualified
+   * @param segmentUpdateStatusManager segments & blocks details management
+   * @param numberDeltaFilesThreshold threshold of delete delta files
+   * @return block list of the segment
*/
-  private static boolean checkDeleteDeltaFilesInSeg(Segment seg,
+  private static List checkDeleteDeltaFilesInSeg(Segment seg,
   SegmentUpdateStatusManager segmentUpdateStatusManager, int 
numberDeltaFilesThreshold) {
 
+List blockLists = new ArrayList<>();
 Set uniqueBlocks = new HashSet();
 List blockNameList =
 segmentUpdateStatusManager.getBlockNameFromSegment(seg.getSegmentNo());
-
-for (final String blockName : blockNameList) {
-
-  CarbonFile[] deleteDeltaFiles =
+for (String blockName : blockNameList) {
+  List deleteDeltaFiles =
   segmentUpdateStatusManager.getDeleteDeltaFilesList(seg, blockName);
-  if (null != deleteDeltaFiles) {
+  if (null != deleteDeltaFiles && deleteDeltaFiles.size() > 
numberDeltaFilesThreshold) {

Review comment:
   Added test case "test IUD Horizontal Compaction when 
CarbonCommonConstants.DELETE_DELTAFILE_COUNT_THRESHOLD_IUD_COMPACTION >= 3"
   in HorizontalCompactionTestCase





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] shenjiayu17 commented on a change in pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update

2020-10-28 Thread GitBox


shenjiayu17 commented on a change in pull request #3986:
URL: https://github.com/apache/carbondata/pull/3986#discussion_r513249194



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/HorizontalCompaction.scala
##
@@ -130,6 +130,7 @@ object HorizontalCompaction {
   absTableIdentifier,
   segmentUpdateStatusManager,
   compactionTypeIUD)
+LOG.debug(s"The segment list for Horizontal Update Compaction is 
$validSegList")

Review comment:
   Done

##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/HorizontalCompaction.scala
##
@@ -177,6 +178,7 @@ object HorizontalCompaction {
   absTableIdentifier,
   segmentUpdateStatusManager,
   compactionTypeIUD)
+LOG.debug(s"The segment list for Horizontal Update Compaction is 
$deletedBlocksList")

Review comment:
   Done

##
File path: 
core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentUpdateStatusManager.java
##
@@ -415,44 +415,41 @@ public boolean accept(CarbonFile pathName) {
   }
 
   /**
-   * Return all delta file for a block.
-   * @param segmentId
-   * @param blockName
-   * @return
+   * Get all delete delta files of the block of specified segment.
+   * Actually, delete delta file name is generated from each 
SegmentUpdateDetails.
+   *
+   * @param seg the segment which is to find block and its delete delta files
+   * @param blockName the specified block of the segment
+   * @return delete delta file list of the block
*/
-  public CarbonFile[] getDeleteDeltaFilesList(final Segment segmentId, final 
String blockName) {
+  public List getDeleteDeltaFilesList(final Segment seg, final String 
blockName) {

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] shenjiayu17 commented on a change in pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update

2020-10-28 Thread GitBox


shenjiayu17 commented on a change in pull request #3986:
URL: https://github.com/apache/carbondata/pull/3986#discussion_r513248715



##
File path: 
core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentUpdateStatusManager.java
##
@@ -415,44 +415,41 @@ public boolean accept(CarbonFile pathName) {
   }
 
   /**
-   * Return all delta file for a block.
-   * @param segmentId
-   * @param blockName
-   * @return
+   * Get all delete delta files of the block of specified segment.
+   * Actually, delete delta file name is generated from each 
SegmentUpdateDetails.
+   *
+   * @param seg the segment which is to find block and its delete delta files
+   * @param blockName the specified block of the segment
+   * @return delete delta file list of the block
*/
-  public CarbonFile[] getDeleteDeltaFilesList(final Segment segmentId, final 
String blockName) {
+  public List getDeleteDeltaFilesList(final Segment seg, final String 
blockName) {
+List deleteDeltaFileList = new ArrayList<>();
 String segmentPath = CarbonTablePath.getSegmentPath(
-identifier.getTablePath(), segmentId.getSegmentNo());
-CarbonFile segDir =
-FileFactory.getCarbonFile(segmentPath);
+identifier.getTablePath(), seg.getSegmentNo());
+
 for (SegmentUpdateDetails block : updateDetails) {
   if ((block.getBlockName().equalsIgnoreCase(blockName)) &&
-  (block.getSegmentName().equalsIgnoreCase(segmentId.getSegmentNo()))
-  && !CarbonUpdateUtil.isBlockInvalid((block.getSegmentStatus( {
-final long deltaStartTimestamp =
-
getStartTimeOfDeltaFile(CarbonCommonConstants.DELETE_DELTA_FILE_EXT, block);
-final long deltaEndTimeStamp =
-getEndTimeOfDeltaFile(CarbonCommonConstants.DELETE_DELTA_FILE_EXT, 
block);
-
-return segDir.listFiles(new CarbonFileFilter() {
-
-  @Override
-  public boolean accept(CarbonFile pathName) {
-String fileName = pathName.getName();
-if (pathName.getSize() > 0
-&& 
fileName.endsWith(CarbonCommonConstants.DELETE_DELTA_FILE_EXT)) {
-  String blkName = fileName.substring(0, 
fileName.lastIndexOf("-"));
-  long timestamp =
-  
Long.parseLong(CarbonTablePath.DataFileUtil.getTimeStampFromFileName(fileName));
-  return blockName.equals(blkName) && timestamp <= 
deltaEndTimeStamp
-  && timestamp >= deltaStartTimestamp;
-}
-return false;
-  }
-});
+  (block.getSegmentName().equalsIgnoreCase(seg.getSegmentNo())) &&
+  !CarbonUpdateUtil.isBlockInvalid(block.getSegmentStatus())) {
+Set deltaFileTimestamps = block.getDeltaFileStamps();
+String deleteDeltaFilePrefix = segmentPath + 
CarbonCommonConstants.FILE_SEPARATOR +
+blockName + CarbonCommonConstants.HYPHEN;
+if (deltaFileTimestamps != null && deltaFileTimestamps.size() > 0) {
+  deltaFileTimestamps.forEach(timestamp -> deleteDeltaFileList.add(
+  deleteDeltaFilePrefix + timestamp + 
CarbonCommonConstants.DELETE_DELTA_FILE_EXT));

Review comment:
   Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] shenjiayu17 commented on a change in pull request #3986: [CARBONDATA-4034] Improve the time-consuming of Horizontal Compaction for update

2020-10-28 Thread GitBox


shenjiayu17 commented on a change in pull request #3986:
URL: https://github.com/apache/carbondata/pull/3986#discussion_r513248146



##
File path: 
core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentUpdateStatusManager.java
##
@@ -415,44 +415,41 @@ public boolean accept(CarbonFile pathName) {
   }
 
   /**
-   * Return all delta file for a block.
-   * @param segmentId
-   * @param blockName
-   * @return
+   * Get all delete delta files of the block of specified segment.
+   * Actually, delete delta file name is generated from each 
SegmentUpdateDetails.
+   *
+   * @param seg the segment which is to find block and its delete delta files
+   * @param blockName the specified block of the segment
+   * @return delete delta file list of the block
*/
-  public CarbonFile[] getDeleteDeltaFilesList(final Segment segmentId, final 
String blockName) {
+  public List getDeleteDeltaFilesList(final Segment seg, final String 
blockName) {
+List deleteDeltaFileList = new ArrayList<>();
 String segmentPath = CarbonTablePath.getSegmentPath(
-identifier.getTablePath(), segmentId.getSegmentNo());
-CarbonFile segDir =
-FileFactory.getCarbonFile(segmentPath);
+identifier.getTablePath(), seg.getSegmentNo());
+
 for (SegmentUpdateDetails block : updateDetails) {
   if ((block.getBlockName().equalsIgnoreCase(blockName)) &&
-  (block.getSegmentName().equalsIgnoreCase(segmentId.getSegmentNo()))
-  && !CarbonUpdateUtil.isBlockInvalid((block.getSegmentStatus( {
-final long deltaStartTimestamp =
-
getStartTimeOfDeltaFile(CarbonCommonConstants.DELETE_DELTA_FILE_EXT, block);
-final long deltaEndTimeStamp =
-getEndTimeOfDeltaFile(CarbonCommonConstants.DELETE_DELTA_FILE_EXT, 
block);
-
-return segDir.listFiles(new CarbonFileFilter() {
-
-  @Override
-  public boolean accept(CarbonFile pathName) {
-String fileName = pathName.getName();
-if (pathName.getSize() > 0
-&& 
fileName.endsWith(CarbonCommonConstants.DELETE_DELTA_FILE_EXT)) {
-  String blkName = fileName.substring(0, 
fileName.lastIndexOf("-"));
-  long timestamp =
-  
Long.parseLong(CarbonTablePath.DataFileUtil.getTimeStampFromFileName(fileName));
-  return blockName.equals(blkName) && timestamp <= 
deltaEndTimeStamp
-  && timestamp >= deltaStartTimestamp;
-}
-return false;
-  }
-});
+  (block.getSegmentName().equalsIgnoreCase(seg.getSegmentNo())) &&
+  !CarbonUpdateUtil.isBlockInvalid(block.getSegmentStatus())) {
+Set deltaFileTimestamps = block.getDeltaFileStamps();
+String deleteDeltaFilePrefix = segmentPath + 
CarbonCommonConstants.FILE_SEPARATOR +
+blockName + CarbonCommonConstants.HYPHEN;
+if (deltaFileTimestamps != null && deltaFileTimestamps.size() > 0) {

Review comment:
   Added this test case "test IUD Horizontal Compaction after updating 
without compaction"
   in HorizontalCompactionTestCase





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-28 Thread GitBox


vikramahuja1001 commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r513243540



##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/cleanfiles/CleanFilesUtil.scala
##
@@ -0,0 +1,327 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.cleanfiles
+
+import scala.collection.JavaConverters._
+
+import org.apache.spark.sql.{AnalysisException, CarbonEnv, Row, SparkSession}
+import org.apache.spark.sql.index.CarbonIndexUtil
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.exception.ConcurrentOperationException
+import org.apache.carbondata.core.indexstore.PartitionSpec
+import org.apache.carbondata.core.locks.{CarbonLockFactory, CarbonLockUtil, 
ICarbonLock, LockUsage}
+import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, 
CarbonMetadata, SegmentFileStore}
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil
+import org.apache.carbondata.core.statusmanager.{LoadMetadataDetails, 
SegmentStatus, SegmentStatusManager}
+import org.apache.carbondata.core.util.{CarbonProperties, CarbonUtil}
+import org.apache.carbondata.core.util.path.{CarbonTablePath, TrashUtil}
+import org.apache.carbondata.processing.loading.TableProcessingOperations
+import org.apache.carbondata.processing.loading.model.CarbonLoadModel
+
+object CleanFilesUtil {
+  private val LOGGER = 
LogServiceFactory.getLogService(this.getClass.getCanonicalName)
+
+  /**
+   * The method deletes all data if forceTableCLean  and lean garbage 
segment
+   * (MARKED_FOR_DELETE state) if forceTableCLean 
+   *
+   * @param dbName : Database name
+   * @param tableName  : Table name
+   * @param tablePath  : Table path
+   * @param carbonTable: CarbonTable Object  in case of 
force clean
+   * @param forceTableClean:  for force clean it will delete all 
data
+   *it will clean garbage segment 
(MARKED_FOR_DELETE state)
+   * @param currentTablePartitions : Hive Partitions  details
+   */
+  def cleanFiles(
+  dbName: String,
+  tableName: String,
+  tablePath: String,
+  carbonTable: CarbonTable,
+  forceTableClean: Boolean,
+  currentTablePartitions: Option[Seq[PartitionSpec]] = None,
+  truncateTable: Boolean = false): Unit = {
+var carbonCleanFilesLock: ICarbonLock = null
+val absoluteTableIdentifier = if (forceTableClean) {
+  AbsoluteTableIdentifier.from(tablePath, dbName, tableName, tableName)
+} else {
+  carbonTable.getAbsoluteTableIdentifier
+}
+try {
+  val errorMsg = "Clean files request is failed for " +
+s"$dbName.$tableName" +
+". Not able to acquire the clean files lock due to another clean files 
" +
+"operation is running in the background."
+  // in case of force clean the lock is not required
+  if (forceTableClean) {
+FileFactory.deleteAllCarbonFilesOfDir(
+  FileFactory.getCarbonFile(absoluteTableIdentifier.getTablePath))
+  } else {
+carbonCleanFilesLock =
+  CarbonLockUtil
+.getLockObject(absoluteTableIdentifier, 
LockUsage.CLEAN_FILES_LOCK, errorMsg)
+if (truncateTable) {
+  SegmentStatusManager.truncateTable(carbonTable)
+}
+SegmentStatusManager.deleteLoadsAndUpdateMetadata(
+  carbonTable, true, currentTablePartitions.map(_.asJava).orNull)
+CarbonUpdateUtil.cleanUpDeltaFiles(carbonTable, true)
+currentTablePartitions match {
+  case Some(partitions) =>
+SegmentFileStore.cleanSegments(
+  carbonTable,
+  currentTablePartitions.map(_.asJava).orNull,
+  true)
+  case _ =>
+}
+  }
+} finally 

[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-28 Thread GitBox


vikramahuja1001 commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r513229197



##
File path: 
core/src/main/java/org/apache/carbondata/core/util/DeleteLoadFolders.java
##
@@ -113,12 +116,24 @@ private static void 
physicalFactAndMeasureMetadataDeletion(CarbonTable carbonTab
 SegmentUpdateStatusManager updateStatusManager =
 new SegmentUpdateStatusManager(carbonTable, currLoadDetails);
 for (final LoadMetadataDetails oneLoad : loadDetails) {
-  if (checkIfLoadCanBeDeletedPhysically(oneLoad, isForceDelete)) {
+  if (checkIfLoadCanBeDeletedPhysically(oneLoad, isForceDelete, carbonTable
+  .getAbsoluteTableIdentifier())) {
 try {
+  // if insert in progress, then move it to trash
+  if (oneLoad.getSegmentStatus() == SegmentStatus.INSERT_IN_PROGRESS 
&& !carbonTable
+  .isHivePartitionTable()) {
+// move this segment to trash
+
TrashUtil.copyDataToTrashBySegment(FileFactory.getCarbonFile(CarbonTablePath
+.getFactDir(carbonTable.getTablePath()) + "/Part0/Segment_" + 
oneLoad

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-28 Thread GitBox


vikramahuja1001 commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r513225060



##
File path: 
core/src/main/java/org/apache/carbondata/core/util/path/TrashUtil.java
##
@@ -0,0 +1,164 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.util.path;
+
+import java.io.File;
+import java.io.IOException;
+import java.sql.Timestamp;
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile;
+import org.apache.carbondata.core.datastore.impl.FileFactory;
+import org.apache.carbondata.core.exception.CarbonFileException;
+import org.apache.carbondata.core.util.CarbonUtil;
+
+import org.apache.commons.io.FileUtils;
+
+import org.apache.log4j.Logger;
+
+public final class TrashUtil {
+
+  private static final Logger LOGGER =
+  LogServiceFactory.getLogService(CarbonUtil.class.getName());
+
+  /**
+   * The below method copies the complete a file to the trash folder. Provide 
necessary
+   * timestamp and the segment number in the suffixToAdd  variable, so that 
the proper folder is
+   * created in the trash folder.
+   */
+  public static void copyDataToTrashFolderByFile(String carbonTablePath, 
String pathOfFileToCopy,
+  String suffixToAdd) {
+String trashFolderPath = 
CarbonTablePath.getTrashFolderPath(carbonTablePath) +
+CarbonCommonConstants.FILE_SEPARATOR + suffixToAdd;
+try {
+  if (new File(pathOfFileToCopy).exists()) {
+FileUtils.copyFileToDirectory(new File(pathOfFileToCopy), new 
File(trashFolderPath));
+LOGGER.info("File: " + pathOfFileToCopy + " successfully copied to the 
trash folder: "
++ trashFolderPath);
+  }
+} catch (IOException e) {
+  LOGGER.error("Unable to copy " + pathOfFileToCopy + " to the trash 
folder", e);
+}
+  }
+
+  /**
+   * The below method copies the complete segment folder to the trash folder. 
Provide necessary
+   * timestamp and the segment number in the suffixToAdd  variable, so that 
the proper folder is
+   * created in the trash folder.
+   */
+  public static void copyDataToTrashBySegment(CarbonFile path, String 
carbonTablePath,

Review comment:
   done

##
File path: 
core/src/main/java/org/apache/carbondata/core/util/path/TrashUtil.java
##
@@ -0,0 +1,164 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.util.path;
+
+import java.io.File;
+import java.io.IOException;
+import java.sql.Timestamp;
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile;
+import org.apache.carbondata.core.datastore.impl.FileFactory;
+import org.apache.carbondata.core.exception.CarbonFileException;
+import org.apache.carbondata.core.util.CarbonUtil;
+
+import org.apache.commons.io.FileUtils;
+
+import org.apache.log4j.Logger;
+
+public final class TrashUtil {
+
+  private static final Logger LOGGER =
+  LogServiceFactory.getLogService(CarbonUtil.class.getName());
+
+  /**
+   * The below method copies the complete a file to the trash folder. 

[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-28 Thread GitBox


vikramahuja1001 commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r513220391



##
File path: 
core/src/main/java/org/apache/carbondata/core/util/path/TrashUtil.java
##
@@ -0,0 +1,164 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.util.path;
+
+import java.io.File;
+import java.io.IOException;
+import java.sql.Timestamp;
+import java.util.ArrayList;
+import java.util.List;
+
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile;
+import org.apache.carbondata.core.datastore.impl.FileFactory;
+import org.apache.carbondata.core.exception.CarbonFileException;
+import org.apache.carbondata.core.util.CarbonUtil;
+
+import org.apache.commons.io.FileUtils;
+
+import org.apache.log4j.Logger;
+
+public final class TrashUtil {

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-28 Thread GitBox


vikramahuja1001 commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r513218983



##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/cleanfiles/CleanFilesUtil.scala
##
@@ -0,0 +1,406 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.cleanfiles
+
+import java.sql.Timestamp
+import java.util
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable.ListBuffer
+
+import org.apache.spark.sql.{AnalysisException, CarbonEnv, Row, SparkSession}
+import org.apache.spark.sql.index.CarbonIndexUtil
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.exception.ConcurrentOperationException
+import org.apache.carbondata.core.indexstore.PartitionSpec
+import org.apache.carbondata.core.locks.{CarbonLockFactory, CarbonLockUtil, 
ICarbonLock, LockUsage}
+import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, 
CarbonMetadata, SegmentFileStore}
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil
+import org.apache.carbondata.core.statusmanager.{LoadMetadataDetails, 
SegmentStatus, SegmentStatusManager}
+import org.apache.carbondata.core.util.{CarbonProperties, CarbonUtil}
+import org.apache.carbondata.core.util.path.{CarbonTablePath, TrashUtil}
+import org.apache.carbondata.processing.loading.TableProcessingOperations
+import org.apache.carbondata.processing.loading.model.CarbonLoadModel
+
+object CleanFilesUtil {
+  private val LOGGER = 
LogServiceFactory.getLogService(this.getClass.getCanonicalName)
+
+  /**
+   * The method deletes all data if forceTableClean  and clean garbage 
segment
+   * (MARKED_FOR_DELETE state) if forceTableClean 
+   *
+   * @param dbName : Database name
+   * @param tableName  : Table name
+   * @param tablePath  : Table path
+   * @param carbonTable: CarbonTable Object  in case of 
force clean
+   * @param forceTableClean:  for force clean it will delete all 
data
+   *it will clean garbage segment 
(MARKED_FOR_DELETE state)
+   * @param currentTablePartitions : Hive Partitions  details
+   */
+  def cleanFiles(
+dbName: String,
+tableName: String,
+tablePath: String,
+carbonTable: CarbonTable,
+forceTableClean: Boolean,
+currentTablePartitions: Option[Seq[PartitionSpec]] = None,
+truncateTable: Boolean = false): Unit = {
+var carbonCleanFilesLock: ICarbonLock = null
+val absoluteTableIdentifier = if (forceTableClean) {
+  AbsoluteTableIdentifier.from(tablePath, dbName, tableName, tableName)
+} else {
+  carbonTable.getAbsoluteTableIdentifier
+}
+try {
+  val errorMsg = "Clean files request is failed for " +
+s"$dbName.$tableName" +
+". Not able to acquire the clean files lock due to another clean files 
" +
+"operation is running in the background."
+  // in case of force clean the lock is not required
+  if (forceTableClean) {
+FileFactory.deleteAllCarbonFilesOfDir(
+  FileFactory.getCarbonFile(absoluteTableIdentifier.getTablePath))
+  } else {
+carbonCleanFilesLock =
+  CarbonLockUtil
+.getLockObject(absoluteTableIdentifier, 
LockUsage.CLEAN_FILES_LOCK, errorMsg)
+if (truncateTable) {
+  SegmentStatusManager.truncateTable(carbonTable)
+}
+SegmentStatusManager.deleteLoadsAndUpdateMetadata(
+  carbonTable, true, currentTablePartitions.map(_.asJava).orNull)
+CarbonUpdateUtil.cleanUpDeltaFiles(carbonTable, true)
+currentTablePartitions match {
+  case Some(partitions) =>
+SegmentFileStore.cleanSegments(
+  carbonTable,
+  currentTablePartitions.map(_.asJava).orNull,
+  true)
+  case _ =>
+}
+  }
+} finally {
+   

[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-28 Thread GitBox


vikramahuja1001 commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r513217586



##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/cleanfiles/CleanFilesUtil.scala
##
@@ -0,0 +1,406 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.cleanfiles
+
+import java.sql.Timestamp
+import java.util
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable.ListBuffer
+
+import org.apache.spark.sql.{AnalysisException, CarbonEnv, Row, SparkSession}
+import org.apache.spark.sql.index.CarbonIndexUtil
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.exception.ConcurrentOperationException
+import org.apache.carbondata.core.indexstore.PartitionSpec
+import org.apache.carbondata.core.locks.{CarbonLockFactory, CarbonLockUtil, 
ICarbonLock, LockUsage}
+import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, 
CarbonMetadata, SegmentFileStore}
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil
+import org.apache.carbondata.core.statusmanager.{LoadMetadataDetails, 
SegmentStatus, SegmentStatusManager}
+import org.apache.carbondata.core.util.{CarbonProperties, CarbonUtil}
+import org.apache.carbondata.core.util.path.{CarbonTablePath, TrashUtil}
+import org.apache.carbondata.processing.loading.TableProcessingOperations
+import org.apache.carbondata.processing.loading.model.CarbonLoadModel
+
+object CleanFilesUtil {
+  private val LOGGER = 
LogServiceFactory.getLogService(this.getClass.getCanonicalName)
+
+  /**
+   * The method deletes all data if forceTableClean  and clean garbage 
segment
+   * (MARKED_FOR_DELETE state) if forceTableClean 
+   *
+   * @param dbName : Database name
+   * @param tableName  : Table name
+   * @param tablePath  : Table path
+   * @param carbonTable: CarbonTable Object  in case of 
force clean
+   * @param forceTableClean:  for force clean it will delete all 
data
+   *it will clean garbage segment 
(MARKED_FOR_DELETE state)
+   * @param currentTablePartitions : Hive Partitions  details
+   */
+  def cleanFiles(
+dbName: String,
+tableName: String,
+tablePath: String,
+carbonTable: CarbonTable,
+forceTableClean: Boolean,
+currentTablePartitions: Option[Seq[PartitionSpec]] = None,
+truncateTable: Boolean = false): Unit = {
+var carbonCleanFilesLock: ICarbonLock = null
+val absoluteTableIdentifier = if (forceTableClean) {
+  AbsoluteTableIdentifier.from(tablePath, dbName, tableName, tableName)
+} else {
+  carbonTable.getAbsoluteTableIdentifier
+}
+try {
+  val errorMsg = "Clean files request is failed for " +
+s"$dbName.$tableName" +
+". Not able to acquire the clean files lock due to another clean files 
" +
+"operation is running in the background."
+  // in case of force clean the lock is not required
+  if (forceTableClean) {
+FileFactory.deleteAllCarbonFilesOfDir(
+  FileFactory.getCarbonFile(absoluteTableIdentifier.getTablePath))
+  } else {
+carbonCleanFilesLock =
+  CarbonLockUtil
+.getLockObject(absoluteTableIdentifier, 
LockUsage.CLEAN_FILES_LOCK, errorMsg)
+if (truncateTable) {
+  SegmentStatusManager.truncateTable(carbonTable)
+}
+SegmentStatusManager.deleteLoadsAndUpdateMetadata(
+  carbonTable, true, currentTablePartitions.map(_.asJava).orNull)
+CarbonUpdateUtil.cleanUpDeltaFiles(carbonTable, true)
+currentTablePartitions match {
+  case Some(partitions) =>
+SegmentFileStore.cleanSegments(
+  carbonTable,
+  currentTablePartitions.map(_.asJava).orNull,
+  true)
+  case _ =>
+}
+  }
+} finally {
+   

[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-28 Thread GitBox


vikramahuja1001 commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r513217211



##
File path: 
integration/spark/src/main/scala/org/apache/carbondata/cleanfiles/CleanFilesUtil.scala
##
@@ -0,0 +1,406 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.cleanfiles
+
+import java.sql.Timestamp
+import java.util
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable.ListBuffer
+
+import org.apache.spark.sql.{AnalysisException, CarbonEnv, Row, SparkSession}
+import org.apache.spark.sql.index.CarbonIndexUtil
+
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.exception.ConcurrentOperationException
+import org.apache.carbondata.core.indexstore.PartitionSpec
+import org.apache.carbondata.core.locks.{CarbonLockFactory, CarbonLockUtil, 
ICarbonLock, LockUsage}
+import org.apache.carbondata.core.metadata.{AbsoluteTableIdentifier, 
CarbonMetadata, SegmentFileStore}
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.mutate.CarbonUpdateUtil
+import org.apache.carbondata.core.statusmanager.{LoadMetadataDetails, 
SegmentStatus, SegmentStatusManager}
+import org.apache.carbondata.core.util.{CarbonProperties, CarbonUtil}

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3952: [CARBONDATA-4006] Fix for currentUser as NULL in getcount method during index server fallback mode

2020-10-28 Thread GitBox


CarbonDataQA1 commented on pull request #3952:
URL: https://github.com/apache/carbondata/pull/3952#issuecomment-717740899


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2956/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3952: [CARBONDATA-4006] Fix for currentUser as NULL in getcount method during index server fallback mode

2020-10-28 Thread GitBox


CarbonDataQA1 commented on pull request #3952:
URL: https://github.com/apache/carbondata/pull/3952#issuecomment-717740581


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4713/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-28 Thread GitBox


vikramahuja1001 commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r513215028



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonCleanFilesCommand.scala
##
@@ -80,40 +113,98 @@ case class CarbonCleanFilesCommand(
   }
 
   override def processData(sparkSession: SparkSession): Seq[Row] = {
-// if insert overwrite in progress, do not allow delete segment
-if (SegmentStatusManager.isOverwriteInProgressInTable(carbonTable)) {
-  throw new ConcurrentOperationException(carbonTable, "insert overwrite", 
"clean file")
+if (!isDryRun) {
+  // if insert overwrite in progress, do not allow delete segment
+  if (SegmentStatusManager.isOverwriteInProgressInTable(carbonTable)) {
+throw new ConcurrentOperationException(carbonTable, "insert 
overwrite", "clean file")
+  }
+  val operationContext = new OperationContext
+  val cleanFilesPreEvent: CleanFilesPreEvent =
+CleanFilesPreEvent(carbonTable,
+  sparkSession)

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-28 Thread GitBox


vikramahuja1001 commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r513215239



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonCleanFilesCommand.scala
##
@@ -80,40 +113,98 @@ case class CarbonCleanFilesCommand(
   }
 
   override def processData(sparkSession: SparkSession): Seq[Row] = {
-// if insert overwrite in progress, do not allow delete segment
-if (SegmentStatusManager.isOverwriteInProgressInTable(carbonTable)) {
-  throw new ConcurrentOperationException(carbonTable, "insert overwrite", 
"clean file")
+if (!isDryRun) {
+  // if insert overwrite in progress, do not allow delete segment
+  if (SegmentStatusManager.isOverwriteInProgressInTable(carbonTable)) {
+throw new ConcurrentOperationException(carbonTable, "insert 
overwrite", "clean file")
+  }
+  val operationContext = new OperationContext
+  val cleanFilesPreEvent: CleanFilesPreEvent =
+CleanFilesPreEvent(carbonTable,
+  sparkSession)
+  OperationListenerBus.getInstance.fireEvent(cleanFilesPreEvent, 
operationContext)
+  if (tableName.isDefined) {
+Checker.validateTableExists(databaseNameOp, tableName.get, 
sparkSession)
+if (forceTrashClean) {
+  CleanFilesUtil.deleteDataFromTrashFolder(carbonTable, sparkSession)
+} else {
+  // clear trash based on timestamp
+  CleanFilesUtil.deleteDataFromTrashFolderByTimeStamp(carbonTable, 
sparkSession)
+}
+if (forceTableClean) {
+  deleteAllData(sparkSession, databaseNameOp, tableName.get)
+} else {
+  cleanGarbageData(sparkSession, databaseNameOp, tableName.get)
+}
+// delete partial load and send them to trash
+TableProcessingOperations
+  .deletePartialLoadDataIfExist(carbonTable, false)
+// clean stash in metadata folder too
+deleteStashInMetadataFolder(carbonTable)
+  } else {
+cleanGarbageDataInAllTables(sparkSession)
+  }
+  if (cleanFileCommands != null) {
+cleanFileCommands.foreach(_.processData(sparkSession))
+  }
+  val cleanFilesPostEvent: CleanFilesPostEvent =
+CleanFilesPostEvent(carbonTable, sparkSession)
+  OperationListenerBus.getInstance.fireEvent(cleanFilesPostEvent, 
operationContext)
+  Seq.empty
+} else if (isDryRun && tableName.isDefined) {
+  // dry run, do not clean anything and do not delete trash too
+  CleanFilesUtil.cleanFilesDryRun(carbonTable, sparkSession)
+}
+else {

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-28 Thread GitBox


vikramahuja1001 commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r513214833



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonCleanFilesCommand.scala
##
@@ -80,40 +113,98 @@ case class CarbonCleanFilesCommand(
   }
 
   override def processData(sparkSession: SparkSession): Seq[Row] = {
-// if insert overwrite in progress, do not allow delete segment
-if (SegmentStatusManager.isOverwriteInProgressInTable(carbonTable)) {
-  throw new ConcurrentOperationException(carbonTable, "insert overwrite", 
"clean file")
+if (!isDryRun) {
+  // if insert overwrite in progress, do not allow delete segment
+  if (SegmentStatusManager.isOverwriteInProgressInTable(carbonTable)) {
+throw new ConcurrentOperationException(carbonTable, "insert 
overwrite", "clean file")
+  }
+  val operationContext = new OperationContext
+  val cleanFilesPreEvent: CleanFilesPreEvent =
+CleanFilesPreEvent(carbonTable,
+  sparkSession)
+  OperationListenerBus.getInstance.fireEvent(cleanFilesPreEvent, 
operationContext)
+  if (tableName.isDefined) {
+Checker.validateTableExists(databaseNameOp, tableName.get, 
sparkSession)
+if (forceTrashClean) {
+  CleanFilesUtil.deleteDataFromTrashFolder(carbonTable, sparkSession)
+} else {
+  // clear trash based on timestamp
+  CleanFilesUtil.deleteDataFromTrashFolderByTimeStamp(carbonTable, 
sparkSession)
+}
+if (forceTableClean) {
+  deleteAllData(sparkSession, databaseNameOp, tableName.get)
+} else {
+  cleanGarbageData(sparkSession, databaseNameOp, tableName.get)
+}
+// delete partial load and send them to trash
+TableProcessingOperations
+  .deletePartialLoadDataIfExist(carbonTable, false)
+// clean stash in metadata folder too
+deleteStashInMetadataFolder(carbonTable)
+  } else {
+cleanGarbageDataInAllTables(sparkSession)
+  }
+  if (cleanFileCommands != null) {
+cleanFileCommands.foreach(_.processData(sparkSession))
+  }
+  val cleanFilesPostEvent: CleanFilesPostEvent =
+CleanFilesPostEvent(carbonTable, sparkSession)
+  OperationListenerBus.getInstance.fireEvent(cleanFilesPostEvent, 
operationContext)
+  Seq.empty
+} else if (isDryRun && tableName.isDefined) {
+  // dry run, do not clean anything and do not delete trash too
+  CleanFilesUtil.cleanFilesDryRun(carbonTable, sparkSession)
+}
+else {
+  Seq.empty
 }
-val operationContext = new OperationContext
-val cleanFilesPreEvent: CleanFilesPreEvent =
-  CleanFilesPreEvent(carbonTable,
-sparkSession)
-OperationListenerBus.getInstance.fireEvent(cleanFilesPreEvent, 
operationContext)
-if (tableName.isDefined) {
-  Checker.validateTableExists(databaseNameOp, tableName.get, sparkSession)
-  if (forceTableClean) {
-deleteAllData(sparkSession, databaseNameOp, tableName.get)
+  }
+
+  def deleteStashInMetadataFolder(carbonTable: CarbonTable): Unit = {

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-28 Thread GitBox


vikramahuja1001 commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r513209768



##
File path: 
core/src/main/java/org/apache/carbondata/core/util/path/CarbonTablePath.java
##
@@ -47,6 +47,7 @@
   public static final String BATCH_PREFIX = "_batchno";
   private static final String LOCK_DIR = "LockFiles";
 
+  public static final String SEGMENTS_METADATA_FOLDER = "segments";

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-28 Thread GitBox


vikramahuja1001 commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r513205026



##
File path: 
processing/src/main/java/org/apache/carbondata/processing/loading/TableProcessingOperations.java
##
@@ -149,9 +132,171 @@ public static void 
deletePartialLoadDataIfExist(CarbonTable carbonTable,
   } finally {
 carbonTableStatusLock.unlock();
   }
+} else {
+
+  int retryCount = CarbonLockUtil
+  
.getLockProperty(CarbonCommonConstants.NUMBER_OF_TRIES_FOR_CONCURRENT_LOCK,
+  CarbonCommonConstants.NUMBER_OF_TRIES_FOR_CONCURRENT_LOCK_DEFAULT);
+  int maxTimeout = CarbonLockUtil
+  
.getLockProperty(CarbonCommonConstants.MAX_TIMEOUT_FOR_CONCURRENT_LOCK,
+  CarbonCommonConstants.MAX_TIMEOUT_FOR_CONCURRENT_LOCK_DEFAULT);
+  ICarbonLock carbonTableStatusLock = CarbonLockFactory
+  .getCarbonLockObj(carbonTable.getAbsoluteTableIdentifier(), 
LockUsage.TABLE_STATUS_LOCK);
+
+  try {
+if (carbonTableStatusLock.lockWithRetries(retryCount, maxTimeout)) {
+

Review comment:
   done

##
File path: 
processing/src/main/java/org/apache/carbondata/processing/loading/TableProcessingOperations.java
##
@@ -149,9 +132,171 @@ public static void 
deletePartialLoadDataIfExist(CarbonTable carbonTable,
   } finally {
 carbonTableStatusLock.unlock();
   }
+} else {
+

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-28 Thread GitBox


vikramahuja1001 commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r513205119



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonCleanFilesCommand.scala
##
@@ -80,40 +113,98 @@ case class CarbonCleanFilesCommand(
   }
 
   override def processData(sparkSession: SparkSession): Seq[Row] = {
-// if insert overwrite in progress, do not allow delete segment
-if (SegmentStatusManager.isOverwriteInProgressInTable(carbonTable)) {
-  throw new ConcurrentOperationException(carbonTable, "insert overwrite", 
"clean file")
+if (!isDryRun) {
+  // if insert overwrite in progress, do not allow delete segment
+  if (SegmentStatusManager.isOverwriteInProgressInTable(carbonTable)) {
+throw new ConcurrentOperationException(carbonTable, "insert 
overwrite", "clean file")
+  }
+  val operationContext = new OperationContext
+  val cleanFilesPreEvent: CleanFilesPreEvent =
+CleanFilesPreEvent(carbonTable,
+  sparkSession)
+  OperationListenerBus.getInstance.fireEvent(cleanFilesPreEvent, 
operationContext)
+  if (tableName.isDefined) {
+Checker.validateTableExists(databaseNameOp, tableName.get, 
sparkSession)
+if (forceTrashClean) {
+  CleanFilesUtil.deleteDataFromTrashFolder(carbonTable, sparkSession)
+} else {
+  // clear trash based on timestamp
+  CleanFilesUtil.deleteDataFromTrashFolderByTimeStamp(carbonTable, 
sparkSession)
+}
+if (forceTableClean) {
+  deleteAllData(sparkSession, databaseNameOp, tableName.get)
+} else {
+  cleanGarbageData(sparkSession, databaseNameOp, tableName.get)
+}
+// delete partial load and send them to trash
+TableProcessingOperations
+  .deletePartialLoadDataIfExist(carbonTable, false)
+// clean stash in metadata folder too
+deleteStashInMetadataFolder(carbonTable)
+  } else {
+cleanGarbageDataInAllTables(sparkSession)
+  }
+  if (cleanFileCommands != null) {
+cleanFileCommands.foreach(_.processData(sparkSession))
+  }
+  val cleanFilesPostEvent: CleanFilesPostEvent =
+CleanFilesPostEvent(carbonTable, sparkSession)
+  OperationListenerBus.getInstance.fireEvent(cleanFilesPostEvent, 
operationContext)
+  Seq.empty
+} else if (isDryRun && tableName.isDefined) {
+  // dry run, do not clean anything and do not delete trash too
+  CleanFilesUtil.cleanFilesDryRun(carbonTable, sparkSession)
+}
+else {
+  Seq.empty
 }
-val operationContext = new OperationContext
-val cleanFilesPreEvent: CleanFilesPreEvent =
-  CleanFilesPreEvent(carbonTable,
-sparkSession)
-OperationListenerBus.getInstance.fireEvent(cleanFilesPreEvent, 
operationContext)
-if (tableName.isDefined) {
-  Checker.validateTableExists(databaseNameOp, tableName.get, sparkSession)
-  if (forceTableClean) {
-deleteAllData(sparkSession, databaseNameOp, tableName.get)
+  }
+
+  def deleteStashInMetadataFolder(carbonTable: CarbonTable): Unit = {
+val tableStatusLock = CarbonLockFactory
+  .getCarbonLockObj(carbonTable.getAbsoluteTableIdentifier, 
LockUsage.TABLE_STATUS_LOCK)
+val carbonLoadModel = new CarbonLoadModel
+try {
+  if (tableStatusLock.lockWithRetries()) {
+val tableStatusFilePath = CarbonTablePath
+  .getTableStatusFilePath(carbonTable.getTablePath)
+val loadMetaDataDetails = SegmentStatusManager
+  .readTableStatusFile(tableStatusFilePath).filter(details => 
details.getSegmentStatus ==
+  SegmentStatus.SUCCESS || details.getSegmentStatus == 
SegmentStatus.LOAD_PARTIAL_SUCCESS)
+  .sortWith(_.getLoadName < _.getLoadName)
+
carbonLoadModel.setLoadMetadataDetails(loadMetaDataDetails.toList.asJava)
   } else {
-cleanGarbageData(sparkSession, databaseNameOp, tableName.get)
+throw new ConcurrentOperationException(carbonTable.getDatabaseName,
+  carbonTable.getTableName, "table status read", "clean files command")
   }
-} else {
-  cleanGarbageDataInAllTables(sparkSession)
+} finally {
+  tableStatusLock.unlock()
 }
-if (cleanFileCommands != null) {
-  cleanFileCommands.foreach(_.processData(sparkSession))
+val loadMetaDataDetails = carbonLoadModel.getLoadMetadataDetails.asScala
+val segmentFileList = loadMetaDataDetails.map(f => 
CarbonTablePath.getSegmentFilesLocation(
+  carbonTable.getTablePath) + CarbonCommonConstants.FILE_SEPARATOR + 
f.getSegmentFile)
+
+val metaDataPath = 
CarbonTablePath.getMetadataPath(carbonTable.getTablePath) +
+  CarbonCommonConstants.FILE_SEPARATOR + "segments"

Review comment:
   done





This is an automated message from the Apache Git 

[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-28 Thread GitBox


vikramahuja1001 commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r513204008



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/cleanfiles/TestCleanFileCommand.scala
##
@@ -0,0 +1,540 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.testsuite.cleanfiles
+
+import java.io.{File, PrintWriter}
+
+import scala.io.Source
+
+import org.apache.spark.sql.{CarbonEnv, Row}
+import org.apache.spark.sql.test.util.QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.impl.FileFactory
+
+class TestCleanFileCommand extends QueryTest with BeforeAndAfterAll {
+
+  var count = 0
+
+  test("clean up table and test trash folder with In Progress segments") {
+sql("""DROP TABLE IF EXISTS CLEANTEST""")
+sql("""DROP TABLE IF EXISTS CLEANTEST1""")
+sql(
+  """
+| CREATE TABLE cleantest (name String, id Int)
+| STORED AS carbondata
+  """.stripMargin)
+sql(s"""INSERT INTO CLEANTEST SELECT "abc", 1""")
+sql(s"""INSERT INTO CLEANTEST SELECT "abc", 1""")
+sql(s"""INSERT INTO CLEANTEST SELECT "abc", 1""")
+// run a select query before deletion
+checkAnswer(sql(s"""select count(*) from cleantest"""),
+  Seq(Row(3)))
+
+val path = CarbonEnv.getCarbonTable(Some("default"), 
"cleantest")(sqlContext.sparkSession)
+  .getTablePath
+val tableStatusFilePath = path + CarbonCommonConstants.FILE_SEPARATOR + 
"Metadata" +
+  CarbonCommonConstants.FILE_SEPARATOR + "tableStatus"
+editTableStatusFile(path)
+val trashFolderPath = path + CarbonCommonConstants.FILE_SEPARATOR +
+  CarbonCommonConstants.CARBON_TRASH_FOLDER_NAME
+
+assert(!FileFactory.isFileExist(trashFolderPath))
+val dryRun = sql(s"CLEAN FILES FOR TABLE cleantest 
OPTIONS('isDryRun'='true')").count()
+// dry run shows 3 segments to move to trash
+assert(dryRun == 3)
+
+sql(s"CLEAN FILES FOR TABLE cleantest").show
+
+checkAnswer(sql(s"""select count(*) from cleantest"""),
+  Seq(Row(0)))
+assert(FileFactory.isFileExist(trashFolderPath))
+var list = getFileCountInTrashFolder(trashFolderPath)
+assert(list == 6)
+
+val dryRun1 = sql(s"CLEAN FILES FOR TABLE cleantest 
OPTIONS('isDryRun'='true')").count()
+sql(s"CLEAN FILES FOR TABLE cleantest").show
+
+count = 0
+list = getFileCountInTrashFolder(trashFolderPath)
+// no carbondata file is added to the trash
+assert(list == 6)
+
+
+val timeStamp = getTimestampFolderName(trashFolderPath)
+
+// recovering data from trash folder
+sql(
+  """
+| CREATE TABLE cleantest1 (name String, id Int)
+| STORED AS carbondata
+  """.stripMargin)
+
+val segment0Path = trashFolderPath + CarbonCommonConstants.FILE_SEPARATOR 
+ timeStamp +
+  CarbonCommonConstants.FILE_SEPARATOR + CarbonCommonConstants.LOAD_FOLDER 
+ '0'
+val segment1Path = trashFolderPath + CarbonCommonConstants.FILE_SEPARATOR 
+ timeStamp +
+  CarbonCommonConstants.FILE_SEPARATOR + CarbonCommonConstants.LOAD_FOLDER 
+ '1'
+val segment2Path = trashFolderPath + CarbonCommonConstants.FILE_SEPARATOR 
+ timeStamp +
+  CarbonCommonConstants.FILE_SEPARATOR + CarbonCommonConstants.LOAD_FOLDER 
+ '2'
+
+sql(s"alter table cleantest1 add segment options('path'='$segment0Path'," +
+  s"'format'='carbon')").show()
+sql(s"alter table cleantest1 add segment options('path'='$segment1Path'," +
+  s"'format'='carbon')").show()
+sql(s"alter table cleantest1 add segment options('path'='$segment2Path'," +
+  s"'format'='carbon')").show()
+sql(s"""INSERT INTO CLEANTEST SELECT * from cleantest1""")
+
+// test after recovering data from trash
+checkAnswer(sql(s"""select count(*) from cleantest"""),
+  Seq(Row(3)))
+
+sql(s"CLEAN FILES FOR TABLE cleantest options('force'='true')").show
+count = 0
+list = getFileCountInTrashFolder(trashFolderPath)
+// no carbondata file is added to the trash
+assert(list == 0)
+sql("""DROP TABLE IF 

[GitHub] [carbondata] vikramahuja1001 commented on a change in pull request #3917: [CARBONDATA-3978] Clean Files Refactor and support for trash folder in carbondata

2020-10-28 Thread GitBox


vikramahuja1001 commented on a change in pull request #3917:
URL: https://github.com/apache/carbondata/pull/3917#discussion_r513198165



##
File path: 
core/src/main/java/org/apache/carbondata/core/util/CarbonProperties.java
##
@@ -2116,6 +2087,26 @@ public int getMaxSIRepairLimit(String dbName, String 
tableName) {
 return Math.abs(Integer.parseInt(thresholdValue));
   }
 
+  /**
+   * The below method returns the microseconds after which the trash folder 
will expire
+   */
+  public long getTrashFolderExpirationTime() {
+String configuredValue = 
getProperty(CarbonCommonConstants.CARBON_TRASH_EXPIRATION_DAYS,
+CarbonCommonConstants.CARBON_TRASH_EXPIRATION_DAYS_DEFAULT);
+Integer result = 0;
+try {
+  result = Integer.parseInt(configuredValue);
+  if (result < 0) {
+LOGGER.warn("Value of carbon.trash.expiration.days is negative, taking 
default value");
+result = Integer.parseInt(CARBON_TRASH_EXPIRATION_DAYS_DEFAULT);
+  }
+} catch (NumberFormatException e) {
+  LOGGER.error("Error happened while parsing", e);

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org