[GitHub] [carbondata] akashrn5 commented on a change in pull request #3935: [CARBONDATA-3993] Remove auto data deletion in IUD processs
akashrn5 commented on a change in pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#discussion_r505226878 ## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/CarbonProjectForDeleteCommand.scala ## @@ -149,14 +149,10 @@ private[sql] case class CarbonProjectForDeleteCommand( case e: HorizontalCompactionException => LOGGER.error("Delete operation passed. Exception in Horizontal Compaction." + " Please check logs. " + e.getMessage) -CarbonUpdateUtil.cleanStaleDeltaFiles(carbonTable, e.compactionTimeStamp.toString) Seq(Row(0L)) case e: Exception => LOGGER.error("Exception in Delete data operation " + e.getMessage, e) -// ** start clean up. Review comment: i dont think we can remove this directly, it might create problem, as mentioned in comments ## File path: integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/DeleteExecution.scala ## @@ -374,8 +374,6 @@ object DeleteExecution { blockMappingVO.getSegmentNumberOfBlockMapping) } } else { - // In case of failure , clean all related delete delta files - CarbonUpdateUtil.cleanStaleDeltaFiles(carbonTable, timestamp) Review comment: same as above This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on a change in pull request #3935: [CARBONDATA-3993] Remove auto data deletion in IUD processs
akashrn5 commented on a change in pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#discussion_r503037126 ## File path: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ## @@ -267,9 +266,8 @@ object CarbonDataRDDFactory { throw new Exception("Exception in compaction " + exception.getMessage) } } finally { - executor.shutdownNow() try { -compactor.deletePartialLoadsInCompaction() Review comment: @Pickupolddriver We cannot remove the clean stale files in case of IUD and wait for clean files command to clean them, we should immediately clean the stale ones in the respective command itself, as there will be chances of extra data or data inconsistency. @QiangCai we can avoid this may be once we implement the writing the update data to new segment and writing only the delete delta files to the updated segment. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on a change in pull request #3935: [CARBONDATA-3993] Remove auto data deletion in IUD processs
akashrn5 commented on a change in pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#discussion_r503037126 ## File path: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ## @@ -267,9 +266,8 @@ object CarbonDataRDDFactory { throw new Exception("Exception in compaction " + exception.getMessage) } } finally { - executor.shutdownNow() try { -compactor.deletePartialLoadsInCompaction() Review comment: > We cannot remove the clean stale files in case of IUD and wait for clean files command to clean them, we should immediately clean the stale ones in the respective command itself, as there will be chances of extra data or data inconsistency. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] akashrn5 commented on a change in pull request #3935: [CARBONDATA-3993] Remove auto data deletion in IUD processs
akashrn5 commented on a change in pull request #3935: URL: https://github.com/apache/carbondata/pull/3935#discussion_r503036710 ## File path: integration/spark/src/main/scala/org/apache/carbondata/spark/rdd/CarbonDataRDDFactory.scala ## @@ -267,9 +266,8 @@ object CarbonDataRDDFactory { throw new Exception("Exception in compaction " + exception.getMessage) } } finally { - executor.shutdownNow() try { -compactor.deletePartialLoadsInCompaction() Review comment: @QiangCai how its handled now, without list files? why cant we do list files with the timestamp filter, which is load timestamp/fact timestamp, we can get from load model or somewhere right? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org