akashrn5 commented on a change in pull request #4072:
URL: https://github.com/apache/carbondata/pull/4072#discussion_r577533482



##########
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/management/CarbonCleanFilesCommand.scala
##########
@@ -37,11 +38,24 @@ case class CarbonCleanFilesCommand(
     databaseNameOp: Option[String],
     tableName: String,
     options: Map[String, String] = Map.empty,
+    dryRun: Boolean,
     isInternalCleanCall: Boolean = false)
   extends DataCommand {
 
   val LOGGER: Logger = 
LogServiceFactory.getLogService(this.getClass.getCanonicalName)
 
+  override def output: Seq[AttributeReference] = {
+    if (dryRun) {
+      Seq(
+        AttributeReference("Size Freed", LongType, nullable = false)(),
+        AttributeReference("Trash Data Remaining", LongType, nullable = 
false)())
+    } else {
+      Seq(
+        AttributeReference("Size Freed", LongType, nullable = false)(),
+        AttributeReference("Trash Data Remaining", LongType, nullable = 
false)())
+    }

Review comment:
       if else both blocks are same? i think better to give these rows only in 
case of dry run

##########
File path: 
integration/spark/src/main/scala/org/apache/carbondata/trash/DataTrashManager.scala
##########
@@ -87,13 +101,28 @@ object DataTrashManager {
     }
   }
 
-  private def checkAndCleanTrashFolder(carbonTable: CarbonTable, 
isForceDelete: Boolean): Unit = {
+  def cleanFilesDryRunOperation (
+      carbonTable: CarbonTable,
+      isForceDelete: Boolean,
+      cleanStaleInProgress: Boolean,
+      partitionSpecs: Option[Seq[PartitionSpec]] = None): Seq[Long] = {
+    // get size freed from the trash folder
+    val trashFolderSizeStats = checkAndCleanTrashFolder(carbonTable, 
isForceDelete, isDryRun = true)
+    // get size that will be deleted (MFD, COmpacted, Inprogress segments)
+    val expiredSegmentsSizeStats = dryRunOnExpiredSegments(carbonTable, 
isForceDelete,
+      cleanStaleInProgress, partitionSpecs)
+    Seq(trashFolderSizeStats.head + expiredSegmentsSizeStats.head, 
trashFolderSizeStats(1) +
+        expiredSegmentsSizeStats(1))
+  }
+
+  private def checkAndCleanTrashFolder(carbonTable: CarbonTable, 
isForceDelete: Boolean,
+      isDryRun: Boolean): Seq[Long] = {

Review comment:
       i think we are mixing the dry run option also along with forcedelete, 
and making this complex with code and combination handling, what i think is, 
when user say dry run, it should be clear that i dont take any other options 
and i just tell user in return how much and what i am going to clean, thats 
all, we will not delete or clear any files when dry run. So it will be easy to 
code and cleaner, may be new class or a new method in clean files command 
class. What you guys think @ajantha-bhat @QiangCai 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to