Github user rahulforallp commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2128#discussion_r180079486
  
    --- Diff: 
integration/spark-common/src/main/scala/org/apache/carbondata/api/CarbonStore.scala
 ---
    @@ -151,13 +152,88 @@ object CarbonStore {
             }
           }
         } finally {
    +      if (currentTablePartitions.equals(None)) {
    +        cleanUpPartitionFoldersRecurssively(carbonTable, 
List.empty[PartitionSpec])
    +      } else {
    +        cleanUpPartitionFoldersRecurssively(carbonTable, 
currentTablePartitions.get.toList)
    +      }
    +
           if (carbonCleanFilesLock != null) {
             CarbonLockUtil.fileUnlock(carbonCleanFilesLock, 
LockUsage.CLEAN_FILES_LOCK)
           }
         }
         LOGGER.audit(s"Clean files operation is success for 
$dbName.$tableName.")
       }
     
    +  /**
    +   * delete partition folders recurssively
    +   *
    +   * @param carbonTable
    +   * @param partitionSpecList
    +   */
    +  def cleanUpPartitionFoldersRecurssively(carbonTable: CarbonTable,
    +      partitionSpecList: List[PartitionSpec]): Unit = {
    +    if (carbonTable != null) {
    +      val loadMetadataDetails = SegmentStatusManager
    --- End diff --
    
    1. partition folders cannot be deleted, as there is no way to check if new 
dataload is using them. ==> Done
    2. Shouldnot take multiple snapshots of file system during clean files. ==> 
earlier we are not taking snapshot recurssively . so it required here for 
partition folders.
    3. Partition location will be valid for partitions inside table path also, 
those folders should not be scanned twice. ==> Done
    4. CarbonFile interface should be used for filesystem operations. ==> Done


---

Reply via email to