Github user gvramana commented on a diff in the pull request:

    https://github.com/apache/carbondata/pull/2128#discussion_r180034323
  
    --- Diff: 
integration/spark-common/src/main/scala/org/apache/carbondata/api/CarbonStore.scala
 ---
    @@ -151,13 +152,88 @@ object CarbonStore {
             }
           }
         } finally {
    +      if (currentTablePartitions.equals(None)) {
    +        cleanUpPartitionFoldersRecurssively(carbonTable, 
List.empty[PartitionSpec])
    +      } else {
    +        cleanUpPartitionFoldersRecurssively(carbonTable, 
currentTablePartitions.get.toList)
    +      }
    +
           if (carbonCleanFilesLock != null) {
             CarbonLockUtil.fileUnlock(carbonCleanFilesLock, 
LockUsage.CLEAN_FILES_LOCK)
           }
         }
         LOGGER.audit(s"Clean files operation is success for 
$dbName.$tableName.")
       }
     
    +  /**
    +   * delete partition folders recurssively
    +   *
    +   * @param carbonTable
    +   * @param partitionSpecList
    +   */
    +  def cleanUpPartitionFoldersRecurssively(carbonTable: CarbonTable,
    +      partitionSpecList: List[PartitionSpec]): Unit = {
    +    if (carbonTable != null) {
    +      val loadMetadataDetails = SegmentStatusManager
    --- End diff --
    
    1. partition folders cannot be deleted, as there is no way to check if new 
dataload is using them.
    2. Shouldnot take multiple snapshots of file system during clean files.
    3. Partition location will be valid for partitions inside table path also, 
those folders should not be scanned twice.
    4. CarbonFile interface should be used for filesystem operations.


---

Reply via email to