sunchao commented on a change in pull request #29387:
URL: https://github.com/apache/spark/pull/29387#discussion_r474967593



##########
File path: core/src/main/scala/org/apache/spark/util/Utils.scala
##########
@@ -269,6 +269,26 @@ private[spark] object Utils extends Logging {
     file.setExecutable(true, true)
   }
 
+  /**
+   * Move data to trash on truncate table given
+   * spark.sql.truncate.trash.enabled is true
+   */
+  def moveToTrashIfEnabled(
+      fs: FileSystem,
+      partitionPath: Path,
+      isTrashEnabled: Boolean,
+      hadoopConf: Configuration): Unit = {
+    if (isTrashEnabled && hadoopConf.getInt("fs.trash.interval", 0) > 0) {

Review comment:
       hmm, what if client side config is non-positive but server side is 
positive. In this case the server side one should be honored according to the 
Hadoop trash behavior. Perhaps we can skip the second check here and clearly 
document the behavior.

##########
File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala
##########
@@ -513,7 +515,7 @@ case class TruncateTableCommand(
             }
           }
 
-          fs.delete(path, true)
+          Utils.moveToTrashIfEnabled(fs, path, isTrashEnabled, hadoopConf)

Review comment:
       do we need to have test cases to cover when the flag is on or off?

##########
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
##########
@@ -2701,6 +2701,15 @@ object SQLConf {
       .booleanConf
       .createWithDefault(false)
 
+  val TRUNCATE_TRASH_ENABLED =
+    buildConf("spark.sql.truncate.trash.enabled")
+      .doc("This Configuration will decide whether move files to trash on 
truncate table given, " +

Review comment:
       nit: maybe we can rephrase this as: "This configuration decides when 
truncating table, whether data files will be moved to trash directory or 
deleted permanently. The trash retention time is controlled by 
`fs.trash.interval`, and in default, the server side configuration value takes 
precedence over the client-side one. Note that if `fs.trash.interval` is 
non-positive, this will be a no-op and log a warning message."




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to