Re: [PR] [SPARK-53762][SQL] Add date and time conversions simplifier rule to optimizer [spark]

via GitHub Tue, 30 Sep 2025 20:19:47 -0700


dongjoon-hyun commented on code in PR #52493:
URL: https://github.com/apache/spark/pull/52493#discussion_r2393333446



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala:
##########
@@ -1142,6 +1142,52 @@ object SimplifyCaseConversionExpressions extends 
Rule[LogicalPlan] {
   }
 }
 
+/**
+ * Removes date and time related functions that are unnecessary.
+ */
+object SimplifyDateTimeConversions extends Rule[LogicalPlan] {
+  def apply(plan: LogicalPlan): LogicalPlan = plan.transformWithPruning(
+    _.containsPattern(DATETIME), ruleId) {
+    case q: LogicalPlan => q.transformExpressionsUpWithPruning(
+      _.containsPattern(DATETIME), ruleId) {
+      case DateFormatClass(
+          GetTimestamp(
+            e @ DateFormatClass(
+              _,
+              pattern,
+              timeZoneId),
+            pattern2,
+            TimestampType,
+            _,
+            timeZoneId2,
+            _),

Review Comment:
   Just a question. Maybe, the following is better?
   
   ```scala
          case DateFormatClass(
              GetTimestamp(
   -            e @ DateFormatClass(
   -              _,
   -              pattern,
   -              timeZoneId),
   +            e @ DateFormatClass(_, pattern, timeZoneId),
                pattern2,
                TimestampType,
                _,
   ```



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala:
##########
@@ -1142,6 +1142,52 @@ object SimplifyCaseConversionExpressions extends 
Rule[LogicalPlan] {
   }
 }
 
+/**
+ * Removes date and time related functions that are unnecessary.
+ */
+object SimplifyDateTimeConversions extends Rule[LogicalPlan] {
+  def apply(plan: LogicalPlan): LogicalPlan = plan.transformWithPruning(
+    _.containsPattern(DATETIME), ruleId) {
+    case q: LogicalPlan => q.transformExpressionsUpWithPruning(
+      _.containsPattern(DATETIME), ruleId) {
+      case DateFormatClass(
+          GetTimestamp(
+            e @ DateFormatClass(
+              _,
+              pattern,
+              timeZoneId),
+            pattern2,
+            TimestampType,
+            _,
+            timeZoneId2,
+            _),
+          pattern3,
+          timeZoneId3)
+          if pattern.semanticEquals(pattern2) && 
pattern.semanticEquals(pattern3)
+            && timeZoneId == timeZoneId2 && timeZoneId == timeZoneId3 =>
+        e
+      case GetTimestamp(
+          DateFormatClass(
+            e @ GetTimestamp(
+              _,
+              pattern,
+              TimestampType,
+              _,
+              timeZoneId,
+              _),
+            pattern2,
+            timeZoneId2),
+          pattern3,
+          TimestampType,

Review Comment:
   ditto. Maybe?
   
   ```scala
          case GetTimestamp(
              DateFormatClass(
   -            e @ GetTimestamp(
   -              _,
   -              pattern,
   -              TimestampType,
   -              _,
   -              timeZoneId,
   -              _),
   +            e @ GetTimestamp(_, pattern, TimestampType, _, timeZoneId, _),
                pattern2,
                timeZoneId2),
   ```



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/expressions.scala:
##########
@@ -1142,6 +1142,52 @@ object SimplifyCaseConversionExpressions extends 
Rule[LogicalPlan] {
   }
 }
 
+/**
+ * Removes date and time related functions that are unnecessary.

Review Comment:
   Do you think we can elaborate more what is the definition of `uncecessary` 
as of now? It would be great an itemized list because we can add more when we 
improve `SimplifyDateTimeConversions` in the future.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/RuleIdCollection.scala:
##########
@@ -175,6 +175,7 @@ object RuleIdCollection {
       "org.apache.spark.sql.catalyst.optimizer.RewriteAsOfJoin" ::
       "org.apache.spark.sql.catalyst.optimizer.SimplifyBinaryComparison" ::
       
"org.apache.spark.sql.catalyst.optimizer.SimplifyCaseConversionExpressions" ::
+      "org.apache.spark.sql.catalyst.optimizer.SimplifyDateTimeConversions" ::

Review Comment:
   Although this order is not strict, can we move this new rule after 
`SimplifyConditionals`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-53762][SQL] Add date and time conversions simplifier rule to optimizer [spark]

Reply via email to