MaxGekk commented on code in PR #56677:
URL: https://github.com/apache/spark/pull/56677#discussion_r3460524493


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/finishAnalysis.scala:
##########
@@ -119,8 +119,15 @@ object ComputeCurrentTime extends Rule[LogicalPlan] {
     val currentDates = collection.mutable.HashMap.empty[ZoneId, Literal]
     val localTimestamps = collection.mutable.HashMap.empty[ZoneId, Literal]
 
+    // The CAST bit is included so this rule can find TIME -> TIMESTAMP_NTZ 
casts (which depend on
+    // CURRENT_DATE) and stabilize them below. CAST is a broad pattern, so 
this widens the rule's
+    // traversal to most plans; the precise `Cast.isTimeToTimestampNTZ` guard 
keeps the rewrite
+    // scoped. We intentionally do not tag these casts with CURRENT_LIKE 
instead: inline-table
+    // validation treats CURRENT_LIKE as safe to defer, so tagging would let 
unrelated non-foldable
+    // NTZ-target casts (e.g. CAST(rand() AS TIMESTAMP_NTZ)) bypass that 
validation (see SPARK-57618
+    // and ResolveInlineTablesSuite).
     def transformCondition(treePatternbits: TreePatternBits): Boolean = {
-      treePatternbits.containsPattern(CURRENT_LIKE)
+      treePatternbits.containsPattern(CURRENT_LIKE) || 
treePatternbits.containsPattern(CAST)

Review Comment:
    The challenge with a dedicated Cast-tagged pattern is that to make it 
actually narrow — i.e. only TIME -> TIMESTAMP_NTZ — nodePatternsInternal() 
would have to inspect child.dataType. That's exactly the path we removed in 
51136ecb5e2: node patterns are computed eagerly at construction, before the 
child is resolved, and child.dataType throws (including on an OuterReference 
that reports resolved == true but wraps an unresolved attribute, which broke 
CREATE FUNCTION ... RETURNS TABLE). The only safe key is the always-available 
target type, which tags every cast to the NTZ family — not just TIME-sourced 
ones — so it's only modestly narrower than CAST while adding a new global 
TreePattern enum entry for a rule that runs once per query.
   
    We also can't reuse CURRENT_LIKE for the tag: inline-table validation 
treats CURRENT_LIKE as safe-to-defer, so tagging let CAST(rand() AS 
TIMESTAMP_NTZ) bypass validation (1b97171f869). Pruning on CAST with the 
precise isTimeToTimestampNTZ guard on the resolved plan is the configuration 
that's both safe and correct. If the traversal cost turns out to matter I'm 
happy to add a target-keyed dedicated pattern, but it buys little over CAST 
here — would you be OK leaving it as-is?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to