LantaoJin opened a new pull request, #48240:
URL: https://github.com/apache/spark/pull/48240

   ### What changes were proposed in this pull request?
   When the drop list of `DataFrameDropColumns` contains an 
UnresolvedAttribute. Current rule mistakenly resolve the column with its 
grand-children's output attributes.
   In dataframe/dataset API application, issue cannot be encountered since the 
`dropList` are all AttributeReferences.
   But when we use Spark LogicalPlan, the bug will be encountered, the 
UnresolvedAttribute in dropList cannot work.
   
   
   ### Why are the changes needed?
   In `ResolveDataFrameDropColumns`
   ```scala
         val dropped = d.dropList.map {
           case u: UnresolvedAttribute =>
             resolveExpressionByPlanChildren(u, d.child)   //mistakenly resolve 
the column with its grand-children's output attributes
           case e => e
         }
   ```
   To fix it, change to `resolveExpressionByPlanChildren(u, d)` or 
`resolveExpressionByPlanOutput(u, d.child)`
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   UT added.
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to