Re: [PR] [SPARK-53805][SQL] Push Variant into DSv2 scan [spark]

via GitHub Sat, 18 Oct 2025 11:48:01 -0700


singhpk234 commented on code in PR #52522:
URL: https://github.com/apache/spark/pull/52522#discussion_r2412012200



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PushVariantIntoScan.scala:
##########
@@ -279,6 +280,8 @@ object PushVariantIntoScan extends Rule[LogicalPlan] {
       relation @ LogicalRelationWithTable(
       hadoopFsRelation@HadoopFsRelation(_, _, _, _, _: ParquetFileFormat, _), 
_)) =>
         rewritePlan(p, projectList, filters, relation, hadoopFsRelation)
+      case p@PhysicalOperation(projectList, filters, relation: 
DataSourceV2Relation) =>
+        rewriteV2RelationPlan(p, projectList, filters, relation.output, 
relation)

Review Comment:
   if we are sending the relation already do we need to send the 
relation.output seperately ? 



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkOptimizer.scala:
##########
@@ -40,11 +40,11 @@ class SparkOptimizer(
       SchemaPruning,
       GroupBasedRowLevelOperationScanPlanning,
       V1Writes,
+      PushVariantIntoScan,

Review Comment:
   now PushVariantIntoScan runs before the PruneFileSourcePartition, which i 
think was for v1 sources, does this matter or if i were to ask did we just like 
add in later, just because it was a new rule ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-53805][SQL] Push Variant into DSv2 scan [spark]

Reply via email to