singhpk234 commented on code in PR #52522:
URL: https://github.com/apache/spark/pull/52522#discussion_r2412012200


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PushVariantIntoScan.scala:
##########
@@ -279,6 +280,8 @@ object PushVariantIntoScan extends Rule[LogicalPlan] {
       relation @ LogicalRelationWithTable(
       hadoopFsRelation@HadoopFsRelation(_, _, _, _, _: ParquetFileFormat, _), 
_)) =>
         rewritePlan(p, projectList, filters, relation, hadoopFsRelation)
+      case p@PhysicalOperation(projectList, filters, relation: 
DataSourceV2Relation) =>
+        rewriteV2RelationPlan(p, projectList, filters, relation.output, 
relation)

Review Comment:
   if we are sending the relation already do we need to send the 
relation.output seperately ? 



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/SparkOptimizer.scala:
##########
@@ -40,11 +40,11 @@ class SparkOptimizer(
       SchemaPruning,
       GroupBasedRowLevelOperationScanPlanning,
       V1Writes,
+      PushVariantIntoScan,

Review Comment:
   now PushVariantIntoScan runs before the PruneFileSourcePartition, which i 
think was for v1 sources, does this matter or if i were to ask did we just like 
add in later, just because it was a new rule ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to