Hi, I am trying to better understand the code for Parquet support. In particular i got lost trying to understand ParquetRelation and ParquetRelation2. Does ParquetRelation2 is the new code that should completely remove ParquetRelation? ( I think there is some remark in the code notifying this )
Assuming i am using spark.sql.parquet.filterPushdown = true spark.sql.parquet.useDataSourceApi = true I saw that method buildScan from newParquet.scala has filtering push down into Parquet, but i also saw that there is filtering and projection push down from ParquetOperations inside SparkStrategies.scala However every time i debug it, the object ParquetOperations extends Strategy { def apply(plan: LogicalPlan): Seq[SparkPlan] = plan match { .......... Never evaluated to case PhysicalOperation(projectList, filters: Seq[Expression], relation: ParquetRelation) => In which cases it will match this case? Also, where is the code for Parquet projection and filter push down, is it inside ParquetOperations in SparkStrategies.scala or inside buildScan of newParquet.scala? Or both? If so i am not sure how it works... Thanks, Gil.