[GitHub] [spark] maropu commented on pull request #28383: [SPARK-31590][SQL] Metadata-only queries should not include subquery in partition filters

GitBox Mon, 04 May 2020 18:25:09 -0700


maropu commented on pull request #28383:
URL: https://github.com/apache/spark/pull/28383#issuecomment-623791024



   > Applying OptimizeMetadataOnlyQuery rule will generate scalar-subquery.
   
   Is this statement true? It seems the test query itself has a subquery.
   ```
   // Analyzed plan of the test query
   Aggregate [partcol1#40], [partcol1#40, max(partcol2#41) AS partcol2#71]
   +- Filter ((partcol1#40 = scalar-subquery#70 []) AND (partcol2#41 = even))
      :  +- Aggregate [max(partcol1#40) AS max(partcol1)#73]
      :     +- SubqueryAlias spark_catalog.default.srcpart
      :        +- Relation[col1#38,col2#39,partcol1#40,partcol2#41] parquet
      +- SubqueryAlias spark_catalog.default.srcpart
         +- Relation[col1#38,col2#39,partcol1#40,partcol2#41] parquet
   ```
   I think the root cause is just that unsupported `partitionFilters` 
(subquery) is passed into `FileIndex.listFiles`.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] maropu commented on pull request #28383: [SPARK-31590][SQL] Metadata-only queries should not include subquery in partition filters

Reply via email to