[GitHub] [spark] cloud-fan commented on a diff in pull request #39408: [SPARK-41896][SQL] Filtering by row index returns empty results

GitBox Tue, 10 Jan 2023 06:23:57 -0800


cloud-fan commented on code in PR #39408:
URL: https://github.com/apache/spark/pull/39408#discussion_r1065845336



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningAwareFileIndex.scala:
##########
@@ -76,28 +76,35 @@ abstract class PartitioningAwareFileIndex(
     // be applied to files.
     val fileMetadataFilterOpt = dataFilters.filter { f =>
       f.references.nonEmpty && f.references.forall {
-        case FileSourceMetadataAttribute(_) => true
+        case FileSourceConstantMetadataAttribute(_) => true
         case _ => false
       }
     }.reduceOption(expressions.And)
 
-    // - create a bound references for filters: put the metadata struct at 0 
position for each file
-    // - retrieve the final metadata struct (could be pruned) from filters
+    // - Retrieve all required metadata attributes and put them into a sequence
+    // - Bind all file constant metadata attribute references to their 
respective index
+    val requiredMetadataColumnNames: mutable.Buffer[String] = 
mutable.Buffer.empty
     val boundedFilterMetadataStructOpt = fileMetadataFilterOpt.map { 
fileMetadataFilter =>
-      val metadataStruct = fileMetadataFilter.references.head.dataType
-      val boundedFilter = 
Predicate.createInterpreted(fileMetadataFilter.transform {
-        case _: AttributeReference => BoundReference(0, metadataStruct, 
nullable = true)
+      val metadataStruct = fileMetadataFilter.references.head

Review Comment:
   hmm, I thought the attrs are already flattened and there is no struct anymore



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cloud-fan commented on a diff in pull request #39408: [SPARK-41896][SQL] Filtering by row index returns empty results

Reply via email to