flaming-archer commented on code in PR #4528:
URL: https://github.com/apache/kyuubi/pull/4528#discussion_r2097463026


##########
extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/read/HiveScanBuilder.scala:
##########
@@ -37,6 +37,8 @@ case class HiveScanBuilder(
       catalogTable = table,
       dataSchema = dataSchema,
       readDataSchema = readDataSchema(),
-      readPartitionSchema = readPartitionSchema())
+      readPartitionSchema = readPartitionSchema(),
+      partitionFilters = partitionFilters,
+      dataFilters = dataFilters)

Review Comment:
   > Yes, `dataFilters` has no effect on correctness.
   > 
   > KSHC does not support pushdown filters to file sources, so, in fact, both 
`dataFilters` and `pushedFilters` are useless.
   
   I understand that this PR cannot solve pushedfilters, only partitionFilters 
are used, dataFilters are still not used. Is that right? DataFilters are never 
used , see 
   ```
    override def listFiles(
         partitionFilters: Seq[Expression],
         dataFilters: Seq[Expression]): Seq[PartitionDirectory] = {
       def isNonEmptyFile(f: FileStatus): Boolean = {
         isDataPath(f.getPath) && f.getLen > 0
       }
       val selectedPartitions =
         if (partitionSpec().partitionColumns.isEmpty) {
           HiveConnectorUtils.createPartitionDirectory(
             InternalRow.empty,
             allFiles().filter(isNonEmptyFile)) :: Nil
         } else {
           if (recursiveFileLookup) {
             throw new IllegalArgumentException(
               "Datasource with partition do not allow recursive file loading.")
           }
           prunePartitions(partitionFilters, partitionSpec()).map {
             case partPath @ PartitionPath(values, path) =>
               val files: Seq[FileStatus] = leafDirToChildrenFiles.get(path) 
match {
                 case Some(existingDir) =>
                   // Directory has children files in it, return them
                   existingDir.filter(f => matchPathPattern(f) && 
isNonEmptyFile(f))
   
                 case None =>
                   // Directory does not exist, or has no children files
                   Nil
               }
               val partDir = 
HiveConnectorUtils.createPartitionDirectory(values, files)
               // Update Partition Directory -> binding Hive part map
               updatePartDirHivePartitionMapping(partDir, partPath)
   
               partDir
           }
         }
       logTrace("Selected files after partition pruning:\n\t" + 
selectedPartitions.mkString("\n\t"))
       selectedPartitions
     }
   ```
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscr...@kyuubi.apache.org
For additional commands, e-mail: notifications-h...@kyuubi.apache.org

Reply via email to