PHILO-HE commented on code in PR #6672:
URL: https://github.com/apache/incubator-gluten/pull/6672#discussion_r1708740211
##########
backends-velox/src/main/scala/org/apache/gluten/backendsapi/velox/VeloxBackend.scala:
##########
@@ -73,7 +74,14 @@ object VeloxBackendSettings extends BackendSettingsApi {
format: ReadFileFormat,
fields: Array[StructField],
partTable: Boolean,
+ rootPaths: Seq[String],
paths: Seq[String]): ValidationResult = {
+ if (
+ !rootPaths.isEmpty &&
!VeloxFileSystemValidationJniWrapper.supportedPaths(rootPaths.toArray)
Review Comment:
Can we just extract the scheme to validate? For example, if `rootPaths
consists of multiple HDFS files, we only need to let scheme "hdfs://" be
validated on native side, not one file by one file validation.
In addition, as local file system is always supported by Gluten + Velox.
Maybe, we can have a fast path to directly let validation pass on scala side
for such case.
##########
backends-velox/src/test/scala/org/apache/gluten/execution/FallbackSuite.scala:
##########
@@ -263,4 +264,31 @@ class FallbackSuite extends
VeloxWholeStageTransformerSuite with AdaptiveSparkPl
}
}
}
+
+ test("fallback reader with unsupported filesystem") {
+ withTempPath {
+ path =>
+ withSQLConf(GlutenConfig.NATIVE_WRITER_ENABLED.key -> "false") {
+ spark
+ .range(100)
+ .selectExpr("cast(id % 9 as int) as c1")
+ .write
+ .format("parquet")
+ .save(path.getCanonicalPath)
+ runQueryAndCompare(s"SELECT count(*) FROM
`parquet`.`${path.getCanonicalPath}`") {
+ df =>
+ val plan = df.queryExecution.executedPlan
+ val fileScan = collect(plan) { case s:
FileSourceScanExecTransformer => s }
+ assert(fileScan.size == 1)
+ val rootPaths = fileScan(0).getRootPathsInternal
+ assert(rootPaths.length == 1)
+ assert(rootPaths(0).startsWith("file:/"))
Review Comment:
Nit:
-> "file://"
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]