xuanyuanking commented on a change in pull request #31638:
URL: https://github.com/apache/spark/pull/31638#discussion_r585636401
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/FileStreamSink.scala
##########
@@ -40,17 +41,31 @@ object FileStreamSink extends Logging {
* be read.
*/
def hasMetadata(path: Seq[String], hadoopConf: Configuration, sqlConf:
SQLConf): Boolean = {
- path match {
- case Seq(singlePath) =>
- val hdfsPath = new Path(singlePath)
- val fs = hdfsPath.getFileSystem(hadoopConf)
- if (fs.isDirectory(hdfsPath)) {
- val metadataPath = getMetadataLogPath(fs, hdfsPath, sqlConf)
- fs.exists(metadataPath)
- } else {
- false
- }
- case _ => false
+ if (sqlConf.getConf(SQLConf.FILE_SINK_FORMAT_CHECK_ENABLED)) {
Review comment:
It's a really reasonable concern. Let me share more details about the
context:
- It's a regression after SPARK-26824 (you may find after the fix we move
the `isDirectory` and `exists` checking out of try catch block). In our user
case, the exception was thrown only with the long glob path. The same code
passed in Spark 2.4.
- The exception was thrown of the checking `isDirecoty`. However, we might
not be sure that it's always the same behavior since the `FileSystem` has
different implementation in different systems (we met this in a non-hdfs file
system).
So based on the two points above, I just chose a safer way to fix this
issue. The current fix, a new config and dealing with the glob path exception,
should carefully back to the behavior before SPARK-26824, and without new
behavior changes.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]