xkrogen commented on a change in pull request #35969:
URL: https://github.com/apache/spark/pull/35969#discussion_r838003558
##########
File path:
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSource.scala
##########
@@ -553,8 +553,12 @@ case class DataSource(
disallowWritingIntervals(data.schema.map(_.dataType),
forbidAnsiIntervals = true)
SaveIntoDataSourceCommand(data, dataSource, caseInsensitiveOptions,
mode)
case format: FileFormat =>
+ val shouldAllowEmptySchema =
+
sparkSession.sessionState.conf.getConf(SQLConf.ALLOW_EMPTY_SCHEMAS_FOR_WRITES)
disallowWritingIntervals(data.schema.map(_.dataType),
forbidAnsiIntervals = false)
- DataSource.validateSchema(data.schema)
+ if (!shouldAllowEmptySchema) {
+ DataSource.validateSchema(data.schema)
Review comment:
Completely skipping `validateSchema` looks wrong to me. If future
validations are added there, they will get skipped as well. I think we should
pass the conf to `validateSchema`, and then allow `validateSchema` to make a
decision about whether or not to check for empty schemas, like:
```
if (hasEmptySchema(schema) &&
!conf.getConf(SQLConf.ALLOW_EMPTY_SCHEMAS_FOR_WRITES)) {
throw
QueryCompilationErrors.writeEmptySchemasUnsupportedByDataSourceError()
}
```
##########
File path:
sql/core/src/test/scala/org/apache/spark/sql/FileBasedDataSourceSuite.scala
##########
@@ -128,7 +128,9 @@ class FileBasedDataSourceSuite extends QueryTest
}
allFileBasedDataSources.foreach { format =>
- test(s"SPARK-23372 error while writing empty schema files using $format") {
+ val emptySchemaValidationConf = SQLConf.ALLOW_EMPTY_SCHEMAS_FOR_WRITES.key
+ test(s"SPARK-23372 error while writing empty schema files " +
+ s"using $format and ${emptySchemaValidationConf} disabled") {
Review comment:
Maybe we should leave this test name as-is? Since disabled is the
default. It makes this code much more messy.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]