andygrove commented on code in PR #255:
URL:
https://github.com/apache/arrow-datafusion-comet/pull/255#discussion_r1560132834
##########
spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala:
##########
@@ -90,6 +98,32 @@ class CometSparkSessionExtensions
scanExec.copy(scan = cometScan),
runtimeFilters = scanExec.runtimeFilters)
+ // unsupported parquet data source V2
+ case scanExec: BatchScanExec if
scanExec.scan.isInstanceOf[ParquetScan] =>
+ val requiredSchema =
scanExec.scan.asInstanceOf[ParquetScan].readDataSchema
+ val info1 = if (isSchemaSupported(requiredSchema)) {
+ CometExplainInfo(s"Schema $requiredSchema is not supported")
+ } else {
+ CometExplainInfo.none
+ }
Review Comment:
We could use Scala's `Option` type here rather than defining our own `none`
constant. For example:
```scala
val info1 = if (isSchemaSupported(requiredSchema)) {
Some(CometExplainInfo(s"Schema $requiredSchema is not
supported"))
} else {
None
}
```
We would also need to update the call to pass the list of reasons to
`opWithInfo` to use `flatten` so that we only pass the valid reasons.
```scala
opWithInfo(scanExec, CometExplainInfo("SCAN", Seq(info1, info2,
info3).flatten))
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]