Re: [PR] feat: Add extended explain info to Comet plan [arrow-datafusion-comet]

via GitHub Wed, 10 Apr 2024 15:45:46 -0700


andygrove commented on code in PR #255:
URL: 
https://github.com/apache/arrow-datafusion-comet/pull/255#discussion_r1560132834



##########
spark/src/main/scala/org/apache/comet/CometSparkSessionExtensions.scala:
##########
@@ -90,6 +98,32 @@ class CometSparkSessionExtensions
               scanExec.copy(scan = cometScan),
               runtimeFilters = scanExec.runtimeFilters)
 
+          // unsupported parquet data source V2
+          case scanExec: BatchScanExec if 
scanExec.scan.isInstanceOf[ParquetScan] =>
+            val requiredSchema = 
scanExec.scan.asInstanceOf[ParquetScan].readDataSchema
+            val info1 = if (isSchemaSupported(requiredSchema)) {
+              CometExplainInfo(s"Schema $requiredSchema is not supported")
+            } else {
+              CometExplainInfo.none
+            }

Review Comment:
   We could use Scala's `Option` type here rather than defining our own `none` 
constant. For example:
   
   ```scala
               val info1 = if (isSchemaSupported(requiredSchema)) {
                 Some(CometExplainInfo(s"Schema $requiredSchema is not 
supported"))
               } else {
                 None
               }
   ```
   
   We would also need to update the call to pass the list of reasons to 
`opWithInfo` to use `flatten` so that we only pass the valid reasons.
   
   ```scala
   opWithInfo(scanExec, CometExplainInfo("SCAN", Seq(info1, info2, 
info3).flatten))
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] feat: Add extended explain info to Comet plan [arrow-datafusion-comet]

Reply via email to