viirya commented on code in PR #186:
URL:
https://github.com/apache/arrow-datafusion-comet/pull/186#discussion_r1519004412
##########
spark/src/test/scala/org/apache/comet/exec/CometNativeShuffleSuite.scala:
##########
@@ -64,8 +64,9 @@ class CometNativeShuffleSuite extends CometTestBase with
AdaptiveSparkPlanHelper
val path = new Path(dir.toURI.toString, "test.parquet")
makeParquetFileAllTypes(path, dictionaryEnabled = dictionaryEnabled,
1000)
var allTypes: Seq[Int] = (1 to 20)
- if (isSpark34Plus) {
- allTypes = allTypes.filterNot(Set(14, 17).contains)
+ if (!isSpark34Plus) {
+ // TODO: Remove this once after
https://github.com/apache/arrow/issues/40038 is fixed
+ allTypes = allTypes.filterNot(Set(14).contains)
}
Review Comment:
Besides, after this change, Comet native operator can be after
`CometExchange`. So it triggers the known bug in Java Arrow on column `_14`. I
exclude the column for Spark 3.2 and 3.3.
```
== Physical Plan ==
AdaptiveSparkPlan isFinalPlan=true
+- == Final Plan ==
*(1) ColumnarToRow
+- CometProject [_1#556], [_1#556]
+- ShuffleQueryStage 0
+- CometExchange hashpartitioning(_13#568, 10), REPARTITION_BY_NUM,
CometNativeShuffle, [plan_id=838]
+- CometScan parquet [_1#556,_13#568] Batched: true,
DataFilters: [], Format: CometParquet, Location: InMemoryFileIndex(1
paths)[file:/Users/liangchi/repos/arrow-datafusion-comet/spark/target/tmp/spa...,
PartitionFilter
s: [], PushedFilters: [], ReadSchema: struct<_1:boolean,_13:string>
+- == Initial Plan ==
CometProject [_1#556], [_1#556]
+- CometExchange hashpartitioning(_13#568, 10), REPARTITION_BY_NUM,
CometNativeShuffle, [plan_id=830]
+- CometScan parquet [_1#556,_13#568] Batched: true, DataFilters: [],
Format: CometParquet, Location: InMemoryFileIndex(1
paths)[file:/Users/liangchi/repos/arrow-datafusion-comet/spark/target/tmp/spa...,
PartitionFilters: [],
PushedFilters: [], ReadSchema: struct<_1:boolean,_13:string>
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]