baibaichen opened a new issue, #11911: URL: https://github.com/apache/gluten/issues/11911
## Backend VL (Velox) ## Bug description 20 Structured Streaming test suites are disabled (TODO) for both Spark 4.0 and 4.1. Additionally, 3 already-enabled streaming suites have 11 excludes related to streaming API changes. **Goal**: Enable SS test suites to run with GlutenPlugin loaded, allowing fallback to vanilla Spark where needed. Parent issue: #11550 ### Disabled suites (20) 19 suites use `GlutenSQLTestsTrait`, 1 uses `GlutenTestsCommonTrait`. | # | Suite | Priority | |---|-------|----------| | 1 | GlutenFileStreamSinkV2Suite | Simple (1 failure) | | 2 | GlutenMultiStatefulOperatorsSuite | Simple (2 failures / 10 tests) | | 3 | GlutenStreamingQueryHashPartitionVerifySuite | Simple (1 test, needs SPARK_HOME) | | 4 | GlutenEventTimeWatermarkSuite | Medium | | 5 | GlutenFileStreamSourceSuite | Medium | | 6 | GlutenStreamSuite | Medium (~66 tests) | | 7 | GlutenStreamingAggregationSuite | Medium | | 8 | GlutenStreamingAggregationDistributionSuite | Medium | | 9 | GlutenStreamingDeduplicationSuite | Medium | | 10 | GlutenStreamingDeduplicationDistributionSuite | Medium | | 11 | GlutenStreamingInnerJoinSuite | Medium | | 12 | GlutenStreamingOuterJoinSuite | Medium | | 13 | GlutenStreamingSessionWindowDistributionSuite | Medium | | 14 | GlutenStreamingStateStoreFormatCompatibilitySuite | Medium | | 15 | GlutenFlatMapGroupsWithStateSuite | Medium | | 16 | GlutenFlatMapGroupsWithStateDistributionSuite | Medium | | 17 | GlutenFlatMapGroupsInPandasWithStateDistributionSuite | Complex (Python/Pandas) | | 18 | GlutenRocksDBStateStoreFlatMapGroupsWithStateSuite | Follows #15 | | 19 | GlutenRocksDBStateStoreStreamingAggregationSuite | Follows #7 | | 20 | GlutenRocksDBStateStoreStreamingDeduplicationSuite | Follows #9 | ### Excludes in already-enabled suites (from #11400) - GlutenStreamRealTimeModeAllowlistSuite (3 excludes) - GlutenStreamRealTimeModeE2ESuite (7 excludes) - GlutenStreamRealTimeModeSuite (1 exclude) - SPARK-53942 stateful shuffle partitions (2 excludes) ### Root causes 1. **Plan assertion failures** — GlutenPlugin replaces `ShuffleExchangeExec` with `ColumnarShuffleExchangeExec`, etc. Tests asserting specific plan nodes fail. 2. **Checkpoint resource loading** — Golden files contain plan structures that don't match Gluten-transformed plans. 3. **Spark 4.1 streaming API changes** — SPARK-53941 (AQE), SPARK-53233 (package refactor), etc. ### Notes - `GlutenStreamingQueryHashPartitionVerifySuite` uses `GlutenTestsCommonTrait` intentionally — switching to `GlutenSQLTestsTrait` causes diamond inheritance conflict with `StreamTest`. The actual issue is that `getWorkspaceFilePath` requires `SPARK_HOME`. - Suggested order: start with #1-3 (simple), then #4-16 per suite, RocksDB variants (#18-20) follow parent suites. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
