baibaichen opened a new issue, #11911:
URL: https://github.com/apache/gluten/issues/11911

   ## Backend
   VL (Velox)
   
   ## Bug description
   
   20 Structured Streaming test suites are disabled (TODO) for both Spark 4.0 
and 4.1. Additionally, 3 already-enabled streaming suites have 11 excludes 
related to streaming API changes.
   
   **Goal**: Enable SS test suites to run with GlutenPlugin loaded, allowing 
fallback to vanilla Spark where needed.
   
   Parent issue: #11550
   
   ### Disabled suites (20)
   
   19 suites use `GlutenSQLTestsTrait`, 1 uses `GlutenTestsCommonTrait`.
   
   | # | Suite | Priority |
   |---|-------|----------|
   | 1 | GlutenFileStreamSinkV2Suite | Simple (1 failure) |
   | 2 | GlutenMultiStatefulOperatorsSuite | Simple (2 failures / 10 tests) |
   | 3 | GlutenStreamingQueryHashPartitionVerifySuite | Simple (1 test, needs 
SPARK_HOME) |
   | 4 | GlutenEventTimeWatermarkSuite | Medium |
   | 5 | GlutenFileStreamSourceSuite | Medium |
   | 6 | GlutenStreamSuite | Medium (~66 tests) |
   | 7 | GlutenStreamingAggregationSuite | Medium |
   | 8 | GlutenStreamingAggregationDistributionSuite | Medium |
   | 9 | GlutenStreamingDeduplicationSuite | Medium |
   | 10 | GlutenStreamingDeduplicationDistributionSuite | Medium |
   | 11 | GlutenStreamingInnerJoinSuite | Medium |
   | 12 | GlutenStreamingOuterJoinSuite | Medium |
   | 13 | GlutenStreamingSessionWindowDistributionSuite | Medium |
   | 14 | GlutenStreamingStateStoreFormatCompatibilitySuite | Medium |
   | 15 | GlutenFlatMapGroupsWithStateSuite | Medium |
   | 16 | GlutenFlatMapGroupsWithStateDistributionSuite | Medium |
   | 17 | GlutenFlatMapGroupsInPandasWithStateDistributionSuite | Complex 
(Python/Pandas) |
   | 18 | GlutenRocksDBStateStoreFlatMapGroupsWithStateSuite | Follows #15 |
   | 19 | GlutenRocksDBStateStoreStreamingAggregationSuite | Follows #7 |
   | 20 | GlutenRocksDBStateStoreStreamingDeduplicationSuite | Follows #9 |
   
   ### Excludes in already-enabled suites (from #11400)
   
   - GlutenStreamRealTimeModeAllowlistSuite (3 excludes)
   - GlutenStreamRealTimeModeE2ESuite (7 excludes)
   - GlutenStreamRealTimeModeSuite (1 exclude)
   - SPARK-53942 stateful shuffle partitions (2 excludes)
   
   ### Root causes
   
   1. **Plan assertion failures** — GlutenPlugin replaces `ShuffleExchangeExec` 
with `ColumnarShuffleExchangeExec`, etc. Tests asserting specific plan nodes 
fail.
   2. **Checkpoint resource loading** — Golden files contain plan structures 
that don't match Gluten-transformed plans.
   3. **Spark 4.1 streaming API changes** — SPARK-53941 (AQE), SPARK-53233 
(package refactor), etc.
   
   ### Notes
   
   - `GlutenStreamingQueryHashPartitionVerifySuite` uses 
`GlutenTestsCommonTrait` intentionally — switching to `GlutenSQLTestsTrait` 
causes diamond inheritance conflict with `StreamTest`. The actual issue is that 
`getWorkspaceFilePath` requires `SPARK_HOME`.
   - Suggested order: start with #1-3 (simple), then #4-16 per suite, RocksDB 
variants (#18-20) follow parent suites.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to