schenksj commented on PR #3932:
URL: 
https://github.com/apache/datafusion-comet/pull/3932#issuecomment-4276098364

   ### Delta 3.3.2 full-regression status update
   
   Ran the Delta 3.3.2 full test suite against `delta-kernel-phase-1` (tip 
`25085889`) to establish a baseline.
   
   **Totals (7h 55m run):**
   - Total tests: **13,612**
   - Passed: **13,437**
   - Failed: **139** (175 before the latest fix)
   - Canceled: 5
   - Ignored: 3,683
   
   **Progress since the run:**
   - `OptimizeGeneratedColumnSuite` — 36 parameterized partition-expression 
tests were failing because the suite has its own private 
`getPushedPartitionFilters` helper that matches only `FileSourceScanExec`, and 
Comet rewrites that into `CometScanExec` / `CometDeltaNativeScanExec`. Patched 
the test-helper in `dev/diffs/delta/3.3.2.diff` to unwrap both Comet variants 
(same approach already used in `TestsStatistics.scala` and 
`ScanReportHelper.scala`). Verified in isolation: **54/54 tests pass** 
(previously 36/54 were failing). Committed as `25085889`.
   
   **Remaining 139 fails, grouped by cluster (triaged):**
   | Cluster | Count | Triage |
   |---|---|---|
   | `IdentityColumnAdmissionScalaSuite` MERGE UPSERT (parameterized) | ~16 | 
Not yet investigated |
   | `DeletionVectorsWithPredicatePushdownSuite` | ~15 | Partition-count 
assertion — behavioral (Comet scan produces different splits) |
   | `DeltaSource* / DeltaSourceLargeLogSuite` streaming | ~14 | Streaming DF 
returns 0 rows where 2 expected |
   | `DeletionVectorsSuite` core DV reads + `checkpoint with DVs` | ~14 | DV 
path-prefix (`test%dv%prefix-…`) not applied in Comet scan path |
   | `DeltaParquetFileFormatSuite` DV metadata columns | 8 | 
`native_delta_compat` scan not materialising injected 
`__delta_internal_is_row_deleted` |
   | `OptimizeMetadataOnlyDeltaQuery*ColumnMappingSuite` joins/windows | 8 | 
AQE `setLogicalLinkForNewQueryStage` assertion — Comet scan replacement loses 
plan metadata |
   | `DeltaSuite` partition-skip (×3) + `DeltaSinkSuite` partitioned reading 
(×3) | 6 | `numFiles` metric lives on `CometScanExec`, not its `wrapped` — 
unwrap alone isn't enough |
   | Misc singletons (TIMESTAMP_NTZ partition, NOT NULL file-writing, 
DescribeDeltaHistory, metric suites, known pollution `Column DEFAULT…negative`) 
| ~15 | Per-suite investigations |
   | `DeltaErrorsSuite` "Validate that links to docs are correct" | 1 | 
External: `docs.delta.io` now serves HTTP 301 — not a Comet bug, would fail 
vanilla too |
   | Rest | ~42 | Mostly id-mode parameterisations of already-noted clusters + 
schema-evolution variants |
   
   **Next steps:**
   1. Pick a cluster (leaning towards the DV-path-prefix cluster first — it 
affects 14+ tests and the root cause is already localised to the Comet scan 
path, so one fix should unblock several suites).
   2. Continue with the other scan-metric / AQE-link / DV-metadata-column 
clusters — these are Comet-code changes rather than test-diff changes, so each 
will get its own isolated validation run before the next full regression.
   3. Re-run the full Delta 3.3.2 regression once a non-trivial batch of 
clusters is resolved, to check for new interactions.
   
   Full log + report: `dev/regression-logs/full-20260418-113345.log` (+ 
`.report.txt`).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to