andygrove opened a new issue, #4751: URL: https://github.com/apache/datafusion-comet/issues/4751
Triage pass over the open `requires-triage` queue, per the project [Bug Triage Guide](https://github.com/apache/datafusion-comet/blob/main/docs/source/contributor-guide/bug_triage.md). - Total issues processed: 11 (7 triaged, 4 skipped, 0 failed) - Priority counts applied: `priority:critical` 2, `priority:medium` 5 - Guide: [docs/source/contributor-guide/bug_triage.md](https://github.com/apache/datafusion-comet/blob/main/docs/source/contributor-guide/bug_triage.md) Labels have already been applied and `requires-triage` removed from each issue listed under "Triaged". A reviewer should spot-check the calls and close this issue when satisfied. To correct a label, edit the affected issue directly. ## Triaged ### priority:critical - Comet produces bloated results in comparison with Spark ([#4723](https://github.com/apache/datafusion-comet/issues/4723)) - Area labels: `area:scan`, `area:aggregation` - Rationale: `priority:critical` was preserved as set by the reporter; `SUM` aggregation over native Iceberg scans silently returns wrong (inflated) values and the user can compare per-row Spark vs Comet output, matching the guide's decision-tree step 1 (silent wrong results). - [Bug] percentile: DataFusion quantizes interpolation weight to 6 decimal places ([#4719](https://github.com/apache/datafusion-comet/issues/4719)) - Area labels: `area:expressions`, `area:aggregation` - Rationale: silent wrong result vs Spark for the native `percentile` / `median` / `percentile_cont` path (bounded by `(upper - lower) * 1e-6`, no fallback or warning); decision-tree step 1. ### priority:medium - Comet Native scan in Azure fails with workload identity (ignores ABFS configs and env vars) ([#4747](https://github.com/apache/datafusion-comet/issues/4747)) - Area labels: `area:scan` - Rationale: native scan drops `fs.azure.*` configs and `AZURE_*` env vars so abfss reads fail with HTTP 400 from IMDS; the user can work around it by disabling Comet's native scan, so this is a functional gap with a workaround (decision-tree step 3). - [Bug] AVG(decimal) over a window always falls back to Spark on Spark 4.x (AvgDecimal window branch is dead) ([#4731](https://github.com/apache/datafusion-comet/issues/4731)) - Area labels: `area:expressions` - Rationale: results stay correct via Spark fallback; the reporter notes this is a coverage gap, not a correctness divergence, so `priority:medium` per decision-tree step 3. - bug: shutdown `jni_api::TOKIO_RUNTIME` on exit ([#4725](https://github.com/apache/datafusion-comet/issues/4725)) - Area labels: `area:ffi` - Rationale: occasional JVM hang at shutdown because tokio runtime workers attached to the JVM remain as non-daemon threads; affects benchmark/test exit, not query results, with a kill-the-JVM workaround — functional bug with workaround per decision-tree step 3. - Support fully-native multi-stage (distinct-combined) collect_list / collect_set ([#4724](https://github.com/apache/datafusion-comet/issues/4724)) - Area labels: `area:expressions`, `area:aggregation` - Rationale: multi-stage `collect_list`/`collect_set` with a `PartialMerge` would crash natively (`Cast error: Cannot cast LIST to non-list data type Binary`), but PR #4720 already added fallback guards that route this shape through Spark — so today this is a feature gap with a working fallback, matching decision-tree step 3. - Table Provider API ([#4706](https://github.com/apache/datafusion-comet/issues/4706)) - Area labels: `area:scan` - Rationale: enhancement to abstract scan APIs so additional table providers (Delta, etc.) can be supported; missing-feature gap with the existing per-provider integration as a workaround. ## Skipped — needs more info - Bug triage results: 2026-06-22 ([#4705](https://github.com/apache/datafusion-comet/issues/4705)) - Prior triage summary issue (auto-labeled `requires-triage`); meta, awaiting human review and closure, not a bug. - Bug triage results: 2026-06-11 ([#4625](https://github.com/apache/datafusion-comet/issues/4625)) - Prior triage summary issue (auto-labeled `requires-triage`); meta, awaiting human review and closure, not a bug. - Bug triage results: 2026-06-01 ([#4548](https://github.com/apache/datafusion-comet/issues/4548)) - Prior triage summary issue (auto-labeled `requires-triage`); meta, awaiting human review and closure, not a bug. - Bug triage results: 2026-05-26 ([#4441](https://github.com/apache/datafusion-comet/issues/4441)) - Prior triage summary issue (auto-labeled `requires-triage`); meta, awaiting human review and closure, not a bug. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
