andygrove opened a new pull request, #4328:
URL: https://github.com/apache/datafusion-comet/pull/4328

   ## Which issue does this PR close?
   
   Closes #3900.
   
   ## Rationale for this change
   
   Comet provides limited benefit when its shuffle manager is not registered, 
because shuffle typically dominates query runtime. When users enable Comet but 
forget to set 
`spark.shuffle.manager=org.apache.spark.sql.comet.execution.shuffle.CometShuffleManager`,
 they may believe Comet is accelerating their workload while actually most of 
the runtime is spent in Spark's default shuffle.
   
   The discussion in #3900 noted that for testing — for example, measuring scan 
performance in isolation — it is sometimes useful to run Comet with Spark's 
default shuffle manager. To preserve that workflow, this change is gated by an 
opt-out config rather than being unconditional.
   
   ## What changes are included in this PR?
   
   - New config `spark.comet.exec.shuffle.required` (default `true`).
   - `CometSparkSessionExtensions.isCometLoaded` returns `false` and logs a 
warning when `shuffle.required=true` and `CometShuffleManager` is not 
registered.
   - Updated the existing `isCometLoaded` test (sets `shuffle.required=false`) 
and added a new test covering all three states (required-but-missing → 
disabled, opted-out → enabled, required-and-set → enabled).
   
   ## How are these changes tested?
   
   - `CometSparkSessionExtensionsSuite` — 7/7 pass, including the new test.
   - `CometExpressionSuite` smoke run — 127/127 pass (`CometTestBase` already 
sets the shuffle manager, so default behavior is unaffected).
   - `configs.md` is auto-generated at doc-build time, so the new config will 
appear in the user-facing config reference automatically.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to