j1wonpark opened a new pull request, #56682:
URL: https://github.com/apache/spark/pull/56682

   ### What changes were proposed in this pull request?
   
   This PR makes the Spark Connect UI tab render in the Spark History Server 
(SHS). Two changes:
   
   1. Register `SparkConnectServerHistoryServerPlugin` via the 
`AppHistoryServerPlugin` SPI (a new 
`META-INF/services/org.apache.spark.status.AppHistoryServerPlugin` resource in 
the `connect/server` module), so that SHS discovers it through `ServiceLoader`, 
as the SQL, Streaming and Hive Thrift Server plugins already do.
   2. Make `SparkConnectServerListener` read its configuration from the 
`SparkConf` passed to its constructor instead of `SparkEnv.get.conf`.
   
   ### Why are the changes needed?
   
   The Spark Connect UI page added in SPARK-44394 works on a live driver but 
never appears in the History Server:
   
   - `SparkConnectServerHistoryServerPlugin` is implemented but is not 
registered in `META-INF/services`, so SHS never loads it via `ServiceLoader`. 
The live UI works only because `SparkConnectService` registers the tab and 
listener directly.
   - Even once the plugin is registered, the listener's constructor reads 
configuration via `SparkEnv.get.conf`. There is no active `SparkEnv` during SHS 
replay, so `SparkEnv.get` returns `null` and the listener throws an NPE in 
`FsHistoryProvider.rebuildAppStore`, failing the whole application UI with HTTP 
500:
   
   ```
   java.lang.NullPointerException: Cannot invoke 
"org.apache.spark.SparkEnv.conf()" because the return value of 
"org.apache.spark.SparkEnv$.get()" is null
        at 
org.apache.spark.sql.connect.ui.SparkConnectServerListener.<init>(SparkConnectServerListener.scala)
        at 
org.apache.spark.sql.connect.ui.SparkConnectServerHistoryServerPlugin.createListeners(SparkConnectServerHistoryServerPlugin.scala)
        at 
org.apache.spark.deploy.history.FsHistoryProvider.rebuildAppStore(FsHistoryProvider.scala)
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes. The Spark Connect tab (sessions / executions) now renders in the 
History Server when replaying event logs of a Spark Connect application. 
Previously the tab was never shown there. This is a change relative to released 
versions and master.
   
   ### How was this patch tested?
   
   - Added a unit test in `SparkConnectServerListenerSuite` that constructs the 
listener with no active `SparkEnv` (simulating History Server replay). It 
reproduces the NPE before the fix and passes after.
   - Updated the existing test helper to set the UI-retention configs on the 
`SparkConf` passed to the listener (previously set on `SparkEnv.get.conf`, 
which the listener no longer reads).
   - Manually verified on a History Server that the Connect tab renders from 
Spark Connect event logs.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   Generated-by: Claude Code (Claude Opus 4.8)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to