Github user andrewor14 commented on a diff in the pull request:
https://github.com/apache/spark/pull/2432#discussion_r18316503
--- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala ---
@@ -1278,8 +1298,8 @@ class SparkContext(config: SparkConf) extends Logging
{
private def postApplicationStart() {
// Note: this code assumes that the task scheduler has been
initialized and has contacted
// the cluster manager to get an application ID (in case the cluster
manager provides one).
- listenerBus.post(SparkListenerApplicationStart(appName,
taskScheduler.applicationId(),
- startTime, sparkUser))
+ listenerBus.post(SparkListenerApplicationStart(appName,
+ Some(taskScheduler.applicationId().toString), startTime, sparkUser))
--- End diff --
I see. The fact that it has to be an `Option` for a different reason why it
had to be an `Option` in the first place. Here we need it to default to `None`
if we load it from a log that doesn't have this field, but before this patch it
would also be `None` if the scheduler may not provide us with an ID. I guess
that means we can never change this then...
I think at the very least we should add a comment above
`SparkListenerApplicationStart` explaining why application ID is an option.
It's not at all intuitive to me why it has to be an option when all live
applications will always pass in an ID.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]