Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3486#issuecomment-73004081
  
    Hi @ksakellis,
    
    This patch looks like it's in pretty good shape overall.
    
    When displaying the UI in the HistoryServer or the standalone Master's 
display of completed application UIs, the executor log links might not always 
work (e.g. if the worker serving the logs is dead).  This will sometimes work, 
though, since it might be common for the standalone worker or Yarn node daemons 
to outlive the application.  I don't see a great solution for this, but I'm 
open to any ideas / suggestions.  I don't think this should necessarily block 
this PR, though.
    
    I think I understand the rationale for why we use environment variables for 
passing the log viewer links: the CoarseGrainedSchedulerBackend, which posts 
the ExecutorAdded event, is used in both Yarn and Standalone modes and 
therefore can't know how to map executor ids to container log paths (since it 
doesn't know about node manager URLs).  However, each deployment mode has a 
different backend for launching executors, so we can determine the URL there 
and pass it back to the driver.  I agree with Andrew that this dataflow seems a 
bit non-intuitive, but I don't necessarily see a better way to handle this 
without modifying a bunch of other components.  I don't think that the current 
design precludes us from changing this in the future.  We _are_ kind of locking 
ourselves into exposing the logName => logUrl map in a listener event, but this 
seems reasonable (this is independent of how we get the data backing this event 
to the source where the event is posted to the bus).
    
    Two minor code changes that we need:
    
    - The `private[ui]` that I removed turns out to be necessary for MiMa 
checks to pass.  Can you add this back in and add comment explaining why this 
is needed?
    - Could you revert 04e410e3d51619af2a24d7e97fe5c876698deaf0, my patch for 
the JsonProtocol compatibility, since it turns out that we don't need this 
because we only modify new-in-1.3 features?
    - If you don't mind, could you squash my changes into a single commit so 
that the merge script attributes authorship properly by default?  I can also 
fix up the authorship attribution myself if I merge this PR.
    
    Sorry for the mess due to my earlier changes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to