Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/12248#issuecomment-207559966
  
    @srowen, I think that the main use-case for this feature is associating 
metadata associated with a Spark action / execution and making that metadata 
accessible in that action's tasks. 
    
    For instance, let's say that I run a Spark SQL query and want to propagate 
some metadata related to that query execution from the driver to the executors 
for use in tracing / debugging / instrumentation. Maybe I want to propagate a 
label associated with all tasks launched from the job, such as a job group 
name, and read that label in a custom log appender so that my log messages from 
those tasks contain that metadata.
    
    In this case, the actual RDD code isn't controlled by the user and they 
don't really have a place to interpose broadcast variables or other custom code 
for propagating this metadata.
    
    Even the user's library code were to use broadcast variables and define 
thread-local variables, etc., then they'd have to worry about some subtleties 
related to Spark's internal threading model: for example, thread-locals need to 
be handled carefully to make sure that they're correctly propagated across 
thread-boundaries in PythonRDD, RRDD, ScriptTransformation, PipedRDD, etc., and 
the set of places where you'd need to do that propagation corresponds exactly 
to the set of places where we already happen to be propagating the TaskContext 
thread-local.
    
    Given that `localProperties` is already a stable public API, I think it 
makes sense to make those properties accessible in tasks, since it seems like a 
small and logical extension of an existing API.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to