Github user ericl commented on the pull request:

    https://github.com/apache/spark/pull/12248#issuecomment-207557823
  
    > Your change is about passing around a Properties, right? You can simply 
access such an object anywhere you need to and it will be sent around as 
needed. There is nothing to do actually, not even explicitly setting it as some 
context property.
    
    That's not true, if you access a static `Properties` object within an 
executor node it won't have the value you set in the driver, since closures 
only capture variables in lexical scope.
    
    > Your example however seems to be about configuring some global 
per-function behavior, not sending props. In this example, why would the 
library not call setLogLevel internally, either in static initialization or as 
needed when any method is invoked -- why would the caller have to do it?
    
    It's more about configuring behavior based on some property set by some 
upstream caller of the function. The idea is that the user wants to configure 
loglevel just for this job, without impacting any other jobs potentially 
running on the cluster.
    
    > But, how is this helped by adding an additional Properties parameter?
    
    Sorry, I should have made the example more explicit. setLogLevel would be 
implemented in the driver side as `sc.setLocalProperty("mylib.loglevel", 
level)`. On the executor side the library would query 
`TaskContext.getLocalProperty("mylib.loglevel")` to determine the verbosity of 
debug logs.
    
    I think more generally that this adds a mechanism for passing values 
implicitly without requiring the user (that is writing Spark code) to manually 
reference it in each of their closures. You are right that this can be achieved 
via other mechanisms, but those may not be convenient or practical for the use 
case e.g. if you want to integrate with something like 
[X-trace](http://www.x-trace.net/wiki/doku.php) (which out of the scope of this 
PR, but would be easy to add once we have the mechanism).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to