It would be a pleasure. That said, what do you think about adding the non-inheritable feature? I think that would be a big win for everything that doesn't specifically need Inheritability.
On Friday, April 15, 2016, Reynold Xin <r...@databricks.com> wrote: > I think this was added a long time ago by me in order to make certain > things work for Shark (good old times ...). You are probably right that by > now some apps depend on the fact that this is inheritable, and changing > that could break them in weird ways. > > Do you mind documenting this, and also add a test case? > > > On Wed, Apr 13, 2016 at 6:15 AM, Marcin Tustin <mtus...@handybook.com > <javascript:_e(%7B%7D,'cvml','mtus...@handybook.com');>> wrote: > >> *Tl;dr: *SparkContext.setLocalProperty is implemented with >> InheritableThreadLocal. >> This has unexpected consequences, not least because the method >> documentation doesn't say anything about it: >> >> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkContext.scala#L605 >> >> I'd like to propose that we do one of: (1) document explicitly that these >> properties are inheritable; (2) stop them being inheritable; or (3) >> introduce the option to set these in a non-inheritable way. >> >> *Motivation: *This started with me investigating a last vestige of the >> leaking spark.sql.execution.id issue in Spark 1.5.2 (it's not >> reproducible under controlled conditions, and given the many and excellent >> fixes on this issue it's completely mysterious that this hangs around; the >> bug itself is largely beside the point). >> >> The specific contribution that inheritable localProperties makes to this >> problem is that if a localProperty like spark.sql.execution.id leaks >> (i.e. remains set when it shouldn't) because those properties are inherited >> by spawned threads, that pollution affects all subsequently spawned threads. >> >> This doesn't sound like a big deal - why would worker threads be spawning >> other threads? It turns out that Java's ThreadPoolExecutor has worker >> threads spawn other worker threads (it has no master dispatcher thread; the >> workers themselves run all the housekeeping). JavaDoc here: >> https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadPoolExecutor.html >> and source code here: >> http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/8u40-b25/java/util/concurrent/ThreadPoolExecutor.java#ThreadPoolExecutor >> >> Accordingly, if using Scala Futures and any kind of thread pool that >> comes built-in with Java, it's impossible to avoid localproperties >> propagating haphazardly to different threads. For localProperties >> explicitly set by user code this isn't nice, and requires work arounds like >> explicitly clearing known properties at the start of every future, or in a >> beforeExecute hook on the threadpool. For leaky properties the work around >> is pretty much the same: defensively clear them in the threadpool. >> >> *Options:* >> (0) Do nothing at all. Unattractive, because documenting this would still >> be better; >> (1) Update the scaladoc to explicitly say that localProperties are >> inherited by spawned threads and note that caution should be exercised with >> thread pools. >> (2) Switch to using ordinary, non-inheritable thread locals. I assume >> this would break something for somebody, but if not, this would be my >> preferred option. Also a very simple change to implement if no-one is >> relying on property inheritance. >> (3) Introduce a second localProperty facility which is not inherited. >> This would not break any existing code, and should not be too hard to >> implement. localProperties which need cleanup could be migrated to using >> this non-inheritable facility, helping to limit the impact of failing to >> clean up. >> The way I envisage this working is that non-inheritable localProperties >> would be checked first, then inheritable, then global properties. >> >> *Actions:* >> I'm happy to do the coding and open such Jira tickets as desirable or >> necessary. Before I do any of that, I'd like to know if there's any support >> for this, and ideally secure a committer who can help shepherd this change >> through. >> >> Marcin Tustin >> >> Want to work at Handy? Check out our culture deck and open roles >> <http://www.handy.com/careers> >> Latest news <http://www.handy.com/press> at Handy >> Handy just raised $50m >> <http://venturebeat.com/2015/11/02/on-demand-home-service-handy-raises-50m-in-round-led-by-fidelity/> >> led >> by Fidelity >> >> > -- Want to work at Handy? Check out our culture deck and open roles <http://www.handy.com/careers> Latest news <http://www.handy.com/press> at Handy Handy just raised $50m <http://venturebeat.com/2015/11/02/on-demand-home-service-handy-raises-50m-in-round-led-by-fidelity/> led by Fidelity