[ https://issues.apache.org/jira/browse/SPARK-26851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16763818#comment-16763818 ]
Bruce Robbins commented on SPARK-26851: --------------------------------------- [~maropu] [~cloud_fan] I will let this Jira marinate for a little while, since I might have missed something (misunderstanding of the code, improvements in the JMM since initial introduction). I will make a PR if needed, but I don't want to use up 4 hours of Jenkins resources for a nonsense PR. > CachedRDDBuilder only partially implements double-checked locking > ----------------------------------------------------------------- > > Key: SPARK-26851 > URL: https://issues.apache.org/jira/browse/SPARK-26851 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.4.0, 3.0.0 > Reporter: Bruce Robbins > Priority: Minor > > In CachedRDDBuilder, {{cachedColumnBuffers}} uses double-checked locking to > lazily initialize {{_cachedColumnBuffers}}. Also, clearCache uses > double-checked locking to likely avoid synchronization when > {{_cachedColumnBuffers}} is still null. > However, the resource (in this case, {{_cachedColumnBuffers}}) is not > declared as volatile, which could cause some visibility problems in both > methods. {{cachedColumnBuffers}} may see a null reference when there is > actually a constructed RDD, and {{clearCache}} may pick up a reference to > partially constructed RDD. > From Java Concurrency in Practice by Brian Goetz et al: > {quote}Subsequent changes in the JMM (Java 5.0 and later) have enabled DCL to > work if resource is made volatile, and the performance impact of this is > small since volatile reads are usually only slightly more expensive than > nonvolatile reads. > {quote} > There are comments in other documentation that volatile is not needed if the > resource in question (again, in this case {{_cachedColumnBuffers}}) is > immutable. While an RDD is immutable from a Spark user's point of view, it > may not be from a JVM's point of view, since not all internal fields are > final. > I've marked this as minor since the race conditions are highly unlikely. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org