srowen commented on a change in pull request #23768: [SPARK-26851][SQL] Fix 
double-checked locking in CachedRDDBuilder
URL: https://github.com/apache/spark/pull/23768#discussion_r259083019
 
 

 ##########
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala
 ##########
 @@ -49,7 +49,7 @@ case class CachedRDDBuilder(
     storageLevel: StorageLevel,
     @transient cachedPlan: SparkPlan,
     tableName: Option[String])(
-    @transient private var _cachedColumnBuffers: RDD[CachedBatch] = null) {
+    @transient @volatile private var _cachedColumnBuffers: RDD[CachedBatch] = 
null) {
 
 Review comment:
   @bersprockets @cloud-fan Oops, I just noticed this causes the Scala 2.11 
build to fail:
   
   ```
   [error] 
/home/jenkins/workspace/spark-master-test-maven-hadoop-2.7-ubuntu-scala-2.11/sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala:52:
 values cannot be volatile
   [error]     @transient @volatile private var _cachedColumnBuffers: 
RDD[CachedBatch] = null) {
   [error]   
   ```
   
   It looks like this might be a scalac bug, that is only fixed in 2.12; didn't 
look too hard but ended up here:
   https://github.com/scala/bug/issues/8873
   https://github.com/scala/scala/pull/5294
   
   It might be sufficient to move this to a private field, as I don't think any 
caller actually sets this value? Let me try.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to