Github user JoshRosen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/2684#discussion_r18498021
  
    --- Diff: core/src/main/scala/org/apache/spark/rdd/HadoopRDD.scala ---
    @@ -132,24 +132,12 @@ class HadoopRDD[K, V](
       // Returns a JobConf that will be used on slaves to obtain input splits 
for Hadoop reads.
       protected def getJobConf(): JobConf = {
         val conf: Configuration = broadcastedConf.value.value
    -    if (conf.isInstanceOf[JobConf]) {
    -      // A user-broadcasted JobConf was provided to the HadoopRDD, so 
always use it.
    -      conf.asInstanceOf[JobConf]
    -    } else if (HadoopRDD.containsCachedMetadata(jobConfCacheKey)) {
    -      // getJobConf() has been called previously, so there is already a 
local cache of the JobConf
    -      // needed by this RDD.
    -      HadoopRDD.getCachedMetadata(jobConfCacheKey).asInstanceOf[JobConf]
    -    } else {
    -      // Create a JobConf that will be cached and used across this RDD's 
getJobConf() calls in the
    -      // local process. The local cache is accessed through 
HadoopRDD.putCachedMetadata().
    -      // The caching helps minimize GC, since a JobConf can contain ~10KB 
of temporary objects.
    -      // Synchronize to prevent ConcurrentModificationException 
(Spark-1097, Hadoop-10456).
    -      HadoopRDD.CONFIGURATION_INSTANTIATION_LOCK.synchronized {
    -        val newJobConf = new JobConf(conf)
    +    HadoopRDD.CONFIGURATION_INSTANTIATION_LOCK.synchronized {
    +      val newJobConf = new JobConf(conf)
    --- End diff --
    
    JobConf seems to implement this constructor by calling the superclass's 
constructor.
    
    Take a look at the `git blame` for Configuration:
    
    
https://github.com/apache/hadoop/blame/release-2.5.1/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/conf/Configuration.java#L662
    
    It looks like this constructor performs proper copying and has done so for 
a while (since 2009 or 2010).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to