I am trying the very same thing to configure min split size with Spark
1.3.1 and i get compilation error
Code:
val hadoopConfiguration = new Configuration(sc.hadoopConfiguration)
hadoopConfiguration.set("mapreduce.input.fileinputformat.split.maxsize",
"67108864")
sc.newAPIHadoopFile
You can indeed override the Hadoop configuration at a per-RDD level -
though it is a little more verbose, as in the below example, and you need
to effectively make a copy of the hadoop Configuration:
val thisRDDConf = new Configuration(sc.hadoopConfiguration)
thisRDDConf.set("mapred.min.split.size