any discussion on this? I would like to hear more advices from the community before I create the PR,
an example is how to create a NewHadoopRDD we get a configuration from JobContext val updatedConf = job.getConfiguration new NewHadoopRDD(this, fClass, kClass, vClass, updatedConf) then we create a jobContext based on this configuration object NewHadoopRDD.scala (L74) val jobContext = newJobContext(conf, jobId) val rawSplits = inputFormat.getSplits(jobContext).toArray because inputFormat is from mapreduce package, it only accept a JobContext as the parameter in its methods I think we should avoid introduce Configuration as the parameter, but same thing as before, it will change the APIs Best, -- Nan Zhu On Wednesday, February 26, 2014 at 8:23 AM, Nan Zhu wrote: > Hi, all > > I just created a JIRA https://spark-project.atlassian.net/browse/SPARK-1139 . > The issue discusses that: > > the new Hadoop API based Spark APIs are actually a mixture of old and new > Hadoop API. > > Spark APIs are still using JobConf (or Configuration) as one of the > parameters, but actually Configuration has been replaced by mapreduce.Job in > the new Hadoop API > > for example : > http://codesfusion.blogspot.ca/2013/10/hadoop-wordcount-with-new-map-reduce-api.html > > > & > > http://www.slideshare.net/sh1mmer/upgrading-to-the-new-map-reduce-api (p10) > > Personally I think it’s better to fix this design, but it will introduce some > compatibility issue > > Just bring it here for your advices > > Best, > > -- > Nan Zhu >