I think that the newer Hadoop API does not expose this suggested min partitions parameter like the old one did. I believe you can try setting mapreduce.input.fileinputformat.split.{min,max}size instead on the Hadoop Configuration to suggest a max/min split size, and therefore bound the number of partitions you get back.
On Thu, Feb 19, 2015 at 11:07 AM, twinkle sachdeva <twinkle.sachd...@gmail.com> wrote: > Hi, > > In our job, we need to process the data in small chunks, so as to avoid GC > and other stuff. For this, we are using old API of hadoop as that let us > specify parameter like minPartitions. > > Does any one knows, If there a way to do the same via newHadoopAPI also? > How that way will be different from older API? > > I am little bit aware of split size stuff, but not much aware regarding any > promise that minimum number of partitions criteria gets satisfied or not. > > Any pointers will be of help. > > Thanks, > Twinkle --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org