subject:"Regarding minimum number of partitions while reading data from Hadoop"

Regarding minimum number of partitions while reading data from Hadoop

2015-02-19 Thread twinkle sachdeva

Hi,

In our job, we need to process the data in small chunks, so  as to avoid GC
and other stuff. For this, we are using old API of hadoop as that let us
specify parameter like minPartitions.

Does any one knows, If  there a way to do the same via newHadoopAPI also?
How that way will be different from older API?

I am little bit aware of split size stuff, but not much aware regarding any
promise that minimum number of partitions criteria gets satisfied or not.

Any pointers will be of help.

Thanks,
Twinkle

Re: Regarding minimum number of partitions while reading data from Hadoop

2015-02-19 Thread Sean Owen

I think that the newer Hadoop API does not expose this suggested min
partitions parameter like the old one did. I believe you can try
setting mapreduce.input.fileinputformat.split.{min,max}size instead on
the Hadoop Configuration to suggest a max/min split size, and
therefore bound the number of partitions you get back.

On Thu, Feb 19, 2015 at 11:07 AM, twinkle sachdeva
twinkle.sachd...@gmail.com wrote:
 Hi,

 In our job, we need to process the data in small chunks, so  as to avoid GC
 and other stuff. For this, we are using old API of hadoop as that let us
 specify parameter like minPartitions.

 Does any one knows, If  there a way to do the same via newHadoopAPI also?
 How that way will be different from older API?

 I am little bit aware of split size stuff, but not much aware regarding any
 promise that minimum number of partitions criteria gets satisfied or not.

 Any pointers will be of help.

 Thanks,
 Twinkle

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Regarding minimum number of partitions while reading data from Hadoop

Re: Regarding minimum number of partitions while reading data from Hadoop

2 matches

Site Navigation

Mail list logo

Footer information