Less regions, but it's often a good thing if you have a lot of data :)

It's probably a good thing to bump the HDFS block size to 128 or 256MB
since you know you're going to have huge-ish files.

But anyway regarding penalties, I can't think of one that clearly
comes out (unless you use a very small heap). The IO usage patterns
will change, but unless you flush very small files all the time and
need to recompact them into much bigger ones, then it shouldn't really
be an issue.

J-D

On Fri, Feb 18, 2011 at 11:36 AM, Jason Rutherglen
<[email protected]> wrote:
>>  We are also using a 5Gb region size to keep our region
>> counts in the 100-200 range/node per Jonathan Grey's recommendation.
>
> So there isn't a penalty incurred from increasing the max region size
> from 256MB to 5GB?
>

Reply via email to