That's what I usually recommend, the bigger the flushed files the better. On the other hand, you only have so much memory to dedicate to the MemStore...
J-D On Fri, Feb 18, 2011 at 11:50 AM, Chris Tarnas <[email protected]> wrote: > Would it be a good idea to raise the hbase.hregion.memstore.flush.size if you > have really large regions? > > -chris > > On Feb 18, 2011, at 11:43 AM, Jean-Daniel Cryans wrote: > >> Less regions, but it's often a good thing if you have a lot of data :) >> >> It's probably a good thing to bump the HDFS block size to 128 or 256MB >> since you know you're going to have huge-ish files. >> >> But anyway regarding penalties, I can't think of one that clearly >> comes out (unless you use a very small heap). The IO usage patterns >> will change, but unless you flush very small files all the time and >> need to recompact them into much bigger ones, then it shouldn't really >> be an issue. >> >> J-D >> >> On Fri, Feb 18, 2011 at 11:36 AM, Jason Rutherglen >> <[email protected]> wrote: >>>> We are also using a 5Gb region size to keep our region >>>> counts in the 100-200 range/node per Jonathan Grey's recommendation. >>> >>> So there isn't a penalty incurred from increasing the max region size >>> from 256MB to 5GB? >>> > >
