That'd be one more argument to put MR only on the big memory machines. J-D
On Thu, Oct 14, 2010 at 2:15 PM, Tim Robertson <[email protected]> wrote: > Thanks again. One of the things we struggle with currently on the > RDBMS, is the organisation of 250million records to complex > taxonomies, and also point in polygon intersections. Having such > memory available the MR jobs allows us to consider loading taxonomies > / polygons / RTree indexes into memory to do those calculations in > parallel with MR. I was playing with that a couple of years ago when > I first ventured into Hadoop > (http://biodivertido.blogspot.com/2008/11/reproducing-spatial-joins-using-hadoop.html) > but might get back into it... > > Tim > > On Thu, Oct 14, 2010 at 8:07 PM, Jean-Daniel Cryans <[email protected]> > wrote: >>> I had it in my mind that HBase liked big memory, hence assuming the >>> region servers should stay on the 24G machines with plenty of memory >>> at their disposal. We'll come up with a test platform and then try >>> some benchmarking and do a blog on it all and share. >>> >> >> They do, but because of JVM limitations the recommended setting is >> around 4-8GB. Giving more would cause bigger heap fragmentation >> issues, leading to full GC pauses, which could cause session timeouts. >> >> J-D >> >
