If a region server is more fragmented, there could be potentially a lot more incomplete flushes if the global memstore is always near-full.. Which means more number of small compactions.. Is this right?
Is it better to have fat regions (I am thinking 8-10 gigs) for a large number (100's) of nodes ? V On 6/9/10 4:15 PM, "Jean-Daniel Cryans" <[email protected]> wrote: Mmm yeah wrote too fast. J-D > > Howso, JD? I can't think of anything that would cause 1 8GB region to take > more RAM than 16x512MB regions. > > -Todd > > >> >> > >> > We are trying to load 100's of terabytes eventually.. And running even >> 100s of regions per RS seems like a big hit on the memory. >> >> From what I saw in your metrics dump, storing the actual regions was >> costing you almost nothing. But, you will encounter problems when the >> global size of all memstores is getting very big (a true random write >> pattern will always get you there). >> >> IMO, the biggest issue with the number of regions served per RS is >> more about the actual data that is stored and retrieved WRT the >> performance each of your node can deliver (capacity planning). >> >> J-D >> > > > > -- > Todd Lipcon > Software Engineer, Cloudera >
