On Wed, Oct 23, 2013 at 3:10 PM, Christopher <[email protected]> wrote:
> The data in the write-ahead logs is needed until the tserver flushes > the in memory maps to disk. Assuming you have a logger running on > every tserver, and tservers write to at least two loggers, you should > ensure that the size of the disk area is *at least* two times as big > as your in-memory map size per tserver. I'd say 5x-10x the in-memory > map size is probably safe. So, if your tservers are running with 2GB > of memory, then a 10-20GB area is probably more than sufficient. > I would go with a large multiplier so you are more resiliant to temporary glitches. The Accumulo GC may not be running for some reason and walogs may not get collected. If a lot of loggers die, but not tservers then you will have more than two tservers using a logger. Also if a tablet is written to slowly (relative to other tablets) then it may maintain references to older walog files keeping them around on disk. When situations like these occur its nice if the logger partition does not fill up. Give at least 100G or 200G. > > -- > Christopher L Tubbs II > http://gravatar.com/ctubbsii > > > On Wed, Oct 23, 2013 at 1:02 PM, Terry P. <[email protected]> wrote: > > Greetings all, > > For Accumulo 1.4 where write ahead logs are not yet stored in HDFS, does > > anyone have guidancewith respect to sizing the walog area? What exactly > > triggers when write ahead logs get removed? What might cause them to > hang > > around for an extended period of time (as in under abnormal > circumstances)? > > > > The system this applies to will see an ingest rate of approximately 2000 > > docs per second averaging 1-2K each (broken out into 12 columns each, so > > 24,000 entries per second) across 6 tabletserver nodes. > > > > Thanks in advance, > > Terry >
