Jeremy, The data will age off daily so I plan to bulk load ~1TB every 4 hours.
Regards, Mike Fagan On 5/22/15, 12:09 PM, "Jeremy Kepner" <[email protected]> wrote: >7TB -> 21TB (Hadoop replication), perhaps larger if you have index >tables, ... > >1M fetches / day ~ 10M entries / day ~ 1000 entries/sec > >Typical Accumulo peak is 100K entries/sec/core so you should be fine on >query > >How fast do you need to insert the data into Accumulo? > >On Fri, May 22, 2015 at 03:46:20PM +0000, Fagan, Michael wrote: >> Josh, >> >> Thanks, I would like use my performance requirements to derive my HW >> requirements. >> >> For example: assume I have a raw 7TB dataset representing 500 million >> records with the expectation of 500K-1000K key fetches a day. >> >> I remember there was a tuning webpage circulating around a several years >> back to help figure the HW sizing to meet performance benchmarks. >> >> >> Regards, >> Mike Fagan >> >> >> >> On 5/22/15, 8:55 AM, "Josh Elser" <[email protected]> wrote: >> >> >Hi Mike, >> > >> >We have some info in >> >http://accumulo.apache.org/1.7/accumulo_user_manual.html#_hardware >> > >> >What's missing there? Let us know the types of questions you have and >>we >> >can expand on the document. >> > >> >- Josh >> > >> >Fagan, Michael wrote: >> >> Hi, >> >> >> >> Can someone point me to recommendations regarding cluster sizing? >> >> >> >> Regards, >> >> Mike Fagan >> >> >> >> >> > >> >
