Our nodes are usually 20+ cores and 100+ GB RAM.
On Tue, Aug 28, 2018 at 10:18:24PM +0300, guy sharon wrote: > hi Jeremy, > > Do you have any information on how you configure them and what kind of > hardware they run on? > > Thanks, > Guy. > > > > On Tue, Aug 28, 2018 at 3:44 PM Jeremy Kepner <[email protected]> wrote: > > > FYI, Single node Accumulo instances is our most popular deployment. > > We have hundreds of them. Accummulo is so fast that it can replace > > what would normally require 20 MySQL servers. > > > > Regards. -Jeremy > > > > On Tue, Aug 28, 2018 at 07:38:37AM +0000, Sean Busbey wrote: > > > Hi Guy, > > > > > > Apache Accumulo is designed for horizontally scaling out for large scale > > workloads that need to do random reads and writes. There's a non-trivial > > amount of overhead that comes with a system aimed at doing that on > > thousands of nodes. > > > > > > If your use case works for a single laptop with such a small number of > > entries and exhaustive scans, then Accumulo is probably not the correct > > tool for the job. > > > > > > For example, on my laptop (i7 2 cores, 8GiB memory) with that dataset > > size you can just rely on a file format like Apache Avro: > > > > > > busbey$ time java -jar avro-tools-1.7.7.jar random --codec snappy > > --count 6300000 --schema '{ "type": "record", "name": "entry", "fields": [ > > { "name": "field0", "type": "string" } ] }' ~/Downloads/6.3m_entries.avro > > > Aug 28, 2018 12:31:13 AM org.apache.hadoop.util.NativeCodeLoader <clinit> > > > WARNING: Unable to load native-hadoop library for your platform... using > > builtin-java classes where applicable > > > test.seed=1535441473243 > > > > > > real 0m5.451s > > > user 0m5.922s > > > sys 0m0.656s > > > busbey$ ls -lah ~/Downloads/6.3m_entries.avro > > > -rwxrwxrwx 1 busbey staff 186M Aug 28 00:31 > > /Users/busbey/Downloads/6.3m_entries.avro > > > busbey$ time java -jar avro-tools-1.7.7.jar tojson > > ~/Downloads/6.3m_entries.avro | wc -l > > > 6300000 > > > > > > real 0m4.239s > > > user 0m6.026s > > > sys 0m0.721s > > > > > > I'd recommend that you start at >= 5 nodes if you want to look at rough > > per-node throughput capabilities. > > > > > > > > > On 2018/08/28 06:59:38, guy sharon <[email protected]> wrote: > > > > hi Mike, > > > > > > > > Thanks for the links. > > > > > > > > My current setup is a 4 node cluster (tserver, master, gc, monitor) > > running > > > > on Alpine Docker containers on a laptop with an i7 processor (8 cores) > > with > > > > 16GB of RAM. As an example I'm running a count of all entries for a > > table > > > > with 6.3M entries with "accumulo shell -u root -p secret -e "scan -t > > > > benchmark_table -np" | wc -l" and it takes 43 seconds. Not sure if > > this is > > > > reasonable or not. Seems a little slow to me. What do you think? > > > > > > > > BR, > > > > Guy. > > > > > > > > > > > > > > > > > > > > On Mon, Aug 27, 2018 at 4:43 PM Michael Wall <[email protected]> > > wrote: > > > > > > > > > Hi Guy, > > > > > > > > > > Here are a couple links I found. Can you tell us more about your > > setup > > > > > and what you are seeing? > > > > > > > > > > https://accumulo.apache.org/papers/accumulo-benchmarking-2.1.pdf > > > > > https://www.youtube.com/watch?v=Ae9THpmpFpM > > > > > > > > > > Mike > > > > > > > > > > > > > > > On Sat, Aug 25, 2018 at 5:09 PM guy sharon < > > [email protected]> > > > > > wrote: > > > > > > > > > >> hi, > > > > >> > > > > >> I've just started working with Accumulo and I think I'm > > experiencing slow > > > > >> reads/writes. I'm aware of the recommended configuration. Does > > anyone know > > > > >> of any standard benchmarks and benchmarking tools I can use to tell > > if the > > > > >> performance I'm getting is reasonable? > > > > >> > > > > >> > > > > >> > > > > > >
