Hopefully you are using accumulo 1.4.*3*. A performance issue (ACCUMULO-1062) was found in 1.4.2 when a large number of clients attempted to update a tablet concurrently.
-Eric On Thu, Apr 4, 2013 at 3:26 PM, Jimmy Lin <[email protected]> wrote: > > > On Thu, Apr 4, 2013 at 2:25 PM, Eric Newton <[email protected]> wrote: > >> Have you pre-split your tablet to spread the load out to all the >> machines? >> Yes. We are using splits from loading the whole dataset previously. >> Does the data distribution match your splits? >> Yes. See above. >> Is the ingest data already sorted (that is, it always writes to the last >> tablet)? >> No. The data writes to multiple tablets concurrently. We set up a queue >> parameter and divide the data into multiple queues. >> How much memory and how many threads are you using in your batchwriters? >> I believe we have 16GB of memory for the Java writer with 18 threads >> running per server. >> >> Check the ingest rates on tablet server monitor page and look for hot >> spots. >> There are certain servers that have higher ingest rates, and the server >> that is busiest changes over time, but the overall ingestion rate will not >> go up. >> >> > >> >> >> On Thu, Apr 4, 2013 at 2:01 PM, Jimmy Lin <[email protected]> wrote: >> >>> Hello, >>> I am fairly new to Accumulo and am trying to figure out what is >>> preventing my system from ingesting data at a faster rate. We have 15 nodes >>> running a simple Java program that reads and writes to Accumulo and then >>> indexes some data into Solr. The rate of ingest is not scaling linearly >>> with the number of nodes that we start up. I have tried increasing several >>> parameters including: >>> - limit of file descriptors in linux >>> - max zookeeper connections >>> - tserver.memory.maps.max >>> - tserver_opts memory size >>> - tserver.mutation_queue.max >>> - tserver.scan.files.open.max >>> - tserver.walog.max.size >>> - tserver.cache.data.size >>> - tserver.cache.index.size >>> - hdfs setting for xceivers >>> No matter what changes we make, we cannot get the ingest rate to go over >>> 100k entries/s and about 6 Mb/s. I know Accumulo should be able to ingest >>> faster than this. >>> Thanks in advance, >>> >>> Jimmy Lin >>> >>> >> >> >
