How many clients are you using to write?

Also the BatchWriter parameters might have an effect too - typically people use 
values like the following:

        BatchWriter writer = connector.createBatchWriter(tableName, 1000000, 
1000, 10)

Those numbers are 

        1000000 : max bytes per batch
        1000 : max latency in milliseconds
        10 : threads to use

What's the max ingest rate of a single server?


On Apr 4, 2013, at 3:26 PM, Jimmy Lin <[email protected]> wrote:

> 
> 
> On Thu, Apr 4, 2013 at 2:25 PM, Eric Newton <[email protected]> wrote:
> Have you pre-split your tablet to spread the load out to all the machines? 
> Yes.  We are using splits from loading the whole dataset previously.
> Does the data distribution match your splits?
> Yes.  See above.
> Is the ingest data already sorted (that is, it always writes to the last 
> tablet)?
> No.  The data writes to multiple tablets concurrently.  We set up a queue 
> parameter and divide the data into multiple queues.
> How much memory and how many threads are you using in your batchwriters?
> I believe we have 16GB of memory for the Java writer with 18 threads running 
> per server.
> 
> Check the ingest rates on tablet server monitor page and look for hot spots.
> There are certain servers that have higher ingest rates, and the server that 
> is busiest changes over time, but the overall ingestion rate will not go up.
>  
>  
> 
> 
> On Thu, Apr 4, 2013 at 2:01 PM, Jimmy Lin <[email protected]> wrote:
> Hello,
> I am fairly new to Accumulo and am trying to figure out what is preventing my 
> system from ingesting data at a faster rate. We have 15 nodes running a 
> simple Java program that reads and writes to Accumulo and then indexes some 
> data into Solr. The rate of ingest is not scaling linearly with the number of 
> nodes that we start up. I have tried increasing several parameters including:
> - limit of file descriptors in linux
> - max zookeeper connections
> - tserver.memory.maps.max
> - tserver_opts memory size
> - tserver.mutation_queue.max
> - tserver.scan.files.open.max
> - tserver.walog.max.size
> - tserver.cache.data.size
> - tserver.cache.index.size
> - hdfs setting for xceivers
> No matter what changes we make, we cannot get the ingest rate to go over 100k 
> entries/s and about 6 Mb/s. I know Accumulo should be able to ingest faster 
> than this.
> Thanks in advance,
>  
> Jimmy Lin
>  
> 
> 

Reply via email to