How many clients are you using to write?
Also the BatchWriter parameters might have an effect too - typically people use
values like the following:
BatchWriter writer = connector.createBatchWriter(tableName, 1000000,
1000, 10)
Those numbers are
1000000 : max bytes per batch
1000 : max latency in milliseconds
10 : threads to use
What's the max ingest rate of a single server?
On Apr 4, 2013, at 3:26 PM, Jimmy Lin <[email protected]> wrote:
>
>
> On Thu, Apr 4, 2013 at 2:25 PM, Eric Newton <[email protected]> wrote:
> Have you pre-split your tablet to spread the load out to all the machines?
> Yes. We are using splits from loading the whole dataset previously.
> Does the data distribution match your splits?
> Yes. See above.
> Is the ingest data already sorted (that is, it always writes to the last
> tablet)?
> No. The data writes to multiple tablets concurrently. We set up a queue
> parameter and divide the data into multiple queues.
> How much memory and how many threads are you using in your batchwriters?
> I believe we have 16GB of memory for the Java writer with 18 threads running
> per server.
>
> Check the ingest rates on tablet server monitor page and look for hot spots.
> There are certain servers that have higher ingest rates, and the server that
> is busiest changes over time, but the overall ingestion rate will not go up.
>
>
>
>
> On Thu, Apr 4, 2013 at 2:01 PM, Jimmy Lin <[email protected]> wrote:
> Hello,
> I am fairly new to Accumulo and am trying to figure out what is preventing my
> system from ingesting data at a faster rate. We have 15 nodes running a
> simple Java program that reads and writes to Accumulo and then indexes some
> data into Solr. The rate of ingest is not scaling linearly with the number of
> nodes that we start up. I have tried increasing several parameters including:
> - limit of file descriptors in linux
> - max zookeeper connections
> - tserver.memory.maps.max
> - tserver_opts memory size
> - tserver.mutation_queue.max
> - tserver.scan.files.open.max
> - tserver.walog.max.size
> - tserver.cache.data.size
> - tserver.cache.index.size
> - hdfs setting for xceivers
> No matter what changes we make, we cannot get the ingest rate to go over 100k
> entries/s and about 6 Mb/s. I know Accumulo should be able to ingest faster
> than this.
> Thanks in advance,
>
> Jimmy Lin
>
>
>