Re: BatchWriter performance on 1.4

Keith Turner Fri, 20 Sep 2013 11:44:37 -0700

On Fri, Sep 20, 2013 at 12:47 PM, Slater, David M.
<[email protected]>wrote:


> I was using flush() after sending a bunch of mutations to the batchwriters
> to limit their latency. I thought it would normally flush the buffer to
> ensure that the maxLatency is not violated. If the maxLatency is quite
> large, how do I ensure that it doesn’t wait a long time before writing?
>

If you are constantly writing a batch writer, then it will be continually
flushing.   The example debug output I posted was from running
org.apache.accumulo.test.TestIngest (may be in a another package before
1.6).  I ran the following command to write a million random mutations.

accumulo org.apache.accumulo.test.TestIngest --debug -u root -p secret
--timestamp 1 --size 50 --random 56 --rows 1000000 --start 0 --cols 1

I think it defaults to 50M of memory for the batch writer.  It was
continually sending batches of 80K mutations every .45 seconds.   So in
that case the latency of a mutation is probably less than two seconds. But
this is just one tablet server, the behavior would be different on multiple
tablet servers.

In this example if I set the max latency on the batch writer to 30 secs,
then it would never kick in and force a flush.

> ****
>
> ** **
>
> If the returned batchscanners are all thread safe, then I’m still going to
> have the bottleneck of their synchronized addMutations method, correct?
>

In my experience, thats not a bottle neck but you will need to confirm this
for your situation (hopefully the debug output can help you w/ this).   If
the M threads adding mutations to a queue are going at a faster rate than
the N threads taking mutation and sending them, then the in synchronization
around the queue is not the bottleneck.  M threads probably could add to a
synchronized queue at a rate of millions of mutations per second.  N
threads can probably only serialize and send tens or hundreds of thousands
of mutations per second.


> ****
>
> ** **
>
> I’m looking for “org.apache.accumulo.client.impl” in the
> log4j.properties, generic_logger.xml the and other config files, but can’t
> locate it. Do I need to create a new entry for it there?
>

You can add something to a log4j.props file thats on the class path or you
can try adding something like the following to your code.  I had the
package wrong, its correct below.

Logger.getLogger("org.apache.accumulo.core.client.impl").setLevel(Level.TRACE)

****
>
> ** **
>
> Thanks,
> David****
>
> ** **
>
> *From:* Keith Turner [mailto:[email protected]]
> *Sent:* Thursday, September 19, 2013 7:01 PM
>
> *To:* [email protected]
> *Subject:* Re: BatchWriter performance on 1.4****
>
> ** **
>
> On Thu, Sep 19, 2013 at 5:08 PM, Slater, David M. <[email protected]>
> wrote:****
>
> Thanks Keith, I’m looking at it now. It appears like what I would want. As
> for the proper usage…****
>
>  ****
>
> Would I create one using the Connector, ****
>
> then .getBatchWriter() for each of the tables I’m interested in,****
>
> add data to each of BatchWriters returned,****
>
> ** **
>
> yes.****
>
>  ****
>
> and then hit flush() when I want to write all of that to get written?****
>
> ** **
>
> Why are you calling flush() ?   Doing this frequently will increase rpc
> overhead and lower throughput.****
>
>  ****
>
>  ****
>
> Would the individual batch writers spawned by the multiTableBatchWriter
> still have synchronized addMutations() methods so I would have to worry
> about blocking still, or would that all happen at the flush() method?****
>
> ** **
>
> The returned batch writers are thread safe. They all add to the same
> queue/buffer in a synchronized manner.   Calling flush() on any of the
> batch writers returned from getBatchWriter() will block the others.   ****
>
> ** **
>
> If you enable set the log4j log level to TRACE for
> org.apache.accumulo.client.impl you can see output like the following.
>  Binning is the process of taking each mutation and deciding which tablet
> and tablet server it goes to.****
>
> ** **
>
>   2013-09-19 18:43:37,261 [impl.ThriftTransportPool] TRACE: Using existing
> connection to 127.0.0.1:9997****
>
>   2013-09-19 18:43:37,393 [impl.TabletLocatorImpl] TRACE: tid=12 oid=13
>  Binning 80909 mutations for table 3****
>
>   2013-09-19 18:43:37,402 [impl.TabletLocatorImpl] TRACE: tid=12 oid=13
>  Binned 80909 mutations for table 3 to 1 tservers in 0.009 secs****
>
>   2013-09-19 18:43:37,402 [impl.TabletServerBatchWriter] TRACE: Started
> sending 80,909 mutations to 1 tablet servers****
>
>   2013-09-19 18:43:37,656 [impl.ThriftTransportPool] TRACE: Returned
> connection 127.0.0.1:9997 (120000) ioCount : 1459116****
>
>   2013-09-19 18:43:37,657 [impl.TabletServerBatchWriter] TRACE: sent
> 80,909 mutations to 127.0.0.1:9997 in 0.40 secs (204,832.91
> mutations/sec) with 0 failures****
>
> ** **
>
> When you close the batch writer, it will log some summary stats like the
> following.   ****
>
> ** **
>
> ** **
>
>   2013-09-19 18:43:39,149 [impl.TabletServerBatchWriter] TRACE: ****
>
>   2013-09-19 18:43:39,149 [impl.TabletServerBatchWriter] TRACE: TABLET
> SERVER BATCH WRITER STATISTICS****
>
>   2013-09-19 18:43:39,149 [impl.TabletServerBatchWriter] TRACE: Added
>            :  1,000,000 mutations****
>
>   2013-09-19 18:43:39,149 [impl.TabletServerBatchWriter] TRACE: Sent
>           :  1,000,000 mutations****
>
>   2013-09-19 18:43:39,149 [impl.TabletServerBatchWriter] TRACE: Resent
> percentage   :       0.00%****
>
>   2013-09-19 18:43:39,150 [impl.TabletServerBatchWriter] TRACE: Overall
> time         :       5.94 secs****
>
>   2013-09-19 18:43:39,150 [impl.TabletServerBatchWriter] TRACE: Overall
> send rate    : 168,406.87 mutations/sec****
>
>   2013-09-19 18:43:39,150 [impl.TabletServerBatchWriter] TRACE: Send
> efficiency      :      86.91%****
>
>   2013-09-19 18:43:39,150 [impl.TabletServerBatchWriter] TRACE: ****
>
>   2013-09-19 18:43:39,150 [impl.TabletServerBatchWriter] TRACE: BACKGROUND
> WRITER PROCESS STATISTICS****
>
>   2013-09-19 18:43:39,150 [impl.TabletServerBatchWriter] TRACE: Total send
> time      :       5.16 secs  86.91%****
>
>   2013-09-19 18:43:39,150 [impl.TabletServerBatchWriter] TRACE: Average
> send rate    : 193,760.90 mutations/sec****
>
>   2013-09-19 18:43:39,151 [impl.TabletServerBatchWriter] TRACE: Total bin
> time       :       0.46 secs   7.81%****
>
>   2013-09-19 18:43:39,151 [impl.TabletServerBatchWriter] TRACE: Average
> bin rate     : 2,155,172.41 mutations/sec****
>
>   2013-09-19 18:43:39,151 [impl.TabletServerBatchWriter] TRACE: tservers
> per batch   :     1.00 avg       1 min      1 max****
>
>   2013-09-19 18:43:39,151 [impl.TabletServerBatchWriter] TRACE: tablets
> per batch    :     1.00 avg       1 min      1 max****
>
>   2013-09-19 18:43:39,151 [impl.TabletServerBatchWriter] TRACE: ****
>
>   2013-09-19 18:43:39,151 [impl.TabletServerBatchWriter] TRACE: SYSTEM
> STATISTICS****
>
>   2013-09-19 18:43:39,151 [impl.TabletServerBatchWriter] TRACE: JVM GC
> Time          :       0.53 secs****
>
>   2013-09-19 18:43:39,152 [impl.TabletServerBatchWriter] TRACE: JVM
> Compile Time     :       1.60 secs****
>
>   2013-09-19 18:43:39,152 [impl.TabletServerBatchWriter] TRACE: System
> load average : initial=  0.22 final=  0.20****
>
> ** **
>
> What do these numbers look like for you?****
>
>  ****
>
> Keith****
>
> ** **
>
>  ****
>
> *From:* Keith Turner [mailto:[email protected]]
> *Sent:* Thursday, September 19, 2013 12:39 PM
> *To:* [email protected]****
>
>
> *Subject:* Re: BatchWriter performance on 1.4****
>
>  ****
>
> Are you aware of the multi table batch writer?  I am not sure if it would
> be useful, but wanted to make sure you knew about it.   It will use the
> same thread pool to process mutations for multiple tables.  Also it will
> batch mutations for multiple tablets into the same rpc calls.****
>
>  ****
>
> On Wed, Sep 18, 2013 at 5:07 PM, Slater, David M. <[email protected]>
> wrote:****
>
> Hi, I’m running a single-threaded ingestion program that takes data from
> an input source, parses it into mutations, and then writes those mutations
> (sequentially) to four different BatchWriters (all on different tables).
> Most of the time (95%) taken is on adding mutations, e.g.
> batchWriter.addMutations(mutations); I am wondering how to reduce the time
> taken by these methods. ****
>
>  ****
>
> 1) For the method batchWriter.addMutations(Iterable<Mutation>), does it
> matter for performance whether the mutations returned by the iterator are
> sorted in lexicographic order? ****
>
>  ****
>
> 2) If the Iterable<Mutation> that I pass to the BatchWriter is very large,
> will I need to wait for a number of Batches to be written and flushed
> before it will finish iterating, or does it transfer the elements of the
> Iterable to a different intermediate list?****
>
>  ****
>
> 3) If that is the case, would it then make sense to spawn off short
> threads for each time I make use of addMutations?****
>
>  ****
>
> At a high level, my code looks like this:****
>
>  ****
>
> BatchWriter bw1 = connector.createBatchWriter(…)****
>
> BatchWriter bw2 = …****
>
> …****
>
> while(true) {****
>
> String[] data = input.getData();****
>
> List<Mutation> mutations1 = parseData1(data);****
>
>                 List<Mutation> mutations2 = parseData2(data);****
>
>                 …****
>
>                 bw1.addMutations(mutations1);****
>
>                 bw2.addMutations(mutations2);****
>
>                 …****
>
> }****
>
> Thanks,
> David****
>
>  ****
>
> ** **
>

Re: BatchWriter performance on 1.4

Reply via email to