> On June 17, 2014, 12:38 a.m., Josh Elser wrote:
> >

Thanks for the review!


> On June 17, 2014, 12:38 a.m., Josh Elser wrote:
> > core/src/main/java/org/apache/accumulo/core/client/impl/Writer.java, line 
> > 155
> > <https://reviews.apache.org/r/22658/diff/1/?file=610893#file610893line155>
> >
> >     I'm not quite sure I understand the use of a ThreadLocal here. I'm 
> > thinking about a case where I have multiple tablets on a tserver. I have 
> > two writes coming in with updates to that same tserver, one to each tablet. 
> > Aren't those updates going to be running in their own threads anyways, so 
> > we wouldn't actually see any batching happening?
> >     
> >     If I'm missing something, please point me at some code and I'd be happy 
> > to try to understand better myself the context.

So Adam pointed out to me that this code in Writer probably more appropriately 
belongs in a server-side class rather than one released to clients. This 
batching behavior was primarily designed for internal use within a TabletServer 
(for now, specifically with metadata table updates).

For some context, the place where this batching is most beneficial is when 
we're adding an entry into the Metadata table when a tablet is referenced in a 
new WAL. Currently, for all tablets in an "update batch" (startUpdate -> 
applyUpdates -> closeUpdate), we make a separate update call one at a time (see 
TabletServerLogger). In one of our clusters, we've seen hsync perform unusually 
slow. Each separate update call will result in 1 hsync call. If there are many 
tablets present in an "update batch", this will yield many separate invocations 
to hsync. We found our ingest rate dip around the time a WAL rollover occured 
as a result.

The reasoning behind choosing a ThreadLocal is as follows. I didn't want 
separate tserver client threads committing another tserver client's batch of 
mutations (it may be safe to do so). Therefore, I wanted some level of 
isolation provided for each client thread to simplify the picture. The use of a 
ThreadLocal could be avoided if each tserver client serving thread would get 
its own Writer instance. 2 factors lead me to avoid choosing that route:
- I saw that we were typically storing Writer instances and re-using them. 
- The use of the Writer class was an implementation detail to the 
TabletServerLogger. 

Admittedly, the semantics get trickier as we now need to carefully ensure that 
the batch is always emptied when the thread finishes updating new logs (whether 
it succeeds or errors). 


> On June 17, 2014, 12:38 a.m., Josh Elser wrote:
> > core/src/main/java/org/apache/accumulo/core/client/impl/Writer.java, line 
> > 217
> > <https://reviews.apache.org/r/22658/diff/1/?file=610893#file610893line217>
> >
> >     Do we want to spin faster than half a second?

I haven't given much thought to this. I opted to use the same duration as the 
single-update case.


> On June 17, 2014, 12:38 a.m., Josh Elser wrote:
> > core/src/main/java/org/apache/accumulo/core/client/impl/Writer.java, line 
> > 242
> > <https://reviews.apache.org/r/22658/diff/1/?file=610893#file610893line242>
> >
> >     Again, is 500ms too long to spin before retrying?

Same as before.


> On June 17, 2014, 12:38 a.m., Josh Elser wrote:
> > server/base/src/main/java/org/apache/accumulo/server/util/MetadataTableUtil.java,
> >  line 154
> > <https://reviews.apache.org/r/22658/diff/1/?file=610894#file610894line154>
> >
> >     Do we want to be batching updates to the root table? We should really 
> > be updating the root table about log(n) as often as we update the metadata 
> > table, right?

We don't necessarily have to. I did it more for consistency than necessity.


On June 17, 2014, 12:38 a.m., Jonathan Park wrote:
> > First pass at looking over this. I need to give it a few more looks (with 
> > proper external context). For the next patch, can you make sure to run the 
> > new code through the formatter too, please? I noticed a few places where 
> > you have your own style :).

Oh, heh oops :) yeah I'll run it through a formatter.


- Jonathan


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/22658/#review45871
-----------------------------------------------------------


On June 16, 2014, 9:59 p.m., Jonathan Park wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/22658/
> -----------------------------------------------------------
> 
> (Updated June 16, 2014, 9:59 p.m.)
> 
> 
> Review request for accumulo.
> 
> 
> Bugs: ACCUMULO-2889
>     https://issues.apache.org/jira/browse/ACCUMULO-2889
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> Added additional methods to the Writer class to handle batching. 
> 
> Potential risks are that we're now holding onto the locks for a bit longer 
> than we used to. All tablets present in a batch will have their logLock's 
> locked until the batch is complete. 
> 
> 
> Diffs
> -----
> 
>   core/src/main/java/org/apache/accumulo/core/client/impl/Writer.java d6762e7 
>   
> server/base/src/main/java/org/apache/accumulo/server/util/MetadataTableUtil.java
>  374017d 
>   server/tserver/src/main/java/org/apache/accumulo/tserver/Tablet.java 
> 36b2289 
>   server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServer.java 
> e2510d7 
>   
> server/tserver/src/main/java/org/apache/accumulo/tserver/log/TabletServerLogger.java
>  d25ee75 
>   test/src/test/java/org/apache/accumulo/test/BatchMetadataUpdatesIT.java 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/22658/diff/
> 
> 
> Testing
> -------
> 
> Added a new IT.
> 
> We insert a few entries to tablets and ensure that the relevant entries 
> appear in the WAL and Metadata Table.
> One test case that isn't included yet is verifying that root + metadata table 
> entries are entered correctly.
> 
> 
> Thanks,
> 
> Jonathan Park
> 
>

Reply via email to