> On June 17, 2014, 12:38 a.m., Josh Elser wrote: > >
Thanks for the review! > On June 17, 2014, 12:38 a.m., Josh Elser wrote: > > core/src/main/java/org/apache/accumulo/core/client/impl/Writer.java, line > > 155 > > <https://reviews.apache.org/r/22658/diff/1/?file=610893#file610893line155> > > > > I'm not quite sure I understand the use of a ThreadLocal here. I'm > > thinking about a case where I have multiple tablets on a tserver. I have > > two writes coming in with updates to that same tserver, one to each tablet. > > Aren't those updates going to be running in their own threads anyways, so > > we wouldn't actually see any batching happening? > > > > If I'm missing something, please point me at some code and I'd be happy > > to try to understand better myself the context. So Adam pointed out to me that this code in Writer probably more appropriately belongs in a server-side class rather than one released to clients. This batching behavior was primarily designed for internal use within a TabletServer (for now, specifically with metadata table updates). For some context, the place where this batching is most beneficial is when we're adding an entry into the Metadata table when a tablet is referenced in a new WAL. Currently, for all tablets in an "update batch" (startUpdate -> applyUpdates -> closeUpdate), we make a separate update call one at a time (see TabletServerLogger). In one of our clusters, we've seen hsync perform unusually slow. Each separate update call will result in 1 hsync call. If there are many tablets present in an "update batch", this will yield many separate invocations to hsync. We found our ingest rate dip around the time a WAL rollover occured as a result. The reasoning behind choosing a ThreadLocal is as follows. I didn't want separate tserver client threads committing another tserver client's batch of mutations (it may be safe to do so). Therefore, I wanted some level of isolation provided for each client thread to simplify the picture. The use of a ThreadLocal could be avoided if each tserver client serving thread would get its own Writer instance. 2 factors lead me to avoid choosing that route: - I saw that we were typically storing Writer instances and re-using them. - The use of the Writer class was an implementation detail to the TabletServerLogger. Admittedly, the semantics get trickier as we now need to carefully ensure that the batch is always emptied when the thread finishes updating new logs (whether it succeeds or errors). > On June 17, 2014, 12:38 a.m., Josh Elser wrote: > > core/src/main/java/org/apache/accumulo/core/client/impl/Writer.java, line > > 217 > > <https://reviews.apache.org/r/22658/diff/1/?file=610893#file610893line217> > > > > Do we want to spin faster than half a second? I haven't given much thought to this. I opted to use the same duration as the single-update case. > On June 17, 2014, 12:38 a.m., Josh Elser wrote: > > core/src/main/java/org/apache/accumulo/core/client/impl/Writer.java, line > > 242 > > <https://reviews.apache.org/r/22658/diff/1/?file=610893#file610893line242> > > > > Again, is 500ms too long to spin before retrying? Same as before. > On June 17, 2014, 12:38 a.m., Josh Elser wrote: > > server/base/src/main/java/org/apache/accumulo/server/util/MetadataTableUtil.java, > > line 154 > > <https://reviews.apache.org/r/22658/diff/1/?file=610894#file610894line154> > > > > Do we want to be batching updates to the root table? We should really > > be updating the root table about log(n) as often as we update the metadata > > table, right? We don't necessarily have to. I did it more for consistency than necessity. On June 17, 2014, 12:38 a.m., Jonathan Park wrote: > > First pass at looking over this. I need to give it a few more looks (with > > proper external context). For the next patch, can you make sure to run the > > new code through the formatter too, please? I noticed a few places where > > you have your own style :). Oh, heh oops :) yeah I'll run it through a formatter. - Jonathan ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/22658/#review45871 ----------------------------------------------------------- On June 16, 2014, 9:59 p.m., Jonathan Park wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/22658/ > ----------------------------------------------------------- > > (Updated June 16, 2014, 9:59 p.m.) > > > Review request for accumulo. > > > Bugs: ACCUMULO-2889 > https://issues.apache.org/jira/browse/ACCUMULO-2889 > > > Repository: accumulo > > > Description > ------- > > Added additional methods to the Writer class to handle batching. > > Potential risks are that we're now holding onto the locks for a bit longer > than we used to. All tablets present in a batch will have their logLock's > locked until the batch is complete. > > > Diffs > ----- > > core/src/main/java/org/apache/accumulo/core/client/impl/Writer.java d6762e7 > > server/base/src/main/java/org/apache/accumulo/server/util/MetadataTableUtil.java > 374017d > server/tserver/src/main/java/org/apache/accumulo/tserver/Tablet.java > 36b2289 > server/tserver/src/main/java/org/apache/accumulo/tserver/TabletServer.java > e2510d7 > > server/tserver/src/main/java/org/apache/accumulo/tserver/log/TabletServerLogger.java > d25ee75 > test/src/test/java/org/apache/accumulo/test/BatchMetadataUpdatesIT.java > PRE-CREATION > > Diff: https://reviews.apache.org/r/22658/diff/ > > > Testing > ------- > > Added a new IT. > > We insert a few entries to tablets and ensure that the relevant entries > appear in the WAL and Metadata Table. > One test case that isn't included yet is verifying that root + metadata table > entries are entered correctly. > > > Thanks, > > Jonathan Park > >
