Josh, We experimented with 1.5.1 today; our load test numbers seem to indicate a 10x performance improvement over 1.5.0 on a single JVM. We are running additional experiments over the next few days to see what happens when we move to multiple JVMs. Stay tuned.
Thanks, Ariel --- Sent from my mobile device. Please excuse any errors. > On Feb 12, 2014, at 6:01 PM, Josh Elser <[email protected]> wrote: > > Also, for completeness: I filed ACCUMULO-2362 to work on concurrent accesses > to the same instance in the same JVM. > > Also, I misspoke earlier: much of the lock contention comes out of the Tables > class, not from the Instance. ZooCache keeps a static map of instance to > ZooCache which are used by a wide breadth of API calls. > >> On 2/12/14, 3:58 PM, Josh Elser wrote: >> ACCUMULO-1833 was merged into 1.5.1-SNAPSHOT a long time ago. I probably >> never cleaned up the branch after I finished the ticket. >> >> I believe John Vines started looking at using Curator, but I think he >> decided in the end that there wasn't significant gains to be had by >> using it. I'm sure he commented on the ticket he had for it. >> >>> On 2/12/14, 3:56 PM, Ariel Valentin wrote: >>> Is the 1833 branch going to be part of 1.5.1? >>> I recall reading somewhere that there was interest in using Curator to >>> ameliorate working with zookeeper. Is that still part of the release >>> roadmap? >>> >>> Thanks, >>> Ariel >>> --- >>> Sent from my mobile device. Please excuse any errors. >>> >>>> On Feb 12, 2014, at 3:13 PM, Josh Elser <[email protected]> wrote: >>>> >>>> Great, that helps. Thanks for the info, Ariel! >>>> >>>> I think this might be an area we want to revisit in later versions of >>>> Accumulo to make the client API implementations a little more robust >>>> and supportive of concurrent usage. >>>> >>>>> On 2/12/14, 3:10 PM, Ariel Valentin wrote: >>>>> Josh, >>>>> >>>>> The symptom is that we hit a point where a single server seems >>>>> "unresponsive" but we do not see anything unusual going on in that >>>>> machine and it seems idol. No heavy CPU, no I/O wait, low load average; >>>>> however when we add additional instances of the JVM our capacity seems >>>>> to increase linearly. >>>>> >>>>> Based on thread dumps and profiler stats it appears that under "heavy" >>>>> load most of our threads are blocked trying to access ZooCache. >>>>> >>>>> >>>>> Ariel Valentin >>>>> e-mail: [email protected] <mailto:[email protected]> >>>>> website: http://blog.arielvalentin.com >>>>> skype: ariel.s.valentin >>>>> twitter: arielvalentin >>>>> linkedin: http://www.linkedin.com/profile/view?id=8996534 >>>>> --------------------------------------- >>>>> *simplicity *communication >>>>> *feedback *courage *respect >>>>> >>>>> >>>>> On Wed, Feb 12, 2014 at 1:41 PM, Josh Elser <[email protected] >>>>> <mailto:[email protected]>> wrote: >>>>> >>>>> Didn't mean to ask about the subject matter, but how you were using >>>>> the API. Are you actually seeing contention on ZooCache? >>>>> >>>>> >>>>> On 2/12/14, 1:19 PM, Ariel Valentin wrote: >>>>> >>>>> Sorry but I am not at liberty to be specific about our business >>>>> problem. >>>>> >>>>> Typical usage is multiple clients writing data to tables, which >>>>> scan to >>>>> avoid duplicate entries. >>>>> >>>>> Ariel Valentin >>>>> e-mail: [email protected] >>>>> <mailto:[email protected]> >>>>> <mailto:ariel@arielvalentin.__com >>>>> <mailto:[email protected]>> >>>>> website: http://blog.arielvalentin.com >>>>> skype: ariel.s.valentin >>>>> twitter: arielvalentin >>>>> linkedin: http://www.linkedin.com/__profile/view?id=8996534 >>>>> <http://www.linkedin.com/profile/view?id=8996534> >>>>> ------------------------------__--------- >>>>> *simplicity *communication >>>>> *feedback *courage *respect >>>>> >>>>> >>>>> On Wed, Feb 12, 2014 at 10:59 AM, Josh Elser >>>>> <[email protected] <mailto:[email protected]> >>>>> <mailto:[email protected] <mailto:[email protected]>>> >>>>> wrote: >>>>> >>>>> Also, I forgot this part before: >>>>> >>>>> The ZooCache instance that's used *typically* comes >>>>> from the >>>>> Instance object that your Connector was created from. >>>>> In other >>>>> words, if you create multiple Instances >>>>> (ZooKeeperInstance, >>>>> usually), you can get multiple ZooCaches which means that >>>>> concurrent >>>>> calls to methods off of those objects should not block one >>>>> another >>>>> (createScanner off of connector1 from instance1 should not >>>>> block >>>>> createScanner off of connector2 from instance2). >>>>> >>>>> That should be something quick you can play with if you so >>>>> desire. >>>>> >>>>> >>>>> On 2/12/14, 9:57 AM, Josh Elser wrote: >>>>> >>>>> Yep, you'll likely also block on BatchScanner, >>>>> anything in >>>>> TableOperations, and a host of other things. >>>>> >>>>> For scanners, there's likely a standing >>>>> recommendation to >>>>> amortize the >>>>> use of those objects (if you want to look up 5 range, >>>>> don't make 5 >>>>> scanners). >>>>> >>>>> Creating a cache per member in the work would likely >>>>> require >>>>> some kind >>>>> of paxos implementation to provide consistency >>>>> which is >>>>> highly >>>>> undesirable. >>>>> >>>>> One thing I'm curious about is the impact of removing >>>>> ZooCache >>>>> altogether from things like the client api and see >>>>> what >>>>> happens. >>>>> I don't >>>>> have a good way to measure that impact off the top of >>>>> my head >>>>> though. >>>>> >>>>> Anyways, is this causing you problems in your usage of >>>>> the api? >>>>> Could >>>>> you elaborate a bit more on the specifics? >>>>> >>>>> On Feb 12, 2014 4:48 AM, "Ariel Valentin" >>>>> <[email protected] >>>>> <mailto:[email protected]> >>>>> <mailto:ariel@arielvalentin.__com >>>>> <mailto:[email protected]>> >>>>> <mailto:ariel@arielvalentin. >>>>> <mailto:ariel@arielvalentin.>____com >>>>> >>>>> <mailto:ariel@arielvalentin.__com >>>>> <mailto:[email protected]>>>> wrote: >>>>> >>>>> I have run into a problem related to >>>>> ACCUMULO-1833, which >>>>> appears to >>>>> have addressed the issue for >>>>> MutliTableBatchWriter; however >>>>> I am >>>>> seeing this issue on the scanner side also: >>>>> >>>>> 394750-"http-/192.168.220.196 >>>>> <http://192.168.220.196> >>>>> <http://192.168.220.196>:____8080-35" daemon prio=10 >>>>> >>>>> tid=0x00007f3108038000 nid=0x538a waiting for >>>>> monitor entry >>>>> [0x00007f31287d1000] >>>>> >>>>> 394878: java.lang.Thread.State: BLOCKED (on >>>>> object monitor) >>>>> >>>>> 394933- at >>>>> >>>>> >>>>> >>>>> org.apache.accumulo.fate.____zookeeper.ZooCache.____getInstance(ZooCache.java:301) >>>>> >>>>> >>>>> >>>>> 395012- - waiting to lock <0x00000000fa64f5b8> (a >>>>> java.lang.Class >>>>> for >>>>> org.apache.accumulo.fate.____zookeeper.ZooCache) >>>>> >>>>> 395120- at >>>>> >>>>> >>>>> >>>>> org.apache.accumulo.core.____client.impl.Tables.____getZooCache(Tables.java:40) >>>>> >>>>> >>>>> 395196- at >>>>> >>>>> >>>>> >>>>> org.apache.accumulo.core.____client.impl.Tables.getMap(____Tables.java:44) >>>>> >>>>> >>>>> 395267- at >>>>> >>>>> >>>>> >>>>> org.apache.accumulo.core.____client.impl.Tables.____getNameToIdMap(Tables.java:78) >>>>> >>>>> >>>>> 395346- at >>>>> >>>>> >>>>> >>>>> org.apache.accumulo.core.____client.impl.Tables.getTableId(____Tables.java:64) >>>>> >>>>> >>>>> 395421- at >>>>> >>>>> >>>>> >>>>> org.apache.accumulo.core.____client.impl.ConnectorImpl.____getTableId(ConnectorImpl.java:____75) >>>>> >>>>> >>>>> 395510- at >>>>> >>>>> >>>>> >>>>> org.apache.accumulo.core.____client.impl.ConnectorImpl.____createScanner(ConnectorImpl.____java:137) >>>>> >>>>> >>>>> >>>>> I have not spent enough time reasoning about the >>>>> code to >>>>> understand >>>>> all of the nuances but I am interested in knowing >>>>> if there >>>>> are any >>>>> mitigating strategies for dealing with this >>>>> thread >>>>> contention e.g. >>>>> would creating a cache entry for each member of >>>>> the Zookeeper >>>>> ensemble help relieve the strain? use multiple >>>>> classloaders? or is >>>>> my only option to spawn multiple JVMs? >>>>> >>>>> Thanks, >>>>> >>>>> Ariel Valentin >>>>> e-mail: [email protected] >>>>> <mailto:[email protected]> >>>>> <mailto:ariel@arielvalentin.__com >>>>> <mailto:[email protected]>> >>>>> <mailto:ariel@arielvalentin. >>>>> <mailto:ariel@arielvalentin.>____com >>>>> <mailto:ariel@arielvalentin.__com >>>>> <mailto:[email protected]>>> >>>>> >>>>> >>>>> website: http://blog.arielvalentin.com >>>>> skype: ariel.s.valentin >>>>> twitter: arielvalentin >>>>> linkedin: >>>>> http://www.linkedin.com/____profile/view?id=8996534 >>>>> <http://www.linkedin.com/__profile/view?id=8996534> >>>>> <http://www.linkedin.com/__profile/view?id=8996534 >>>>> <http://www.linkedin.com/profile/view?id=8996534>> >>>>> ------------------------------____--------- >>>>> >>>>> *simplicity *communication >>>>> *feedback *courage *respect >>>>> >>>>> >>>>>
