FWIW you can probably avoid the scan by making your insert idempotent aside from the timestamp and let versioning handle deduplication.
On Wed, Feb 12, 2014 at 1:19 PM, Ariel Valentin <[email protected]>wrote: > Sorry but I am not at liberty to be specific about our business problem. > > Typical usage is multiple clients writing data to tables, which scan to > avoid duplicate entries. > > Ariel Valentin > e-mail: [email protected] > > website: http://blog.arielvalentin.com > skype: ariel.s.valentin > twitter: arielvalentin > linkedin: http://www.linkedin.com/profile/view?id=8996534 > --------------------------------------- > *simplicity *communication > *feedback *courage *respect > > > On Wed, Feb 12, 2014 at 10:59 AM, Josh Elser <[email protected]> wrote: > >> Also, I forgot this part before: >> >> The ZooCache instance that's used *typically* comes from the Instance >> object that your Connector was created from. In other words, if you create >> multiple Instances (ZooKeeperInstance, usually), you can get multiple >> ZooCaches which means that concurrent calls to methods off of those objects >> should not block one another (createScanner off of connector1 from >> instance1 should not block createScanner off of connector2 from instance2). >> >> That should be something quick you can play with if you so desire. >> >> >> On 2/12/14, 9:57 AM, Josh Elser wrote: >> >>> Yep, you'll likely also block on BatchScanner, anything in >>> TableOperations, and a host of other things. >>> >>> For scanners, there's likely a standing recommendation to amortize the >>> use of those objects (if you want to look up 5 range, don't make 5 >>> scanners). >>> >>> Creating a cache per member in the work would likely require some kind >>> of paxos implementation to provide consistency which is highly >>> undesirable. >>> >>> One thing I'm curious about is the impact of removing ZooCache >>> altogether from things like the client api and see what happens. I don't >>> have a good way to measure that impact off the top of my head though. >>> >>> Anyways, is this causing you problems in your usage of the api? Could >>> you elaborate a bit more on the specifics? >>> >>> On Feb 12, 2014 4:48 AM, "Ariel Valentin" <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> I have run into a problem related to ACCUMULO-1833, which appears to >>> have addressed the issue for MutliTableBatchWriter; however I am >>> seeing this issue on the scanner side also: >>> >>> 394750-"http-/192.168.220.196:8080-35" daemon prio=10 >>> tid=0x00007f3108038000 nid=0x538a waiting for monitor entry >>> [0x00007f31287d1000] >>> >>> 394878: java.lang.Thread.State: BLOCKED (on object monitor) >>> >>> 394933- at >>> org.apache.accumulo.fate.zookeeper.ZooCache. >>> getInstance(ZooCache.java:301) >>> >>> 395012- - waiting to lock <0x00000000fa64f5b8> (a java.lang.Class >>> for org.apache.accumulo.fate.zookeeper.ZooCache) >>> >>> 395120- at >>> org.apache.accumulo.core.client.impl.Tables. >>> getZooCache(Tables.java:40) >>> >>> 395196- at >>> org.apache.accumulo.core.client.impl.Tables.getMap(Tables.java:44) >>> >>> 395267- at >>> org.apache.accumulo.core.client.impl.Tables. >>> getNameToIdMap(Tables.java:78) >>> >>> 395346- at >>> org.apache.accumulo.core.client.impl.Tables.getTableId( >>> Tables.java:64) >>> >>> 395421- at >>> org.apache.accumulo.core.client.impl.ConnectorImpl. >>> getTableId(ConnectorImpl.java:75) >>> >>> 395510- at >>> org.apache.accumulo.core.client.impl.ConnectorImpl. >>> createScanner(ConnectorImpl.java:137) >>> >>> I have not spent enough time reasoning about the code to understand >>> all of the nuances but I am interested in knowing if there are any >>> mitigating strategies for dealing with this thread contention e.g. >>> would creating a cache entry for each member of the Zookeeper >>> ensemble help relieve the strain? use multiple classloaders? or is >>> my only option to spawn multiple JVMs? >>> >>> Thanks, >>> >>> Ariel Valentin >>> e-mail: [email protected] <mailto:[email protected]> >>> >>> website: http://blog.arielvalentin.com >>> skype: ariel.s.valentin >>> twitter: arielvalentin >>> linkedin: http://www.linkedin.com/profile/view?id=8996534 >>> --------------------------------------- >>> *simplicity *communication >>> *feedback *courage *respect >>> >>> >
