keith-turner opened a new pull request, #5040: URL: https://github.com/apache/accumulo/pull/5040
ZooCache currently has zero contention among threads for reads of cached data. It did this w/ snapshots of data. However it has two problems for updates. First there is global lock for writes. Second writes are expensive because the snapshots for reads must be recomputed, which can lead to O(N^2) behavior for a large series of rapid small updates. This change removes the global update lock and the snapshots and replaces them with a ConcurrentHashMap. It also replaces the three maps that existed with a single ConcurrentHashMap that has a complex value that represented the data in the three maps. This single complex values allows removal of the global lock which existed to keep the three maps in sync for a given path. Now for any path all of its data is stored in a single value in the map and can be safely updated with the compute function on ConcurrentHashMap which only allows one thread to update a path at a time. This change should maintain similar read behavior performance as the snapshots because ConcurrentHashMap does not block reads for writes, if there is a concurrent compute operation happening when a read happens then it will return the previously computed value w/o blocking. This is the same behavior that ZooCache used to have. So hopefully this change has similar read performance as before and much better update performance. Updates to different paths should be able to proceed in parallel and the snapshots no longer need to be computed on update. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
