mfmettler opened a new issue #1975:
URL: https://github.com/apache/accumulo/issues/1975


   ``We've tried testing Datawave with Accumulo 2.0.1 with the
   TinyLfuBlockCacheManager and the tservers hang under load within a few
   minutes.  It reproduces every time.  This same configuration and tests
   run fine with LruBlockCacheManager.  Our configuration includes
   
   default    | table.cache.block.enable .......................... | false
   system     |    @override ...................................... | true
   default    | table.cache.index.enable .......................... | true
   
   default    | tserver.cache.data.size ........................... | 10%
   site       |    @override ...................................... | 2G
   system     |    @override ...................................... | 4G
   default    | tserver.cache.index.size .......................... | 25%
   site       |    @override ...................................... | 2G
   system     |    @override ...................................... | 1G
   default    | tserver.cache.manager.class ....................... | 
org.apache.accumulo.core.file.blockfile.cache.lru.LruBlockCacheManager
   system     |    @override ...................................... | 
org.apache.accumulo.core.file.blockfile.cache.tinylfu.TinyLfuBlockCacheManager
   
   We're testing on a 10 worker, 2 manager node configuration on amazon aws.
   
   CentOS Linux release 7.7.1908 (Core)
   Hadoop 3.0.0-cdh6.3.1
   Accumulo 2.0.1
   
   We can, but don't need to, use Datawave to supply load and hang.  We can
   also cause a hang by running a stand-alone test program and iterators
   on the tables written out by Datawave.
   
   Accumulo starts out okay, after a restart to install 
TinyLfuBlockCacheManager,
   runs for a few minutes, and hangs.  CPU load on the workers falls to zero.
   There is a long list of scans reported in ashell, but they just sit there
   not completing.  The search running at the time of the hang doesn't error
   out, but it doesn't complete either.  The tserver logfiles just go quiet.
   No errors, no warnings, no messages.  
   
   You can still connect to accumulo and attempt to start another query, and
   it will start, but then hangs, and never finishes.  We are able to use
   the ashell to configure tserver.cache.manager.class back to 
   LruBlockCacheManager for the next restart.
   
   If there is additional information that would be helpful, please let 
   us know.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to