mfmettler opened a new issue #1975: URL: https://github.com/apache/accumulo/issues/1975
``We've tried testing Datawave with Accumulo 2.0.1 with the TinyLfuBlockCacheManager and the tservers hang under load within a few minutes. It reproduces every time. This same configuration and tests run fine with LruBlockCacheManager. Our configuration includes default | table.cache.block.enable .......................... | false system | @override ...................................... | true default | table.cache.index.enable .......................... | true default | tserver.cache.data.size ........................... | 10% site | @override ...................................... | 2G system | @override ...................................... | 4G default | tserver.cache.index.size .......................... | 25% site | @override ...................................... | 2G system | @override ...................................... | 1G default | tserver.cache.manager.class ....................... | org.apache.accumulo.core.file.blockfile.cache.lru.LruBlockCacheManager system | @override ...................................... | org.apache.accumulo.core.file.blockfile.cache.tinylfu.TinyLfuBlockCacheManager We're testing on a 10 worker, 2 manager node configuration on amazon aws. CentOS Linux release 7.7.1908 (Core) Hadoop 3.0.0-cdh6.3.1 Accumulo 2.0.1 We can, but don't need to, use Datawave to supply load and hang. We can also cause a hang by running a stand-alone test program and iterators on the tables written out by Datawave. Accumulo starts out okay, after a restart to install TinyLfuBlockCacheManager, runs for a few minutes, and hangs. CPU load on the workers falls to zero. There is a long list of scans reported in ashell, but they just sit there not completing. The search running at the time of the hang doesn't error out, but it doesn't complete either. The tserver logfiles just go quiet. No errors, no warnings, no messages. You can still connect to accumulo and attempt to start another query, and it will start, but then hangs, and never finishes. We are able to use the ashell to configure tserver.cache.manager.class back to LruBlockCacheManager for the next restart. If there is additional information that would be helpful, please let us know. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
