We have a 3 node Accumulo 1.7 cluster running as VMWare VMs with minute amount of data compared to Accumulo standards.
We have run into a situation multiple times now where all the nodes have a power failure and when they are trying to recover from it simultaneously, walog grows exponentially and fills up all the available disk space. We have confirmed that the walog folder under /accumulo in hdfs is consuming 99% of the disk space. We have tried freeing enough space to be able to run Accumulo processes in the hopes of it burning through walog without success. Walog just grew to take up the freed space. Given that we need to better manage the power situation, we're trying to understand what could be causing this and if there's anything we can do to avoid this situation. We have some heartbeat data being written to a table at a very small constant rate which is not sufficient to cause a such large write-ahead log even if HDFS was pulled from under Accumulo's feet, so to speak during the power failure in case you're wondering. Thank you, Jayesh
smime.p7s
Description: S/MIME cryptographic signature
