I am attaching a log from another cluster where I had this problem today. It was going through "recovering lease" on all the files and eventually got stuck with that OOM. I included parts with wal.WALProcedureStore. On that cluster, I see now only one state*log: [hadoop@wdc01is-ja-prod-hbase5 logs]$ /usr/local/hadoop/bin/hdfs dfs -ls /hbase/MasterProcWALs/state*log -rw-r--r-- 3 hadoop supergroup 0 2017-09-08 21:44 /hbase/MasterProcWALs/state-00000000000000000009.log
I moved all the ones from before to _deleteme dir so I could start the master masterLog.txt <http://apache-hbase.679495.n3.nabble.com/file/t496798/masterLog.txt> : [hadoop@wdc01is-ja-prod-hbase5 logs]$ /usr/local/hadoop/bin/hdfs dfs -ls /hbase/MasterProcWALs_deleteme/state*log|more -rw-r--r-- 3 hadoop supergroup 33557009 2017-08-31 08:21 /hbase/MasterProcWALs_deleteme/state-00000000000000003828.log -rw-r--r-- 3 hadoop supergroup 33560149 2017-08-31 08:23 /hbase/MasterProcWALs_deleteme/state-00000000000000003829.log -rw-r--r-- 3 hadoop supergroup 33567073 2017-08-31 08:24 /hbase/MasterProcWALs_deleteme/state-00000000000000003830.log -rw-r--r-- 3 hadoop supergroup 33572251 2017-08-31 08:25 /hbase/MasterProcWALs_deleteme/state-00000000000000003831.log .............. ............ -rw-r--r-- 3 hadoop supergroup 38210390 2017-09-01 17:59 /hbase/MasterProcWALs_deleteme/state-00000000000000082166.log -rw-r--r-- 3 hadoop supergroup 38210646 2017-09-01 17:59 /hbase/MasterProcWALs_deleteme/state-00000000000000082167.log -rw-r--r-- 3 hadoop supergroup 38210902 2017-09-01 17:59 /hbase/MasterProcWALs_deleteme/state-00000000000000082168.log -rw-r--r-- 3 hadoop supergroup 38211158 2017-09-01 17:59 /hbase/MasterProcWALs_deleteme/state-00000000000000082169.log -rw-r--r-- 3 hadoop supergroup 49 2017-09-01 17:59 /hbase/MasterProcWALs_deleteme/state-00000000000000082170.log -- Sent from: http://apache-hbase.679495.n3.nabble.com/HBase-User-f4020416.html
