Enis Soztutar created HBASE-14223:
-------------------------------------

             Summary: Meta WALs are not cleared if meta region was closed and 
RS aborts
                 Key: HBASE-14223
                 URL: https://issues.apache.org/jira/browse/HBASE-14223
             Project: HBase
          Issue Type: Bug
            Reporter: Enis Soztutar
             Fix For: 2.0.0, 1.0.2, 1.2.0, 1.3.0, 1.1.3


When an RS opens meta, and later closes it, the WAL(FSHlog) is not closed. The 
last WAL file just sits there in the RS WAL directory. If RS stops gracefully, 
the WAL file for meta is deleted. Otherwise if RS aborts, WAL for meta is not 
cleaned. It is also not split (which is correct) since master determines that 
the RS no longer hosts meta at the time of RS abort. 

>From a cluster after running ITBLL with CM, I see a lot of {{-splitting}} 
>directories left uncleaned: 
{code}
[root@os-enis-dal-test-jun-4-7 cluster-os]# sudo -u hdfs hadoop fs -ls 
/apps/hbase/data/WALs
Found 31 items
drwxr-xr-x   - hbase hadoop          0 2015-06-05 01:14 
/apps/hbase/data/WALs/hregion-58203265
drwxr-xr-x   - hbase hadoop          0 2015-06-05 07:54 
/apps/hbase/data/WALs/os-enis-dal-test-jun-4-1.openstacklocal,16020,1433489308745-splitting
drwxr-xr-x   - hbase hadoop          0 2015-06-05 09:28 
/apps/hbase/data/WALs/os-enis-dal-test-jun-4-1.openstacklocal,16020,1433494382959-splitting
drwxr-xr-x   - hbase hadoop          0 2015-06-05 10:01 
/apps/hbase/data/WALs/os-enis-dal-test-jun-4-1.openstacklocal,16020,1433498252205-splitting
...
{code}

The directories contain WALs from meta: 
{code}
[root@os-enis-dal-test-jun-4-7 cluster-os]# sudo -u hdfs hadoop fs -ls 
/apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285-splitting
Found 2 items
-rw-r--r--   3 hbase hadoop     201608 2015-06-05 03:15 
/apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285-splitting/os-enis-dal-test-jun-4-5.openstacklocal%2C16020%2C1433466904285..meta.1433470511501.meta
-rw-r--r--   3 hbase hadoop      44420 2015-06-05 04:36 
/apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285-splitting/os-enis-dal-test-jun-4-5.openstacklocal%2C16020%2C1433466904285..meta.1433474111645.meta
{code}

The RS hosted the meta region for some time: 
{code}
2015-06-05 03:14:28,692 INFO  [PostOpenDeployTasks:1588230740] 
zookeeper.MetaTableLocator: Setting hbase:meta region location in ZooKeeper as 
os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285
...
2015-06-05 03:15:17,302 INFO  [RS_CLOSE_META-os-enis-dal-test-jun-4-5:16020-0] 
regionserver.HRegion: Closed hbase:meta,,1.1588230740
{code}

In between, a WAL is created: 
{code}
2015-06-05 03:15:11,707 INFO  
[RS_OPEN_META-os-enis-dal-test-jun-4-5:16020-0-MetaLogRoller] wal.FSHLog: 
Rolled WAL 
/apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285/os-enis-dal-test-jun-4-5.openstacklocal%2C16020%2C1433466904285..meta.1433470511501.meta
 with entries=385, filesize=196.88 KB; new WAL 
/apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285/os-enis-dal-test-jun-4-5.openstacklocal%2C16020%2C1433466904285..meta.1433474111645.meta
{code}

When CM killed the region server later master did not see these WAL files: 
{code}
./hbase-hbase-master-os-enis-dal-test-jun-4-3.log:2015-06-05 03:36:46,075 INFO  
[MASTER_SERVER_OPERATIONS-os-enis-dal-test-jun-4-3:16000-0] 
master.SplitLogManager: started splitting 2 logs in 
[hdfs://os-enis-dal-test-jun-4-1.openstacklocal:8020/apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285-splitting]
 for [os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285]
./hbase-hbase-master-os-enis-dal-test-jun-4-3.log:2015-06-05 03:36:47,300 INFO  
[main-EventThread] wal.WALSplitter: Archived processed log 
hdfs://os-enis-dal-test-jun-4-1.openstacklocal:8020/apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285-splitting/os-enis-dal-test-jun-4-5.openstacklocal%2C16020%2C1433466904285.default.1433475074436
 to 
hdfs://os-enis-dal-test-jun-4-1.openstacklocal:8020/apps/hbase/data/oldWALs/os-enis-dal-test-jun-4-5.openstacklocal%2C16020%2C1433466904285.default.1433475074436
./hbase-hbase-master-os-enis-dal-test-jun-4-3.log:2015-06-05 03:36:50,497 INFO  
[main-EventThread] wal.WALSplitter: Archived processed log 
hdfs://os-enis-dal-test-jun-4-1.openstacklocal:8020/apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285-splitting/os-enis-dal-test-jun-4-5.openstacklocal%2C16020%2C1433466904285.default.1433475175329
 to 
hdfs://os-enis-dal-test-jun-4-1.openstacklocal:8020/apps/hbase/data/oldWALs/os-enis-dal-test-jun-4-5.openstacklocal%2C16020%2C1433466904285.default.1433475175329
./hbase-hbase-master-os-enis-dal-test-jun-4-3.log:2015-06-05 03:36:50,507 WARN  
[MASTER_SERVER_OPERATIONS-os-enis-dal-test-jun-4-3:16000-0] 
master.SplitLogManager: returning success without actually splitting and 
deleting all the log files in path 
hdfs://os-enis-dal-test-jun-4-1.openstacklocal:8020/apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285-splitting
./hbase-hbase-master-os-enis-dal-test-jun-4-3.log:2015-06-05 03:36:50,508 INFO  
[MASTER_SERVER_OPERATIONS-os-enis-dal-test-jun-4-3:16000-0] 
master.SplitLogManager: finished splitting (more than or equal to) 129135000 
bytes in 2 log files in 
[hdfs://os-enis-dal-test-jun-4-1.openstacklocal:8020/apps/hbase/data/WALs/os-enis-dal-test-jun-4-5.openstacklocal,16020,1433466904285-splitting]
 in 4433ms
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to