[
https://issues.apache.org/jira/browse/HBASE-11906?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jimmy Xiang updated HBASE-11906:
--------------------------------
Attachment: meta-data-loss-2.log
Attached meta-data-loss-2.log. In this log, the region was open on e1206.
However, after log replay, we got from meta the region was open on e1207.
We didn't pick up the right edit with seq id=40001551, ts=1410198840046.
Instead, it seems that we picked up the edit with seq id=40000961,
ts=1410198807834.
The scan was logged at 2014-09-08 10:56:34,997 which is about 2 minutes after
ts=1410198840046(Mon Sep 8 10:54:00 PDT 2014). So I assume the meta scan used
a proper timestamp.
> Meta data loss with distributed log replay
> ------------------------------------------
>
> Key: HBASE-11906
> URL: https://issues.apache.org/jira/browse/HBASE-11906
> Project: HBase
> Issue Type: Bug
> Reporter: Jimmy Xiang
> Attachments: meta-data-loss-2.log, meta-data-loss-with-dlr.log
>
>
> In the attached log, you can see, before log replaying, the region is open on
> e1205:
> {noformat}
> A3. 2014-09-05 16:38:46,705 INFO
> [B.defaultRpcServer.handler=5,queue=2,port=20020] master.RegionStateStore:
> Updating row
> IntegrationTestBigLinkedList,\x90Jy\x04\xA7\x90Jp,1409959495482.cbb0d736ebfabcf4a07e5a7b395fcdf7.
> with
> state=OPEN&openSeqNum=40118237&server=e1205.halxg.cloudera.com,20020,1409960280431
> {noformat}
> After the log replay, we got from meta the region is open on e1209
> {noformat}
> A4. 2014-09-05 16:41:12,257 INFO [ActiveMasterManager]
> master.AssignmentManager: Loading from meta:
> {cbb0d736ebfabcf4a07e5a7b395fcdf7 state=OPEN, ts=1409960472257,
> server=e1209.halxg.cloudera.com,20020,1409959391651}
> {noformat}
> The replayed edits show the log does have the edit expected:
> {noformat}
> 2014-09-05 16:41:11,862 INFO
> [B.defaultRpcServer.handler=18,queue=0,port=20020]
> regionserver.RSRpcServices: Meta replay edit
> type=PUT,mutation={"totalColumns":4,"families":{"info":[{"timestamp":1409960326705,"tag":["3:\\x00\\x00\\x00\\x00\\x02bad"],"value":"e1205.halxg.cloudera.com:20020","qualifier":"server","vlen":30},{"timestamp":1409960326705,"tag":["3:\\x00\\x00\\x00\\x00\\x02bad"],"value":"\\x00\\x00\\x01HH.\\x81o","qualifier":"serverstartcode","vlen":8},{"timestamp":1409960326705,"tag":["3:\\x00\\x00\\x00\\x00\\x02bad"],"value":"\\x00\\x00\\x00\\x00\\x02d'\\xDD","qualifier":"seqnumDuringOpen","vlen":8},{"timestamp":1409960326706,"tag":["3:\\x00\\x00\\x00\\x00\\x02bad"],"value":"OPEN","qualifier":"state","vlen":4}]},"row":"IntegrationTestBigLinkedList,\\x90Jy\\x04\\xA7\\x90Jp,1409959495482.cbb0d736ebfabcf4a07e5a7b395fcdf7."}
> {noformat}
> Why we picked up a wrong value with an older time stamp?
> {noformat}
> 2014-09-05 16:41:11,063 INFO
> [B.defaultRpcServer.handler=9,queue=0,port=20020] regionserver.RSRpcServices:
> Meta replay edit
> type=PUT,mutation={"totalColumns":4,"families":{"info":[{"timestamp":1409959994634,"tag":["3:\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\x99"],"value":"e1209.halxg.cloudera.com:20020","qualifier":"server","vlen":30},{"timestamp":1409959994634,"tag":["3:\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\x99"],"value":"\\x00\\x00\\x01HH
>
> \\xF1\\xA3","qualifier":"serverstartcode","vlen":8},{"timestamp":1409959994634,"tag":["3:\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\x99"],"value":"\\x00\\x00\\x00\\x00\\x00\\x01\\xB7\\xAB","qualifier":"seqnumDuringOpen","vlen":8},{"timestamp":1409959994634,"tag":["3:\\x00\\x00\\x00\\x00\\x00\\x00\\x09\\x99"],"value":"OPEN","qualifier":"state","vlen":4}]},"row":"IntegrationTestBigLinkedList,\\x90Jy\\x04\\xA7\\x90Jp,1409959495482.cbb0d736ebfabcf4a07e5a7b395fcdf7."}
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)