[ https://issues.apache.org/jira/browse/HBASE-12978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311825#comment-14311825 ]
stack commented on HBASE-12978: ------------------------------- This happened again. Interesting is that it would seem info:regioninfo was never written. I can get multiple versions of the info:server and info:serverstartcode content but not of info:regioninfo. Either all were lost or were never written. Let me dig some more. Below I am querying hbase:meta content asking for all versions (having meta keep 10 versions is a help). {code} hbase(main):014:0> get "hbase:meta", "IntegrationTestBigLinkedList,+\x84\xFF\xFC\xE4%\xF2\x11\xDE\x97t\xF0(\xF1$\xE8,1423438433508.014990fd6eb13141c04018f19c8910c8.", {COLUMN => 'info:server', VERSIONS => 10} COLUMN CELL info:server timestamp=1423442394783, value=c2025.halxg.cloudera.com:16020 info:server timestamp=1423442332641, value=c2024.halxg.cloudera.com:16020 2 row(s) in 0.0170 seconds hbase(main):015:0> get "hbase:meta", "IntegrationTestBigLinkedList,+\x84\xFF\xFC\xE4%\xF2\x11\xDE\x97t\xF0(\xF1$\xE8,1423438433508.014990fd6eb13141c04018f19c8910c8.", {COLUMN => 'info:serverstartcode', VERSIONS => 10} COLUMN CELL info:serverstartcode timestamp=1423442394783, value=1423442383454 info:serverstartcode timestamp=1423442332641, value=1423442287722 2 row(s) in 0.0050 seconds hbase(main):016:0> get "hbase:meta", "IntegrationTestBigLinkedList,+\x84\xFF\xFC\xE4%\xF2\x11\xDE\x97t\xF0(\xF1$\xE8,1423438433508.014990fd6eb13141c04018f19c8910c8.", {COLUMN => 'info:regioninfo', VERSIONS => 10} COLUMN CELL 0 row(s) in 0.0050 seconds hbase(main):017:0> get "hbase:meta", "IntegrationTestBigLinkedList,+\x84\xFF\xFC\xE4%\xF2\x11\xDE\x97t\xF0(\xF1$\xE8,1423438433508.014990fd6eb13141c04018f19c8910c8.", {TIMESTAMP => 1423442332641} COLUMN CELL info:seqnumDuringOpen timestamp=1423442332641, value=\x00\x00\x00\x00\x00\x11\xEE\xC2 info:server timestamp=1423442332641, value=c2024.halxg.cloudera.com:16020 info:serverstartcode timestamp=1423442332641, value=1423442287722 3 row(s) in 0.0120 seconds hbase(main):018:0> get "hbase:meta", "IntegrationTestBigLinkedList,+\x84\xFF\xFC\xE4%\xF2\x11\xDE\x97t\xF0(\xF1$\xE8,1423438433508.014990fd6eb13141c04018f19c8910c8.", {TIMESTAMP => 1423442394783} COLUMN CELL info:seqnumDuringOpen timestamp=1423442394783, value=\x00\x00\x00\x00\x00\x11\xFC\x89 info:server timestamp=1423442394783, value=c2025.halxg.cloudera.com:16020 info:serverstartcode timestamp=1423442394783, value=1423442383454 3 row(s) in 0.0050 seconds hbase(main):019:0> java.util.Date.new(1423442394783).toString => "Sun Feb 08 16:39:54 PST 2015" hbase(main):020:0> java.util.Date.new(1423442332641).toString => "Sun Feb 08 16:38:52 PST 2015" {code} > hbase:meta has a row missing hregioninfo and it causes my long-running job to > fail > ---------------------------------------------------------------------------------- > > Key: HBASE-12978 > URL: https://issues.apache.org/jira/browse/HBASE-12978 > Project: HBase > Issue Type: Bug > Reporter: stack > Fix For: 1.0.1 > > > Testing 1.0.0 trying long-running tests. > A row in hbase:meta was missing its HRI entry. It caused the job to fail. > Around the time of the first task failure, there are balances of the > hbase:meta region and it was on a server that crashed. I tried to look at > what happened around time of our writing hbase:meta and I ran into another > issue; 20 logs of 256MBs filled with WrongRegionException written over a > minute or two. The actual update of hbase:meta was not in the logs, it'd been > rotated off. -- This message was sent by Atlassian JIRA (v6.3.4#6332)