[
https://issues.apache.org/jira/browse/HBASE-710?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608258#action_12608258
]
stack commented on HBASE-710:
-----------------------------
Wouldn't be hard to have regionservers volunteer local time when they check in
and if very different from the master's, it could at least flag it. I made a
new issue to do this, HBASE-711
> If clocks are way off, then we can have daughter split come before rather
> than after its parent in .META.
> ---------------------------------------------------------------------------------------------------------
>
> Key: HBASE-710
> URL: https://issues.apache.org/jira/browse/HBASE-710
> Project: Hadoop HBase
> Issue Type: Bug
> Reporter: stack
> Priority: Blocker
> Fix For: 0.1.3
>
>
> On the Jon Gray cluster, his clocks are skewed badly. I see weird stuff in
> .META.
> {code}
> 2008-06-25 14:57:57,728 DEBUG org.apache.hadoop.hbase.HMaster:
> HMaster.metaScanner regioninfo: {regionname:
> items,823ce1e3-d414-474f-ac70-c4081cecef0f,1214416614697, startKey:
> <823ce1e3-d414-474f-ac70-c4081cecef0f>, endKey:
> <86f8df20-e237-4bb3-9748-88cef892bd70>, encodedName: 1157924217, offline:
> true, split: true, tableDesc: {name: items, families: {cfrecs:={name: cfrecs,
> max versions: 2, compression: NONE, in memory: false, max length: 2147483647,
> bloom filter: none}, clusters:={name: clusters, max versions: 2, compression:
> NONE, in memory: false, max length: 2147483647, bloom filter: none},
> content:={name: content, max versions: 2, compression: NONE, in memory:
> false, max length: 2147483647, bloom filter: none}, readby:={name: readby,
> max versions: 2, compression: NONE, in memory: false, max length: 2147483647,
> bloom filter: none}, receivedby:={name: receivedby, max versions: 2,
> compression: NONE, in memory: false, max length: 2147483647, bloom filter:
> none}, savedby:={name: savedby, max versions: 2, compression: NONE, in
> memory: false, max length: 2147483647, bloom filter: none}, sentby:={name:
> sentby, max versions: 2, compression: NONE, in memory: false, max length:
> 2147483647, bloom filter: none}}}}, server: 192.168.249.223:60020, startCode:
> 1214406344634
> 2008-06-25 14:57:57,732 DEBUG org.apache.hadoop.hbase.HMaster:
> HMaster.metaScanner regioninfo: {regionname:
> items,823ce1e3-d414-474f-ac70-c4081cecef0f,1214416641213, startKey:
> <823ce1e3-d414-474f-ac70-c4081cecef0f>, endKey:
> <83fca0e2-f324-4f9e-99c1-1fdbeff63b3d>, encodedName: 541300165, tableDesc:
> {name: items, families: {cfrecs:={name: cfrecs, max versions: 2, compression:
> NONE, in memory: false, max length: 2147483647, bloom filter: none},
> clusters:={name: clusters, max versions: 2, compression: NONE, in memory:
> false, max length: 2147483647, bloom filter: none}, content:={name: content,
> max versions: 2, compression: NONE, in memory: false, max length: 2147483647,
> bloom filter: none}, readby:={name: readby, max versions: 2, compression:
> NONE, in memory: false, max length: 2147483647, bloom filter: none},
> receivedby:={name: receivedby, max versions: 2, compression: NONE, in memory:
> false, max length: 2147483647, bloom filter: none}, savedby:={name: savedby,
> max versions: 2, compression: NONE, in memory: false, max length: 2147483647,
> bloom filter: none}, sentby:={name: sentby, max versions: 2, compression:
> NONE, in memory: false, max length: 2147483647, bloom filter: none}}}},
> server: 192.168.249.220:60020, startCode: 1214424347649
> 2008-06-25 14:57:57,738 DEBUG org.apache.hadoop.hbase.HMaster:
> HMaster.metaScanner regioninfo: {regionname:
> items,823ce1e3-d414-474f-ac70-c4081cecef0f,1214434560891, startKey:
> <823ce1e3-d414-474f-ac70-c4081cecef0f>, endKey:
> <9066d4f3-314b-4d9c-90e8-7aa08a52fdd4>, encodedName: 1673833201, offline:
> true, split: true, tableDesc: {name: items, families: {cfrecs:={name: cfrecs,
> max versions: 2, compression: NONE, in memory: false, max length: 2147483647,
> bloom filter: none}, clusters:={name: clusters, max versions: 2, compression:
> NONE, in memory: false, max length: 2147483647, bloom filter: none},
> content:={name: content, max versions: 2, compression: NONE, in memory:
> false, max length: 2147483647, bloom filter: none}, readby:={name: readby,
> max versions: 2, compression: NONE, in memory: false, max length: 2147483647,
> bloom filter: none}, receivedby:={name: receivedby, max versions: 2,
> compression: NONE, in memory: false, max length: 2147483647, bloom filter:
> none}, savedby:={name: savedby, max versions: 2, compression: NONE, in
> memory: false, max length: 2147483647, bloom filter: none}, sentby:={name:
> sentby, max versions: 2, compression: NONE, in memory: false, max length:
> 2147483647, bloom filter: none}}}}, server: 192.168.249.221:60020, startCode:
> 1214406358315
> {code}
> Thats 3 regions with same start code; 2 are offline.
> Looking at the regionids -- these are timestamps -- I see that they don't
> jibe with how they should be aligned. Parents should come before daughters
> in timestamps.
> Looking at clocks on cluster, they are badly skewed:
> {code
> <st^ack> [EMAIL PROTECTED] logs]$ for i in 0 1 2 3 4 5 6 7 8 9; do ssh
> hb$i "hostname; date"; done
> <st^ack> hb0.streamy.com
> <st^ack> Wed Jun 25 16:47:29 PDT 2008
> <st^ack> hb1.streamy.com
> <st^ack> Wed Jun 25 11:47:39 PDT 2008
> <st^ack> hb2.streamy.com
> <st^ack> Wed Jun 25 11:47:40 PDT 2008
> <st^ack> hb3.streamy.com
> <st^ack> Wed Jun 25 11:47:26 PDT 2008
> <st^ack> hb4.streamy.com
> <st^ack> Wed Jun 25 11:47:35 PDT 2008
> <st^ack> hb5.streamy.com
> <st^ack> Wed Jun 25 16:47:29 PDT 2008
> <st^ack> hb6.streamy.com
> <st^ack> Wed Jun 25 16:47:29 PDT 2008
> <st^ack> hb7.streamy.com
> <st^ack> Wed Jun 25 16:47:29 PDT 2008
> <st^ack> hb8.streamy.com
> <st^ack> Wed Jun 25 16:47:30 PDT 2008
> <st^ack> hb9.streamy.com
> <st^ack> Wed Jun 25 16:47:30 PDT 2008
> {code}
> Looking at split code, looks like the regionserver sets the
> regionid/timestamp on the new daughter regions inside in the HRegionInfo
> constructor that gets called when splitting:
> {code}
> this.regionId = System.currentTimeMillis();
> {code}
> Daughters update the .META. table; they need to have a basic check that they
> are not inserting with a timestamp that is older than the parent they are
> splitting.
> This is a bit like HBASE-609
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.