[jira] [Commented] (HBASE-24753) HA masters based on raft

2020-10-02 Thread Nicholas Jiang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17206109#comment-17206109
 ] 

Nicholas Jiang commented on HBASE-24753:


[~zhangduo], I prefer to recommend the raft library 
https://github.com/sofastack/sofa-jraft. The benchmark of JRaft could refer to 
https://www.sofastack.tech/projects/sofa-jraft/benchmark-performance/.

> HA masters based on raft
> 
>
> Key: HBASE-24753
> URL: https://issues.apache.org/jira/browse/HBASE-24753
> Project: HBase
>  Issue Type: New Feature
>  Components: master
>Reporter: Duo Zhang
>Priority: Major
>
> For better availability, for moving bootstrap information from zookeeper to 
> our own service so finally we could remove the dependency on zookeeper 
> completely.
> This has been in my mind for a long time, and since the there is a dicussion 
> in HBASE-11288 about how to storing root table, and also in HBASE-24749, we 
> want to have better performance on a filesystem can not support list and 
> rename well, where requires a storage engine at the bottom to store the 
> storefiles information for meta table, I think it is the time to throw this 
> idea out.
> The basic solution is to build a raft group to store the bootstrap 
> information, for now it is cluster id(it is on the file system already?) and 
> the root table. For region servers they will always go to the leader to ask 
> for the information so they can always see the newest data, and for client, 
> we enable 'follower read', to reduce the load of the leader(and there are 
> some solutions to even let 'follower read' to always get the newest data in 
> raft).
> With this solution in place, as long as root table will not be in a format of 
> region(we could just use rocksdb to store it locally), the cyclic dependency 
> in HBASE-24749 has also been solved, as we do not need to find a place to 
> store the storefiles information for root table any more.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24753) HA masters based on raft

2020-08-06 Thread Nick Dimiduk (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17172521#comment-17172521
 ] 

Nick Dimiduk commented on HBASE-24753:
--

bq. Actually, no... It is about HBASE-24286, where the AWS guys want to 
redeploy an HBase cluster with the old data on S3 but all new virtual machines 
as region server. So I say if you want to do this on cloud, you could also use 
EBS.

I see. I would classify HBASE-24286 as an operator error, in the sense that 
they are attempting to re-hydrate a cluster from partial state (missing WAL 
files/directories)... maybe a bug, except that there's never been (i've never 
seen) an explicit design that completely isolates root dir filesystem from 
master filesystem. What we're talking about in this JIRA is a design choice 
explicitly changing from what was intentionally decided before, intentionally 
introducing a persistence dependency on something additional to the root dir on 
the shared namespace filesystem.

bq. You could build the storage of the raft store on a HDFS? Or even just on a 
dynamodb as it is a just a KV? I do not see any problems here...

Consensus storage on HDFS would resolve the concern I attempted to clarify.

> HA masters based on raft
> 
>
> Key: HBASE-24753
> URL: https://issues.apache.org/jira/browse/HBASE-24753
> Project: HBase
>  Issue Type: New Feature
>  Components: master
>Reporter: Duo Zhang
>Priority: Major
>
> For better availability, for moving bootstrap information from zookeeper to 
> our own service so finally we could remove the dependency on zookeeper 
> completely.
> This has been in my mind for a long time, and since the there is a dicussion 
> in HBASE-11288 about how to storing root table, and also in HBASE-24749, we 
> want to have better performance on a filesystem can not support list and 
> rename well, where requires a storage engine at the bottom to store the 
> storefiles information for meta table, I think it is the time to throw this 
> idea out.
> The basic solution is to build a raft group to store the bootstrap 
> information, for now it is cluster id(it is on the file system already?) and 
> the root table. For region servers they will always go to the leader to ask 
> for the information so they can always see the newest data, and for client, 
> we enable 'follower read', to reduce the load of the leader(and there are 
> some solutions to even let 'follower read' to always get the newest data in 
> raft).
> With this solution in place, as long as root table will not be in a format of 
> region(we could just use rocksdb to store it locally), the cyclic dependency 
> in HBASE-24749 has also been solved, as we do not need to find a place to 
> store the storefiles information for root table any more.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24753) HA masters based on raft

2020-08-05 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17171878#comment-17171878
 ] 

Duo Zhang commented on HBASE-24753:
---

{quote}
Duo, my concern was not on avoid zk usage and all HM have own consensus for 
leader election (As what the jira title says). My worry was on the line which 
says move the root table data (meta as of today) away from storage but to local 
and HM handle it in special way.
{quote}

You could build the storage of the raft store on a HDFS? Or even just on a 
dynamodb as it is a just a KV? I do not see any problems here...

> HA masters based on raft
> 
>
> Key: HBASE-24753
> URL: https://issues.apache.org/jira/browse/HBASE-24753
> Project: HBase
>  Issue Type: New Feature
>  Components: master
>Reporter: Duo Zhang
>Priority: Major
>
> For better availability, for moving bootstrap information from zookeeper to 
> our own service so finally we could remove the dependency on zookeeper 
> completely.
> This has been in my mind for a long time, and since the there is a dicussion 
> in HBASE-11288 about how to storing root table, and also in HBASE-24749, we 
> want to have better performance on a filesystem can not support list and 
> rename well, where requires a storage engine at the bottom to store the 
> storefiles information for meta table, I think it is the time to throw this 
> idea out.
> The basic solution is to build a raft group to store the bootstrap 
> information, for now it is cluster id(it is on the file system already?) and 
> the root table. For region servers they will always go to the leader to ask 
> for the information so they can always see the newest data, and for client, 
> we enable 'follower read', to reduce the load of the leader(and there are 
> some solutions to even let 'follower read' to always get the newest data in 
> raft).
> With this solution in place, as long as root table will not be in a format of 
> region(we could just use rocksdb to store it locally), the cyclic dependency 
> in HBASE-24749 has also been solved, as we do not need to find a place to 
> store the storefiles information for root table any more.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24753) HA masters based on raft

2020-08-05 Thread Anoop Sam John (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17171877#comment-17171877
 ] 

Anoop Sam John commented on HBASE-24753:


Thanks Nick... Ya even what he mentioned also possible.  Clone a cluster.  The 
drop and recreate the cluster based on saved data is a very common thing.  
HBase always use(d) a FS like HDFS or cloud for storing any persistent data.  
This is really great for cloud cases.  Any system which deal with local storage 
(replicated and raft kind of consensus), wont be easy to make it to work in 
cloud.
bq.And even for now, it is not safe to just restart a new cluster with data on 
HDFS but no data on zookeeper. 
Duo, my concern was not on avoid zk usage and all HM have own consensus for 
leader election (As what the jira title says).  My worry was on the line which 
says move the root table data (meta as of today) away from storage but to local 
and HM handle it in special way.   


> HA masters based on raft
> 
>
> Key: HBASE-24753
> URL: https://issues.apache.org/jira/browse/HBASE-24753
> Project: HBase
>  Issue Type: New Feature
>  Components: master
>Reporter: Duo Zhang
>Priority: Major
>
> For better availability, for moving bootstrap information from zookeeper to 
> our own service so finally we could remove the dependency on zookeeper 
> completely.
> This has been in my mind for a long time, and since the there is a dicussion 
> in HBASE-11288 about how to storing root table, and also in HBASE-24749, we 
> want to have better performance on a filesystem can not support list and 
> rename well, where requires a storage engine at the bottom to store the 
> storefiles information for meta table, I think it is the time to throw this 
> idea out.
> The basic solution is to build a raft group to store the bootstrap 
> information, for now it is cluster id(it is on the file system already?) and 
> the root table. For region servers they will always go to the leader to ask 
> for the information so they can always see the newest data, and for client, 
> we enable 'follower read', to reduce the load of the leader(and there are 
> some solutions to even let 'follower read' to always get the newest data in 
> raft).
> With this solution in place, as long as root table will not be in a format of 
> region(we could just use rocksdb to store it locally), the cyclic dependency 
> in HBASE-24749 has also been solved, as we do not need to find a place to 
> store the storefiles information for root table any more.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24753) HA masters based on raft

2020-08-05 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17171864#comment-17171864
 ] 

Duo Zhang commented on HBASE-24753:
---

{quote}
I believe Anoop Sam John's point is that today we can clone the root directory 
from hdfs to a new path and from that new path standup an independent cluster. 
{quote}

Actually, no... It is about HBASE-24286, where the AWS guys want to redeploy an 
HBase cluster with the old data on S3 but all new virtual machines as region 
server. So I say if you want to do this on cloud, you could also use EBS.

And on a normal deploy in DC with physical machine, if you just want to change 
the machines, raft has built-in support for adding new node and removing old 
node. And even for now, it is not safe to just restart a new cluster with data 
on HDFS but no data on zookeeper. You need to use HBCK2 to repair the cluster. 
I do not think there is much difference if we just move the data from zookeeper 
to our own raft based master store.

> HA masters based on raft
> 
>
> Key: HBASE-24753
> URL: https://issues.apache.org/jira/browse/HBASE-24753
> Project: HBase
>  Issue Type: New Feature
>  Components: master
>Reporter: Duo Zhang
>Priority: Major
>
> For better availability, for moving bootstrap information from zookeeper to 
> our own service so finally we could remove the dependency on zookeeper 
> completely.
> This has been in my mind for a long time, and since the there is a dicussion 
> in HBASE-11288 about how to storing root table, and also in HBASE-24749, we 
> want to have better performance on a filesystem can not support list and 
> rename well, where requires a storage engine at the bottom to store the 
> storefiles information for meta table, I think it is the time to throw this 
> idea out.
> The basic solution is to build a raft group to store the bootstrap 
> information, for now it is cluster id(it is on the file system already?) and 
> the root table. For region servers they will always go to the leader to ask 
> for the information so they can always see the newest data, and for client, 
> we enable 'follower read', to reduce the load of the leader(and there are 
> some solutions to even let 'follower read' to always get the newest data in 
> raft).
> With this solution in place, as long as root table will not be in a format of 
> region(we could just use rocksdb to store it locally), the cyclic dependency 
> in HBASE-24749 has also been solved, as we do not need to find a place to 
> store the storefiles information for root table any more.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24753) HA masters based on raft

2020-08-05 Thread Nick Dimiduk (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17171807#comment-17171807
 ] 

Nick Dimiduk commented on HBASE-24753:
--

I believe [~anoop.hbase]'s point is that today we can clone the root directory 
from hdfs to a new path and from that new path standup an independent cluster. 
The cluster's persistent state resides exclusively in the configured root 
directory, on HDFS. Introducing non-ephemeral consensus changes this story, 
makes it so that the consensus implementation's data is also required. To date, 
this has been non-desirable.

I don't think we should assume all deployments have an EBS-alike local volume 
management system available.

> HA masters based on raft
> 
>
> Key: HBASE-24753
> URL: https://issues.apache.org/jira/browse/HBASE-24753
> Project: HBase
>  Issue Type: New Feature
>  Components: master
>Reporter: Duo Zhang
>Priority: Major
>
> For better availability, for moving bootstrap information from zookeeper to 
> our own service so finally we could remove the dependency on zookeeper 
> completely.
> This has been in my mind for a long time, and since the there is a dicussion 
> in HBASE-11288 about how to storing root table, and also in HBASE-24749, we 
> want to have better performance on a filesystem can not support list and 
> rename well, where requires a storage engine at the bottom to store the 
> storefiles information for meta table, I think it is the time to throw this 
> idea out.
> The basic solution is to build a raft group to store the bootstrap 
> information, for now it is cluster id(it is on the file system already?) and 
> the root table. For region servers they will always go to the leader to ask 
> for the information so they can always see the newest data, and for client, 
> we enable 'follower read', to reduce the load of the leader(and there are 
> some solutions to even let 'follower read' to always get the newest data in 
> raft).
> With this solution in place, as long as root table will not be in a format of 
> region(we could just use rocksdb to store it locally), the cyclic dependency 
> in HBASE-24749 has also been solved, as we do not need to find a place to 
> store the storefiles information for root table any more.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24753) HA masters based on raft

2020-07-28 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17166371#comment-17166371
 ] 

Duo Zhang commented on HBASE-24753:
---

{quote}
There is one interesting usecase by Cloud to drop a cluster and recreate it 
later on existing data. This was/is possible because we never store any 
persisting data locally but always on FS. I would say lets not break that. I 
read in another jira also says that the Root table data can be stored locally 
(RAFT will be in place) not on FS. I would say lets not do that. Let us 
continue to have the storage isolation.
{quote}

I do not think the problem here is local storage, it is HDFS, and also 
ZooKeeper. We could use EBS as local storage, and it also supports snapshot so 
recreate a cluster is easy.

> HA masters based on raft
> 
>
> Key: HBASE-24753
> URL: https://issues.apache.org/jira/browse/HBASE-24753
> Project: HBase
>  Issue Type: New Feature
>  Components: master
>Reporter: Duo Zhang
>Priority: Major
>
> For better availability, for moving bootstrap information from zookeeper to 
> our own service so finally we could remove the dependency on zookeeper 
> completely.
> This has been in my mind for a long time, and since the there is a dicussion 
> in HBASE-11288 about how to storing root table, and also in HBASE-24749, we 
> want to have better performance on a filesystem can not support list and 
> rename well, where requires a storage engine at the bottom to store the 
> storefiles information for meta table, I think it is the time to throw this 
> idea out.
> The basic solution is to build a raft group to store the bootstrap 
> information, for now it is cluster id(it is on the file system already?) and 
> the root table. For region servers they will always go to the leader to ask 
> for the information so they can always see the newest data, and for client, 
> we enable 'follower read', to reduce the load of the leader(and there are 
> some solutions to even let 'follower read' to always get the newest data in 
> raft).
> With this solution in place, as long as root table will not be in a format of 
> region(we could just use rocksdb to store it locally), the cyclic dependency 
> in HBASE-24749 has also been solved, as we do not need to find a place to 
> store the storefiles information for root table any more.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24753) HA masters based on raft

2020-07-28 Thread Tamas Penzes (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17166294#comment-17166294
 ] 

Tamas Penzes commented on HBASE-24753:
--

[~zhangduo] Ozone uses Ratis 1.0.0 according to their pom.xml 
[https://github.com/apache/hadoop-ozone/blob/master/pom.xml#L82]

Looks like it's stable enough for them.

> HA masters based on raft
> 
>
> Key: HBASE-24753
> URL: https://issues.apache.org/jira/browse/HBASE-24753
> Project: HBase
>  Issue Type: New Feature
>  Components: master
>Reporter: Duo Zhang
>Priority: Major
>
> For better availability, for moving bootstrap information from zookeeper to 
> our own service so finally we could remove the dependency on zookeeper 
> completely.
> This has been in my mind for a long time, and since the there is a dicussion 
> in HBASE-11288 about how to storing root table, and also in HBASE-24749, we 
> want to have better performance on a filesystem can not support list and 
> rename well, where requires a storage engine at the bottom to store the 
> storefiles information for meta table, I think it is the time to throw this 
> idea out.
> The basic solution is to build a raft group to store the bootstrap 
> information, for now it is cluster id(it is on the file system already?) and 
> the root table. For region servers they will always go to the leader to ask 
> for the information so they can always see the newest data, and for client, 
> we enable 'follower read', to reduce the load of the leader(and there are 
> some solutions to even let 'follower read' to always get the newest data in 
> raft).
> With this solution in place, as long as root table will not be in a format of 
> region(we could just use rocksdb to store it locally), the cyclic dependency 
> in HBASE-24749 has also been solved, as we do not need to find a place to 
> store the storefiles information for root table any more.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24753) HA masters based on raft

2020-07-27 Thread Anoop Sam John (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17166154#comment-17166154
 ] 

Anoop Sam John commented on HBASE-24753:


bq.With this solution in place, as long as root table will not be in a format 
of region(we could just use rocksdb to store it locally), 
There is one interesting usecase by Cloud to drop a cluster and recreate it 
later on existing data.  This was/is possible because we never store any 
persisting data locally but always on FS. I would say lets not break that. I 
read in another jira also says that the Root table data can be stored locally 
(RAFT will be in place) not on FS.  I would say lets not do that.  Let us 
continue to have the storage isolation.

> HA masters based on raft
> 
>
> Key: HBASE-24753
> URL: https://issues.apache.org/jira/browse/HBASE-24753
> Project: HBase
>  Issue Type: New Feature
>  Components: master
>Reporter: Duo Zhang
>Priority: Major
>
> For better availability, for moving bootstrap information from zookeeper to 
> our own service so finally we could remove the dependency on zookeeper 
> completely.
> This has been in my mind for a long time, and since the there is a dicussion 
> in HBASE-11288 about how to storing root table, and also in HBASE-24749, we 
> want to have better performance on a filesystem can not support list and 
> rename well, where requires a storage engine at the bottom to store the 
> storefiles information for meta table, I think it is the time to throw this 
> idea out.
> The basic solution is to build a raft group to store the bootstrap 
> information, for now it is cluster id(it is on the file system already?) and 
> the root table. For region servers they will always go to the leader to ask 
> for the information so they can always see the newest data, and for client, 
> we enable 'follower read', to reduce the load of the leader(and there are 
> some solutions to even let 'follower read' to always get the newest data in 
> raft).
> With this solution in place, as long as root table will not be in a format of 
> region(we could just use rocksdb to store it locally), the cyclic dependency 
> in HBASE-24749 has also been solved, as we do not need to find a place to 
> store the storefiles information for root table any more.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24753) HA masters based on raft

2020-07-27 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17166058#comment-17166058
 ] 

Duo Zhang commented on HBASE-24753:
---

Update: I was trying to make use of the sofa-jraft, but it depends on protobuf 
3.x, so I tried to set hadoop.version to 3.3.0, as hadoop 3.3.0 has shaded 
protobuf, so we can purge all the protobuf 2.5 dependencies. But then I noticed 
that there is a problem for support hadoop 3.3.0 because of the conflict on 
jetty version.

So I'm currently working on shade jetty to solve the problem first.

> HA masters based on raft
> 
>
> Key: HBASE-24753
> URL: https://issues.apache.org/jira/browse/HBASE-24753
> Project: HBase
>  Issue Type: New Feature
>  Components: master
>Reporter: Duo Zhang
>Priority: Major
>
> For better availability, for moving bootstrap information from zookeeper to 
> our own service so finally we could remove the dependency on zookeeper 
> completely.
> This has been in my mind for a long time, and since the there is a dicussion 
> in HBASE-11288 about how to storing root table, and also in HBASE-24749, we 
> want to have better performance on a filesystem can not support list and 
> rename well, where requires a storage engine at the bottom to store the 
> storefiles information for meta table, I think it is the time to throw this 
> idea out.
> The basic solution is to build a raft group to store the bootstrap 
> information, for now it is cluster id(it is on the file system already?) and 
> the root table. For region servers they will always go to the leader to ask 
> for the information so they can always see the newest data, and for client, 
> we enable 'follower read', to reduce the load of the leader(and there are 
> some solutions to even let 'follower read' to always get the newest data in 
> raft).
> With this solution in place, as long as root table will not be in a format of 
> region(we could just use rocksdb to store it locally), the cyclic dependency 
> in HBASE-24749 has also been solved, as we do not need to find a place to 
> store the storefiles information for root table any more.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24753) HA masters based on raft

2020-07-21 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17162461#comment-17162461
 ] 

Duo Zhang commented on HBASE-24753:
---

Another java raft library:

https://github.com/sofastack/sofa-jraft

> HA masters based on raft
> 
>
> Key: HBASE-24753
> URL: https://issues.apache.org/jira/browse/HBASE-24753
> Project: HBase
>  Issue Type: New Feature
>  Components: master
>Reporter: Duo Zhang
>Priority: Major
>
> For better availability, for moving bootstrap information from zookeeper to 
> our own service so finally we could remove the dependency on zookeeper 
> completely.
> This has been in my mind for a long time, and since the there is a dicussion 
> in HBASE-11288 about how to storing root table, and also in HBASE-24749, we 
> want to have better performance on a filesystem can not support list and 
> rename well, where requires a storage engine at the bottom to store the 
> storefiles information for meta table, I think it is the time to throw this 
> idea out.
> The basic solution is to build a raft group to store the bootstrap 
> information, for now it is cluster id(it is on the file system already?) and 
> the root table. For region servers they will always go to the leader to ask 
> for the information so they can always see the newest data, and for client, 
> we enable 'follower read', to reduce the load of the leader(and there are 
> some solutions to even let 'follower read' to always get the newest data in 
> raft).
> With this solution in place, as long as root table will not be in a format of 
> region(we could just use rocksdb to store it locally), the cyclic dependency 
> in HBASE-24749 has also been solved, as we do not need to find a place to 
> store the storefiles information for root table any more.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24753) HA masters based on raft

2020-07-21 Thread Duo Zhang (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17162445#comment-17162445
 ] 

Duo Zhang commented on HBASE-24753:
---

A possible problem is which library to use. Is ratis stable enough? It is used 
in ozone?

> HA masters based on raft
> 
>
> Key: HBASE-24753
> URL: https://issues.apache.org/jira/browse/HBASE-24753
> Project: HBase
>  Issue Type: New Feature
>  Components: master
>Reporter: Duo Zhang
>Priority: Major
>
> For better availability, for moving bootstrap information from zookeeper to 
> our own service so finally we could remove the dependency on zookeeper 
> completely.
> This has been in my mind for a long time, and since the there is a dicussion 
> in HBASE-11288 about how to storing root table, and also in HBASE-24749, we 
> want to have better performance on a filesystem can not support list and 
> rename well, where requires a storage engine at the bottom to store the 
> storefiles information for meta table, I think it is the time to throw this 
> idea out.
> The basic solution is to build a raft group to store the bootstrap 
> information, for now it is cluster id(it is on the file system already?) and 
> the root table. For region servers they will always go to the leader to ask 
> for the information so they can always see the newest data, and for client, 
> we enable 'follower read', to reduce the load of the leader(and there are 
> some solutions to even let 'follower read' to always get the newest data in 
> raft).
> With this solution in place, as long as root table will not be in a format of 
> region(we could just use rocksdb to store it locally), the cyclic dependency 
> in HBASE-24749 has also been solved, as we do not need to find a place to 
> store the storefiles information for root table any more.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)