[
https://issues.apache.org/jira/browse/HBASE-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12707924#action_12707924
]
Billy Pearson commented on HBASE-1295:
--------------------------------------
I was thanking on this there is some other thing to consider like table splits
will the regions be the same on both because there is no guarantee the
compactions will happen at the same time or the split will find the same mid
key.
I would thank the master would be the idea process to pull logs a pass to peer
master then it can split the logs in to regions and pass the edits on to the
servers hosting the regions.
I would like to see Sequential process of the edits to the peer so everything
is in the same order and that's the way we store the wal's now.
I am not sure what the current status of appends on hdfs right now but if we
had that 100% working the master could just remember where in the wal it read
up to and pull every x secs to see if there are any updates then we would not
have to worry about waiting for a log to roll which could be a while in some
cases. Waiting for a log to roll for the updates to get pushed to the peers
seams like the wrong way to go with this but might be the only way we have now
if append is not working right in hdfs.
As for a first sync for the peers would be hugh saving if we could do a rolling
read only mode on the regions and flush the memcache and copy the needed files
unlock the region and start the transfer to the peer this would allow one by
one copy of the regions to the remote and it would only be depending on the
site-site bandwidth as the bottleneck in the mean time the peer could be
holding edits and waiting for all regions to get copied and then start the
replay of the logs skipping any edit that is older the the time stamp of the
copy. I thank that could be written in the hfile now I thank as meta data.
Just some suggestions and/or other thoughts
> Federated HBase
> ---------------
>
> Key: HBASE-1295
> URL: https://issues.apache.org/jira/browse/HBASE-1295
> Project: Hadoop HBase
> Issue Type: New Feature
> Reporter: Andrew Purtell
> Attachments: hbase_repl.2.odp, hbase_repl.2.pdf
>
>
> HBase should consider supporting a federated deployment where someone might
> have terascale (or beyond) clusters in more than one geography and would want
> the system to handle replication between the clusters/regions. It would be
> sweet if HBase had something on the roadmap to sync between replicas out of
> the box.
> Consider if rows, columns, or even cells could be scoped: local, or global.
> Then, consider a background task on each cluster that replicates new globally
> scoped edits to peer clusters. The HBase/Bigtable data model has convenient
> features (timestamps, multiversioning) such that simple exchange of globally
> scoped cells would be conflict free and would "just work". Implementation
> effort here would be in producing an efficient mechanism for collecting up
> edits from all the HRS and transmitting the edits over the network to peers
> where they would then be split out to the HRS there. Holding on to the edit
> trace and tracking it until the remote commits succeed would also be
> necessary. So, HLog is probably the right place to set up the tee. This would
> be filtered log shipping, basically.
> This proposal does not consider transactional tables. For transactional
> tables, enforcement of global mutation commit ordering would come into the
> picture if the user wants the transaction to span the federation. This
> should be an optional feature even with transactional tables themselves being
> optional because of how slow it would be.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.