Hey Gabriel, I think when I originally designed it I over-engineered it a bit. Just picking a random one should be enough and make the code simpler.
J-D On Tue, Feb 12, 2013 at 8:37 AM, Gabriel Reid <[email protected]> wrote: > Hi, > > I was wondering if someone (perhaps Jean-Daniel, but anyone is welcome) could > explain the reasoning for the current peer sink selection logic within > replication. > > As it currently stands, a percentage (by default 10%) of the slave cluster's > region servers are randomly chosen by each region server in the master > cluster as their replication pool. Each time a batch of edits is shipped to a > peer, one region server is chosen from the pre-selected pool of slave region > servers. > > I was wondering what the advantage(s) of this approach are compared to each > master region server simply randomly choosing a slave peer from the full set > of slave region servers. In my (probably naive) view, this approach would > provide a more even distribution of usage over the whole slave cluster, and I > can't see any real advantages that the current approach has (although I > assume there must be some). > > Could someone let me know what the reasoning is behind the current approach? > > Thanks, > > Gabriel
