Have you looked in /contrib for the block placement stuff? Maybe it provides some ideas?
https://git1-us-west.apache.org/repos/asf?p=incubator-blur.git;a=tree;f=contrib/blur-block-placement-policy;h=743a50d6431f4f8cecbb0f55d75baf187da7f755;hb=HEAD Thanks, --tim On Wed, Apr 26, 2017 at 9:40 AM, Ravikumar Govindarajan <[email protected]> wrote: >> >> In case of HDFS or MAPRF can we dynamically assign >> shards to shardservers based on the data locality (using block locations)? > > > I was exploring the reverse option. Blur will suggest the set of > hadoop-datanodes to replicate while writing index files. > > Blur will also explicitly control bootstrapping a new datanode & > load-balancing it, as well as removing a datanode from cluster.. > > Such fine control is possible by customizing BlockPlacementPolicy API... > > Have started exploring it. Changes look big. Will keep the group posted on > progress > > On Fri, Apr 21, 2017 at 10:42 PM, rahul challapalli < > [email protected]> wrote: > >> Its been a while since I looked at the code, but I believe a shard server >> has a list of shards which it can serve. Now maintaining this static >> mapping (or tight coupling) between shard servers and shards is a design >> decision which makes complete sense for clusters where nodes do not share a >> distributed file system. In case of HDFS or MAPRF can we dynamically assign >> shards to shardservers based on the data locality (using block locations)? >> Obviously this hasn't been well thought out as a lot of components would be >> affected. Just dumping a few thoughts from my brain. >> >> - Rahul >> >> On Fri, Apr 21, 2017 at 9:44 AM, Ravikumar Govindarajan < >> [email protected]> wrote: >> >> > We have been facing lot of slowdown in production, whenever a >> shard-server >> > is added or removed... >> > >> > Shards which were locally served via short-circuit suddenly becomes fully >> > remote & at scale, this melts down. >> > >> > Block cache is kind of reactive cache & takes a lot of time to settle >> down >> > (at-least for us!!) >> > >> > Have been thinking of handling this locality issue for some time now.. >> > >> > 1. For every shard, Blur can map a primary server & a secondary server in >> > ZooKeeper >> > 2. File-writes can use the favored nodes hint of Hadoop & write to both >> > these servers [https://issues.apache.org/jira/browse/HDFS-2576] >> > 3. When a machine goes down, instead of randomly assigning shards to >> > different shard-servers, Blur can decide to allocate shards to designated >> > secondary servers. >> > >> > Adding a new machine is another problem, where it will immediately start >> > serving shards from remote machines. It must need data copies of all >> > primary shards it is supposed serve from local disk.. >> > >> > hadoop has something called BlockPlacementPolicy that can be hacked into. >> > [ >> > http://hadoopblog.blogspot.in/2009/09/hdfs-block-replica- >> > placement-in-your.html >> > ] >> > >> > When booting a new machine, lets say we increase replication-factor from >> 3 >> > to 4, for shards that will be hosted by new machine (setrep command from >> > hdfs console) >> > >> > Now hadoop will call our CustomBlockPlacementPolicy class to arrange >> extra >> > replication, where we can sneak in the new IP.. >> > >> > Once all shards to be hosted by this new machine are replicated, we can >> > close these shards, update the mappings in ZK & open them. Data will be >> > served locally >> > >> > Similarly, when restoring replication-factor from 4 to 3, our >> > CustomBlockPlacementPolicy class can hook up to ZK, find out which node >> to >> > delete the data & proceed... >> > >> > Do let know your thoughts on this... >> > >>
