Hi Varun, As Kanak mentioned, we will need to write a rebalancer that monitors HDFS changes.
Have few questions on the design. - Are you planning to use YARN or a standalone helix application - Is it a must to have affinity or is it a best effort. - If affinity is a must --- then worst case we will have as need as many servers as the number of partitions. --- Another problem would be uneven distribution of partition to server. - Apart from failure, what is the expected behavior when an rebalance is done on the HDFS. Do you want the servers to move ? Once we get more info, we can quickly write a simple recipe to demonstrate this ( excluding the serving part). thanks, Kishore G On Mon, Jun 16, 2014 at 4:29 PM, Kanak Biscuitwala <[email protected]> wrote: > Hi Varun, > > If you would like to do this using Helix today, there are ways to support > this, but you will have to write your own rebalancer. Essentially what you > need is to put your resource ideal state in CUSTOMIZED rebalance mode, and > then have some code that will listen on HDFS changes and computes the > logical partition assignment and writes the ideal state based on what HDFS > looks like at that moment. > > Eventually we want to define affinity-based assignments in our FULL_AUTO > mode, but the challenge here is being able to represent that 30 minute > delay, and changing that affinity through some configuration. > > Does that make sense? Perhaps others on this list have more ideas. > > Kanak > ________________________________ > > Date: Mon, 16 Jun 2014 15:55:43 -0700 > > Subject: Using Helix for HDFS serving > > From: [email protected] > > To: [email protected] > > > > Hi folks, > > > > We are looking at helix for building a serving solution on top of HDFS > > for data generated from mapreduce jobs. The files will be smaller than > > the HDFS block size and hence each file will be on 3 replicas with each > > replica having the whole file in entirety. A set of files output by MR > > would be the resource and each file (or group of X files) would be a > > partition. > > > > We can assume that there is a container which can serve these immutable > > files for lookups. Since we have 3 replicas, we were wondering if we > > could use helix for serving these files with 3 logically equivalent > > replicas. We need a few things: > > > > a) In the steady state, when HDFS blocks are all triplicated, the > > logical assigment of the 3 replicas should respect block affinity. > > > > b) When a node crashes, some blocks become under replicated both > > physically and logically (from helix point of view). In such a case, we > > don't want to carry out any transitions. Finally, over time (~ 20 > > minutes), HDFS will re replicate blocks so that physical replication > > factor of 3 is attained. Once this happens, we want the logical > > replication to catch up to 3 and also respect hdfs block placement. > > > > So there are two aspects, one is to retain block locality by doing > > logical assigment in a way that the logical partition comes up on the > > same nodes hosting the physical partition. Secondly, we want the > > logical placement to trail the physical placement (as determined by > > HDFS). So we could have the cluster in a non ideal state for a long > > period of time - say 20-30 minutes. > > > > Please let us know if these are feasible with helix and if yes, what > > would be the recommended practices. > > > > Thanks > > Varun > > > > > >
