Re: Heterogeneous cluster

Robert Dyer Sat, 08 Dec 2012 15:50:50 -0800

I of course can not speak for Jean-Marc, however my use case is not very
corporate.  It is a small cluster (9 nodes) and only 1 of those nodes is
different (drastically different).


And yes, I configured it so that node has a lot more map slots.  However,
the problem is HBase balances without regard to that and thus even though
more map tasks run on those nodes they are not data-local!  If I have a
balancer that is able to keep more regions on that particular node, then
the data locality of my map tasks is improved.


On Sat, Dec 8, 2012 at 5:45 PM, Michael Segel <[email protected]>wrote:

> Take what I say with a grain of kosher salt. (Its what they put on your
> drink glasses because the grains are bigger. ;-)
>
> I think what you are doing is cool hack, however in the bigger picture,
> you shouldn't have to do this with your load balancer. Also it doesn't
> matter if you think about ti.
>
> With a heterogenous cluster, you will not share the same configuration
> across all machines in the cluster. You will change the number of slots per
> node based on its capacity.
> That will limit what amount of work could be done on the same cluster.
>
> You could also consider playing with the rack aware aspects of your
> cluster.
> You could make all of your 2CPU machines in the same rack.
>
> In theory... machine, rack , second rack is how the data is distributed.
> In theory if the 2CPU cores are neighbors, then the 2nd and or 3rd copy
> goes to another machine.
>
> Trying to write a custom balancer, may be a good hack, but not good in
> terms of corporate life.
>
> Just saying!
>
> -Mike
>
> On Dec 8, 2012, at 1:34 PM, Jean-Marc Spaggiari <[email protected]>
> wrote:
>
> > Hi,
> >
> > It's not yet available anywhere. I will post it today or tomorrow,
> > just the time to remove some hardcoding I did into it ;) It's a quick
> > and dirty PerformanceBalancer. It's not a CPULoadBalencer.
> >
> > Anyway, I will give more details over the week-end, but there is
> > absolutly nothing extraordinaire with it.
> >
> > JM
> >
> > 2012/12/8, Robert Dyer <[email protected]>:
> >> I too am interested in this custom load balancer, as I was actually just
> >> starting to look into writing one that does the same thing for
> >> my heterogeneous cluster!
> >>
> >> Is this available somewhere?
> >>
> >> On Sat, Dec 8, 2012 at 9:17 AM, James Chang <[email protected]>
> >> wrote:
> >>
> >>>     By the way, I saw you mentioned that you
> >>> have built a "LoadBalancer", could you kindly
> >>> share some detailed info about it?
> >>>
> >>> Jean-Marc Spaggiari 於 2012年12月8日星期六寫道：
> >>>
> >>>> Hi,
> >>>>
> >>>> Here is the situation.
> >>>>
> >>>> I have an heterogeneous cluster with 2 cores CPUs, 4 cores CPUs and 8
> >>>> cores CPUs servers. The performances of those different servers allow
> >>>> them to handle different size of load. So far, I built a LoadBalancer
> >>>> which balance the regions over those servers based on the
> >>>> performances. And it’s working quite well. The RowCounter went down
> >>>> from 11 minutes to 6 minutes. However, I can still see that the tasks
> >>>> are run on some servers accessing data on other servers, which
> >>>> overwhelme the bandwidth and slow done the process since some 2 cores
> >>>> servers are assigned to count some rows hosted on 8 cores servers.
> >>>>
> >>>> I’m looking for a way to “force” the tasks to run on the servers where
> >>>> the regions are assigned.
> >>>>
> >>>> I first tried to reject the tasks on the Mapper setup method when the
> >>>> data was not local to see if the tracker will assign it to another
> >>>> server. No. It’s just failing and mostly not re-assigned. I tried
> >>>> IOExceptions, RuntimeExceptions, InterruptionExceptions with no
> >>>> success.
> >>>>
> >>>> So now I have 3 possible options.
> >>>>
> >>>> The first one is to move from the MapReduce to the Coprocessor
> >>>> EndPoint. Running locally on the RegionServer, it’s accessing only the
> >>>> local data and I can manually reject all what is not local. Therefor
> >>>> it’s achieving my needs, but it’s not my preferred options since I
> >>>> would like to keep the MR features.
> >>>>
> >>>> The second option is to tell Hadoop where the tasks should be
> >>>> assigned. Should that be done by HBase? By Hadoop? I don’t know.
> >>>> Where? I don’t know either. I have started to look at JobTracker and
> >>>> JobInProgress code but it seems it will be a big task. Also, doing
> >>>> that will mean I will have to re-patch the distributed code each time
> >>>> I’m upgrading the version, and I will have to redo everything when I
> >>>> will move from 1.0.x to 2.x…
> >>>>
> >>>> Third option is to not process the task if the data is not local. I
> >>>> mean, on the map method, simply have a if (!local) return; right from
> >>>> the beginning and just do nothing. This will not work for things like
> >>>> RowCount since all the entries are required, but for some of my
> >>>> usecases this might work where I don’t necessary need all the data to
> >>>> be processed. I will not be efficient stlil the task will still scan
> >>>> the entire region.
> >>>>
> >>>> My preferred option is definitively the 2nd one, but it seems also to
> >>>> be the most difficult one. The Third one is very easy to implement.
> >>>> Need 2 lines to see if the data is local. But it’s not working for all
> >>>> the scenarios, and is more like a dirty fix. The coprocessor option
> >>>> might be doable too since I already have all the code for my MapReduce
> >>>> jobs. So it might be an acceptable option.
> >>>>
> >>>> I’m wondering if anyone already faced this situation and worked on
> >>>> something, and if not, do you have any other ideas/options to propose,
> >>>> or can someone point me to the right classes to look at to implement
> >>>> the solution 2?
> >>>>
> >>>> Thanks,
> >>>>
> >>>> JM
> >>>>
> >>>
> >>
> >>
> >>
> >> --
> >>
> >> Robert Dyer
> >> [email protected]
> >>
> >
>
>


-- 

Robert Dyer
[email protected]

Re: Heterogeneous cluster

Reply via email to