Re: Drill on YARN

Paul Rogers Fri, 25 Mar 2016 15:46:19 -0700

Hi Jacques,

Llama is a very investing approach; I read their paper [1] early on; just went 
back and read it again. Basically, Llama (as best as I can tell) has a two-part 
solution.

First, Impala is run off-YARN (that is, not in a YARN container). Llama uses 
“dummy” containers to inform YARN of Impala’s resource usage. They can 
grow/shrink static allocations by launching more dummy containers. Each dummy 
container does nothing other than inform off-YARN Impala of the container 
resources. Rather clever, actually, even if it “abuses the software” a bit.

Secondly, Llama is able to dynamically grab spare YARN resources on each node. 
Specifically, Llama runs a Node Manager (NM) plugin that watches actual node 
usage. The plugin detects the free NM resources and informs Impala of them. 
Impala then consumes the resources as needed. When the NM allocates a new 
container, the plugin informs Impala which relinquishes the resources. All this 
works because YARN allocations are mostly a gentleman’s agreement. Again, this 
is pretty clever, but only one app per node can play this game.

The Llama approach could work for Drill. The benefit is that Drill runs as it 
does today. Hanifi’s work will allow us to increase or decrease the number of 
cores we consume. The draw-back is that Drill is not yet ready to play the 
memory game: it can’t release memory back to the OS when requested. Plus, the 
approach just smells like a hack.

The “pure-YARN” approach would be to let YARN start/stop the Drill-bits. The 
user can grow/shrink Drill resources by starting/stopping Drill-bits. (This is 
simple to do if one ignores data locality and starts each Drill-bit on a 
separate node. It is a bit more work if one wants to preserve data locality by 
being rack-aware, or by running multiple drill-bits per node.)

YARN has been working on the ability to resize running containers. (See 
YARN-1197 - Support changing resources of an allocated container [2]) Once that 
is available, we can grow/shrink existing Drill-bits (assuming that Drill 
itself is enhanced as discussed above.) The promise of resizable containers 
also suggests that the “pure-YARN” approach is workable.

Once resizable containers are available, one more piece is needed to let Drill 
use free resources. Some cluster-wide component must detect free resources and 
offer them to applications that want them, deciding how to divvy up the 
resources between, say, Drill and Impala. The same piece would revoke resources 
when paying YARN customers need them.

Of course, if the resizable container feature come too late, or does not work 
well, we still have the option of going off-YARN using the Llama trick. But the 
Llama trick does nothing to do the cluster-wide coordination discussed above.

So, the thought is: start simple with a “stock” YARN app. Then, we can add 
bells and whistles as we gain experience and as YARN offers more capabilities.

The nice thing about this approach is that the same idea plays well with Mesos 
(though the implementation is different).

Thanks,

- Paul 

[1] http://cloudera.github.io/llama/ <http://cloudera.github.io/llama/>
[2] https://issues.apache.org/jira/browse/YARN-1197 
<https://issues.apache.org/jira/browse/YARN-1197>

> On Mar 24, 2016, at 2:34 PM, Jacques Nadeau <[email protected]> wrote:
> 
> Your proposed allocation approach makes a lot of sense. I think it will
> solve a large number of use cases. Thanks for giving an overview of the
> different frameworks. I wonder if they got too focused on the simple use
> case....
> 
> Have you looked at LLama to see whether we could extend it for our needs?
> Its Apache licensed and probably has at least a start at a bunch of things
> we're trying to do.
> 
> https://github.com/cloudera/llama
> 
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
> 
> On Tue, Mar 22, 2016 at 7:42 PM, Paul Rogers <[email protected]> wrote:
> 
>> Hi Jacques,
>> 
>> I’m thinking of “semi-static” allocation at first. Spin up a cluster of
>> Drill-bits, after which the user can add or remove nodes while the cluster
>> runs. (The add part is easy, the remove part is a bit tricky since we don’t
>> yet have a way to gracefully shut down a Drill-bit.) Once we get the basics
>> to work, we can incrementally try out dynamics. For example, someone could
>> whip up a script to look at load and use the proposed YARN client app to
>> adjust resources. Later, we can fold dynamic load management into the
>> solution once we’re sure what folks want.
>> 
>> I did look at Slider, Twill, Kitten and REEF. Kitten is too basic. I had
>> great hope for Slider. But, it turns out that Slider and Weave have each
>> built an elaborate framework to isolate us from YARN. The Slider framework
>> (written in Python) seems harder to understand than YARN itself. At least,
>> one has to be an expert in YARN to understand what all that Python code
>> does. And, just looking at the class count in the Twill Javadoc was
>> overwhelming. Slider and Twill have to solve the general case. If we build
>> our own Java solution, we only have to solve the Drill case, which is
>> likely much simpler.
>> 
>> A bespoke solution would seem to offer some other advantages. It lets us
>> do things like integrate ZK monitoring so we can learn of zombie drill bits
>> (haven’t exited, but not sending heartbeat messages.) We can also gather
>> metrics and historical data about the cluster as a whole. We can try out
>> different cluster topologies. (Run Drill-bits on x of y nodes on a rack,
>> say.) And, we can eventually do the dynamic load management we discussed
>> earlier.
>> 
>> But first, I look forward to hearing what others have tried and what we’ve
>> learned about how people want to use Drill in a production YARN cluster.
>> 
>> Thanks,
>> 
>> - Paul
>> 
>> 
>>> On Mar 22, 2016, at 5:45 PM, Jacques Nadeau <[email protected]> wrote:
>>> 
>>> This is great news, welcome!
>>> 
>>> What are you thinking in regards to static versus dynamic resource
>>> allocation? We have some conversations going regarding workload
>> management
>>> but they are still early so it seems like starting with user-controlled
>>> allocation makes sense initially.
>>> 
>>> Also, have you spent much time evaluating whether one of the existing
>> YARN
>>> frameworks such as Slider would be useful? Does anyone on the list have
>> any
>>> feedback on the relative merits of these technologies?
>>> 
>>> Again, glad to see someone picking this up.
>>> 
>>> Jacques
>>> 
>>> 
>>> --
>>> Jacques Nadeau
>>> CTO and Co-Founder, Dremio
>>> 
>>> On Tue, Mar 22, 2016 at 4:58 PM, Paul Rogers <[email protected]>
>> wrote:
>>> 
>>>> Hi All,
>>>> 
>>>> I’m a new member of the Drill Team here at MapR. We’d like to take a
>> look
>>>> at running Drill on YARN for production customers. JIRA suggests some
>> early
>>>> work may have been done (DRILL-142 <
>>>> https://issues.apache.org/jira/browse/DRILL-142>, DRILL-1170 <
>>>> https://issues.apache.org/jira/browse/DRILL-1170>, DRILL-3675 <
>>>> https://issues.apache.org/jira/browse/DRILL-3675>).
>>>> 
>>>> YARN is a complex beast and the Drill community is large and growing.
>> So,
>>>> a good place to start is to ask if anyone has already done work on
>>>> integrating Drill with YARN (see DRILL-142)?  Or has thought about what
>>>> might be needed?
>>>> 
>>>> DRILL-1170 (YARN support for Drill) seems a good place to gather
>>>> requirements, designs and so on. I’ve posted a “starter set” of
>>>> requirements to spur discussion.
>>>> 
>>>> Thanks,
>>>> 
>>>> - Paul
>>>> 
>>>> 
>> 
>>

Re: Drill on YARN

Reply via email to