Hi Jacques,

Thanks for the comments and the links to the documents.

In the context of YARN, Resource Management divides into two (mostly) 
independent parts: external and internal.

YARN (via a user request) sets the external limits: x cores and y MB of RAM. 
YARN kills processes that exceed the memory limit, and (optionally) uses 
cgroups to enforce the vcores limit.

Drill’s job is to manage its internal resources to best “live within” that 
external limit. Being new, I was not aware of the two documents. Sounds like a 
good solution is already in place for memory, and good plans exist for CPU.

If we assume the current threading model, as you suggest, we’ll still be fine 
in terms of CPU usage. The current model can be a bit exuberant in its use of 
CPU, but, cgroups will ensure that Drill cannot exceed the YARN-imposed CPU 
limit.

Once the Drill-on-YARN work gets a bit further along, we will run tests to 
validate that cgroups does work as promised. I’ll let the group know as we get 
some results.

Thanks,

- Paul

 
> On Apr 13, 2016, at 3:03 PM, Jacques Nadeau <jacq...@dremio.com> wrote:
> 
> It sounds like Paul and John would both benefit from reviewing [1] & [2].
> 
> Drill's has memory management, respects limits and has a hierarchy of
> allocators to do this. The framework for constraining certain operations,
> fragments or queries all exists. (Note that this is entirely focused on
> off-heap memory, in general Drill tries to avoid ever moving data on heap.)
> 
> Workload management is another topic and there is an initial proposal out
> on that for comment here: [2]
> 
> The parallelization algorithms don't currently support heterogeneous nodes.
> I'd suggest that initial work be done on adding or removing same sized
> nodes. A separate substantial effort would be involved in better lopsided
> parallelization and workload decisions. (Let's get the basics right first.)
> 
> With regards to Paul's comments on 'inside Drill' threading, I think you're
> jumping to some incorrect conclusions. There hasn't been any formal
> proposals to change the threading model. There was a very short discussion
> a month or two back where Hanifi said he'd throw out some prototype code
> but nothing has been shared since. I suggest you assume the current
> threading model until there is a consensus around something new.
> 
> [1]
> https://github.com/apache/drill/blob/master/exec/memory/base/src/main/java/org/apache/drill/exec/memory/README.md
> [2]
> https://docs.google.com/document/d/1xK6CyxwzpEbOrjOdmkd9GXf37dVaf7z0BsvBNLgsZWs/edit
> 
> 
> 
> 
> 
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
> 
> On Mon, Mar 28, 2016 at 8:43 AM, John Omernik <j...@omernik.com> wrote:
> 
>> Great summary.  I'll fill in some "non-technical" explanations of some
>> challenges with Memory as I see. Drill Devs, please keep Paul and I
>> accurate in our understanding.
>> 
>> First,  Memory is already set at the drillbit level... sorta.  It's set via
>> ENV in drill-env, and is not a cluster specific thing. However, I believe
>> there are some challenges that come into play when you have bits of
>> different sizes. Drill "may" assume that bits are all the same size, and
>> thus, if you run a query, depending on which bit is the foreman, and which
>> fragments land where, the query may succeed or fail. That's not an ideal
>> situation. I think for a holistic discussion on memory, we need to get some
>> definitives around how Drill handles memory, especially different sized
>> nodes, and what changes would need to be made for bits of different size to
>> work well together on a production cluster.
>> 
>> This discussion forms the basis of almost all work around memory
>> management. If we can realistically only have bits of one size in it's
>> current form, then static allocations are where we are going to be for the
>> initial Yarn work. I love the idea of scaling up and down, but it will be
>> difficult to scale an entire cluster worth of bits up and down, so
>> heterogeneous resource allocations must be a prerequisite to dynamic
>> allocation discussions (other then just adding and removing whole bits).
>> 
>> Second, this also plays into the multiple drillbits per node discussion.
>> If static sized bits are our only approach, then the initial reaction is to
>> make them smaller so you have some granularity in scaling up and down.
>> This may actually hurt a cluster.  Large queries may be challenged by
>> trying to fit it's fragments on 3 nodes of say 8GB of direct RAM, but that
>> query would run fine on bits of 24GB of direct RAM.  Drill Devs: Keep me
>> honest here. I am going off of lots of participation in this memory/cpu
>> discussions when I first started Drill/Marathon integration, and that is
>> the feeling I got in talking to folks on and off list about memory
>> management.
>> 
>> This is a hard topic, but one that I am glad you are spearheading Paul,
>> because as we see more and more clusters get folded together, having a
>> citizen that plays nice with others, and provides flexibility with regards
>> to performance vs resource tradeoffs will be a huge selling/implementation
>> point of any analytics tool.  If it's hard to implement and test at scale
>> without dedicated hardware, it won't get a fair shake.
>> 
>> John
>> 
>> 
>> On Sun, Mar 27, 2016 at 3:25 PM, Paul Rogers <prog...@maprtech.com> wrote:
>> 
>>> Hi John,
>>> 
>>> The other main topic of your discussion is memory management. Here we
>> seem
>>> to have 6 topics:
>>> 
>>> 1. Setting the limits for Drill.
>>> 2. Drill respects the limits.
>>> 3. Drill lives within its memory “budget.”
>>> 4. Drill throttles work based on available memory.
>>> 5. Drill adapts memory usage to available memory.
>>> 6. Some means to inform Drill of increases (or decreased) in memory
>>> allocation.
>>> 
>>> YARN, via container requests, solves the first problem. Someone (the
>>> network admin) has to decide on the size of each drill-bit container, but
>>> YARN handles allocating the space, preventing memory oversubscription,
>> and
>>> enforcing the limit (by killing processes that exceed their allocation.)
>>> 
>>> As you pointed out, memory management is different than CPU: we can’t
>> just
>>> expect Linux to silently give us more or less depending on load. Instead,
>>> Drill itself has to actively request and release memory (and know what to
>>> do in each case.)
>>> 
>>> Item 2 says that Drill must limit its memory use. The JVM enforces heap
>>> size. (As the heap is exhausted, the a Java program gets slower due to
>>> increased garbage collection events until finally it receives an
>>> out-of-memory error.
>>> 
>>> At present I’m still learning the details of how Drill manages memory so,
>>> by necessity, most of what follows is at the level of “what we could do”
>>> rather than “how it works today.” Drill devs, please help fill in the
>> gaps.
>>> 
>>> The docs. suggest we have a variety of settings that configure drill
>>> memory (heap size, off-heap size, etc.) I need to ask around more to
>> learn
>>> if Drill does, in fact, limit its off-heap memory usage. If not, then
>>> perhaps this is a change we want to make.
>>> 
>>> Once Drill respects memory limits, we move to item 3: Drill should live
>>> within the limits. By this I mean that query operations should work with
>>> constrained memory, perhaps by spilling to disk — it is not sufficient to
>>> simply fail when memory is exhausted. Again, I don’t yet know where we’re
>>> at here, but I understand we may still have a bit of work to do to
>> achieve
>>> this goal.
>>> 
>>> Item 4 looks at the larger picture. Suppose a Drill-bit has 32GB of
>> memory
>>> available to it. We do the work needed so that any given query can
>> succeed
>>> within this limit (perhaps slowly if operations spill to disk.) But, what
>>> happens when the same Drill-bit now has to process 10 such queries or
>> 100?
>>> We now have a much harder problem: having the collection of ALL queries
>>> live within the same 32GB limit.
>>> 
>>> One solution is to simply hold queries in a queue when memory (or even
>>> CPU) becomes impacted. That is, rather than trying to run all 100 queries
>>> at once (slowly), perhaps run 20 at a time (quickly), allowing each a
>> much
>>> larger share of memory.
>>> 
>>> Drill already has queues, but they are off by default. We may have to
>> look
>>> at turning them on by default. Again, I’m not familiar with our queuing
>>> strategy, but there seems quite a bit we could do to release queries from
>>> the queue only when we can give them adequate resources on each
>> drill-bit.
>>> 
>>> Item 5 says that Drill should be opportunistic. If some external system
>>> can grant a temporary loan of more memory, Drill should be able to use
>> it.
>>> When the loan is revoked, Drill should relinquish the memory, perhaps by
>>> spilling the data to disk (or moving the data to other parts of memory.)
>>> Java programs can’t release heap memory, but Drill uses off-heap, so it
>> is
>>> at least theoretically possible to release memory back to the OS. Sounds
>>> like Drill has a number of improvements needed before Drill can actually
>>> release off-heap memory.
>>> 
>>> Finally, item 6 says we need that external system to loan Drill the extra
>>> memory. With CPU, the process scheduler can solve the problem all on its
>>> own by looking at system load and deciding, at any instant, which
>> processes
>>> to run. Memory is harder.
>>> 
>>> One solution would be for YARN to resize Drill’s container. But, YARN
>> does
>>> not yet support resizing containers. YARN-1197: "Support changing
>> resources
>>> of an allocated container" [1] describes the tasks needed to get there.
>>> Once that feature is complete, YARN will let an application ask for more
>>> memory (or release excess memory). Presumably the app or a user must
>> decide
>>> to request more memory. For example, the admin might dial up Drill memory
>>> during the day when the marketing folks are running queries, but dial it
>>> back at night when mostly batch jobs run.
>>> 
>>> The ability to manually change memory is great, but the ideal would be
>>> have some automated way to use free memory on each node. Llama does this
>> in
>>> an ad-hoc manner. A quick search on YARN did not reveal anything in this
>>> vein, so we’ll have to research this idea a bit more. I wonder, though,
>> if
>>> Drill could actually handle fast-moving allocation changes; change on the
>>> order of the lifetime of a query seems more achievable (that is, on the
>>> order of minutes to hours).
>>> 
>>> In short, it seems we have quite a few tasks ahead in the area of memory
>>> management. Each seems achievable, but each requires work. The
>>> Drill-on-YARN project is just a start: it helps the admin allocate memory
>>> between Drill and other apps.
>>> 
>>> Thanks,
>>> 
>>> - Paul
>>> 
>>> 
>>> [1] https://issues.apache.org/jira/browse/YARN-1197 <
>>> https://issues.apache.org/jira/browse/YARN-1197>
>>> 
>>> 
>>>> On Mar 26, 2016, at 6:48 AM, John Omernik <j...@omernik.com> wrote:
>>>> 
>>>> Paul  -
>>>> 
>>>> Great write-up.
>>>> 
>>>> Your description of Llama and Yarn is both informative and troubling
>> for
>>> a
>>>> potential cluster administrator. Looking at this solution, it would
>>> appear
>>>> that to use Yarn with Llama, the "citizen" in this case Drill would
>> have
>>> to
>>>> be an extremely good citizen and honor all requests from Llama related
>> to
>>>> deallocation and limits on resources while in reality there is no
>>>> enforcement mechanisms.  Not that I don't think Drill is a great tool
>>>> written well by great people, but I don't know if I would want to leave
>>> my
>>>> cluster SLAs up to Drill bits doing the self regulation.  Edge cases,
>> etc
>>>> causing a Drillbit to start taking more resources would be very
>> impactful
>>>> to a cluster, and with more and more people going to highly concurrent,
>>>> multi-tenant solutions, this becomes a HUGE challenge.
>>>> 
>>>> Obviously dynamic allocation, flexing up and down to use "spare"
>> cluster
>>>> resources is very important to many cluster/architecture
>> administrators,
>>>> but if I had to guess, SLAs/Workload guarantees would rank higher.
>>>> 
>>>> The Llama approach seems to be to much of a  "house of cards" to me to
>> be
>>>> viable, and I worry that long term it may not be best for a product
>> like
>>>> Drill. Our goal I think should be to play nice with others, if our core
>>>> philosophy in integration is playing nice with others, it will only
>> help
>>>> adoption and people giving it a try.  So back to Drill on Yarn
>>> (natively)...
>>>> 
>>>> A few questions around this.  You mention that resource allocations are
>>>> mostly a gentlemen's agreement. Can you explore that a bit more?  I do
>>>> believe there is Cgroup support in Yarn.  (I know the Myriad project is
>>>> looking to use Cgroups).  So is this gentlemen's agreement more about
>>> when
>>>> Cgroups is NOT enabled?  Thus it is only the word of the the process
>>>> running in the container in Yarn?  If this is the case, then has there
>>> been
>>>> any research on the stability of Cgroups and the implementation in
>> Yarn?
>>>> Basically, Poll: Are you using Yarn? If so are you using Cgroups? If
>> not,
>>>> why? If you are using them, any issues?   This may be helpful in what
>> we
>>>> are looking to do with Drill.
>>>> 
>>>> "Hanifi’s work will allow us to increase or decrease the number of
>> cores
>>> we
>>>> consume."  Do you have any JIRAs I can follow on this, I am very
>>> interested
>>>> in this. One of the benefits of CGroups in Mesos as it relates to CPU
>>>> shares is a sorta built in Dynamic allocation. And it would be
>>> interesting
>>>> to test a Yarn Cluster with Cgroups enabled (once a basic Yarn Aware
>>> Drill
>>>> bit is enabled) to see if Yarn reacts the same way.
>>>> 
>>>> Basically, when I run a drillbit on a node with Cgroup isolation
>> enabled
>>> in
>>>> Marathon on Mesos, lets say I have 16 total cores on the node. For me,
>> I
>>>> run my Mesos-Agent  with "14" available Vcores... Why? Static
>> allocation
>>> of
>>>> 2 vcores for MapR-FS.  Those 14 vcores are now available to tasks on
>> the
>>>> agent.  When I start the drillbit, let's say I allocate 8 vcores to the
>>>> drillbit in Marathon.  Drill runs queries, and let's say the actual CPU
>>>> usage on this node is minimal at the time, Drill, because it is not
>>>> currently CPU aware, takes all the CPU it can. (it will use all 16
>>> cores).
>>>> Query finishes it goes back to 0. But what happens if MapR is heavily
>>> using
>>>> it's 2 cores? Well , Cgroups detects contention and limits Drill
>> because
>>>> it's only allocated 8 shares of those 14 it's aware of, this gives
>>> priority
>>>> to the MapR operations. Even more so, if there are other Mesos tasks
>>> asking
>>>> for CPU shares, Drill's CPU share is being scaled back, not by telling
>>>> Drill it can't use core, but by processing what Drill is trying to do
>>>> slower compared to the rest of the work loads.   I am know I am dumbing
>>>> this down, but that's how I understand Cgroups working. Basically, I
>> was
>>>> very concerned when I first started doing Drill queries in Mesos, and I
>>>> posted to the Mesos list to which some people smarter than I took the
>>> time
>>>> to explain things. (Vinode, you are lurking on this list, thanks
>> again!)
>>>> 
>>>> In a way, this is actually a nice side effect of Cgroup Isolation, from
>>>> Drill's perspective it gets all the CPU, and is only scaled back on
>>>> contention.  So, my long explanation here is to bring things back to
>> the
>>>> Yarn/Cgroup/Gentlemen's agreement comment. I'd really want to
>> understand
>>>> this. As a cluster administrator, I can guarantee a level of resources
>>> with
>>>> Mesos, Can I get that same guarantee in Yarn? Is it only with certain
>>>> settings?  I just want to be 100% clear that if we go the route, and
>> make
>>>> Drill work on Yarn, that in our documentation/instructions we are
>>> explicit
>>>> in what we are giving the user on Yarn.  To me, a bad situation would
>>> occur
>>>> when someone thinks all will be well when they run Drill on Yarn, and
>>>> because they are not aware of their own settings (say not enabling
>>> Cgroups)
>>>> They blame Drill for breaking something.
>>>> 
>>>> So that falls back to Memory and scaling memory in Drill.  Memory for
>>>> obvious reason can't operate like CPU with Cgroups. You can't allocated
>>> all
>>>> memory to all the things, and then scale back if contention. So being a
>>>> complete neophyte on the inner workings of Drill memory. What options
>>> would
>>>> exist for allocation of memory.  Could we trigger events that would
>>>> allocate up and down what memory a given drillbit can use so it self
>>>> limits? It currently self limits because we can see the memory settings
>>> in
>>>> drill-env.sh.  But what about changing that at a later time?   Is it
>>> easier
>>>> to change direct memory limits rather than Heap?
>>>> 
>>>> Hypothesis 1:  If Direct memory isn't allocated (i.e. a drillbit is
>> idle)
>>>> then setting what it could POTENTIALLY use if a query would come in
>> would
>>>> be easier then actually deallocating Heap that's been used. Hypothesis
>> 2:
>>>> Direct Memory, if it's truly deallocated when not in use is more about
>>> the
>>>> limit of what it could use not about allocated or deallocating memory.
>>>> Hence a nice step one may be to allow this to change as needed by an
>>>> Application Master in Yarn (or a Mesos Framework)
>>>> 
>>>> If the work to change the limit on Direct Memory usage was easier, it
>> may
>>>> be a good first step, (assuming I am not completely wrong on memory
>>>> allocation) if we have to statically allocate Heap, and it's a big
>> change
>>>> in code to make that dynamic, but Direct Memory is easy to change,
>>> that's a
>>>> great first feature, without boiling the ocean. Obviously lots of
>>>> assumptions here, but I am just thinking outloud.
>>>> 
>>>> Paul - When it comes to Application in Yarn, and the containers that
>> the
>>>> Application Master allocates, can containers be joined?  Let's say I am
>>> an
>>>> application master, and I allocated 4 CPU Cores and 16 GB of ram to a
>>>> Drillbit. (8 for Heap and 8 for Direct) .  Then at a later time I can
>> add
>>>> more memory to the drill bit.... If my assumptions worked on Direct
>>> memory
>>>> in Drill, could my Application master tell the drill bit, ok you can
>> use
>>> 16
>>>> GB of direct memory now (i.e. the AM asks the RM to allocate 8 more GB
>> of
>>>> ram on that node, the RM agrees, and allocates another container, can
>> it
>>>> just resize, or would that not work? I guess what I am describing here
>> is
>>>> sorta what Llama is doing... but I am actually talking about the
>> ability
>>> to
>>>> enforce the quotas.... This may actually be a question that fits into
>>> your
>>>> discussion on resizing Yarn containers more than anything.
>>>> 
>>>> So i just tossed out a bunch of ideas here to keep discussion running.
>>>> Drill Devs, I would love a better understanding of the memory
>> allocation
>>>> mechanisms within Drill. (High level, neophyte here).  I do feel as a
>>>> cluster admin, as I have said, that the Llama approach (now that I
>>>> understand it better) would worry me, especially in a multi-tenant
>>>> cluster.  And as you said Paul, it "feels" hacky.
>>>> 
>>>> Thanks for this discussion, it's a great opportunity for Drill adoption
>>> as
>>>> clusters go more and more multi-tenant/multi-use.
>>>> 
>>>> John
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Fri, Mar 25, 2016 at 5:45 PM, Paul Rogers <prog...@maprtech.com
>>> <mailto:prog...@maprtech.com>> wrote:
>>>> 
>>>>> Hi Jacques,
>>>>> 
>>>>> Llama is a very investing approach; I read their paper [1] early on;
>>> just
>>>>> went back and read it again. Basically, Llama (as best as I can tell)
>>> has a
>>>>> two-part solution.
>>>>> 
>>>>> First, Impala is run off-YARN (that is, not in a YARN container).
>> Llama
>>>>> uses “dummy” containers to inform YARN of Impala’s resource usage.
>> They
>>> can
>>>>> grow/shrink static allocations by launching more dummy containers.
>> Each
>>>>> dummy container does nothing other than inform off-YARN Impala of the
>>>>> container resources. Rather clever, actually, even if it “abuses the
>>>>> software” a bit.
>>>>> 
>>>>> Secondly, Llama is able to dynamically grab spare YARN resources on
>> each
>>>>> node. Specifically, Llama runs a Node Manager (NM) plugin that watches
>>>>> actual node usage. The plugin detects the free NM resources and
>> informs
>>>>> Impala of them. Impala then consumes the resources as needed. When the
>>> NM
>>>>> allocates a new container, the plugin informs Impala which
>> relinquishes
>>> the
>>>>> resources. All this works because YARN allocations are mostly a
>>> gentleman’s
>>>>> agreement. Again, this is pretty clever, but only one app per node can
>>> play
>>>>> this game.
>>>>> 
>>>>> The Llama approach could work for Drill. The benefit is that Drill
>> runs
>>> as
>>>>> it does today. Hanifi’s work will allow us to increase or decrease the
>>>>> number of cores we consume. The draw-back is that Drill is not yet
>>> ready to
>>>>> play the memory game: it can’t release memory back to the OS when
>>>>> requested. Plus, the approach just smells like a hack.
>>>>> 
>>>>> The “pure-YARN” approach would be to let YARN start/stop the
>> Drill-bits.
>>>>> The user can grow/shrink Drill resources by starting/stopping
>>> Drill-bits.
>>>>> (This is simple to do if one ignores data locality and starts each
>>>>> Drill-bit on a separate node. It is a bit more work if one wants to
>>>>> preserve data locality by being rack-aware, or by running multiple
>>>>> drill-bits per node.)
>>>>> 
>>>>> YARN has been working on the ability to resize running containers.
>> (See
>>>>> YARN-1197 - Support changing resources of an allocated container [2])
>>> Once
>>>>> that is available, we can grow/shrink existing Drill-bits (assuming
>> that
>>>>> Drill itself is enhanced as discussed above.) The promise of resizable
>>>>> containers also suggests that the “pure-YARN” approach is workable.
>>>>> 
>>>>> Once resizable containers are available, one more piece is needed to
>> let
>>>>> Drill use free resources. Some cluster-wide component must detect free
>>>>> resources and offer them to applications that want them, deciding how
>> to
>>>>> divvy up the resources between, say, Drill and Impala. The same piece
>>> would
>>>>> revoke resources when paying YARN customers need them.
>>>>> 
>>>>> Of course, if the resizable container feature come too late, or does
>> not
>>>>> work well, we still have the option of going off-YARN using the Llama
>>>>> trick. But the Llama trick does nothing to do the cluster-wide
>>> coordination
>>>>> discussed above.
>>>>> 
>>>>> So, the thought is: start simple with a “stock” YARN app. Then, we can
>>> add
>>>>> bells and whistles as we gain experience and as YARN offers more
>>>>> capabilities.
>>>>> 
>>>>> The nice thing about this approach is that the same idea plays well
>> with
>>>>> Mesos (though the implementation is different).
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> - Paul
>>>>> 
>>>>> [1] http://cloudera.github.io/llama/ <
>> http://cloudera.github.io/llama/>
>>> <http://cloudera.github.io/llama/ <http://cloudera.github.io/llama/>>
>>>>> [2] https://issues.apache.org/jira/browse/YARN-1197 <
>>> https://issues.apache.org/jira/browse/YARN-1197> <
>>>>> https://issues.apache.org/jira/browse/YARN-1197 <
>>> https://issues.apache.org/jira/browse/YARN-1197>>
>>>>> 
>>>>>> On Mar 24, 2016, at 2:34 PM, Jacques Nadeau <jacq...@dremio.com
>>> <mailto:jacq...@dremio.com>> wrote:
>>>>>> 
>>>>>> Your proposed allocation approach makes a lot of sense. I think it
>> will
>>>>>> solve a large number of use cases. Thanks for giving an overview of
>> the
>>>>>> different frameworks. I wonder if they got too focused on the simple
>>> use
>>>>>> case....
>>>>>> 
>>>>>> Have you looked at LLama to see whether we could extend it for our
>>> needs?
>>>>>> Its Apache licensed and probably has at least a start at a bunch of
>>>>> things
>>>>>> we're trying to do.
>>>>>> 
>>>>>> https://github.com/cloudera/llama <https://github.com/cloudera/llama
>>> 
>>>>>> 
>>>>>> --
>>>>>> Jacques Nadeau
>>>>>> CTO and Co-Founder, Dremio
>>>>>> 
>>>>>> On Tue, Mar 22, 2016 at 7:42 PM, Paul Rogers <prog...@maprtech.com>
>>>>> wrote:
>>>>>> 
>>>>>>> Hi Jacques,
>>>>>>> 
>>>>>>> I’m thinking of “semi-static” allocation at first. Spin up a cluster
>>> of
>>>>>>> Drill-bits, after which the user can add or remove nodes while the
>>>>> cluster
>>>>>>> runs. (The add part is easy, the remove part is a bit tricky since
>> we
>>>>> don’t
>>>>>>> yet have a way to gracefully shut down a Drill-bit.) Once we get the
>>>>> basics
>>>>>>> to work, we can incrementally try out dynamics. For example, someone
>>>>> could
>>>>>>> whip up a script to look at load and use the proposed YARN client
>> app
>>> to
>>>>>>> adjust resources. Later, we can fold dynamic load management into
>> the
>>>>>>> solution once we’re sure what folks want.
>>>>>>> 
>>>>>>> I did look at Slider, Twill, Kitten and REEF. Kitten is too basic. I
>>> had
>>>>>>> great hope for Slider. But, it turns out that Slider and Weave have
>>> each
>>>>>>> built an elaborate framework to isolate us from YARN. The Slider
>>>>> framework
>>>>>>> (written in Python) seems harder to understand than YARN itself. At
>>>>> least,
>>>>>>> one has to be an expert in YARN to understand what all that Python
>>> code
>>>>>>> does. And, just looking at the class count in the Twill Javadoc was
>>>>>>> overwhelming. Slider and Twill have to solve the general case. If we
>>>>> build
>>>>>>> our own Java solution, we only have to solve the Drill case, which
>> is
>>>>>>> likely much simpler.
>>>>>>> 
>>>>>>> A bespoke solution would seem to offer some other advantages. It
>> lets
>>> us
>>>>>>> do things like integrate ZK monitoring so we can learn of zombie
>> drill
>>>>> bits
>>>>>>> (haven’t exited, but not sending heartbeat messages.) We can also
>>> gather
>>>>>>> metrics and historical data about the cluster as a whole. We can try
>>> out
>>>>>>> different cluster topologies. (Run Drill-bits on x of y nodes on a
>>> rack,
>>>>>>> say.) And, we can eventually do the dynamic load management we
>>> discussed
>>>>>>> earlier.
>>>>>>> 
>>>>>>> But first, I look forward to hearing what others have tried and what
>>>>> we’ve
>>>>>>> learned about how people want to use Drill in a production YARN
>>> cluster.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>> - Paul
>>>>>>> 
>>>>>>> 
>>>>>>>> On Mar 22, 2016, at 5:45 PM, Jacques Nadeau <jacq...@dremio.com>
>>>>> wrote:
>>>>>>>> 
>>>>>>>> This is great news, welcome!
>>>>>>>> 
>>>>>>>> What are you thinking in regards to static versus dynamic resource
>>>>>>>> allocation? We have some conversations going regarding workload
>>>>>>> management
>>>>>>>> but they are still early so it seems like starting with
>>> user-controlled
>>>>>>>> allocation makes sense initially.
>>>>>>>> 
>>>>>>>> Also, have you spent much time evaluating whether one of the
>> existing
>>>>>>> YARN
>>>>>>>> frameworks such as Slider would be useful? Does anyone on the list
>>> have
>>>>>>> any
>>>>>>>> feedback on the relative merits of these technologies?
>>>>>>>> 
>>>>>>>> Again, glad to see someone picking this up.
>>>>>>>> 
>>>>>>>> Jacques
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --
>>>>>>>> Jacques Nadeau
>>>>>>>> CTO and Co-Founder, Dremio
>>>>>>>> 
>>>>>>>> On Tue, Mar 22, 2016 at 4:58 PM, Paul Rogers <prog...@maprtech.com
>>> 
>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hi All,
>>>>>>>>> 
>>>>>>>>> I’m a new member of the Drill Team here at MapR. We’d like to
>> take a
>>>>>>> look
>>>>>>>>> at running Drill on YARN for production customers. JIRA suggests
>>> some
>>>>>>> early
>>>>>>>>> work may have been done (DRILL-142 <
>>>>>>>>> https://issues.apache.org/jira/browse/DRILL-142>, DRILL-1170 <
>>>>>>>>> https://issues.apache.org/jira/browse/DRILL-1170>, DRILL-3675 <
>>>>>>>>> https://issues.apache.org/jira/browse/DRILL-3675>).
>>>>>>>>> 
>>>>>>>>> YARN is a complex beast and the Drill community is large and
>>> growing.
>>>>>>> So,
>>>>>>>>> a good place to start is to ask if anyone has already done work on
>>>>>>>>> integrating Drill with YARN (see DRILL-142)?  Or has thought about
>>>>> what
>>>>>>>>> might be needed?
>>>>>>>>> 
>>>>>>>>> DRILL-1170 (YARN support for Drill) seems a good place to gather
>>>>>>>>> requirements, designs and so on. I’ve posted a “starter set” of
>>>>>>>>> requirements to spur discussion.
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> 
>>>>>>>>> - Paul
>>> 
>>> 
>> 

Reply via email to