Hello -

I have been playing with Drill and really see the potential in it.  I
wanted to start some discussion around Drill in how it could be used in the
Enterprise, specifically in a services orientated architecture.

We are exploring Apache Mesos right now (MapRFS as the clustered
filesystem) So we I started thinking how we could create "clusters" of
Drill Bits.

What I am thinking is say you want to have some data for a  part of the
company, you could allocated disk space to them, and you could allocated
resources to them so they could run map reduce (via Myriad) or Spark, or
 others. Basically it allows us to determine who is utilizing what and
scale as needed.

In my testing with Drill, I ran Drill natively next to my MapRFS processes
and then Mesos processes.  This is getting away from "managing" my
resources with Mesos. If my drill bits are setting outside of Mesos, then I
am not accounting for those resources.  In addition, Can I even run
multiple drill bits in this setup? What if Marketing and HR both have data
on a node, and I want to run a drill cluster for HR and Drill cluster for
Marketing?  Should we just use one cluster and users? Or would this be good
for smaller drill bits that can allocated per department?

So if the multiple, smaller drill bits is a good idea (this helps us limit
access to data as well, using unix users and file permissions, using users
in the definitions for DB or Mongo connections etc).  This really helps
with data access/governance) Then, how best to do this?

I suppose we could do this in Docker and run multiple groups of drill bit
in Docker containers to sync things up, but what about a native Mesos
Framework? Is this even a good idea? I guess the reason for this post is I
see the how it could be beneficial, but I am not schooled enough in Mesos
or Drill to see the pitfalls, and am curious on the thoughts of the group,
and also just some discussion on the topic of Drill in the Enterprise.

Reply via email to