Hi Paul, 
Here’s what our engineers said:

From Paul’s response, I understand that there is a slight confusion around how 
multi-tenancy has been enabled in our data lake.

Some more details on this – 

Drill already has the concept of multitenancy where we can have multiple drill 
clusters running on the same data lake enabled through different ports and 
zookeeper. But, all of this is launched through the same hard coded yarn queue 
that we provide as a config parameter.

In our data lake, each tenant has a certain amount of compute capacity allotted 
to them which they can use for their project work. This is provisioned through 
individual YARN queues for each tenant (resource caging). This restricts the 
tenants from using cluster resources beyond a certain limit and not impacting 
other tenants at the same time. 

Access to these YARN queues is provisioned through ACL memberships. 

——

Does this make sense?   Is this possible to get Drill to work in this manner, 
or should we look into opening up JIRAs and working on new capabilities?



> On Dec 17, 2018, at 21:59, Paul Rogers <[email protected]> wrote:
> 
> Hi Kwizera,
> I hope my answer to Charles gave you the information you need. If not, please 
> check out the DoY documentation or ask follow-up questions.
> Key thing to remember: Drill is a long-running YARN service; queries DO NOT 
> go through YARN queues, they go through Drill directly.
> 
> Thanks,
> - Paul
> 
> 
> 
>    On Monday, December 17, 2018, 11:01:04 AM PST, Kwizera hugues Teddy 
> <[email protected]> wrote:  
> 
> Hello,
> Same questions ,
> I would like to know how drill deal with this yarn fonctionality?
> Cheers.
> 
> On Mon, Dec 17, 2018, 17:53 Charles Givre <[email protected] wrote:
> 
>> Hello all,
>> We are trying to set up a Drill cluster on our corporate data lake.  Our
>> cluster requires dynamic YARN queue allocation for multi-tenant
>> environment.  Is this something that Drill supports or is there a
>> workaround?
>> Thanks!
>> —C  

Reply via email to