In the DB world, this issue was resolved by running the DB server as a 
privileged user. The user then logs into the DB to do work. This means that the 
DB itself is trusted, but clients are not. Of course, DBs have their own user 
system so that users are defined in the DB; the DB user owns all the DB’s files 
and manages security internally to the DB.

As Ted says, HDFS and MFS emulate this model by having the file system manage 
access; Drill itself becomes the (untrusted) client.

When Drill accesses the local file system, accessing files owned by other 
users, then we hit the problem discussed in this thread.

All of this is a long-winded way of asking this: What do other “big data” tools 
do to solve this problem? If one is doing big data, should a distributed file 
system be a requirement if one wants security?

Thanks,

- Paul

> On Jul 1, 2016, at 7:44 AM, scott <tcots8...@gmail.com> wrote:
> 
> Great explanation, Ted. I think I will still open the ticket, if nothing
> else to address the gaps in documentation.
> 
> Thanks,
> Scott
> 
> On Fri, Jul 1, 2016 at 2:30 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:
> 
>> You can certainly open a ticket for this.
>> 
>> The problem is resolving the ticket.
>> 
>> Impersonation explicitly depends on some of the functionality available
>> with certain distributed file systems like MapR FS and HDFS. This allows a
>> drill bit to just do file system level impersonation based on possession of
>> a crypto certificate.
>> 
>> Doing the same thing for a local file system requires the ability to
>> actually change the effective user id of the process doing the access. That
>> can be done, but allowing that level of power to an application process is
>> pretty terrifying to security admins. It also is typically implemented
>> using a separation of powers style which would require that an entire
>> additional process be introduced into the design. That's a lot of work for
>> something that is likely to result in a fair bit of grief.
>> 
>> There is definitely a documentation bug here that should be fixed.
>> 
>> 
>> 
>> 
>> On Fri, Jul 1, 2016 at 7:20 AM, scott <tcots8...@gmail.com> wrote:
>> 
>>> Am I able to open a Jira ticket on this? Or, is this something a
>> developer
>>> has to do?
>>> 
>>> Scott
>>> 
>>> On Thu, Jun 30, 2016 at 5:17 PM, scott <tcots8...@gmail.com> wrote:
>>> 
>>>> Impersonation using the default dfs configuration is not supported? The
>>>> documentation for Impersonation Support says that File System is a
>>>> supported Storage Plugin, and that only HBase is not supported.
>>>> If this is true, do you know if there is a Jira ticket to add this
>>>> feature?
>>>> 
>>>> Scott
>>>> 
>>>> 
>>>> On Thu, Jun 30, 2016 at 4:58 PM, Chun Chang <cch...@maprtech.com>
>> wrote:
>>>> 
>>>>> Impersonation against local file system is not supported. If you are
>>>>> running against hdfs, please take a look at drillbit.log or post
>>> relevant
>>>>> part here.
>>>>> 
>>>>> On Thu, Jun 30, 2016 at 8:12 AM, scott <tcots8...@gmail.com> wrote:
>>>>> 
>>>>>> Hi,
>>>>>> I am having trouble getting Impersonation to work. Using Drill 1.7,
>> I
>>>>> have
>>>>>> a drill user, user1, and user2. Drill is started as the drill user.
>> I
>>> am
>>>>>> testing impersonation on the local file system dfs default storage
>>>>> plugin
>>>>>> on a linux server. I have setup some files that are owned by user1
>> and
>>>>>> user2 with 600 permissions, and am using the sqlline tool to test
>>>>> access.
>>>>>> However, I am not able to access either file logged in as user1 or
>>>>> user2.
>>>>>> Only when I change permissions so that the drill user can read am I
>>>>> able to
>>>>>> access either file. I have confirmed that impersonation is enabled
>>> using
>>>>>> the following:
>>>>>> 
>>>>>> select * from sys.boot where name like '%impersonation%';
>>>>>> 
>>>>>> 
>>>>> 
>>> 
>> +-------------------------------------------------+----------+-------+---------+----------+-------------+-----------+------------+
>>>>>> |                      name                       |   kind   |
>> type  |
>>>>>> status  | num_val  | string_val  | bool_val  | float_val  |
>>>>>> 
>>>>>> 
>>>>> 
>>> 
>> +-------------------------------------------------+----------+-------+---------+----------+-------------+-----------+------------+
>>>>>> | drill.exec.impersonation.enabled                | BOOLEAN  |
>> BOOT  |
>>>>>> BOOT    | null     | null        | true      | null       |
>>>>>> | drill.exec.impersonation.max_chained_user_hops  | LONG     |
>> BOOT  |
>>>>>> BOOT    | 2        | null        | null      | null       |
>>>>>> 
>>>>>> 
>>>>> 
>>> 
>> +-------------------------------------------------+----------+-------+---------+----------+-------------+-----------+------------+
>>>>>> 
>>>>>> My override conf is:
>>>>>> drill.exec: {
>>>>>>  cluster-id: "mydrillbits",
>>>>>>  zk: {
>>>>>>    connect: "10.80.22.238:2181",
>>>>>>    root: "drill",
>>>>>>    refresh: 500,
>>>>>>    timeout: 5000,
>>>>>>    retry: {
>>>>>>      count: 7200,
>>>>>>      delay: 500
>>>>>>    }
>>>>>>  },
>>>>>>  http: {
>>>>>>    enabled: true,
>>>>>>    ssl_enabled: true,
>>>>>>    port: 8047
>>>>>>  },
>>>>>>  impersonation: {
>>>>>>    enabled: true,
>>>>>>    max_chained_user_hops: 2
>>>>>>  },
>>>>>>  security.user.auth {
>>>>>>    enabled: true,
>>>>>>    packages += "org.apache.drill.exec.rpc.user.security",
>>>>>>    impl: "pam",
>>>>>>    pam_profiles: [ "sudo", "login" ]
>>>>>>  }
>>>>>> }
>>>>>> 
>>>>>> 
>>>>>> Has anyone had similar problems, or am I misunderstanding how user
>>>>>> impersonation works?
>>>>>> 
>>>>>> Thanks for your time,
>>>>>> Scott
>>>>>> 
>>>>> 
>>>> 
>>>> 
>>> 
>> 

Reply via email to