MapR audit records print the errno value to indicate success/failure. Thus 
status 17 means errno 17 which means EEXIST. Looks like Drill is trying to 
create a file that already exists.

I’ll defer to others as to why Drill might do that.

Keys
_______________________________
Keys Botzum 
Senior Principal Technologist
[email protected]
443-718-0098 
MapR Technologies 
http://www.mapr.com



On Nov 11, 2015, at 4:09 PM, John Omernik <[email protected]> wrote:

> I turned on MapR Auditing (This is a handy feature) and found that when I
> run a query (that is giving me access denied.. my query is select * from
> table limit 1) Per MapR the user I am logged in as (mapradm) is trying to
> do a create operation on the .drill.parquet_metadata operation and I
> guessing it's failing with status: 17 (Not sure what this means, successes
> appear to be "0".  What was intersting was the "CREATE" being attempted
> three times.   Any thoughts on why a select * from tables limit 1 would try
> to initiate a create operation on the .drill.parquet_metadata file?
> 
> On Wed, Nov 11, 2015 at 2:25 PM, John Omernik <[email protected]> wrote:
> 
>> I take it back.
>> 
>> I went to run a query, in the same session that had worked, and now I am
>> getting permission denied.
>> 
>> I do have a query running created new directories every 5 minutes,
>> however, these aren't the directories that are giving me permission denied.
>>  Did you try running an aggregate query accross all data? This is a
>> interesting one to track down, not sure why I am getting the access denied
>> now,
>> 
>> the .drill.parquet_metadata file in the directory that I am getting the
>> error on is owned by mapr:mapr and has rwxr-xr-x  permissions. This tells
>> me that both the user of the drillbits (mapr) and the user I am logged into
>> in sqlline (mapradm) should be able to read the file... so why do I get an
>> access denied in running a query. I any assistance would be valuable here
>> in that there are some great performance increases with the metadata
>> caching, and I don't want to miss out on that.
>> 
>> On Wed, Nov 11, 2015 at 2:18 PM, John Omernik <[email protected]> wrote:
>> 
>>> All files are owned by mapr:mapr?
>>> 
>>> I have a setup where mapr is the user running the drillbit, but then I
>>> have a directory that is owned by a another user. mapradm:mapradm on all
>>> files. (Permissions on directories and files appears to be rwxr-x-r-x) When
>>> I run the REFRESH TABLE metatdata the .drill.parquet_metadata file gets
>>> created as mapr:mapr with rwxr-xr-x.
>>> 
>>> So
>>> Drillbit User:mapr
>>> Directory (and subdirectories/files) owner: mapradm:mapradm
>>> Directory permissions (all files and folder under main directory)
>>> rwxr-x-r-x
>>> 
>>> I authenticated to drill via sqlline as user mapradm (this user should be
>>> able to read and write just fine to all directories).
>>> 
>>> Now, one thing I did notice is my mapr user was not in the mapradm group,
>>> therefore, didn't have write permissions anywhere... when I fixed that on
>>> all nodes, and then I manually deleted the metadatafiles, things seem to be
>>> working. I wonder if that was my issue?
>>> 
>>> Basically, the user running the drillbits need to be able to write files
>>> (the .drill.parquet_metadata)  or something bad will happen :) I will do
>>> more testing. This may be a good candidate for some documentation work to
>>> understand what permissions are required to be able to query these.
>>> 
>>> 
>>> 
>>> 
>>> On Wed, Nov 11, 2015 at 1:36 PM, Vince Gonzalez <[email protected]
>>>> wrote:
>>> 
>>>> Hi John, I tried this and didn't find any issues. Let me know if I didn't
>>>> follow your reproduction faithfully.
>>>> 
>>>> $ sqlline -u jdbc:drill: -n ec2-user -p mapr
>>>> apache drill 1.2.0
>>>> "drill baby drill"
>>>> 0: jdbc:drill:> refresh table metadata dfs.`/tmp/flows`;
>>>> +-------+------------------------------------------------------+
>>>> |  ok   |                       summary                        |
>>>> +-------+------------------------------------------------------+
>>>> | true  | Successfully updated metadata for table /tmp/flows.  |
>>>> +-------+------------------------------------------------------+
>>>> 1 row selected (32.27 seconds)
>>>> 0: jdbc:drill:> select srcIP,dstIP from dfs.`/tmp/flows` limit 12;
>>>> +---------------+---------------+
>>>> |     srcIP     |     dstIP     |
>>>> +---------------+---------------+
>>>> | 172.16.2.152  | 172.16.1.58   |
>>>> | 172.16.1.58   | 172.16.2.152  |
>>>> | 172.16.2.152  | 172.16.2.73   |
>>>> | 172.16.2.152  | 172.16.2.73   |
>>>> | 172.16.2.73   | 172.16.2.152  |
>>>> | 172.16.2.152  | 172.16.2.73   |
>>>> | 172.16.2.152  | 172.16.2.73   |
>>>> | 172.16.2.152  | 172.16.2.73   |
>>>> | 172.16.2.73   | 172.16.2.152  |
>>>> | 172.16.2.73   | 172.16.2.152  |
>>>> | 172.16.2.73   | 172.16.2.152  |
>>>> | 172.16.2.152  | 172.16.2.73   |
>>>> +---------------+---------------+
>>>> 12 rows selected (5.654 seconds)
>>>> 
>>>> And here's what my table structure looks like (as seen via MapR NFS):
>>>> 
>>>> $ tree /mapr/vgonzalez.drill/tmp/flows/ | head -15
>>>> /mapr/vgonzalez.drill/tmp/flows/
>>>> └── 2015
>>>>    └── 11
>>>>        ├── 10
>>>>        │   ├── 21
>>>>        │   │   ├── 39
>>>>        │   │   │   ├── 03
>>>>        │   │   │   │   ├── _common_metadata
>>>>        │   │   │   │   ├── _metadata
>>>>        │   │   │   │   ├──
>>>> part-r-00000-853882bd-66d8-4505-96ba-f0a282e374de.gz.parquet
>>>>        │   │   │   │   └── _SUCCESS
>>>>        │   │   │   └── 20
>>>>        │   │   │       ├── _common_metadata
>>>>        │   │   │       ├── _metadata
>>>>        │   │   │       ├──
>>>> part-r-00000-37a94549-8e56-46d5-be88-cb28e6d8bc35.gz.parquet
>>>> 
>>>> My parquet was created in Spark, not Drill. Not sure if that's relevant.
>>>> 
>>>> I have authentication and impersonation turned on, and the files are
>>>> owned
>>>> by mapr:mapr. Here's my drill-override.conf:
>>>> 
>>>> drill.exec: {
>>>>  cluster-id: "vgonzalez_drill-drillbits",
>>>> zk.connect:
>>>> 
>>>> "ip-172-16-2-36.ec2.internal:5181,ip-172-16-2-37.ec2.internal:5181,ip-172-16-2-38.ec2.internal:5181"
>>>> }
>>>> drill.exec.impersonation: { enabled: true, max_chained_user_hops: 3 }
>>>> drill.exec { security.user.auth { enabled: true, packages +=
>>>> "org.apache.drill.exec.rpc.user.security", impl: "pam", pam_profiles: [
>>>> "login","sudo","sshd","password-auth" ] } }
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Tue, Nov 10, 2015 at 1:17 PM, John Omernik <[email protected]> wrote:
>>>> 
>>>>> Cool, looking forward to it.
>>>>> 
>>>>> On Mon, Nov 9, 2015 at 7:21 PM, Vince Gonzalez <
>>>> [email protected]>
>>>>> wrote:
>>>>> 
>>>>>> Hey John, I have a secure cluster and some parquet files, I'll try
>>>> this
>>>>> out
>>>>>> and report back.
>>>>>> 
>>>>>> On Monday, November 9, 2015, John Omernik <[email protected]> wrote:
>>>>>> 
>>>>>>> Has anyone been able to try/test this? I am curious if it's me only
>>>>> issue
>>>>>>> or something more of bug so I can open a JIRA if needed.
>>>>>>> 
>>>>>>> John
>>>>>>> 
>>>>>>> On Fri, Nov 6, 2015 at 11:06 AM, John Omernik <[email protected]
>>>>>>> <javascript:;>> wrote:
>>>>>>> 
>>>>>>>> If someone has authorization/authentication setup, to reproduce:
>>>>>>>> 
>>>>>>>> Have a Parquet table with directories underneath the main (I have
>>>>>>>> directories per day)
>>>>>>>> 
>>>>>>>> Then issue REFRESH TABLE METADATA on the root of the table
>>>> running an
>>>>>>>> authenticated user other than the drill bit user. (I am using
>>>> mapr, I
>>>>>>> used
>>>>>>>> my user to run the query, and yes I have access to the data)
>>>>>>>> 
>>>>>>>> Then run a normal query and see what the result is. .
>>>>>>>> 
>>>>>>>> John
>>>>>>>> 
>>>>>>>> On Fri, Nov 6, 2015 at 10:22 AM, Neeraja Rentachintala <
>>>>>>>> [email protected] <javascript:;>> wrote:
>>>>>>>> 
>>>>>>>>> This doesn't make sense and seems like a bug.
>>>>>>>>> I think the right behavior is for the Drillbit to access the
>>>> cache
>>>>> as
>>>>>>>>> Drillbit user at the query time (there is no user level metadata
>>>>> cache
>>>>>>> in
>>>>>>>>> Drill at this point).
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Fri, Nov 6, 2015 at 6:57 AM, John Omernik <[email protected]
>>>>>>> <javascript:;>> wrote:
>>>>>>>>> 
>>>>>>>>>> I ran REFRESH TABLE METADATA on a table, it completed
>>>>> successfully.
>>>>>>>>>> 
>>>>>>>>>> When I tried a subsequent query, I get a IOException:
>>>> Permission
>>>>>>> Denied
>>>>>>>>> on
>>>>>>>>>> .drill.parquet_metadata.
>>>>>>>>>> 
>>>>>>>>>> I am running drill with authentication.  I ran the REFRESH
>>>> TABLE
>>>>>>>>> METADATA
>>>>>>>>>> as user X, it appears the .drill.parquet_metadata was created
>>>> and
>>>>>>> owned
>>>>>>>>> by
>>>>>>>>>> the user the drill bits are running as as is created with
>>>>>> -rwxr-x-r-x
>>>>>>>>>> 
>>>>>>>>>> My question is this: So, I can see why the file is owned by
>>>> the
>>>>>> drill
>>>>>>>>> bit
>>>>>>>>>> user, and the file is created with all can read permissions,
>>>> but
>>>>> why
>>>>>>> am
>>>>>>>>> I
>>>>>>>>>> getting a permission denied when user X is trying to run a
>>>> query?
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>> 

Reply via email to