MapR audit records print the errno value to indicate success/failure. Thus status 17 means errno 17 which means EEXIST. Looks like Drill is trying to create a file that already exists.
I’ll defer to others as to why Drill might do that. Keys _______________________________ Keys Botzum Senior Principal Technologist [email protected] 443-718-0098 MapR Technologies http://www.mapr.com On Nov 11, 2015, at 4:09 PM, John Omernik <[email protected]> wrote: > I turned on MapR Auditing (This is a handy feature) and found that when I > run a query (that is giving me access denied.. my query is select * from > table limit 1) Per MapR the user I am logged in as (mapradm) is trying to > do a create operation on the .drill.parquet_metadata operation and I > guessing it's failing with status: 17 (Not sure what this means, successes > appear to be "0". What was intersting was the "CREATE" being attempted > three times. Any thoughts on why a select * from tables limit 1 would try > to initiate a create operation on the .drill.parquet_metadata file? > > On Wed, Nov 11, 2015 at 2:25 PM, John Omernik <[email protected]> wrote: > >> I take it back. >> >> I went to run a query, in the same session that had worked, and now I am >> getting permission denied. >> >> I do have a query running created new directories every 5 minutes, >> however, these aren't the directories that are giving me permission denied. >> Did you try running an aggregate query accross all data? This is a >> interesting one to track down, not sure why I am getting the access denied >> now, >> >> the .drill.parquet_metadata file in the directory that I am getting the >> error on is owned by mapr:mapr and has rwxr-xr-x permissions. This tells >> me that both the user of the drillbits (mapr) and the user I am logged into >> in sqlline (mapradm) should be able to read the file... so why do I get an >> access denied in running a query. I any assistance would be valuable here >> in that there are some great performance increases with the metadata >> caching, and I don't want to miss out on that. >> >> On Wed, Nov 11, 2015 at 2:18 PM, John Omernik <[email protected]> wrote: >> >>> All files are owned by mapr:mapr? >>> >>> I have a setup where mapr is the user running the drillbit, but then I >>> have a directory that is owned by a another user. mapradm:mapradm on all >>> files. (Permissions on directories and files appears to be rwxr-x-r-x) When >>> I run the REFRESH TABLE metatdata the .drill.parquet_metadata file gets >>> created as mapr:mapr with rwxr-xr-x. >>> >>> So >>> Drillbit User:mapr >>> Directory (and subdirectories/files) owner: mapradm:mapradm >>> Directory permissions (all files and folder under main directory) >>> rwxr-x-r-x >>> >>> I authenticated to drill via sqlline as user mapradm (this user should be >>> able to read and write just fine to all directories). >>> >>> Now, one thing I did notice is my mapr user was not in the mapradm group, >>> therefore, didn't have write permissions anywhere... when I fixed that on >>> all nodes, and then I manually deleted the metadatafiles, things seem to be >>> working. I wonder if that was my issue? >>> >>> Basically, the user running the drillbits need to be able to write files >>> (the .drill.parquet_metadata) or something bad will happen :) I will do >>> more testing. This may be a good candidate for some documentation work to >>> understand what permissions are required to be able to query these. >>> >>> >>> >>> >>> On Wed, Nov 11, 2015 at 1:36 PM, Vince Gonzalez <[email protected] >>>> wrote: >>> >>>> Hi John, I tried this and didn't find any issues. Let me know if I didn't >>>> follow your reproduction faithfully. >>>> >>>> $ sqlline -u jdbc:drill: -n ec2-user -p mapr >>>> apache drill 1.2.0 >>>> "drill baby drill" >>>> 0: jdbc:drill:> refresh table metadata dfs.`/tmp/flows`; >>>> +-------+------------------------------------------------------+ >>>> | ok | summary | >>>> +-------+------------------------------------------------------+ >>>> | true | Successfully updated metadata for table /tmp/flows. | >>>> +-------+------------------------------------------------------+ >>>> 1 row selected (32.27 seconds) >>>> 0: jdbc:drill:> select srcIP,dstIP from dfs.`/tmp/flows` limit 12; >>>> +---------------+---------------+ >>>> | srcIP | dstIP | >>>> +---------------+---------------+ >>>> | 172.16.2.152 | 172.16.1.58 | >>>> | 172.16.1.58 | 172.16.2.152 | >>>> | 172.16.2.152 | 172.16.2.73 | >>>> | 172.16.2.152 | 172.16.2.73 | >>>> | 172.16.2.73 | 172.16.2.152 | >>>> | 172.16.2.152 | 172.16.2.73 | >>>> | 172.16.2.152 | 172.16.2.73 | >>>> | 172.16.2.152 | 172.16.2.73 | >>>> | 172.16.2.73 | 172.16.2.152 | >>>> | 172.16.2.73 | 172.16.2.152 | >>>> | 172.16.2.73 | 172.16.2.152 | >>>> | 172.16.2.152 | 172.16.2.73 | >>>> +---------------+---------------+ >>>> 12 rows selected (5.654 seconds) >>>> >>>> And here's what my table structure looks like (as seen via MapR NFS): >>>> >>>> $ tree /mapr/vgonzalez.drill/tmp/flows/ | head -15 >>>> /mapr/vgonzalez.drill/tmp/flows/ >>>> └── 2015 >>>> └── 11 >>>> ├── 10 >>>> │ ├── 21 >>>> │ │ ├── 39 >>>> │ │ │ ├── 03 >>>> │ │ │ │ ├── _common_metadata >>>> │ │ │ │ ├── _metadata >>>> │ │ │ │ ├── >>>> part-r-00000-853882bd-66d8-4505-96ba-f0a282e374de.gz.parquet >>>> │ │ │ │ └── _SUCCESS >>>> │ │ │ └── 20 >>>> │ │ │ ├── _common_metadata >>>> │ │ │ ├── _metadata >>>> │ │ │ ├── >>>> part-r-00000-37a94549-8e56-46d5-be88-cb28e6d8bc35.gz.parquet >>>> >>>> My parquet was created in Spark, not Drill. Not sure if that's relevant. >>>> >>>> I have authentication and impersonation turned on, and the files are >>>> owned >>>> by mapr:mapr. Here's my drill-override.conf: >>>> >>>> drill.exec: { >>>> cluster-id: "vgonzalez_drill-drillbits", >>>> zk.connect: >>>> >>>> "ip-172-16-2-36.ec2.internal:5181,ip-172-16-2-37.ec2.internal:5181,ip-172-16-2-38.ec2.internal:5181" >>>> } >>>> drill.exec.impersonation: { enabled: true, max_chained_user_hops: 3 } >>>> drill.exec { security.user.auth { enabled: true, packages += >>>> "org.apache.drill.exec.rpc.user.security", impl: "pam", pam_profiles: [ >>>> "login","sudo","sshd","password-auth" ] } } >>>> >>>> >>>> >>>> >>>> >>>> On Tue, Nov 10, 2015 at 1:17 PM, John Omernik <[email protected]> wrote: >>>> >>>>> Cool, looking forward to it. >>>>> >>>>> On Mon, Nov 9, 2015 at 7:21 PM, Vince Gonzalez < >>>> [email protected]> >>>>> wrote: >>>>> >>>>>> Hey John, I have a secure cluster and some parquet files, I'll try >>>> this >>>>> out >>>>>> and report back. >>>>>> >>>>>> On Monday, November 9, 2015, John Omernik <[email protected]> wrote: >>>>>> >>>>>>> Has anyone been able to try/test this? I am curious if it's me only >>>>> issue >>>>>>> or something more of bug so I can open a JIRA if needed. >>>>>>> >>>>>>> John >>>>>>> >>>>>>> On Fri, Nov 6, 2015 at 11:06 AM, John Omernik <[email protected] >>>>>>> <javascript:;>> wrote: >>>>>>> >>>>>>>> If someone has authorization/authentication setup, to reproduce: >>>>>>>> >>>>>>>> Have a Parquet table with directories underneath the main (I have >>>>>>>> directories per day) >>>>>>>> >>>>>>>> Then issue REFRESH TABLE METADATA on the root of the table >>>> running an >>>>>>>> authenticated user other than the drill bit user. (I am using >>>> mapr, I >>>>>>> used >>>>>>>> my user to run the query, and yes I have access to the data) >>>>>>>> >>>>>>>> Then run a normal query and see what the result is. . >>>>>>>> >>>>>>>> John >>>>>>>> >>>>>>>> On Fri, Nov 6, 2015 at 10:22 AM, Neeraja Rentachintala < >>>>>>>> [email protected] <javascript:;>> wrote: >>>>>>>> >>>>>>>>> This doesn't make sense and seems like a bug. >>>>>>>>> I think the right behavior is for the Drillbit to access the >>>> cache >>>>> as >>>>>>>>> Drillbit user at the query time (there is no user level metadata >>>>> cache >>>>>>> in >>>>>>>>> Drill at this point). >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Nov 6, 2015 at 6:57 AM, John Omernik <[email protected] >>>>>>> <javascript:;>> wrote: >>>>>>>>> >>>>>>>>>> I ran REFRESH TABLE METADATA on a table, it completed >>>>> successfully. >>>>>>>>>> >>>>>>>>>> When I tried a subsequent query, I get a IOException: >>>> Permission >>>>>>> Denied >>>>>>>>> on >>>>>>>>>> .drill.parquet_metadata. >>>>>>>>>> >>>>>>>>>> I am running drill with authentication. I ran the REFRESH >>>> TABLE >>>>>>>>> METADATA >>>>>>>>>> as user X, it appears the .drill.parquet_metadata was created >>>> and >>>>>>> owned >>>>>>>>> by >>>>>>>>>> the user the drill bits are running as as is created with >>>>>> -rwxr-x-r-x >>>>>>>>>> >>>>>>>>>> My question is this: So, I can see why the file is owned by >>>> the >>>>>> drill >>>>>>>>> bit >>>>>>>>>> user, and the file is created with all can read permissions, >>>> but >>>>> why >>>>>>> am >>>>>>>>> I >>>>>>>>>> getting a permission denied when user X is trying to run a >>>> query? >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >>> >>
