Hi John, I tried this and didn't find any issues. Let me know if I didn't
follow your reproduction faithfully.

$ sqlline -u jdbc:drill: -n ec2-user -p mapr
apache drill 1.2.0
"drill baby drill"
0: jdbc:drill:> refresh table metadata dfs.`/tmp/flows`;
+-------+------------------------------------------------------+
|  ok   |                       summary                        |
+-------+------------------------------------------------------+
| true  | Successfully updated metadata for table /tmp/flows.  |
+-------+------------------------------------------------------+
1 row selected (32.27 seconds)
0: jdbc:drill:> select srcIP,dstIP from dfs.`/tmp/flows` limit 12;
+---------------+---------------+
|     srcIP     |     dstIP     |
+---------------+---------------+
| 172.16.2.152  | 172.16.1.58   |
| 172.16.1.58   | 172.16.2.152  |
| 172.16.2.152  | 172.16.2.73   |
| 172.16.2.152  | 172.16.2.73   |
| 172.16.2.73   | 172.16.2.152  |
| 172.16.2.152  | 172.16.2.73   |
| 172.16.2.152  | 172.16.2.73   |
| 172.16.2.152  | 172.16.2.73   |
| 172.16.2.73   | 172.16.2.152  |
| 172.16.2.73   | 172.16.2.152  |
| 172.16.2.73   | 172.16.2.152  |
| 172.16.2.152  | 172.16.2.73   |
+---------------+---------------+
12 rows selected (5.654 seconds)

And here's what my table structure looks like (as seen via MapR NFS):

$ tree /mapr/vgonzalez.drill/tmp/flows/ | head -15
/mapr/vgonzalez.drill/tmp/flows/
└── 2015
    └── 11
        ├── 10
        │   ├── 21
        │   │   ├── 39
        │   │   │   ├── 03
        │   │   │   │   ├── _common_metadata
        │   │   │   │   ├── _metadata
        │   │   │   │   ├──
part-r-00000-853882bd-66d8-4505-96ba-f0a282e374de.gz.parquet
        │   │   │   │   └── _SUCCESS
        │   │   │   └── 20
        │   │   │       ├── _common_metadata
        │   │   │       ├── _metadata
        │   │   │       ├──
part-r-00000-37a94549-8e56-46d5-be88-cb28e6d8bc35.gz.parquet

My parquet was created in Spark, not Drill. Not sure if that's relevant.

I have authentication and impersonation turned on, and the files are owned
by mapr:mapr. Here's my drill-override.conf:

drill.exec: {
  cluster-id: "vgonzalez_drill-drillbits",
zk.connect:
"ip-172-16-2-36.ec2.internal:5181,ip-172-16-2-37.ec2.internal:5181,ip-172-16-2-38.ec2.internal:5181"
}
drill.exec.impersonation: { enabled: true, max_chained_user_hops: 3 }
drill.exec { security.user.auth { enabled: true, packages +=
"org.apache.drill.exec.rpc.user.security", impl: "pam", pam_profiles: [
"login","sudo","sshd","password-auth" ] } }





On Tue, Nov 10, 2015 at 1:17 PM, John Omernik <[email protected]> wrote:

> Cool, looking forward to it.
>
> On Mon, Nov 9, 2015 at 7:21 PM, Vince Gonzalez <[email protected]>
> wrote:
>
> > Hey John, I have a secure cluster and some parquet files, I'll try this
> out
> > and report back.
> >
> > On Monday, November 9, 2015, John Omernik <[email protected]> wrote:
> >
> > > Has anyone been able to try/test this? I am curious if it's me only
> issue
> > > or something more of bug so I can open a JIRA if needed.
> > >
> > > John
> > >
> > > On Fri, Nov 6, 2015 at 11:06 AM, John Omernik <[email protected]
> > > <javascript:;>> wrote:
> > >
> > > > If someone has authorization/authentication setup, to reproduce:
> > > >
> > > > Have a Parquet table with directories underneath the main (I have
> > > > directories per day)
> > > >
> > > > Then issue REFRESH TABLE METADATA on the root of the table running an
> > > > authenticated user other than the drill bit user. (I am using mapr, I
> > > used
> > > > my user to run the query, and yes I have access to the data)
> > > >
> > > > Then run a normal query and see what the result is. .
> > > >
> > > > John
> > > >
> > > > On Fri, Nov 6, 2015 at 10:22 AM, Neeraja Rentachintala <
> > > > [email protected] <javascript:;>> wrote:
> > > >
> > > >> This doesn't make sense and seems like a bug.
> > > >> I think the right behavior is for the Drillbit to access the cache
> as
> > > >> Drillbit user at the query time (there is no user level metadata
> cache
> > > in
> > > >> Drill at this point).
> > > >>
> > > >>
> > > >>
> > > >> On Fri, Nov 6, 2015 at 6:57 AM, John Omernik <[email protected]
> > > <javascript:;>> wrote:
> > > >>
> > > >> > I ran REFRESH TABLE METADATA on a table, it completed
> successfully.
> > > >> >
> > > >> > When I tried a subsequent query, I get a IOException: Permission
> > > Denied
> > > >> on
> > > >> > .drill.parquet_metadata.
> > > >> >
> > > >> > I am running drill with authentication.  I ran the REFRESH TABLE
> > > >> METADATA
> > > >> > as user X, it appears the .drill.parquet_metadata was created and
> > > owned
> > > >> by
> > > >> > the user the drill bits are running as as is created with
> > -rwxr-x-r-x
> > > >> >
> > > >> > My question is this: So, I can see why the file is owned by the
> > drill
> > > >> bit
> > > >> > user, and the file is created with all can read permissions, but
> why
> > > am
> > > >> I
> > > >> > getting a permission denied when user X is trying to run a query?
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >
>

Reply via email to