Ideally HDFS should allow plugin external authorization. Then the privilege inconsistent problem will gone.
On Wed, Oct 13, 2010 at 4:49 PM, Pradeep Kamath <prade...@yahoo-inc.com> wrote: > One related concern with not using hdfs permissions is that there can be > conflicts between what the hive authorization realm would permit versus what > hdfs would permit. > > For instance a user X (in the hive authorization realm) has create table > privilege for database db1 but the hdfs directory /user/hive/warehouse/db1 is > actually not writable by user X - wouldn't this lead to a dfs permissions > denied error though user X has the create privilege per hive? We can extend > the same issue to other operations like drop table etc. > > Keep the two worlds in sync so that what is allowed/disallowed in one is the > same in the other might be difficult - thoughts? > > -----Original Message----- > From: John Sichi [mailto:jsi...@facebook.com] > Sent: Wednesday, October 13, 2010 4:36 PM > To: <dev@hive.apache.org> > Cc: howl...@yahoogroups.com; Pradeep Kamath; <hive-...@hadoop.apache.org> > Subject: Re: [howldev] RE: Howl Authorization proposal > > On Oct 13, 2010, at 9:22 AM, Alan Gates wrote: > >> Our biggest concern is that HDFS already has a permissions model, why create >> a whole new one? It is a lot of duplication. And that duplication will >> flow through to things like logging and auditing, all of which Hive/Howl >> will now need in addition to HDFS. To justify this we needed to understand >> what additional benefits a traditional ACL model would get us. We were not >> able to come up with compelling use cases where we had to have this >> traditional model. > > Here are some you probably already considered, but I'm listing them for > consideration anyway... > > * table A can only be queried by roles X and Y; table B can only be queried > by roles Y and Z; managing different groups for all the possible role > combinations isn't very practical given large numbers of tables and roles > > * finer-grained access control (e.g. column-level) may not be expressible in > terms of HDFS permissions without doing things like creating dummy files > (although in SQL, views can be used to avoid column-level permissions) > > * privileges beyond read/write (e.g. delete vs update vs append) > > * (Hive-specific): GRANT/REVOKE is the standard SQL approach and requires > ACL's (it can't be implemented in terms of HDFS permissions) > >> All that said, I see no problem with having two models for now, and seeing >> which turns out to better provide what users need and/or be easier to >> maintain. > > > OK, let us know if the hooks turn out to be insufficient as the > implementation mechanism. > > JVS > >