Currently group information is not present in the Table and both owner and 
group information are absent from Database. If these are added to these 
classes, we could change Warehouse.mkdirs(). This method is also called form 
addPartition(), should we just use the table's owner/group in this case? - 
could potentially fail in non thrift case if some other user is creating the 
partitions OR we would need to add owner/group to Partition as well with the 
implication that table and partition owner's could differ causing query 
failures.

Paul's concern about security is valid but is there any other way around this?

Pradeep

-----Original Message-----
From: Paul Yang [mailto:[email protected]] 
Sent: Wednesday, July 14, 2010 3:18 PM
To: [email protected]
Subject: RE: Thrift metastore server and dfs file owner

Yeah, you could overload Warehouse.mkdirs() to allow specification of an 
owner/group and then use Filesystem.setOwner() within the method.

If the thrift server has full permissions for DFS though, wouldn't this present 
a security hole? 

-----Original Message-----
From: Ashish Thusoo [mailto:[email protected]] 
Sent: Wednesday, July 14, 2010 12:34 PM
To: [email protected]
Subject: RE: Thrift metastore server and dfs file owner

We could just fix this in Warehouse.java so that the mkdirs call make the 
directories according to the owner field that is passed to the table? That 
probably would be a simple fix for this, no?

Ashish

-----Original Message-----
From: Pradeep Kamath [mailto:[email protected]] 
Sent: Wednesday, July 14, 2010 11:14 AM
To: [email protected]
Subject: RE: Thrift metastore server and dfs file owner

<name>dfs.permissions</name>
<value>true</value>
..
<name>dfs.permissions.supergroup</name>
<value>hdfs</value>

You mentioned: "I think the thrift server can use the dfs processor." - were 
you suggesting the metastore implementation in HiveMetastore should always do 
chown user:user on create_table_core() (or selectively look at the conf and 
known it is being run as a thrift server and chown only in that case)?

Pradeep
 
-----Original Message-----
From: Edward Capriolo [mailto:[email protected]]
Sent: Tuesday, July 13, 2010 4:52 PM
To: [email protected]
Subject: Re: Thrift metastore server and dfs file owner

On Tue, Jul 13, 2010 at 6:20 PM, Pradeep Kamath <[email protected]> wrote:
> I tried:
> hive -e "set user.name=$USER;create table foo2 ( name string);"
>
> My warehouse table dir still got created by "root" (the user my thrift 
> server is running as) drwxr-xr-x   - root supergroup          0 
> 2010-07-13 15:19 /user/pradeepk/hive/warehouse/foo2
>
> -----Original Message-----
> From: Edward Capriolo [mailto:[email protected]]
> Sent: Tuesday, July 13, 2010 2:47 PM
> To: [email protected]
> Subject: Re: Thrift metastore server and dfs file owner
>
> On Tue, Jul 13, 2010 at 5:04 PM, Pradeep Kamath <[email protected]> 
> wrote:
>> Hi,
>>
>>    I suspect this is true but wanted to confirm: If I start a thrift 
>> metastore service as user "joe" then all internal tables created will 
>> have directories under the warehouse directory owned by "joe" 
>> regardless of the actual user running the create table statement - is 
>> this correct? There is no way for the thrift server to create the directory 
>> as the actual user?
>> However if thrift service is not used and the hive client directly 
>> works against the metastore database, then the directories are 
>> created by the actual user - is this correct?
>>
>>
>>
>> Thanks,
>>
>> Pradeep
>
> The hive web interface does this:
>
>    queries.add("set hadoop.job.ugi=" + auth.getUser() + ","
>        + auth.getGroups()[0]);
>    queries.add("set user.name=" + auth.getUser());
>
> You should be able to accomplish the same thing using set commands 
> with the Thrift Server to impersonate.
>
> Regards,
> Edward
>

You are right. That technique may only affect files created during the 
map/reduce job. I think the thrift server can use the dfs processor.

hive> dfs -chown user:user /user/hive/warehouse/foo2;

Questions:
Who is your hadoop superuser?
Are you enforcing dfs permissions?

If you are enforcing permissions only the hadoop superuser (hadoop) will be 
able to chown files to other users and groups.

Reply via email to