-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/12050/#review23858
-----------------------------------------------------------

Ship it!


Ship It!

- Ashutosh Chauhan


On July 19, 2013, 6:55 p.m., Chaoyu Tang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/12050/
> -----------------------------------------------------------
> 
> (Updated July 19, 2013, 6:55 p.m.)
> 
> 
> Review request for hive, Ashutosh Chauhan and Sushanth Sowmyan.
> 
> 
> Bugs: HIVE- and HIVE-3756
>     https://issues.apache.org/jira/browse/HIVE-
>     https://issues.apache.org/jira/browse/HIVE-3756
> 
> 
> Repository: hive-git
> 
> 
> Description
> -------
> 
> Problems:
> 1. When doing load data or insert overwrite to a table, the data files under 
> database/table directory could not inherit their parent's permissions (i.e. 
> group) as described in HIVE-3756.
> 2. Beside the group issue, the read/write permission mode is also not 
> inherited
> 3. Same problem affects the partition files (see HIVE-3094)
> 
> Cause:
> The task results (from load data or insert overwrite) are initially stored in 
> scratchdir and then loaded under warehouse table directory. FileSystem.rename 
> is used in this step (e.g. LoadTable/LoadPartition) to move the dirs/files 
> but it preserves their permissions (including group and mode) which are 
> determined by scratchdir permission or umask. If the scratchdir has different 
> permissions from those of warehouse table directories, the problem occurs.
> 
> Solution:
> After the FileSystem.rename is called, changing all renamed (moved) 
> files/dirs to their destination parents' permissions if needed (say if 
> hive.warehouse.subdir.inherit.perms is true). Here I introduced a new method 
> renameFile doing both rename and permission. It replaces the 
> FileSystem.rename used in LoadTable/LoadPartition. I do not replace rename 
> used to move files/dirs under same scratchdir in the middle of task 
> processing. It looks to me not necessary since they are temp files and also 
> probably access protected by top scratchdir mode 700 (HIVE-4487).
> 
> 
> Diffs
> -----
> 
>   ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java 87a584d 
> 
> Diff: https://reviews.apache.org/r/12050/diff/
> 
> 
> Testing
> -------
> 
> The following cases tested that all created subdirs/files inherit their 
> parents' permission mode and group in : 1). create database; 2). create 
> table; 3). load data; 4) insert overwrite; 5) partitions.
> {code}
> hive> dfs -ls -d /user/tester1/hive;                                          
>                              
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:20 
> /user/tester1/hive
> 
> hive> create database tester1 COMMENT 'Database for user tester1' LOCATION 
> '/user/tester1/hive/tester1.db';
> hive> dfs -ls -R /user/tester1/hive;                                          
>                               
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:21 
> /user/tester1/hive/tester1.db
> 
> hive> use tester1;
> hive>  create table tester1.tst1(col1 int, col2 string) ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY ',' STORED AS TEXTFILE;
> hive> dfs -ls -R /user/tester1/hive;                                          
>                                           
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:22 
> /user/tester1/hive/tester1.db
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:22 
> /user/tester1/hive/tester1.db/tst1
> 
> hive>  load data local inpath '/home/tester1/tst1.input' into table tst1;     
>                                           
> hive> dfs -ls -R /user/tester1/hive;                                     
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:22 
> /user/tester1/hive/tester1.db
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:23 
> /user/tester1/hive/tester1.db/tst1
> -rw-rw----   3 tester1 testgroup123        168 2013-06-22 13:23 
> /user/tester1/hive/tester1.db/tst1/tst1.input
> 
> hive> create table tester1.tst2(col1 int, col2 string) ROW FORMAT DELIMITED 
> FIELDS TERMINATED BY ',' STORED AS SEQUENCEFILE;
> hive> dfs -ls -R /user/tester1/hive;                                          
>                                           
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:24 
> /user/tester1/hive/tester1.db
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:23 
> /user/tester1/hive/tester1.db/tst1
> -rw-rw----   3 tester1 testgroup123        168 2013-06-22 13:23 
> /user/tester1/hive/tester1.db/tst1/tst1.input
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:24 
> /user/tester1/hive/tester1.db/tst2
> 
> hive> insert overwrite table tst2 select * from tst1;
> hive> dfs -ls -R /user/tester1/hive;                 
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:25 
> /user/tester1/hive/tester1.db
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:23 
> /user/tester1/hive/tester1.db/tst1
> -rw-rw----   3 tester1 testgroup123        168 2013-06-22 13:23 
> /user/tester1/hive/tester1.db/tst1/tst1.input
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:25 
> /user/tester1/hive/tester1.db/tst2
> -rw-rw----   3 tester1 testgroup123        291 2013-06-22 13:25 
> /user/tester1/hive/tester1.db/tst2/000000_0
> 
> hive> create table tester1.tst3(col2 string) PARTITIONED BY (col1 int) ROW 
> FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE;
> hive> dfs -ls -R /user/tester1/hive;                                          
>                                           
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:27 
> /user/tester1/hive/tester1.db
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:23 
> /user/tester1/hive/tester1.db/tst1
> -rw-rw----   3 tester1 testgroup123        168 2013-06-22 13:23 
> /user/tester1/hive/tester1.db/tst1/tst1.input
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:25 
> /user/tester1/hive/tester1.db/tst2
> -rw-rw----   3 tester1 testgroup123        291 2013-06-22 13:25 
> /user/tester1/hive/tester1.db/tst2/000000_0
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:27 
> /user/tester1/hive/tester1.db/tst3
> 
> hive> set hive.exec.dynamic.partition.mode=nonstrict;                         
>                                           
> hive> insert overwrite table tester1.tst3 partition (col1) select t1.col2, 
> t1.col1 from tst1 t1;
> hive> dfs -ls -R /user/tester1/hive;                                          
>                   
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:27 
> /user/tester1/hive/tester1.db
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:23 
> /user/tester1/hive/tester1.db/tst1
> -rw-rw----   3 tester1 testgroup123        168 2013-06-22 13:23 
> /user/tester1/hive/tester1.db/tst1/tst1.input
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:25 
> /user/tester1/hive/tester1.db/tst2
> -rw-rw----   3 tester1 testgroup123        291 2013-06-22 13:25 
> /user/tester1/hive/tester1.db/tst2/000000_0
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:28 
> /user/tester1/hive/tester1.db/tst3
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:28 
> /user/tester1/hive/tester1.db/tst3/col1=1111
> -rw-rw----   3 tester1 testgroup123         51 2013-06-22 13:28 
> /user/tester1/hive/tester1.db/tst3/col1=1111/000000_0
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:28 
> /user/tester1/hive/tester1.db/tst3/col1=2222
> -rw-rw----   3 tester1 testgroup123         51 2013-06-22 13:28 
> /user/tester1/hive/tester1.db/tst3/col1=2222/000000_0
> drwxrwx---   - tester1 testgroup123          0 2013-06-22 13:28 
> /user/tester1/hive/tester1.db/tst3/col1=3333
> -rw-rw----   3 tester1 testgroup123         51 2013-06-22 13:28 
> /user/tester1/hive/tester1.db/tst3/col1=3333/000000_0
> {code}
> 
> 
> Thanks,
> 
> Chaoyu Tang
> 
>

Reply via email to