Hi All, Hive stores all the data under /user/hive/warehouse (configurable) Each table has a directory under /user/hive/warehouse. Under each table directory, the data files are stored . There can be multiple of them .
By default, the permission of /user/hive/warehouse is rwxrwxr_x . The table directories and data files are created based on the hdfs u_mask property for files and directories. -sh-3.2$ hadoop fs -ls /user/hive/ Found 1 items drwxrwxr-x - hadoop supergroup 0 2011-10-26 08:58 /user/hive/warehouse -sh-3.2$ hadoop fs -ls /user/hive/warehouse Found 2 items drwxr-xr-x - hadoop supergroup 0 2011-10-26 08:58 /user/hive/warehouse/u1_data drwxr-xr-x - hadoop supergroup 0 2011-10-20 14:55 /user/hive/warehouse/u_data -sh-3.2$ hadoop fs -ls /user/hive/warehouse/u_data Found 1 items -rw-r--r-- 3 bantony supergroup 1979173 2011-10-20 14:54 /user/hive/warehouse/u_data/u.data ---------------------------------------------------------------------------- With this approach , only users belonging to group associated with /user/hive/warehouse can create tables in hive and load data into tables. If we want everybody to create tables, then we have to change the permission of /user/hive/warehouse to 777 , but then anyone can delete other's tables and hence not secure. So if I want different groups to work on different tables, the "hadoop" user has to change the group and permission of the table directory to the specific user group and give write permission to the group. So in short, only superusers (or users belonging to a specific group) can create tables. Then the table creator has to change group of the table to the required user group and give write permission to group. (Or change the ownership to the user who requested table and let him do what he/she wants.) . Is this a reasonable approach ? Is there any feature in hive already to automate this process ? cheers, Benoy Antony