[ 
https://issues.apache.org/jira/browse/DRILL-4143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15233952#comment-15233952
 ] 

ASF GitHub Bot commented on DRILL-4143:
---------------------------------------

Github user chunhui-shi commented on the pull request:

    https://github.com/apache/drill/pull/470#issuecomment-207939397
  
    This function is one entrance to access metadata. Write (Automatic metadata 
update) will happen during this 'read' process if there is any directory/file 
under the  table's root is changed thus has a newer modified time. In other 
code paths, the metadata is accessed with permission of drillbituser already. 
This fix will make this code path to to be consistent with other code paths. 
Since this inconsistency of permissions used to access metadata is the root 
cause of this JIRA.


> REFRESH TABLE METADATA - Permission Issues with metadata files
> --------------------------------------------------------------
>
>                 Key: DRILL-4143
>                 URL: https://issues.apache.org/jira/browse/DRILL-4143
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Parquet
>    Affects Versions: 1.3.0, 1.4.0
>            Reporter: John Omernik
>            Assignee: Chunhui Shi
>              Labels: Metadata, Parquet, Permissions
>             Fix For: Future
>
>
> Summary of Refresh Metadata Issues confirmed by two different users on Drill 
> User Mailing list. (Title: REFRESH TABLE METADATA - Access Denied)
> This issue pertains to table METADATA and revolves around user 
> authentication. 
> Basically, when the drill bits are running as one user, and the data is owned 
> by another user, there can be access denied issues on subsequent queries 
> after issuing a REFRESH TABLE METADATA command. 
> To troubleshoot what is actually happening, I turned on MapR Auditing (This 
> is a handy feature) and found that when I run a query (that is giving me 
> access denied.. my query is select count(1) from testtable ) Per MapR the 
> user I am logged in as (dataowner) is trying to do a create operation on the 
> .drill.parquet_metadata file and it's failing with status: 17. Per Keys at 
> MapR, "status 17 means errno 17 which means EEXIST. Looks like Drill is 
> trying to create a file that already exists." This seems to indicate that 
> drill is perhaps trying to create the .drill.parquet_metadata on each select 
> as the dataowner user, but the permissions (as seen below) don't allow it. 
> Here are the steps to reproduce:
> Enable Authentication. 
> Run all drill bits in the cluster as "drillbituser", then have the files 
> owned by "dataowner". Note the root of the table permissions are drwxrwxr-x 
> but as Drill loads each partition it loads them as drwxr-xr-x (all with 
> dataowner:dataowner ownership). That may be something too, the default 
> permissions when creating a table?  Another note, in my setup, drillbituser 
> is in the group for dataowner.  Thus, they should always have read access. 
> # Authenticated as dataowner (this should have full permissions to all the 
> data)
> Enter username for jdbc:drill:zk=zknode1:5181: dataowner
> Enter password for jdbc:drill:zk=zknode1:5181: **********
> 0: jdbc:drill:zk=zknode1> use dfs.dev;
> +-------+--------------------------------------+
> |  ok   |               summary                |
> +-------+--------------------------------------+
> | true  | Default schema changed to [dfs.dev]  |
> +-------+--------------------------------------+
> 1 row selected (0.307 seconds)
> # The query works fine with no table metadata
> 0: jdbc:drill:zk=zknode1> select count(1) from `testtable`;
> +-----------+
> |  EXPR$0   |
> +-----------+
> | 24565203  |
> +-----------+
> 1 row selected (3.392 seconds)
> # Refresh of metadata works under with no errors
> 0: jdbc:drill:zk=zknode1> refresh table metadata `testtable`;
> +-------+-------------------------------------------------------+
> |  ok   |                        summary                        |
> +-------+-------------------------------------------------------+
> | true  | Successfully updated metadata for table testtable.  |
> +-------+-------------------------------------------------------+
> 1 row selected (5.767 seconds)
>  
> # Trying to run the same query, it returns a access denied issue. 
> 0: jdbc:drill:zk=zknode1> select count(1) from `testtable`;
> Error: SYSTEM ERROR: IOException: 2127.7646.2950962 
> /data/dev/testtable/2015-11-12/.drill.parquet_metadata (Permission denied)
>  
>  
> [Error Id: 7bfce2e7-f78d-4fba-b047-f4c85b471de4 on node1:31010] 
> (state=,code=0)
>  
>  
> # Note how all the files are owned by the drillbituser. Per discussion on 
> list, this is normal 
>  
> $ find ./ -type f -name ".drill.parquet_metadata" -exec ls -ls {} \;
> 726 -rwxr-xr-x 1 drillbituser drillbituser 742837 Nov 30 14:27 
> ./2015-11-12/.drill.parquet_metadata
> 583 -rwxr-xr-x 1 drillbituser drillbituser 596146 Nov 30 14:27 
> ./2015-11-29/.drill.parquet_metadata
> 756 -rwxr-xr-x 1 drillbituser drillbituser 773811 Nov 30 14:27 
> ./2015-11-11/.drill.parquet_metadata
> 763 -rwxr-xr-x 1 drillbituser drillbituser 780829 Nov 30 14:27 
> ./2015-11-04/.drill.parquet_metadata
> 632 -rwxr-xr-x 1 drillbituser drillbituser 646851 Nov 30 14:27 
> ./2015-11-08/.drill.parquet_metadata
> 845 -rwxr-xr-x 1 drillbituser drillbituser 864421 Nov 30 14:27 
> ./2015-11-05/.drill.parquet_metadata
> 771 -rwxr-xr-x 1 drillbituser drillbituser 788823 Nov 30 14:27 
> ./2015-11-28/.drill.parquet_metadata
> 1273 -rwxr-xr-x 1 drillbituser drillbituser 1303168 Nov 30 14:27 
> ./2015-11-10/.drill.parquet_metadata
> 645 -rwxr-xr-x 1 drillbituser drillbituser 660028 Nov 30 14:27 
> ./2015-11-22/.drill.parquet_metadata
> 1017 -rwxr-xr-x 1 drillbituser drillbituser 1040469 Nov 30 14:27 
> ./2015-11-13/.drill.parquet_metadata
> 1280 -rwxr-xr-x 1 drillbituser drillbituser 1310552 Nov 30 14:27 
> ./2015-11-03/.drill.parquet_metadata
> 585 -rwxr-xr-x 1 drillbituser drillbituser 598973 Nov 30 14:27 
> ./2015-11-07/.drill.parquet_metadata
> 737 -rwxr-xr-x 1 drillbituser drillbituser 754103 Nov 30 14:27 
> ./2015-11-20/.drill.parquet_metadata
> 836 -rwxr-xr-x 1 drillbituser drillbituser 855363 Nov 30 14:27 
> ./2015-11-16/.drill.parquet_metadata
> 646 -rwxr-xr-x 1 drillbituser drillbituser 661134 Nov 30 14:27 
> ./2015-11-27/.drill.parquet_metadata
> 1156 -rwxr-xr-x 1 drillbituser drillbituser 1183378 Nov 30 14:27 
> ./2015-11-21/.drill.parquet_metadata
> 679 -rwxr-xr-x 1 drillbituser drillbituser 694551 Nov 30 14:27 
> ./2015-11-09/.drill.parquet_metadata
> 631 -rwxr-xr-x 1 drillbituser drillbituser 645500 Nov 30 14:27 
> ./2015-11-26/.drill.parquet_metadata
> 190 -rwxr-xr-x 1 drillbituser drillbituser 193798 Nov 30 14:27 
> ./2015-11-18/.drill.parquet_metadata
> 221 -rwxr-xr-x 1 drillbituser drillbituser 225342 Nov 30 14:27 
> ./2015-11-23/.drill.parquet_metadata
> 269 -rwxr-xr-x 1 drillbituser drillbituser 274988 Nov 30 14:27 
> ./2015-11-25/.drill.parquet_metadata
> 845 -rwxr-xr-x 1 drillbituser drillbituser 864615 Nov 30 14:27 
> ./2015-11-06/.drill.parquet_metadata
> 32 -rwxr-xr-x 1 drillbituser drillbituser 32221 Nov 30 14:27 
> ./2015-11-19/.drill.parquet_metadata
> 16468 -rwxr-xr-x 1 drillbituser drillbituser 16862958 Nov 30 14:27 
> ./.drill.parquet_metadata
>  
> # Once the files are deleted... 
> $ sudo find ./ -type f -name ".drill.parquet_metadata" -exec rm {} \;
> # Then the query works again 
> 0: jdbc:drill:zk=zknode1> select count(1) from `testtable`;
> +-----------+
> |  EXPR$0   |
> +-----------+
> | 24567700  |
> +-----------+
> 1 row selected (1.353 seconds)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to