[jira] [Commented] (HDFS-13322) fuse dfs - uid persists when switching between ticket caches

Istvan Fajth (JIRA) Wed, 14 Aug 2019 01:08:43 -0700


    [ 
https://issues.apache.org/jira/browse/HDFS-13322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16907005#comment-16907005
 ]


Istvan Fajth commented on HDFS-13322:
-------------------------------------

Further information, that we discovered and discussed with [~abukor] and 
[~wolfosis] after dig into this together a bit more:

The FUSE context struct does not expose anything from the callers environment, 
mostly as it is not possible from the FUSE code's perspective, the struct we 
get in the calls coming in to fuse-dfs code contains the uid, gid, pid, umask 
of the caller, the fuse struct (which contains implementation details, and 
mount args mostly), and private data that the FS itself exposes. See the [code 
here|https://github.com/libfuse/libfuse/blob/master/include/fuse.h#L786-L804].

The limitation comes from deeper levels in the OS and process handling in 
POSIX, which is summarized pretty well in [this StackExchange 
question|https://unix.stackexchange.com/questions/29128/how-to-read-environment-variables-of-a-process]
 (See the answer from Jonathan Ben-Avraham edited by Toby Speight, currently 
the second answer).
 In a nutshell: the kernel, when executes and starts a process, puts the 
initial environment onto the stack of the process into a fixed length 
structure, this area of the stack is exposed in the /proc/[pid]/environ system 
path. After the start of the process, a POSIX process has a global __environ 
variable that is allocated and updated in the heap of the process, by libc 
routines every time the environment changes. This area is not accessible for 
other processes or for the kernel, at least it is not easy to access and the 
access is restricted, as you need ptrace, and the symbol table of the caller 
process, and also access permissions to the memory of the other process to do 
so.

Based on the research, this limitation is not something we can resolve from the 
fuse-dfs code, so if you need to use this feature, you need to apply the 
workaround and ensure that all dfs access via the fuse mount has the proper 
initial environment (i.e: it is forked from a process that already has the 
environment variable set so the forked process can inherit it in its initial 
environment.)

> fuse dfs - uid persists when switching between ticket caches
> ------------------------------------------------------------
>
>                 Key: HDFS-13322
>                 URL: https://issues.apache.org/jira/browse/HDFS-13322
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: fuse-dfs
>    Affects Versions: 2.6.0
>         Environment: Linux xxxxxx.xx.xx.xxx 3.10.0-514.el7.x86_64 #1 SMP Wed 
> Oct 19 11:24:13 EDT 2016 x86_64 x86_64 x86_64 GNU/Linux
>  
>            Reporter: Shoeb Sheyx
>            Assignee: Istvan Fajth
>            Priority: Minor
>             Fix For: 3.2.0
>
>         Attachments: HDFS-13322.001.patch, HDFS-13322.002.patch, 
> HDFS-13322.003.patch, TestFuse.java, TestFuse2.java, catter.sh, catter2.sh, 
> perftest_new_behaviour_10k_different_1KB.txt, perftest_new_behaviour_1B.txt, 
> perftest_new_behaviour_1KB.txt, perftest_new_behaviour_1MB.txt, 
> perftest_old_behaviour_10k_different_1KB.txt, perftest_old_behaviour_1B.txt, 
> perftest_old_behaviour_1KB.txt, perftest_old_behaviour_1MB.txt, 
> testHDFS-13322.sh, test_after_patch.out, test_before_patch.out
>
>
> The symptoms of this issue are the same as described in HDFS-3608 except the 
> workaround that was applied (detect changes in UID ticket cache) doesn't 
> resolve the issue when multiple ticket caches are in use by the same user.
> Our use case requires that a job scheduler running as a specific uid obtain 
> separate kerberos sessions per job and that each of these sessions use a 
> separate cache. When switching sessions this way, no change is made to the 
> original ticket cache so the cached filesystem instance doesn't get 
> regenerated.
>  
> {{$ export KRB5CCNAME=/tmp/krb5cc_session1}}
> {{$ kinit user_a@domain}}
> {{$ touch /fuse_mount/tmp/testfile1}}
> {{$ ls -l /fuse_mount/tmp/testfile1}}
> {{ *-rwxrwxr-x 1 user_a user_a 0 Mar 21 13:37 /fuse_mount/tmp/testfile1*}}
> {{$ export KRB5CCNAME=/tmp/krb5cc_session2}}
> {{$ kinit user_b@domain}}
> {{$ touch /fuse_mount/tmp/testfile2}}
> {{$ ls -l /fuse_mount/tmp/testfile2}}
> {{ *-rwxrwxr-x 1 user_a user_a 0 Mar 21 13:37 /fuse_mount/tmp/testfile2*}}
> {{   }}{color:#d04437}*{{** expected owner to be user_b **}}*{color}



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

[jira] [Commented] (HDFS-13322) fuse dfs - uid persists when switching between ticket caches

Reply via email to