Having a "skip chown" option sounds good to me. We'll add the option to CommandInfo.URI so that frameworks can override the default if desired. Mind filing a ticket?
On Thu, Sep 11, 2014 at 5:00 AM, John Omernik <j...@omernik.com> wrote: > Vinod - > > I believe this is EXACTLY the issue. I also understand why in most cases > this is ok. If a user is provided, then a fair assumption would be to chown > the extracted archive as that user. (Assuming the untar is happening as > root in all cases) So that leads to three components we may want to make > customizable by the framework: > > 1. Who untars the archive. Right now, it appears root untars the archive > (otherwise, I would imagine that the chown would be unneeded, if the user > untared the archive, the user would already have permissions, thus the > chown would not be needed). If it is root, perhaps this is "ok" to leave > as is? Another option may be to set the untar user separate from the > running user, but I am not sure we'd need to if root always untars. > > 2. Pass a flag from framework that allow a skipping of the chown. For > compatibility sakes, the flag would default to "off" so that it wouldn't > break existing things, but if the framework wanted, they could tell the > slave that the permissions are fine how they are set, and there is no need > to chown. I am not sure I understand the architecture of Mesos well enough > yet to comment on the best way to do this. Should it be a framework > variable? (Frameworks would have to be updated to make use of this) A > string in the filename (could this be abused?) Etc. > > 3. The user that runs the executor. This is already passed, and I am not > sure we need to change anything here. As long as A. Root untars the > archive, and B. We have the ability to skip the chown, the user stuff > should be perfectly ok as is. This way, in my case, root would untar the > archive, I could set the skip on chown, and then I'd have the user hadoop > run the framework. In this model, the LinuxTaskController should work. > > Thanks for looking into this, I welcome more thoughts on the subject. > > John > > > > On Wed, Sep 10, 2014 at 4:39 PM, Vinod Kone <vinodk...@gmail.com> wrote: > >> IanD: Mind helping John out here? >> >> My hunch here is that this is because the slave does "chown()" after >> extracting ( >> https://github.com/apache/mesos/blob/master/src/launcher/fetcher.cpp#L258 >> )? >> >> From POSIX standard, it looks like chown() when invoked by root doesn't >> clear the setuid bit for ordinary files but clears them for other types >> (e.g., binary). >> >> >> http://unix.stackexchange.com/questions/53665/chown-removes-sticky-bit-bug-or-feature >> http://pubs.opengroup.org/onlinepubs/009695399/utilities/chown.html >> >> >> On Wed, Sep 10, 2014 at 2:17 PM, John Omernik <j...@omernik.com> wrote: >> >>> I am wondering about the process of fetching the tgz files and running >>> them on slaves. Basically, I am trying to run hadoop-mesos, but still use >>> the LinuxTaskController ( >>> http://hadoop.apache.org/docs/r1.0.4/cluster_setup.html for details). >>> >>> When I am using hadoop, I have to swich to the defaultTaskController >>> because when Mesos untars the tgz, it loses the setuid bit on the binary. >>> I've done a bit of testing around this, and I am unsure why it loses it >>> (even if the running process is root) but it does. >>> >>> Basically, tar by itself works like this: If the user is a super user, >>> tar maintain all permissions that are in the tgz. (I've tested this, when I >>> manually untar with tar zxf myhadoop.tgz it untars properly, including >>> permissions and setuid on the Linux Task Controller.) >>> >>> When I untar as a non-super user, the permissions all get moved to the >>> user that untared it, and the setuid bit is lost. It makes sense from a >>> security point of view. >>> >>> So how does this work in mesos and hadoop? >>> >>> Well, if I run the jobtracker as user hadoop, hadoop is not a super >>> user, all the files in the untared hadooop folder are owned by >>> hadoop:hadoop, and the setuid bit is lost. >>> >>> Ok, next test, well, let's run jobtracker as root, and see what happens. >>> (remember, when I untared as root, the setuid and all permissions were >>> preserved). So, when we run JT as root, all the files become root:root, >>> and the setuid bit is lost. That's weird? What happened here? (This is >>> where I get lost, perhaps the untar/gzipping isn't using the tar command >>> thus permissions are not preserved like I would expect) >>> >>> Either way, when using the LinuxTaskController, tasktrackers WILL NOT >>> RUN if the setuid bit is not set. That's a pain, the LinuxTaskController >>> is really nice from an impersonation/security setup with hadoop jobs. I >>> CAN run my hadoop framework as hadoop:hadoop, but then I am limited in how >>> things are setup and I get strange permissions issues when trying to run >>> certain jobs as other users. >>> >>> The fix? >>> >>> I am hoping we can have a discussion around this. As I see it, the >>> slaves are running as root, they have the power to run however we need them >>> to run. Ideally, I'd like to see the untarring happen with the preserve >>> permissions bit. I.e. the archives for mesos, at the very least having the >>> OPTION to preserve permissions in the tgz. If we could do this, as an >>> option somehow, this would be a win. >>> >>> Also ideally, I don't want to run the framework as root, just untar the >>> tgz as root, preserving permissions. There is a difference between the >>> action of untarring, and the execution of the framework, and the security >>> nerd in me would like to ensure while the slave COULD run the framework as >>> root, we avoid it if possible. >>> >>> I am not sure how exactly mesos untars things, nor am I aware how hard >>> it would be to do this, but I think from a security perspective, the >>> flexibility that untarting/preserving permissions (especially the setuid >>> bit) would bring Mesos would warrant the dev time. >>> >>> Thoughts? >>> >>> >>> >> >