On 4/19/07, Doug Cutting <[EMAIL PROTECTED]> wrote:
Kurtis Heimerl wrote:
>> Yes, DFSClient will need to pass the user to the namenode.
>>
>> Perhaps the username should be put in the FileSystem's URI. So an HDFS
>> URI would become hdfs://[EMAIL PROTECTED]:5555/foo/bar. URI's without a
>> username would have "other" access (typically read-only).
>
> That's reasonable. I don't know how kerberos plays with that though.
I chatted with Owen a bit yesterday about this and think it's better to
keep the username in the config. A FileSystem is created given a URI
and a Configuration. FileSystem's are currently cached, keyed on the
URI's protocol and authority (host & port, typically). We should add
the configuration to the cache key too, so that different FileSystem
instances are used for different users. That permits FileSystem
implementations to use arbitrary config properties in their ctor.
I think we should be able to put a Kerberos ticket into the configuration.
I think i'm understanding the plan here. NameNode.java reads the location of
the namenode instance from config. So, we'll inset username and groups into
the config. On the first iteration, this will not be authenticated. This
information will be passed to the namenode server, who will translate the
name and groups to UID and GID, which are stored with the files.
Sounds like a reasonable thing. There's one problem here, that being that
each user will require their own config file. This is not the way I've seen
hadoop currently run, but if we all agree that this is the way to go, I'll
begin a prototype very soon.
We should have an equivalent of /etc/groups in the namenode.
>
> So, what I think it does is that it validates that the user really is
> [EMAIL PROTECTED] [ ... ]
>
> There's a chance kerberos actually validates that it's [EMAIL PROTECTED]
Kerberos validates that a user is [EMAIL PROTECTED], where both the user and
the domain are part of Kerberos, not some host. Initially we'll not do
any user validation, but just trust the username sent.
There's accountability, but not great protection. If someone put their
client into kerberos and it was accepted, they could take any role they
wanted.
That is, if I understood what you are talking about.
We might be able to get away without groups, but it would be awkward.
For example, if the default file permission is -rw-rw-r-, then, without
groups, anyone can read any file, but folks can only remove files
they've created. That doesn't permit read/write sharing of data w/o
changing its owner.
We probably also need a "root" username that can do anything.
I think groups and root are easy, so I plan to implement those initially as
well. Is there any more reasonable way to do root than just hardcoding that
root can do anything? I thought about adding root to all groups, but there's
a chance that a file had no groups. I guess I could add one root group that
simply contains root. That would allow the service to allow others to run as
root as well.
Doug