Ted Dunning wrote:
Last I heard, the API could be suborned in this scenario. Real credential
based identity would be needed to provide more than this.
The hack would involve a changed hadoop library that lies about identity.
This would not be difficult to do.
On Wed, Jul 22, 2009 at 11:45 PM, Mathias Herberts <
[email protected]> wrote:
You can simply set up some bastion hosts which are trusted and from
which jobs can be run.
Then let users connect to these hosts using a secure mechanism such as SSH
keys.
You can then create users/groups on those bastion hosts and have
permissions on your HDFS files that use those credentials.
There's no wire security, nothing to stop me pushing in packets straight
to a datanode, saying who I claim to be.
Even if you lock down access to the cluster so that I don't have direct
access to the nodes, if I can run an MR job in the cluster, I can gain
full administrative rights, by virtue of the fact the cluster is running
my Java code on one of its nodes, a node which must have direct access
to the rest of the cluster.
the details are left as an exercise for the reader.