@Larry, yes your description is pretty well. We have different "levels" of security. This is the lowest one and most unsecure however since this is on-prem product some customers woulnd't mind accepting the risk
We also have Kerberos and AD on top of these for those want a more secure environment. Currently, we need Knox to allow us pass the superuser identity if possible. thanks ________________________________ From: jeff saremi <jeffsar...@hotmail.com> Sent: Monday, November 18, 2019 1:12 PM To: larry mccay <lmc...@apache.org>; user@knox.apache.org <user@knox.apache.org> Subject: Re: Switching user going from KNOX to WebHDFS @kevin, yes we're not using Kerberos or any AD So you're saying that whatever user I authenticate against knox is the one that will be passed to webhdfs? If i were to pass ?user.name=hdfs in the query string targeted for hdfs or ?doas=hdfs in the that request, would knox honor those and pass them along? or will they get overwritten by Knox? ________________________________ From: larry mccay <lmc...@apache.org> Sent: Monday, November 18, 2019 1:09 PM To: user@knox.apache.org <user@knox.apache.org> Subject: Re: Switching user going from KNOX to WebHDFS Hi Jeff - Thanks for reaching out! Rather than try and unpack all of that, I'd like to get to step back to a description of what you are trying to accomplish with your deployment and the addition of Knox within it. As you have described it, it seems like a very unsecured environment. Whether you are running your process as a root user or not, executing your queries and operations as the HDFS user is also very insecure. HDFS is a superuser in a Hadoop deployment. Authenticating to Knox as root and asserting the effective user as hdfs is certainly we can do but I don't see what the value is of doing that. So, let's step back and get a clear picture of what you would like to accomplish and we can direct you to appropriate authentication/federation providers and possibly identity-assertion providers to meet your needs. thanks, --larry On Mon, Nov 18, 2019 at 2:47 PM Kevin Risden <kris...@apache.org<mailto:kris...@apache.org>> wrote: If i am to do an hdfs query, all i need to do is to set HADOOP_USER_NAME to 'hdfs' then everything works nicely. This means that you aren't using Kerberos just regular simple auth for your cluster. This is true until we get to knox. We still communicate with Knox using a root and an admin password. I believe by default, this user's identity is used to call webhdfs? The user identity is asserted by Knox against the backend service. So Knox is configured for authentication that username is asserted to the backend. So however you are doing authentication in Knox needs to be configured. This is usually LDAP out of the box but can be configured with different authentication providers like PAM. Kevin Risden On Mon, Nov 18, 2019 at 2:37 PM jeff saremi <jeffsar...@hotmail.com<mailto:jeffsar...@hotmail.com>> wrote: I'm not sure how to phrase this question and also I don't have any experience in these two technologies Here's the deal: We are switching from running hadoop and related technologies from under root to a non-root user So far we have managed to successfully change our namenodes and datanodes such that the process is running under a user named 'hdfs'. If i am to do an hdfs query, all i need to do is to set HADOOP_USER_NAME to 'hdfs' then everything works nicely. This is true until we get to knox. We still communicate with Knox using a root and an admin password. I believe by default, this user's identity is used to call webhdfs? We need to change this behavior. Looking for some pointers on what the changes would be. thanks Jeff