ugh - forgot the references for you that time... 1. http://knox.apache.org/books/knox-1-4-0/user-guide.html#KnoxSSO+Setup+and+Configuration 2. http://knox.apache.org/books/knox-1-4-0/user-guide.html#Pac4j+Provider+-+CAS+/+OAuth+/+SAML+/+OpenID+Connect http://knox.apache.org/books/knox-1-4-0/user-guide.html#For+CAS+support: 3. http://knox.apache.org/books/knox-1-4-0/user-guide.html#SSO+Cookie+Provider
On Tue, Dec 1, 2020 at 5:51 PM larry mccay <[email protected]> wrote: > Just realized that I left out your specific question about CAS. > We have support for CAS authentication. > For this you will want to look into using KnoxSSO [1] and replacing the > default config in the knoxsso.xml topology to use Pac4j [2] with it > configured for CAS. > You can then proxy access to your various UIs through a topology that is > configured with the SSOCookieProvider [3] authentication provider. > > When you try and access a UI through Knox, it will check for the knoxsso > cookie "hadoop-jwt" and if not found direct you to the knoxsso endpoint. > That endpoint will be configured for your CAS server and redirect you > there to login. > Once successfully logged in to CAS it will redirect back to KnoxSSO which > will in turn redirect you back to your original URL. > The cookie will be found and access to the UI granted based on your CAS > authentication event. > > > On Tue, Dec 1, 2020 at 2:39 PM larry mccay <[email protected]> wrote: > >> Hi Tien - >> >> Apache Knox sounds like exactly what you need here. >> Let me explain a bit about how Knox fits into the Hadoop ecosystem. >> >> Apache Hadoop established an integration pattern that is used across the >> ecosystem of related projects called proxyuser or Trusted Proxy [1]. >> This is a pattern that allows specific processes/services to make >> requests on behalf of other endusers. >> These trusted proxy services establish a trust relationship with the >> backend services with a combination of: >> * Kerberos for strong authentication for determining the identity of the >> trusted service >> * doAs/impersonation - this is typically a doAs query param sent to the >> backend service from the trusted proxy >> * configuration to dictate which hosts to expect trusted proxies to make >> calls from, which users are allowed to be impersonated by the trusted >> service >> >> Now, a high level view of Knox in this context - we'll use WebHDFS as the >> backend service example: >> * an enduser makes a curl call to WebHDFS through Knox: 'curl -ivku >> guest:guest-password >> https://localhost:8443/gateway/sandbox/webhdfs/v1/tmp?op=LISTSTATUS' >> * Knox has an endpoint that is configured via a Knox Topology called >> sandbox.xml and this topology is configured to support HTTP Basic Auth >> * Knox authenticates the user via the ShiroProvider authentication >> provider and establishes a Java Subject security context for the request >> processing internally >> * The request flows through the provider chain and enforces whatever >> security policies, authorization checks, identity assertion, etc >> * The last provider is the Dispatch to the backend service - this is >> essentially an HTTP client that interacts with the backend service >> * The webhdfs dispatch takes the authenticated username and sets that as >> the doAs query param on the outgoing request and dispatches the client's >> original request with that param added. >> * The WebHDFS endpoint will issue a kerberos challenge to the client >> which is Knox in this case and we will authenticate as the Knox identity >> via Kerberos/SPNEGO >> * WebHDFS will note that there is a doAs query param, that the Knox >> identity is indeed a trusted proxy, that the request is coming from an >> expected host and that the impersonation is allowed for the user being >> asserted by the doAs. >> >> Hopefully that wasn't too much detail and that it proves helpful. >> >> thanks, >> >> --larry >> >> 1. >> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html >> >> >> On Tue, Dec 1, 2020 at 2:23 PM Tien Dat PHAN <[email protected]> wrote: >> >>> Dear experts, >>> >>> We are recently starting to adopt Knox as the principle component for >>> equipping our data processing cluster a complete security layer. >>> >>> In fact, the situation is, in our cluster, there are Apache components >>> like Apache HBase, HDFS which play the role as our data processing backend. >>> These components work perfectly with Kerberos authentication for Access >>> Control. >>> On the other hand, our frontend is using CAS for authenticating users >>> (when accessing the data stored in our cluster). >>> >>> We just wonder (sorry if this turns out to a dumb question for you all) >>> if it is possible for the following scenario? >>> 1) User access to our web UI, inputting the username and password >>> 2) The CAS authentication certificates that username and password, there >>> will be a token stored in this session >>> 3) We (somehow) convert this token into Kerberos token which will be >>> passed to backend API when querying data. >>> >>> The main concern is about the step 3). The reason we think of this >>> scenario is because we don't expect the users to login one more time to >>> create a Kerberos token (for backend access). >>> >>> Do you think this is a reasonable authentication setup? And if YES, do >>> you think is possible with the help from Knox API? >>> >>> Thank you in advance for your time and consideration. >>> >>> Best regards >>> Tien Dat PHAN >>> >>
