Hi Tien Dat - My pleasure. Thanks for your interest in Apache Knox.
In fact, you should consider subscribing to the Knox dev@ and/or user@ mailing lists. Since you are not subscribed, your email needs to be accepted by a moderator and may get lost. You can do so from the apache knox Mailing Lists page [1] from the project home page. thanks, --larry 1. http://knox.apache.org/mailing-lists.html On Wed, Dec 2, 2020 at 1:30 PM Tien Dat PHAN <[email protected]> wrote: > Many thanks, Larry, for such useful information. > > Best regards > Tien Dat > On 2020/12/01 22:55:47, larry mccay <[email protected]> wrote: > > ugh - forgot the references for you that time... > > > > 1. > > > http://knox.apache.org/books/knox-1-4-0/user-guide.html#KnoxSSO+Setup+and+Configuration > > 2. > > > http://knox.apache.org/books/knox-1-4-0/user-guide.html#Pac4j+Provider+-+CAS+/+OAuth+/+SAML+/+OpenID+Connect > > > http://knox.apache.org/books/knox-1-4-0/user-guide.html#For+CAS+support: > > 3. > > > http://knox.apache.org/books/knox-1-4-0/user-guide.html#SSO+Cookie+Provider > > > > > > On Tue, Dec 1, 2020 at 5:51 PM larry mccay <[email protected]> wrote: > > > > > Just realized that I left out your specific question about CAS. > > > We have support for CAS authentication. > > > For this you will want to look into using KnoxSSO [1] and replacing the > > > default config in the knoxsso.xml topology to use Pac4j [2] with it > > > configured for CAS. > > > You can then proxy access to your various UIs through a topology that > is > > > configured with the SSOCookieProvider [3] authentication provider. > > > > > > When you try and access a UI through Knox, it will check for the > knoxsso > > > cookie "hadoop-jwt" and if not found direct you to the knoxsso > endpoint. > > > That endpoint will be configured for your CAS server and redirect you > > > there to login. > > > Once successfully logged in to CAS it will redirect back to KnoxSSO > which > > > will in turn redirect you back to your original URL. > > > The cookie will be found and access to the UI granted based on your CAS > > > authentication event. > > > > > > > > > On Tue, Dec 1, 2020 at 2:39 PM larry mccay <[email protected]> wrote: > > > > > >> Hi Tien - > > >> > > >> Apache Knox sounds like exactly what you need here. > > >> Let me explain a bit about how Knox fits into the Hadoop ecosystem. > > >> > > >> Apache Hadoop established an integration pattern that is used across > the > > >> ecosystem of related projects called proxyuser or Trusted Proxy [1]. > > >> This is a pattern that allows specific processes/services to make > > >> requests on behalf of other endusers. > > >> These trusted proxy services establish a trust relationship with the > > >> backend services with a combination of: > > >> * Kerberos for strong authentication for determining the identity of > the > > >> trusted service > > >> * doAs/impersonation - this is typically a doAs query param sent to > the > > >> backend service from the trusted proxy > > >> * configuration to dictate which hosts to expect trusted proxies to > make > > >> calls from, which users are allowed to be impersonated by the trusted > > >> service > > >> > > >> Now, a high level view of Knox in this context - we'll use WebHDFS as > the > > >> backend service example: > > >> * an enduser makes a curl call to WebHDFS through Knox: 'curl -ivku > > >> guest:guest-password > > >> https://localhost:8443/gateway/sandbox/webhdfs/v1/tmp?op=LISTSTATUS' > > >> * Knox has an endpoint that is configured via a Knox Topology called > > >> sandbox.xml and this topology is configured to support HTTP Basic Auth > > >> * Knox authenticates the user via the ShiroProvider authentication > > >> provider and establishes a Java Subject security context for the > request > > >> processing internally > > >> * The request flows through the provider chain and enforces whatever > > >> security policies, authorization checks, identity assertion, etc > > >> * The last provider is the Dispatch to the backend service - this is > > >> essentially an HTTP client that interacts with the backend service > > >> * The webhdfs dispatch takes the authenticated username and sets that > as > > >> the doAs query param on the outgoing request and dispatches the > client's > > >> original request with that param added. > > >> * The WebHDFS endpoint will issue a kerberos challenge to the client > > >> which is Knox in this case and we will authenticate as the Knox > identity > > >> via Kerberos/SPNEGO > > >> * WebHDFS will note that there is a doAs query param, that the Knox > > >> identity is indeed a trusted proxy, that the request is coming from an > > >> expected host and that the impersonation is allowed for the user being > > >> asserted by the doAs. > > >> > > >> Hopefully that wasn't too much detail and that it proves helpful. > > >> > > >> thanks, > > >> > > >> --larry > > >> > > >> 1. > > >> > https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/Superusers.html > > >> > > >> > > >> On Tue, Dec 1, 2020 at 2:23 PM Tien Dat PHAN <[email protected]> > wrote: > > >> > > >>> Dear experts, > > >>> > > >>> We are recently starting to adopt Knox as the principle component for > > >>> equipping our data processing cluster a complete security layer. > > >>> > > >>> In fact, the situation is, in our cluster, there are Apache > components > > >>> like Apache HBase, HDFS which play the role as our data processing > backend. > > >>> These components work perfectly with Kerberos authentication for > Access > > >>> Control. > > >>> On the other hand, our frontend is using CAS for authenticating users > > >>> (when accessing the data stored in our cluster). > > >>> > > >>> We just wonder (sorry if this turns out to a dumb question for you > all) > > >>> if it is possible for the following scenario? > > >>> 1) User access to our web UI, inputting the username and password > > >>> 2) The CAS authentication certificates that username and password, > there > > >>> will be a token stored in this session > > >>> 3) We (somehow) convert this token into Kerberos token which will be > > >>> passed to backend API when querying data. > > >>> > > >>> The main concern is about the step 3). The reason we think of this > > >>> scenario is because we don't expect the users to login one more time > to > > >>> create a Kerberos token (for backend access). > > >>> > > >>> Do you think this is a reasonable authentication setup? And if YES, > do > > >>> you think is possible with the help from Knox API? > > >>> > > >>> Thank you in advance for your time and consideration. > > >>> > > >>> Best regards > > >>> Tien Dat PHAN > > >>> > > >> > > >
