Thanks, Larry, and Attila for your above comments. I'm trying to answer the questions you raised Larry:
1. I believe the data is stored in the clear in files in the directory - is this true? As Attila pointed out, Derby supports data encryption in a quite easy way. Currently, we use Debry 10.14 so I'm linking the security guide of that particular version: https://db.apache.org/derby/docs/10.14/security/index.html Here, you can see instructions on how to configure database encryption as well as using alternate security algorithms other than DES/DESede; AES 256 for instance to provide higher security. 2. Are all tokens only hashed including SSO cookie tokens? We do not store the JWT tokens themselves in the database, but related metadata (token ID, issue time, etc.) It's true, that we persist the so-called "Passcode tokens" though. 3. Credential stores can be synchronized across instances by simply copying them - can you do that with persisted derby data as well? If not, what is needed to do this? Derby has data replication capabilities too: https://db.apache.org/derby/docs/10.14/adminguide/cadminreplication.html. However, the idea of replacing the AliasBasedtokenStateService with JDBCTokenStateService with a pre-configured Derby DB is something that is meant for non-HA cases. I understand there might be misusage of this idea by copying the credential store every time a token was created (which is quite a poor decision IMO), but end-users have to understand we have a fully supported MySQL/PostgreSQL support for HA environments. I hope this helps to proceed with this discuss thread. Cheers, Sandor On Tue, Nov 14, 2023 at 11:10 AM Attila Magyar <amag...@cloudera.com.invalid> wrote: > I support this change. I'd like to add that derby also supports DB level > encryption, using DES or triple DES. > > It can be specified the following way: > > jdbc:derby:encryptionDB1;create=true;dataEncryption=true; > bootPassword=clo760uds2caPe > > > > On Mon, Nov 13, 2023 at 4:59 PM larry mccay <lmc...@apache.org> wrote: > > > I am in favor of this in general. > > > > My only concern is the alias based one. > > This is the default implementation for token based authentication flows > and > > is possibly more secure than a derby based implementation. > > > > Let's begin with some questions about the use of derby: > > 1. I believe the data is stored in the clear in files in the directory - > is > > this true? > > 2. Are all tokens only hashed including SSO cookie tokens? > > 3. Credential stores can be synchronized across instances by simply > copying > > them - can you do that with persisted derby data as well? If not, what is > > needed to do this? Copying them may not be something that we need to do > if > > this is only for non-HA but it is something that folks may be doing in > the > > wild. We certainly do that for generic aliases already, so if tokens are > in > > there then they are being sync'd as well. > > > > > > On Mon, Nov 13, 2023 at 8:02 AM Sandor Molnar > <smol...@cloudera.com.invalid > > > > > wrote: > > > > > Hello folks! > > > > > > I'm starting this thread because I am convinced we should remove the > > > following TokenStateService implementations: > > > - AliasBasedtokenStateService > > > - ZookeeperTokenStateService > > > - JournalBasedTokenStateService > > > > > > The reason behind this idea for the last two implementations in the > above > > > list is quite simple: > > > > > > 1. ZookeeperTokenStateService was our first approach to provide HA > > support > > > for Knox Token Integration. However, our internal tests have shown that > > ZK > > > is just simply not the right tool for that feature. Eventual > consistency > > is > > > only one part of this issue (we could make this work with re-tried ZK > > > queries). Performance-wise ZK proved to be a wrong decision. In our > test > > > environment, where hundreds of tokens were generated in every minute, > ZK > > > was not enough to scale. > > > > > > 2. JournalBasedTokensSateService is > > > 2.1 insecure (it stores plain data on the FS), > > > 2.2 missing features (no impersonation or SSO Cookie support) > > > > > > In the case of the AliasBasedtokenStateService, the reason is not that > > > simple. It's true, that keystore-related operations are expensive, but > > the > > > background thread that actually persists the token state improved a lot > > in > > > this respect. However, it's still slow compared to the supported > > databases > > > we added for the JDBC implementation when it comes to token > verification. > > > In addition to that, the current implementation creates at least 3 > > aliases > > > per token, which makes the __gateway really big in case of lots of > > tokens. > > > Even worse, we try to read all tokens into memory from __gateway > > credential > > > store in a background thread that also consumes memory, CPU which we > > could > > > avoid. > > > To be honest, I don't see any reason why could not we achieve the same > > > functionality with a pre-configured Derby database that stores its data > > in > > > a dedicated sub-folder within the KNOX_DATA_DIR. This would be the > > default > > > choice, so users will still not need to configure everything for the > > > KnoxToken service even if token state management is enabled. > > > > > > We could also write a small KNOX CLI command to migrate existing tokens > > > from keystores to Derby upon upgrade. > > > > > > Advantages of the above: > > > - only one implementation will be kept (JDBCTokenStateService) which is > > > proven to be robust enough and can scale well > > > - easier to maintain the product > > > - easier to troubleshoot in PROD environments (Derby has very powerful > > > tools to connect and run SQL queries) > > > - eliminate background threads which make debugging hard, > > > resource-consuming, and adds complexity > > > - the non-desired side effects of reading lots of tokens into memory > from > > > __gateway credential store that may make the > > > > > > I'm curious about what you think of the above and I'd like to hear back > > > from you with your suggestions and ideas. > > > > > > Cheers, > > > Sandor > > > > > > > > -- > *Attila Magyar* | Staff Software Engineer > > cloudera.com <https://www.cloudera.com> > > [image: Cloudera] <https://www.cloudera.com/> > > [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image: > Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera > on LinkedIn] <https://www.linkedin.com/company/cloudera> > ------------------------------ >