Thanks, Sandeep for your insights too! To be honest, I would be surprised if someone used the JournalBased or the ZK TokenStateService implementations for storing tokens (even in the wild). In the case of AliasBasedTSS, it might be true. That's why I suggested that we could write a small KNOX CLI command that migrates tokens from the __gateway.jks credential store into the new Derby Database. We could even make this an auto-executed step: when token state service is initiated and it's using the pre-configured Derby database, we may check if there is any tokens stored in __gateway.jks and migrate them. This way it'd be seamless for existing tokens. I'm not sure that the migration tool should handle ZK and/or files. If I had to vote now, I'd say no, it's not necessary.
Sandor On Tue, Nov 14, 2023 at 1:05 PM Sandeep Moré <moresand...@gmail.com> wrote: > Thank you for starting the thread Sandor. You bring up valid points and > pain that we had to go through with these implementations :) > I am in favor of removing these as well. My concern is around folks > currently using any of these impls. Should we deprecate it this release > (but keep the support and make JDBCTokenStateService default) and remove > the support completely in the next release? This is because this is a > breaking change. > > My thoughts. > > On Mon, Nov 13, 2023 at 8:02 AM Sandor Molnar <smol...@cloudera.com.invalid > > > wrote: > > > Hello folks! > > > > I'm starting this thread because I am convinced we should remove the > > following TokenStateService implementations: > > - AliasBasedtokenStateService > > - ZookeeperTokenStateService > > - JournalBasedTokenStateService > > > > The reason behind this idea for the last two implementations in the above > > list is quite simple: > > > > 1. ZookeeperTokenStateService was our first approach to provide HA > support > > for Knox Token Integration. However, our internal tests have shown that > ZK > > is just simply not the right tool for that feature. Eventual consistency > is > > only one part of this issue (we could make this work with re-tried ZK > > queries). Performance-wise ZK proved to be a wrong decision. In our test > > environment, where hundreds of tokens were generated in every minute, ZK > > was not enough to scale. > > > > 2. JournalBasedTokensSateService is > > 2.1 insecure (it stores plain data on the FS), > > 2.2 missing features (no impersonation or SSO Cookie support) > > > > In the case of the AliasBasedtokenStateService, the reason is not that > > simple. It's true, that keystore-related operations are expensive, but > the > > background thread that actually persists the token state improved a lot > in > > this respect. However, it's still slow compared to the supported > databases > > we added for the JDBC implementation when it comes to token verification. > > In addition to that, the current implementation creates at least 3 > aliases > > per token, which makes the __gateway really big in case of lots of > tokens. > > Even worse, we try to read all tokens into memory from __gateway > credential > > store in a background thread that also consumes memory, CPU which we > could > > avoid. > > To be honest, I don't see any reason why could not we achieve the same > > functionality with a pre-configured Derby database that stores its data > in > > a dedicated sub-folder within the KNOX_DATA_DIR. This would be the > default > > choice, so users will still not need to configure everything for the > > KnoxToken service even if token state management is enabled. > > > > We could also write a small KNOX CLI command to migrate existing tokens > > from keystores to Derby upon upgrade. > > > > Advantages of the above: > > - only one implementation will be kept (JDBCTokenStateService) which is > > proven to be robust enough and can scale well > > - easier to maintain the product > > - easier to troubleshoot in PROD environments (Derby has very powerful > > tools to connect and run SQL queries) > > - eliminate background threads which make debugging hard, > > resource-consuming, and adds complexity > > - the non-desired side effects of reading lots of tokens into memory from > > __gateway credential store that may make the > > > > I'm curious about what you think of the above and I'd like to hear back > > from you with your suggestions and ideas. > > > > Cheers, > > Sandor > > >