Sure. I will test that on Monday (tomorrow is off in France :-) Le jeu. 10 nov. 2022 à 17:27, Misagh <[email protected]> a écrit :
> Could I ask you to review and test this? > https://github.com/mmoayyed/cas/compare/master...mmoayyed:cas:redis-fix > > On Thu, Nov 10, 2022 at 1:03 PM Misagh <[email protected]> wrote: > > > > Thank you Jérôme. I'll take a look. > > > > -- Misagh > > > > On Wed, Nov 9, 2022, 2:39 PM Jérôme LELEU <[email protected]> wrote: > >> > >> Hi, > >> > >> I have made a Redis performance test between v6.5.9 and v7.0.0-RC1 and > figures are very relevant. > >> > >> In both cases, I have a CAS server using a Redis ticket registry > (Docker image) and a custom authentication handler to validate any username > starting with "jleleu". > >> I have a script which performs a certain number of logins (without any > service) for jleleuRANDOMVALUE. > >> I have overriden the RedisTickerRegistry class to add time counting in > v6.5: > >> > >> private static AtomicLong getTime = new AtomicLong(); > >> private static AtomicInteger nbGet = new AtomicInteger(); > >> > >> @Override > >> public Ticket getTicket(final String ticketId, final Predicate<Ticket> > predicate) { > >> val t0 = System.currentTimeMillis(); > >> try { > >> val redisKey = getTicketRedisKey(encodeTicketId(ticketId)); > >> val t = this.client.boundValueOps(redisKey).get(); > >> if (t != null) { > >> val result = decodeTicket(t); > >> if (predicate.test(result)) { > >> return result; > >> } > >> LOGGER.trace("The condition enforced by [{}] cannot > successfully accept/test the ticket id [{}]", ticketId, > >> predicate.getClass().getSimpleName()); > >> return null; > >> } > >> } catch (final Exception e) { > >> LOGGER.error("Failed fetching [{}]", ticketId); > >> LoggingUtils.error(LOGGER, e); > >> } finally { > >> val t1 = System.currentTimeMillis(); > >> val time = t1 - t0; > >> val t = getTime.addAndGet(time); > >> val n = nbGet.incrementAndGet(); > >> LOGGER.info("### GET time: {} ms | Average time: {} ms", time, > t / n); > >> } > >> return null; > >> } > >> > >> > >> And in v7: > >> > >> @Override > >> public Ticket getTicket(final String ticketId, final Predicate<Ticket> > predicate) { > >> val t0 = System.currentTimeMillis(); > >> try { > >> val redisKey = > RedisCompositeKey.builder().id(encodeTicketId(ticketId)).build().toKeyPattern(); > >> return getKeysStream(redisKey) > >> .map(key -> redisTemplate.boundValueOps(key).get()) > >> .filter(Objects::nonNull) > >> .map(this::decodeTicket) > >> .filter(predicate) > >> .findFirst() > >> .orElse(null); > >> } catch (final Exception e) { > >> LOGGER.error("Failed fetching [{}]", ticketId); > >> LoggingUtils.error(LOGGER, e); > >> } finally { > >> val t1 = System.currentTimeMillis(); > >> val time = t1 - t0; > >> val t = getTime.addAndGet(time); > >> val n = nbGet.incrementAndGet(); > >> LOGGER.info("### GET time: {} ms | Average time: {} ms", time, > t / n); > >> } > >> return null; > >> } > >> > >> > >> Then I perform 1 000 and 10 000 logins (with my script) and check my > logs to see the average time: > >> > >> v6.5.9: > >> 1000 logins -> Average time: 3 ms > >> 10000 logins -> Average time: 3 ms > >> > >> v7.0.0-RC1: > >> 1000 logins -> Average time: 22 ms > >> 10000 logins -> Average time: 195 ms > >> > >> So indeed, I notice a big performance issue. > >> > >> Do you need more information? > >> > >> Thanks. > >> Best regards, > >> Jérôme > >> > >> > >> Le mar. 8 nov. 2022 à 07:56, Jérôme LELEU <[email protected]> a écrit : > >>> > >>> Hi, > >>> > >>> Yes, double indexing is harder than simple indexing as the second > operation may fail and you should revert the first one (transactional > aspect). > >>> If we did that for all tickets, we would double the size of the keys, > but not the size of the database though. > >>> And maybe we should have two different databases for better > performance. > >>> > >>> I will make a test to check the problem. > >>> > >>> Thanks. > >>> Best regards, > >>> Jérôme > >>> > >>> > >>> Le lun. 7 nov. 2022 à 18:30, Misagh <[email protected]> a > écrit : > >>>> > >>>> > Our main problem was the full SCAN of tickets to check if a ticket > is a TGT and if the principal is the right one ("count/get SSO sessions"). > >>>> > For this one, I would have created a "double key indexing", big > words for a simple thing ;-) > >>>> > On 6.5.x, we stored tickets this way: key=CAS_TICKET:ticketId => > VALUE=serialized ticket > >>>> > I propose for the "add ticket" operation to check if it is a TGT: > in that case, I would add to Redis: key=CAS_TICKET_USER:ticketId:userId => > VALUE=nothing. > >>>> > This way, we would "only" SCAN keys of TGT sessions to find the > right user (CAS_TICKET_USER:*:userId) and we would retrieve all the TGT > identifiers for a given user. > >>>> > Then, a multi GET on these identifiers would find the SSO sessions > of the user. > >>>> > >>>> That's quite clever. But it's not without complications. These are not > >>>> strictly blockers, but we should likely take these into account: > >>>> > >>>> Doing the double-indexing for a TGT would also imply that the same > >>>> thing would/could be done for OIDC codes, access tokens and refresh > >>>> tokens. For example, think of operations where you'd want to execute > >>>> "get me all the access tokens issued to user X", or "all the refresh > >>>> tokens issued to user Y", etc. This would mean that the registry > >>>> would have to somehow be tied to modules that present those extra > >>>> ticket types though I imagine this can be somewhat solved with the > >>>> ticket catalog concept. And of course, the registry size sort of > >>>> grows. 10 TGTs for unique users would actually mean 20 entries, not to > >>>> mention for every update/remove operation you'd be issuing double > >>>> queries. So it's double the index, double the number of operations. At > >>>> scale, I am not so sure this would actually be all that better, but I > >>>> have not run any conclusive tests. > >>>> > >>>> I would also be interested to see an actual test that showcases the > >>>> slowness. For example, I ran a test against a basic local redis > >>>> cluster, 7.0.5. My test, added 1000 TGTs to the registry, then fetched > >>>> them all, and then looped through the resultset asking the registry > >>>> for each ticket that was fetched again. This operation completed in > >>>> approx 13 seconds for non-encrypted tickets, and 14seconds encrypted > >>>> tickets. Then, I reverted the pattern back to what 6.5 used to do, and > >>>> I ran the same test, and I more or less saw the same execution time. > >>>> Double checked to make sure there are no obvious mistakes. Is this > >>>> also what you see? and if so, can you share some sort of a test that > >>>> actually shows or demonstrates the problem? > > -- > You received this message because you are subscribed to the Google Groups > "CAS Developer" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/a/apereo.org/d/msgid/cas-dev/CAGSBKkd9v9ss_BCoQwH6X%2B9OrtPWyuDjNaZJy3SusMMD578gGA%40mail.gmail.com > . > -- You received this message because you are subscribed to the Google Groups "CAS Developer" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/a/apereo.org/d/msgid/cas-dev/CAP279Ly3A2TV%3DsCWp_jKRZZG_M97s1O%2BMHpOg_DeejkpBNS0dA%40mail.gmail.com.
