john-bodley opened a new issue #15396: URL: https://github.com/apache/superset/issues/15396
**Is your feature request related to a problem? Please describe.** As mentioned in the Apache Superset Contributor Meetup on [2021-06-25](https://docs.google.com/document/d/1xRLqmUn-G7WiPe8qZnfuvbrYHZ1jmk9aSv9Bhjis2tg/edit#heading=h.uh575p50wagev) at Airbnb we implemented a custom memoization function (`memoize_with_user_lock`) which on a per-user basis provides global locking (based on a cache key) to reduce the Presto query load—as it relates to the `latest_partition` call. This ensures that numerous instances of the same query are not simultaneously sent to the underlying engine (which may or may not dedupe). Though the cache is now at the per user level to prevent deadlocks (necessary for our implementation given our Presto cluster queue is faceted by user and thus there's no guarantee that the queries will be executed globally in a FIFO manner) we noticed a significant decrease in the number of queries when users invoked dashboard filters. **Describe the solution you'd like** I'm not saying this solution will work globally, but I felt there was merit in sharing the code. Note this is implemented for Redis which provides locking. ```python from functools import wraps from typing import Any, Callable, Optional from flask import g from redis.lock import Lock from superset import cache_manager def memoize_with_user_lock( timeout: Optional[int] = None, ) -> Callable[..., Any]: """ Decorator for memoization which leverages per user global locking to prevent simultaneous execution. :param timeout: If set, will cache for that amount of time (in seconds) :returns: A memoize decorator """ def decorator(func: Callable[..., Any]) -> Callable[..., Any]: @wraps(func) def wrapper(*args: Any, **kwargs: Any) -> Callable[..., Any]: """ Cache the result of a function, via memoization, taking its arguments and the user into account in the cache key. Rather than re-implementing the somewhat non-trivial Flask-Caching memoization logic we simply re-wrap the memoized function and serialize the calls if uncached to prevent simultaneous execution. :see: https://github.com/sh4nks/flask-caching """ @cache_manager.data_cache.memoize( make_name=lambda fname: str((fname, g.user.username)), timeout=timeout, ) def memoized_func(*args: Any, **kwargs: Any) -> Callable[..., Any]: return func(*args, **kwargs) # The cache key associated with the non-memoized function, which is also # used for locking, is additionally keyed by user to prevent deadlocks. cache_key = memoized_func.make_cache_key( memoized_func.uncached, *args, **kwargs ) # First check whether the Flask-Caching memoized function is cached. Note # the cached value is never `None` by construction. rv = cache_manager.data_cache.get(cache_key) if rv is None: # Per user sequential execution of the Flask-Caching memoized function. with Lock( cache_manager.data_cache.cache._write_client, # pylint: disable=protected-access cache_key, ): # Calling the Flask-Caching memoized function caches the result. rv = memoized_func(*args, **kwargs) return rv return wrapper return decorator ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
