craig-rueda commented on pull request #10678: URL: https://github.com/apache/incubator-superset/pull/10678#issuecomment-682027355
> > I would recommend against tying a cache write to a metadata database write by default. This isn't great from a performance perspective. If you're attempting to create a lookup table for keys related to a specific dataset, perhaps you could push the cache key onto a set in the same cache that contains the required info in its key? That would allow this use-case without touching the DB. > > @willbarrett we are talking about fairly small load, probably ~1-5 % load increase to the log table. Currently every access to the explore and log endpoints is logged in the log table. This change would not affect the performance / load on the database. Logging in superset in configurable, default goes to the metadata db, but it can be overwritten and be emitted anywhere else. An expected load on our side would be ~1-3 K writes a day. An issue with cache that it cannot be treated as a reliable source, that's why I picked log - as it still configurable and serves a bit different purpose . > > @craig-rueda from my prospective logging is a perfect place for the superset performance analytics, including cache effectiveness. I picked logging mostly because if someone doesn't want to have a huge log table in the superset DB - they always can customize the logger and write to to kafka, cassandra, kv store, etc. > > I really like the idea of writing this mapping into a separate table as the next step would be adding a cache invalidation endpoint. @villebro floated a similar idea in the slack channel. If you agree with this approach - happy to refactor this PR & do the writes into a separate table, it would be even more efficient as it would be possible to index on the datasource id / name for the endpoint effectiveness. > > Please let me know how would you prefer to proceed, happy to hop on a call / discuss in slack to align on the approach. I think @willbarrett 's idea is a good one! This allows us to skip the DB all together and provides a mechanism by which we can invalidate items based on something other than the cache key. Thanks!! ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
