[GitHub] [incubator-superset] craig-rueda commented on pull request #10678: chore: log cache keys to the logs

GitBox Thu, 27 Aug 2020 08:39:29 -0700


craig-rueda commented on pull request #10678:
URL: 
https://github.com/apache/incubator-superset/pull/10678#issuecomment-682027355



   > > I would recommend against tying a cache write to a metadata database 
write by default. This isn't great from a performance perspective. If you're 
attempting to create a lookup table for keys related to a specific dataset, 
perhaps you could push the cache key onto a set in the same cache that contains 
the required info in its key? That would allow this use-case without touching 
the DB.
   > 
   > @willbarrett we are talking about fairly small load, probably ~1-5 % load 
increase to the log table. Currently every access to the explore and log 
endpoints is logged in the log table. This change would not affect the 
performance / load on the database. Logging in superset in configurable, 
default goes to the metadata db, but it can be overwritten and be emitted 
anywhere else. An expected load on our side would be ~1-3 K writes a day. An 
issue with cache that it cannot be treated as a reliable source, that's why I 
picked log - as it still configurable and serves a bit different purpose .
   > 
   > @craig-rueda from my prospective logging is a perfect place for the 
superset performance analytics, including cache effectiveness. I picked logging 
mostly because if someone doesn't want to have a huge log table in the superset 
DB - they always can customize the logger and write to to kafka, cassandra, kv 
store, etc.
   > 
   > I really like the idea of writing this mapping into a separate table as 
the next step would be adding a cache invalidation endpoint. @villebro floated 
a similar idea in the slack channel. If you agree with this approach - happy to 
refactor this PR & do the writes into a separate table, it would be even more 
efficient as it would be possible to index on the datasource id / name for the 
endpoint effectiveness.
   > 
   > Please let me know how would you prefer to proceed, happy to hop on a call 
/ discuss in slack to align on the approach.
   
   I think @willbarrett 's idea is a good one! This allows us to skip the DB 
all together and provides a mechanism by which we can invalidate items based on 
something other than the cache key.
   
   Thanks!! 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [incubator-superset] craig-rueda commented on pull request #10678: chore: log cache keys to the logs

Reply via email to