AlexanderKaraberov edited a comment on issue #2493: couch_auth_cache creates 
millions of monitors to _users.couch database which introduces a performance 
drop
URL: https://github.com/apache/couchdb/issues/2493#issuecomment-579370645
 
 
   Hello @kocolosk, and thank you for a fast reply!
   
   >  If I remove the couch_db:monitor/1 invocation and call reinit_cache in a 
loop I only see the one monitor for the file descriptor at any point in time.
   
   Unfortunately this is not a case for me. I'm not running current `master` 
however I've compared a version of `couch_auth_cache` from our `2.3.0` fork 
with the current `master` and there are no differences. So here's a normal 
scenario when I've only launched CouchDB server:
   <img width="688" alt="standard" 
src="https://user-images.githubusercontent.com/3254818/73288725-cd965680-41fb-11ea-802e-c932675dc23a.png";>
   
   You might see that there are two monitors currently active for the 
`couch_auth_cache` process, one is `couch_file` (`<0.320.0>`) and one is 
`couch_db_updater` (`<0.319.0>`). Now I repeatedly call `reinit_cache` and both 
of those monitors are duplicated each time:
   
   <img width="393" alt="dupes" 
src="https://user-images.githubusercontent.com/3254818/73289187-8bb9e000-41fc-11ea-9d73-1602aea1626f.png";>
   
   Now when I remove the `couch_db:monitor/1` invocation and repeat the same 
procedure again I can see that only `couch_file` monitors are duplicated:
   
   <img width="478" alt="couch_file" 
src="https://user-images.githubusercontent.com/3254818/73289427-fbc86600-41fc-11ea-989b-16b89391d7d7.png";>
   
   In fact this matches with what I've seen on our production nodes where there 
were only `couch_file` monitors duplicates (several million of them) but no 
duplicated `couch_db_updater` monitors.
   
   >  Is it possible that those monitors were the ones you saw during your 
debugging, and that the couch_file monitors you observed in production have a 
different root cause?
   
   I'm not fully excluding this possibility. However one interesting 
correlation as per my first post is that accumulating of these monitors and 
subsequent slow `pread_iolist` calls exactly correspond to a spike in 
`couchdb.auth_cache_misses`. What else could trigger this if not a cold cache?
   
   > Taking a step back, it's not at all obvious to me that we still need 
couch_auth_cache as a running service, especially now that the backdoor HTTP 
port has been removed. 
   
   This is actually great news. If someone from the dev team can indeed confirm 
that I can shutdown this process safely without any impact it would be 
wonderful. I'm no so interested in fixing the root cause of this especially 
taking into consideration that this is an outdated (legacy) module. I merely 
don't want this issue to return.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to