URL: https://github.com/SSSD/sssd/pull/999
Title: #999: Mem-cache fixes/improvements

sumit-bose commented:
"""
...
> > For the other `/ 8` I would suggest to round up the data from the config 
> > file to the next integer which can be devided by 8 to be on the safe side.
> 
> I think this was already implemented in `sss_mmap_cache_init()`:
> 
> https://github.com/SSSD/sssd/blob/3c10595659698c7d8976bcbed88b3696c222d86e/src/responder/nss/nsssrv_mmap_cache.c#L1305
> 
> Please let me know if you see any reason to move (or duplicate?) this 
> protection or you think it is done wrong.

You are right, I only looked at the patches and wasn't aware that it is already 
done.

> 
> > I think the number of expected objects might be a better option.
> > ...
> > If we have a tool to get statistics from the memory cache and this tool can 
> > show the average payload (i.e. how many slot a cached entry is using) we 
> > might consider to make the payload configurable as well, e.g. by enhancing 
> > the number of expected option so that the value can be 
> > `number_of_expected_objects:payload`. But should should be added alter.
> 
> I understand your intention but IMO from user point of view this is 
> unjustified over-complication. And most probably will not work well "in the 
> field"...
> 
> My reasoning is as follows:
> 
> There is just no such thing as "average payload". What I typically meet in my 
> experience, is that, for example, groups mem-cache may contain entries with 
> members amount ranging from 1 to 30..60k(!) (this is typical for integration 
> with very large AD - exactly the case when mem-cache is very important).
> Of course we can calculate average (or weighted average) size but this 
> doesn't make any sense. It would rather do harm because space will be wasted 
> for small entries and estimation will be wrong anyway.
> So while I understand your idea - "user knows size of own database in amount 
> of entries => let them configure size of the cache in amount entries" - I 
> don't think this is realistic approach.
> What I offer instead is perhaps very blunt but IMO should work well: "user 
> got message in system log that mem-cache is overfill => increase size of the 
> cache". This is less than ideal solution and still needs improvements (we 
> should report cache overfill only if we have to erase _non expired_ entries) 
> but good in the sense of effort/result ration from my point of view (I mostly 
> mean users here, not developers)

I agree about the payload, the only purpose is to have some relation between 
the number of slots in the data table and the number of entries in the hash 
table and I think it is really not needed to make it configurable I was just 
saying that we can in case there will be a potential use-case. My main concern 
if the integer division, especially with 3 and 5. It should not do much harm 
because it is only the size of the hash table but I still think it is better to 
avoid it and using the number of entries as an option would make it a 
multiplication in the data table size. 


> From developers perspective, I would rather go in totally different way as 
> opposite to "average payload" (btw, justification of those factors in the 
> comments in the source code is very loose). I would prefer to get rid of 
> fixed slot size completly (I must admit I didn't investigate performance 
> impact yet but my gut feeling is it should be ok). But this would be quite 
> large refactoring and intention of this PR - to resolve specific bug and to 
> provide at least basic means to ajust cache for NSS load...

Originally the slot size was 32bytes and the fixed slot size together with all 
the memory alignment was use to make sure the a slot or 2 will fix into a CPU 
cache line. This was of course broken by extending the slot size to 40. But I 
think the original idea was good and in the long term it would be good to come 
back to 32bytes (or maybe 64bytes). I'm not sure if a flexible slot size would 
have much benefits.

If you prefer the number of slots as a config option I would suggest to make 
the factor between the data and the hash table size (aka "payload") either 
constantly 4 or even 1. The latter will of course increase the hash table size 
but with the positive side effect that there might be less collisions.

bye,
Sumit  

> 
> I hope this makes sense. Please let me know what you think.


"""

See the full comment at 
https://github.com/SSSD/sssd/pull/999#issuecomment-601231341
_______________________________________________
sssd-devel mailing list -- sssd-devel@lists.fedorahosted.org
To unsubscribe send an email to sssd-devel-le...@lists.fedorahosted.org
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedorahosted.org/archives/list/sssd-devel@lists.fedorahosted.org

Reply via email to