ywkaras commented on pull request #7382: URL: https://github.com/apache/trafficserver/pull/7382#issuecomment-747678411
An issue related to this PR is whether we should continue to use freelists, or instead rely on an implementation of malloc with per-thread memory pools. For demanding applications, TS should run on hosts with many CPU cores. But increasing cores will result in a likely worse-than-linear increase in inter-core contention for freelist head pointers. To explore this issue, I wrote https://github.com/ywkaras/MiscRepo/blob/master/CAE128/x.cc . It measures the performance of contention for a 128-bit variable that is worse than anything we'd be likely to see in practice. I tested on my 4 core MacBook (running directly in MacOS, not a container or VM). My results were about 159.7 nanoseconds to do the rough equivalent of a freelist push or pop. Another interesting result I saw was that, if multiple cores are reading and writing in the same cache line, speed is reduced by about a factor of about 500. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
