Hi Ben, after Brian reported the thread performance regression affecting the pattern matchings in haproxy when relying on the LRU cache, I had a look at other users of the LRU cache and found that 51d.c is using it with a lock as well and may also suffer from a lack of linearity with threads.
You may want to have a look at this patch I just committed to make the pattern LRU cache per thread : 403bfbb130 ("BUG/MEDIUM: pattern: make the pattern LRU cache thread-local and lockless") It's quite straightforward, and if you want to do the same on 51d.c, you just have to make your lru_tree pointer thread_local and allocate it for each thread in a small callback registered to be called after threads are initialized. Same for the call to lru64_destroy(). Then you can remove the lru lock and gain a lot of scalability. I'll let you have a look because I'd rather not break somehing non- obvious in your code and also because you know better than me how to benchmark the changes using the real lib and database, but if you need some help, just let me know. Given that it's quite small and simple a change, I'm fine with merging such a patch for 2.1 even a bit late (but only this, no other changes that are not bugfixes please). Willy