> benchmarSQL has about half reads. So I think it should be effective.
> I don't think BufFreelistLock take much time, it just get a buffer from list. 
> It should be very fast.

You're wrong.  That list is usually empty right now; so it does a
linear scan of the buffer pool looking for a good eviction candidate.

> The test server has 2 CPUs and 12 cores in each CPU. 24 processor totally. 
> CPU Idle time is over 50%. IO only 10%(data is in SSD)
> I perf one process of pg. The hot spot is hash search. Attachment is perf 
> data file.

I think you need to pass -g to perf so that you get a call-graph
profile.  Then you should be able to expand the entry for
hash_search_with_hash_value() and see what's calling it.

