In the last week: 1) I tried to find out what's wrong with my last patch: reducing the contention on SerializableFinishedListLock. 2) I used a skip list to replace the linked list to speedup conflict tracking. But there are still some bugs resulting in segment fault. Here is the code: https://github.com/liumx10/postgresql/commit/c469e14e27c31edba5caff001647440040f53c84 I will figure it out in the next days.
Here is the details. 1) Reducing the contention on SerializableFinishedListLock In the last email, I reported that reducing the contention on FinishedListLock didn't increase the performance. The system became even slower in some cases. I thought the key point was not the problem of my codes. The original Pseudo code looks like this: lockAcquire(SerializableFinishedListLock) for predlock in this Sxact: lockAcquire(patitionlock); // do something; unlock(paritionlock); releaseAcquire(SerializableFinishedListLock) SerializableFinishedListLock likes a coarse-grained lock, while partition locks like fine-grained lock. My patch essentially remove the outer lock. But unfortunately, there are many other places requiring partition locks. We have only 16 partition locks in all. So even without SerializableFinishedListLock, contentions will happen on partition locks. The performance wouldn't be improved. Why the performance became worse in some cases? Context switch may be one of the reason. It was easier to fail on getting locks when using fine-grained locks, which would hang the current process and result in a context switch. For example, if it failed to get partition locks 5 times in a for loop, there would be 5 context switch at least. I used vmstat to count the context switch. In the benchmark where my patch had a worse performance, there were more context switch happened than using the original code. I also modify the source code to count where requiring partition locks failed. In the original version, only a few (no more than 1%) locking failure came from the pseudo code above. But in my patch, more than 50% failure came from this part. 2) Skip list To make it simple, I didn't develop an independent data structure for skip list. Instead, I used two linked list to simulate skip list. Most transactions have no more than 50 conflicts, so two level skip list is enough now. I'll make a clean patch after I figure out bugs. -- Sincerely Mengxing Liu