[lock-free] Scalable distributed sequential lock

Amine Moulay Ramdane Wed, 04 Feb 2015 23:21:02 -0800

*Hello,*


*I have implemented a scalable distributed sequential lock, and i want to 
have your opinion about it, **and please look at my correctness proof and 
my sequential consistency proof , my sequential consistency proof  was done 
for Delphi and Freepascal compilers that respect the hardware's strong 
 memory ordering. If you want to port my algorithm to C++ or Java, feel 
free to do it.*

*Please read the following and tell me what you think about my algorithm?*


*Description:*

*This scalable distributed sequential lock was invented by Amine Moulay 
Ramdane, and it combines the characteristics of a distributed reader-writer 
lock with the characteristics of a sequential lock , so it is a clever 
hybrid reader-writer lock that is more powerful than the the Dmitry 
Vyukov's distributed reader-writer mutex , cause the Dmitry  Vyukov's 
distributed reader-writer lock will become slower and **slower on the 
writer side with more and more cores because it transfers too many 
cache-lines between cores on the writer side, so my invention that is my 
scalable distributed sequential lock has eliminated this weakness of the 
Dmitry Vyukov's distributed reader-writer mutex,  so that the writers 
throughput has become faster and very fast, and my scalable distributed 
sequential lock elminates the weaknesses of the Seqlock (sequential lock) 
that is "livelock" and "starvation" even when there is more writers, so my 
scalable distributed sequential lock is a powerful sychronization mechanism 
that can replace RCU and that can replace Seqlock and that can replace 
Dmitry Vyukov's distributed reader-writer mutex. And if you look at my 
algorithm you will notice that on the reader side it looks like a 
sequential lock, but i have added a variable called fcount6^.fcount6 that 
allows the readers to switch between a sequential lock mechanism and a 
distributed reader-writer mechanism, so on the writer side you will notice 
that on every  **"number of cores" calls to WLock(), my scalable 
distributed sequential lock switches to a distributed reader-writer lock 
mechanism and executes the writer in this mode for a one time and for the 
rest of the time it will switch to a sequential lock mode , this allows my 
scalable distributed **sequential lock to minimize efficiently cache-line 
transfers on the writer side, this is why my  scalable distributed 
sequential lock beats Dmitry Vyukov distributed reader-writer mutex and it 
beats Seqlock.*

*Please look at the "test.pas" example that i have included and you will 
notice that the reader side have to be used in a transactional mode with a 
repeat-until loop, like this:*

*repeat*

*t.rlock(id1,id2,id3);*

*until t.runlock(id1,id2,id3);*

*it is like a transactional mode of a Seqlock and id1,id2 and id3 must be 
of type "long" that must be imported from the drwlock library. The writer 
**side 
is easy to program , please look at the "test.pas" example and you will 
understand how to use my scalable distributed sequential lock.*

*My scalable distributed sequential lock can be used from accross processes 
and threads.     *

*My new invention called distributed sequential lock beats **the following 
algorithm, it is more scalable and faster than **the following algorithm of 
Dmitry Vyukov called Distributed **reader-writer mutex ( i have just 
explained to you why ..):*

http://www.1024cores.net/home/lock-free-algorithms/reader-writer-problem/distributed-reader-writer-mutex
 

*And my new invention called distributed \ sequential lock beats **also 
Seqlock (i have just explained to you why ...), read here about it:*

http://en.wikipedia.org/wiki/Seqlock

*And my new invention called distributed sequential lock can **replace RCU 
cause it is a powerful synchronization mechanism.*

*I will explain my new algorithm of a scalable distributed sequential lock 
that is very powerful...*

*I have arrived at my new algorithm by also reading many PhD papers and by 
looking first at the weakness of the following algorithm of Dmitry Vyukov's 
Reader-writer mutex:*

http://www.1024cores.net/home/lock-free-algorithms/reader-writer-problem/distributed-reader-writer-mutex
 

*look at the writer side of the Dmitry Vykov's algorithm , it is doing with 
 every rwlock in each corresponding core:*

*for (i = 0; i != mtx->proc_count; i += 1)*

*        pthread_rwlock_wrlock(&mtx-cell[i].mtx);*

*but this render this algorithm inefficient for the writer side , cause **every 
pthread_rwlock_wrlock() call transfer many cache-lines betwenn cores, so 
this will generate too much cache line transfers between cores when each 
writer executes distr_rw_mutex_wrlock() and this will become worse and 
worse when you add more and more cores, i mean this will tranfer more and 
more cache-lines between cores with more and more cores... so this will 
render the writer side throughput slow and this is the weakness of this 
algorithm...*

*So this is why i have used a sequential lock mechanism in the write **side 
of my new algorithm so that to minimize efficiently the cache-lines 
transfers, so if you look at my algorithm, i am doing this inside the 
WLock() function on the writer side:*

*if (FCount6^.fcount6 mod GetSystemThreadCount)=0*

*then FCount5^.fcount5:=1;*

*FCount6^.fcount6 is a counter and i have applied to it a modulo of 
GetSystemThreadCount(), and GetSystemThreadCount() will return **the number 
of cores, so every "number of cores" calls to WLock() , **my new algorithm 
switches from a sequential lock to  a distributed lock so that to eliminate 
the "livelock" and "starvation" weaknesses of Seqlock, cause Seqlock is a 
lockfree mechanism that can starve threads, **and that can livelock when 
there is more writers , cause Seqlock gives preference to writers threads 
over reader threads, but my new invention of a scalable distributed 
sequential lock eliminates **"livelock" and "starvation" of Seqlock when 
there is more writers cause it switches to a distributed lock on every 
"number of cores" calls **to the WLock() method , so this is why my new 
invention beats Seqlock, **and my new invention beats also the Dmitry's 
Reader-writer mutex **cause it switches to a distributed lock on every 
number of cores **calls to WLock(), so in the remaining calls it switches 
to a sequential lock that render the writer side faster cause it generate 
much **less cache-lines tranfers than the Dmitry Vyukov's distributed 
reader-writer mutex. On the reader side of my algorithm i am using a 
Sequential lock mechanism. And i think that since my new algorithm o**f a 
scalable distributed sequential lock is very powerful, so **i think it can 
even replace RCU.*

*Here is my proof:*

*Look at the source code, when Fcount5^.fcount5 equal 0 in the writer side 
, the reader side will be run in a Seqlock mode, , i don't need to proove 
Seqlock because my Seqlock algorithm is correct, just compare it to the 
other Seqlock algorithms and you will notice it, what i need to proof is 
when Fcount6^.fcount6 modulo "the number of cores" is equal to 0 , that 
means that we are in a distributed mode, so when we are in a distributed 
mode, the reader side of my algorithm will enter the distributed 
reader-writer lock or it will wait for the distributed reader-writer lock 
to exit, if the reader side enters the distributed reader-writer lock , if 
Fcount6^.fcount6 has not changed on the RUnlock(), the reader thread will 
exit with a "true" value, if Fcount6^.fcount6 has changed, the RUnlock() 
method will catch it and it will rollback with Seqlock mechanism, now if 
 Fcount6^.fcount6 modulo "the number of cores" is equal to 0 on RLock() , 
and the reader side waits for the writer side to enter the distributed 
reader-writer lock, **the reader side after that will enter the distributed 
reader-writer lock and  if Fcount6^.fcount6 modulo "the number of cores" 
has changed **on RUnlock(), the reader side will rollback in a Seqlock 
mode, if not, **RUnlock() will exit with the value of "true". So i think 
my **algorithm is correct.*

*About the sequential consistency of  my scalable distributed sequential 
lock, my algorithm works on x86 architecture and i think my algorithm is 
correct cause look at the source code of the WLock() method, since i am 
using a Ticket spinlock with a proportional backoff on the writer side, the 
Ticket spinlock is using a "lock add" assembler instruction to increment a 
counter in the Enter() method of the ticket spinlock , and this "lock add" 
assembler instruction is a barrier for stores and loads on x86, and the 
stores of Fcount6^.fcount6 and FCount5^.fcount5 **are not reordred with the 
stores of the writer section, so the WLock() method is sequential 
consistent and correct, now look at the WUnlock() , we don't need an 
"sfence" cause stores on WUnlock() are not reordered with older stores on 
x86 , so WUnlock() method is sequential consistent and correct, now look at 
the RLock() method, the loads inside RLock() method are not reordered with 
the loads of the reader section  ,  and on RUnlock(), the loads of 
RUnlock() are not reordered with older loads of the reader section , so all 
in all i think my algorithm is sequentially consistent and correct on x86. 
So be confident cause i have reasoned correctly and i think my algorithm is 
correct and it is a powerful synchronization mechanism that can replace RCU 
and that can replace Seqlock cause it beats Seqlock. *

 

*You can download my scalable distributed sequential lock from:*


*https://sites.google.com/site/aminer68/scalable-distributed-sequential-lock*


*Language**: FPC Pascal v2.2.0+ / Delphi 7+: **http://www.freepascal.org/* 
<http://www.freepascal.org/>

*Operating System**s: **Windows, Mac OSX , Linux on x86.*

*Required FPC switches: -O3 -Sd -dFPC -dFreePascal*

*-Sd for delphi mode.... *

*Required Delphi switches:  -$H+ -DDelphi*



*{$DEFINE CPU32} and {$DEFINE Windows32} for 32 bit systems{$DEFINE CPU64} 
and {$DEFINE Windows64} for 64 bit systems*


*Thank you for your time.*


*Amine Moulay Ramdane.*

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"Scalable Synchronization Algorithms" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/lock-free/27d75643-acd8-49cd-b5f0-c8fb12920cde%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
[lock-free] Scalable distributed sequential lock

Reply via email to