On Thursday, January 9, 2014 7:02:36 AM UTC-8, Dmitry Vyukov wrote:
>
> On Tue, Jan 7, 2014 at 12:42 PM, Rajiv Kurian 
> <[email protected]<javascript:>> 
> wrote: 
> > I am trying to design a lock-free or close to lock-free hash map of int 
> keys 
> > and int values. I have a single writer thread that writes to the 
> hash-map. 
> > My efforts to do better than a RW mutex around the map haven't yielded 
> > anything good. My gut feeling says I should be able to do better given 
> it's 
> > a single writer. I have the following requirements: 
> > 
> > 1) A single-writer multiple reader map of int keys and int values. 
> > 2) Keys are guaranteed to be monotonically increasing. These are just 
> ids 
> > and are generated by incrementing an integer. 
> > 3) The operations I need are: 
> > 
> > append(key, value) where key is guaranteed to be higher than anything in 
> the 
> > map so far - only called on the writer thread. 
> > delete(key) - only called on the writer thread. 
> > set(key, value) where key is required to be already in the map - only 
> called 
> > on the writer thread. 
> > get(key) - could be called by writer/reader threads. 
> > 
> > 4) All the reader threads and the single writer thread are known during 
> the 
> > creation of the map. Reader threads do not pop up or die. They stay 
> constant 
> > for the lifetime of the application. 
> > 
> > I am struggling to use anything cheaper than RW locks (changed for a 
> single 
> > writer) for the implementation since the delete and append operations 
> could 
> > require resizing of buckets or moving elements around and the reader 
> threads 
> > could observe the changes in between. 
> > 
> > Any pointers? 
>
>
> Hi! 
>
> The most efficient solution is always tailored to your particular 
> situation. Lots of details matter. In particular in this case: 1. what 
> is the number of reader threads? 2. what is the relative frequencies 
> of different operations?

Number of reader threads = num cores - 1, so all but one core should be 
reading.
Writes are vastly outnumbered by reads, that's all I know for now. I am 
trying to build a framework so, the real ratio is dependent on the 
application and kind of unpredictable.

>
>
> Also the fastest way to do something is to not do it all. 
> Can you replace integers with pointers? Even if you send them across 
> network, you can verify that the received pointer is valid before 
> using it, e.g: 
>
> #define MAX_OBJECT_IN_FLIGHT (1<<20) 
>
> object_t objects[MAX_OBJECT_IN_FLIGHT]; 
>
> bool verify_pointer(object_t *o) { 
>   if(o < objects || o >= objects + MAX_OBJECT_IN_FLIGHT) 
>     return false; 
>   if(((uintptr)o - (uintptr)objects) % sizeof(object_t)) 
>     return false; 
>   // OK, this is a valid pointer into objects. 
>   // A client can spoof it, but he can spoof the integer as well. 
>   return true; 
>
 

> } 
>
> --

Hmm, I am not sure what you mean. Are you saying that since my int is an id 
of an object I might as well deal with the pointers to that object instead 
of having extraneous look ups by ids? For more context I am trying to 
design an actor system like Erlang. IDs instead of pointers help because 
there are extraneous requirements like persisting actor state and ids 
across runs of the applications. The hash map helps with scheduling actors 
to threads. There are num core - 1 dispatcher threads actually running the 
actor logic. These are the reader threads that I talked about. They are 
connected by a mesh of SPSC queues/ring-buffers. Each dispatcher thread 
(which is running actor logic) needs to check where an actor (id) is 
scheduled so that it can enqueue a msg on the right queue. There is a 
single writer thread which actually does the scheduling of actors to 
dispatcher threads. This is the writer thread that writes to the hash map. 
 Having pointers won't save me from doing the look ups since I cannot use 
the pointer to actually execute a function. I need to know where an 
object/actor is scheduled to run. The operations I need in context are:

1) Append(actorId, dispatcher-id). This is called by the single writer 
thread, when a new actor is created and hence the monotonically increasing 
appends.
2) Set(actorId, dispatcher-id). The single scheduler thread observes 
patterns like certain actor pairs communicating a lot, or dispatchers 
becoming imbalanced (queue length, queue drain rate etc). Based on these 
metrics it can choose to move an actor(s) from one dispatcher to another. 
There are other steps involved here that I am omitting.
3) Delete(actorId). An actor has just died hence it's deleted from the 
central directory.

On your suggestion of specialization, something I thought of is that since 
I already have communication channels between the writer thread and the 
reader threads (ring-buffers that they poll on), I could use it as an 
opportunity to do updates such as resize/compaction etc. I am using a 
closed  addressing hash map implemented by 2^n buckets of key value arrays. 
Since ids are monotonically increasing, an append will never displace 
another value. The only time value of an existing cell changes is:
i) There is a compaction (too many deletions).
ii) A rebalance event occurs, resulting in Set(actorId, dispatcher-id) 
calls. 
iii) An append call will cause a bucket to run out of space or cause the 
load-factor of the table to be too high. This will cause wholesale 
restructuring of the hash-map - allocation and free of one or more buckets 
or increasing the number of buckets etc.

If I size my map appropriately (massively oversize) then compaction and 
append running out of space events should be rare enough. Rebalance in 
general should be rare enough and stablize with time. Rebalance also 
requires other coordination in any case - dispatcher threads need to stop 
processing messages of the transitioning actors etc. So what I could do is 
any time the writer thread requires to change the structure of the hash-map 
I could wait for a safe point by signalling a map-about-to-change event to 
all the other threads and waiting for them to acknowledge. Once they 
acknowledge (it means they are just busy waiting for end-of- map-change 
event) I could make as many changes as I want to the internal structure of 
the map and then signal end-of-map-change. The reader threads will now be 
caught up with all the changes and can proceed. The nice thing is all get 
calls can proceed without any locks. Append calls are always making changes 
to previously unaccessed cells so a normal load of the cell should be fine 
for get calls. All of this is based on the hypothesis that resize, 
compaction and set calls will be infrequent. 
 

>
>
> Then if you can limit the difference between the newest and the oldest 
> object, then you can use a ring buffer with direct indexing. If a 
> reader asks for a very old object (older than the threshold), then, 
> sorry, no luck. 
>
> -- 
>
> If you can not afford to reject requests for very old ids, but you 
> expect them to not happen in common case, then you can have fast path 
> for recent ids and slow path for older ids. E.g.: 
>
> // We do not expect to receive requests for ids older than this 
> #define N (1<<20) 
>
> struct table_element { 
>   int key; 
>   int val; 
>   table_element_slow* slow; 
> }; 
>
> table_element table[N]; 
>
> int find(int key) { 
>   int idx = key % N; 
>   table_element* e = &table[idx]; 
>   if(e->key == key) 
>     return val; 
>   // consult e->slow 
>   // this can be slow, but we do not expect this to happen in common case 
> } 
>
> This needs special care wrt atomic accesses to key/val, and also some 
> protection for slow, and other details. But I hope the idea is clear. 
>
I think so. 

>
> -- 
>
> There can be other "special" solutions. 
> It all these cases, accesses are extremely fast in common case. 
> If that does not work for you, then I would consider taking a normal 
> hashmap, partitioning it into lots of partitions and protecting each 
> partition with own rw mutex. On top of that you can have a mutex for 
> each thread/partition, readers lock own mutex but the writer need to 
> lock mutexes of all threads. This will eliminate contention between 
> readers, but make writer slower. Since you know number of readers 
> ahead of time, this must be really simple to implement. 
>
Right, I saw your RW mutex solution based on one cell per core. 

>
>
>
>
>
> -- 
> Dmitry Vyukov 
>
> All about lockfree/waitfree algorithms, multicore, scalability, 
> parallel computing and related topics: 
> http://www.1024cores.net 
>

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"Scalable Synchronization Algorithms" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/lock-free/5fc8de5a-545e-41dd-bdb4-4c635c2b53c7%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to