> I claim that there are two alternatives in the face of one thread
> mutating an object and the other observing:
Well, I did consider the possibility of one thread being able to change, the 
others observe, but I have no idea if that is too complicate like you are 
suggesting.
However, that is not even necessary.  An even more limited form, would work 
fine (at least for me):
 
Two possible modes:
Read/Write from 1 thread:
* ONLY one thread can change and observe(read) -- no other threads have access 
of any kind or even know of its existence until you transfer control to another 
thread (then only the thread you transferred control has acces).
(Optional) read only from all threads:
* Optionally, you could have objects that are in read only mode and all threads 
can observe it.
 
To make things easier, maybe special GIL-free threads could be added.  (They 
would still be OS-level threads, but with special properties in Python.) These 
threads would have the property that they could ONLY access data stored in the 
special object store to which they have read/write privilege.  They can't 
access other objects not in the special store.  As a result, these special 
threads would be free of the GIL and could run in parallel.

> Queues already are in a sense your per-object-lock,
> one-thread-mutating, but usually one thread has acquire semantics and
> one has release semantics, and that combination actually works. It's
> when you expect to have a full memory barrier that is the problem.

Now you brought up something interesting: queues
To be honest something like queues and pipes would good enough for my purposes 
-- if they used shared memory.  Currently, the implemenation of queues and 
pipes in the multiprocessing module seems rather costly as they use processes, 
and require copying data back and forth.
In particular, what would be useful:
 
* A queue that holds self-contained Python objects (with no pointers/references 
to other data not in the queue so as to prevent threading issues)
* The queue can be accessed by all special threads simultaneously (in 
parallel).  You would only need locks around queue operations, but that is 
pretty easy to do -- unless there is some hidden Interpreter problem that would 
make this easy task hard.
* Streaming buffers -- like a file buffer or something similar, so you can send 
data from one thread to another as it comes in (when you don't know when it 
will end or it may never end).  Only two threads have access: one to put data 
in, the other to extract it.
 
> 0. You can give up consistency and do fine-grained locking, which is
> reasonably fast but error prone, or
> 1. Expect python to handle all of this for you, effectively not making
> a change to the memory model. You could do this with implicit
> per-object locks which might be reasonably fast in the absence of
> contention, but not when several threads are trying to use the object.
> 
...
> 
> Come to think of it, you might be right Kevin: as long as only one
> thread mutates the object, the mutating thread never /needs/ to
> acquire, as it knows that it has the latest revision.
> 
> Have I missed something?
I'm afraid I don't know enough about Python's Interpreter to say much.  The 
only way would be for me to do some studying on interpreters/compilers and get 
digging into the codebase -- and I'm not sure how much time I have to do that 
right now. :)
Perhaps the part about one thread only having read & write changes the 
situation?
 
One possible implemenation might be similar to how POSH does it:
Now, I'm not suggesting this, because I know enough to say it is possible, but 
just to put something out there that might work.
Create a special virtual memory address or lookup table for each thread.  When 
you assign a read+write object to a thread, it gets added to the virtual 
address/memory table.
Optinally, it could be up to the programmer to make sure they don't try to 
access data from a thread that does not have ownership/control of that object.  
If a programmer does try to access it, it would fail as the memory address 
would point to nowhere/bad data/etc....
 
Of course, there are probably other, better ways to do it that are not as 
fickle as this... but I don't know if the limitations of the Python Interpreter 
and GIL would allow better methods.                                     
_______________________________________________
[email protected]
http://codespeak.net/mailman/listinfo/pypy-dev

Reply via email to