There should be an e-mail about this in the archive somewhere, but to summarize...

I think that requiring client code to worry about wrapping every thread in run-elephant-thread is unrealistic, so that interface is now deprecated. The reason for this is simple, when you are using a multithreaded web server the main thread launches client threads inside the server code and there is no way to wrap it in an elephant thread without modifying the web server and this is unrealistic.

A Thread-Safe Serializer:

The answer to this problem is a little more important than the others. The serializer is called all the time and is a performance critical part of the system. The serializer is thread safe except for it's use of a hash table used to detect circular data structures. I've used elephant in a multi-threaded setting for quite some time without worrying about the serializer because in practice I would never hit the case where an object I was serializing would accidentally lookup a duplicate object/id in the circularity hash. The way to avoid this case in general is to have a queue of hash tables that each thread can grab when it wants to serialize an object (you don't want to allocate a hash table in an inner loop - reuse by clearing is also costly, but about 50% as much). So only this queue needs to be protected.

I tried using standard locks in the various EXCL-like extended packages but the performance is atrocious for frequently called routines like serialize. Instead, I use without-interrupts (a common lisp primitive) to block interrupts for the duration of a vector-pop command that grabs a new hash. This gives much better performance. The only other variables that need to be so protected are:

Thread Safety in Backends:

I'm pretty sure that the current behavior of BDB is thread-safe. I researched this earlier, but only remember that I concluded it was safe so if anyone remembers the details feel free to contradict me.

A quick Google investigation says that CL-SQL requires, at a minimum, that each thread have it's own connection object to be thread safe, so each thread needs to reconnect to the cl-sql database plus have a thread-local clsql:*default-database* binding. (I don't think this works for SQLite though)

These are pretty easy and will be handled in my next checkin:
-------------------------------------------------------------

Global variables (infrequently written):
*elephant-controller-init*
*dbconnection-spec*

Store-controller slots that need infrequent write-protection:
- instance-cache
- symbol-cache (0.6.1+)


The following elephant variables are a little tricky:
-----------------------------------------------------

Thread-local global vars (frequently accessed):
*store-controller* (if different threads use different controllers)
*transaction-stack*
*current-transaction*
*resourced-byte-spec*
(errno handling in uffi?)

Deprecated thread-local vars:
*auto-commit* (BDB 4.4 no longer pays attention to auto-commit arguments so we can remove this from elephant)

1) You can use the macro with-elephant-variables in 0.6.1 to create new, thread-local dynamic bindings of the above variables, but that is still a manual solution for when you have access to the thread creation code and can create thread-local specials.

2) A more consequential is to excise required dependency on these variables entirely. The implication of this is a potentially significant API change where an application can always provide the store controller on calls to collection accessors, cursor operations, etc and it defaults to *store-controller* for environments where there is only one store or where the user is managing the binding of *store-controller* in each thread. I think this is already accommodated in much of the API, but I haven't investigated this to see how much work it is.

We can further require that all transactions be wrapped in 'with- elephant-transaction' so that the appropriate specials are dynamically bound within the stack. I think this actually would be pretty easy. We could document the internals of with-elephant- transaction for anyone who wants to do something sophisticated and is willing to manage the thread issues themselves.

Does anyone have a better suggestion here? For example is there a portability layer that can detect the current thread ID and use that to index the default global values?

Regards,
Ian

On Jan 20, 2007, at 2:42 PM, Gábor Melis wrote:

The fine manual at
http://common-lisp.net/project/elephant/doc/Threading.html says that
run-elephant-thread is broken but leaves the consequences to my
imagination.

Considering the comment about specials for buffers and such as well, my
reading is that there is no official way of using elephant from
multithreaded code.

Is that right?
What needs to be done to make run-elephant-thread safe again?

Gabor Melis
_______________________________________________
elephant-devel site list
elephant-devel@common-lisp.net
http://common-lisp.net/mailman/listinfo/elephant-devel

_______________________________________________
elephant-devel site list
elephant-devel@common-lisp.net
http://common-lisp.net/mailman/listinfo/elephant-devel

Reply via email to