Hi Alan, I've missed the start of this thread on threads because our internet connection was physically down for a couple of days, but hopefully what I've said here is useful to you. It started as a couple of comments but then grew a bit long...
You might find some of the code in the activescript sapi useful; it's in CVS and only builds for Windows, but might give you a couple of hints. It doesn't implement threading (directly), but does have some code that helps you to see what is required. Each thread must have it's own copy of the engine. This is a "hard" requirement of the PHP/TSRM architecture. Communication between threads needs some kind of synchronization mechanism. Zvals need to be proxied so that access is serialized. ActiveScript does this by exporting the zvals as IDispatch-able COM objects and lets windows manage serialization via it's message queues. For other platforms you would need to implement some way to "pre-empt" a thread so that it can access a zval and return the data. You could implement this using ticks, or other similar mechanism - perhaps as a Zend extension. Proxying the zvals means the a given zval can only be accessed in the context of the thread that created it - not 100% useful for threading, but this is a "limitation" (feature?) of the memory management system - memory is managed per-thread, so write accesses (including changing a refcount) may trigger allocations/deallocations. If that happens on the wrong thread, you get inconsistent hash tables and most probably segfaults in BOTH threads. Likewise, the zend functions and classes structures are managed in a similar way. You must make your own copy of those for each thread in case that thread includes/requires other scripts, or creates new functions/classes dynamically using eval or similar. To make threading useful, you would need to somehow arrange for multiple threads to access the same underlying zval data without blocking all the threads. This just isn't possible AFAIK. If you want fast read-only access to "shared" zvals, you can serialize (just like sessions) the zval from the main thread and unserialize it into your new thread. This isn't strictly read-only, although writes will only be visible to the new thread. The advantage is that since the zval lives in the threads own engine/address-space, no thread-serialization occurs so performance is "better". As a summary of the above, threading would work something like below: Given a zval, allocates a struct that identifies the zval as belonging to this thread and assigns it an id (much like the resource management system, but using global malloc()d memory). The id can be passed to other threads (much like session serialization) and used to construct a proxy (or just return the actual zval if it's on the same thread as the zval). proxy_data proxyize(zval * zval); Given a proxy id: if the zval is on the same thread as the current caller, return the zval from the proxy data. Otherwise, look in the list of "running" proxies for this thread; if we already have a proxy for this zval, return it. Otherwise we have not yet accessed the zval from this thread. Create an overloaded object that will use the thread synchronization methods to access the zval in it's owners thread. zval * proxy_to_zval(proxy_data data); Returns a php script representing all the functions and classes defined in the current engine context. This would be generated by examining the zend structures. It has to be serialized like this since the zend structures are emalloc-ed. Ideally, this would return a malloc-ed copy of the structures that would then be emalloc-ed into the address space of the new thread. As a first attempt, it might be simpler just to convert to a string and ask the new engine to recompile it. char * serialize_functions_and_classes(); struct thread_create_info { char * functions; int globals_count; char ** globals_keys proxy_data ** globals_values; char * threadfunc; } // proto resource thread_create(string threadfunc); PHP_FUNCTION(thread_create) { thread_create_info * info = malloc(sizeof(thread_create_info)); info->functions = serialize_functions_and_classes(); info->threadfunc = strdup(threadfunc); // could do similar thing for passing params to the thread func info->globals_count = count($GLOBALS); for (i = 0; i < info->globals_count; i++) { info->globals_keys[i] = strdup(key($GLOBALS)); info->globals_values[i] = proxyize(current($GLOBALS)); next($GLOBALS); } // use your platform-specific function here pthread_create(phpthreads_threadfunc, info); // return some kind of token for that thread } phpthreads_threadfunc(thread_create_info * info) { zval ** args; /* create a new zend engine instance */ ... /* load in functions (and classes) */ compile_string(info->functions); free(info->functions); /* import global data */ // could do similar thing to import thread function args for (i = 0; i < info->globals_count; i++) { $GLOBALS[info->globals_keys[i]] = proxy_to_zval(info->globals_values[i]); free(info->globals_keys[i]); } call_user_func(info->threadfunc); free(info->threadfunc); free(info); /* close down the zend engine instance */ } For the user, their script would look like this: <?php function another_thread() { // This $GLOBALS access will switch contexts to the main thread // to retrieve the value (= slow) echo "Another thread: hello is " . $GLOBALS["hello"] . "\n"; } $GLOBALS["hello"] = "hi there"; $thread = thread_create("another_thread"); // If main engine dies before the child, there is a chance for // segfaults... thread_wait($thread); ?> Conclusion: I'm not sure if this threading implementation will have enough performance to warrant it's use (generally speaking), although high performance scripts can be written in this framework if the user is aware of the issues. If we change the way we create the thread so that there is a super-global called $THREADGLOBALS that holds the proxyized contents of $GLOBALS, and make $GLOBALS in the new thread hold serialized versions of the original $GLOBALS, we will end up with a faster-by-default version: <?php function another_thread() { // This $GLOBALS access occurs within our own thread (=fast) echo "Another thread: hello is " . $GLOBALS["hello"] . "\n"; // this changes the thread-local version only. This is fast // but other threads cant see it $GLOBALS["hello"] = "local"; // this changes the value in the main thread. // this is slower than the other methods above. // the new value is only visible to child threads using the // $THREADGLOBALS superglobal or to the main thread using // it's regular $GLOBALS access. // the main thread should create $THREADGLOBALS as an alias // for GLOBALS so that thread aware code can use $THREADGLOBALS // without worrying about which thread they are running in. $THREADGLOBALS["hello"] = "global"; // copy the shared value into local space (=slow) $GLOBALS["hello"] = $THREADGLOBALS["hello"]; } $GLOBALS["hello"] = "hi there"; $thread = thread_create("another_thread"); // If main engine dies before the child, there is a chance for // segfaults... thread_wait($thread); ?> Phew! Thats a long mail. I hope it makes sense. What I suggest, if you are (still!) serious about threading PHP, is that you work with ZE2 and use Haralds RPC extension for the proxies (that will make things a little bit easier). I don't think there is an easier way than this! --Wez. -- PHP Development Mailing List <http://www.php.net/> To unsubscribe, visit: http://www.php.net/unsub.php