> libdhfs is thread safe. > > > Concurrency and Hadoop FS "handles" > The Hadoop FS implementation includes a FS handle cache which caches based > on the URI of the namenode along with the user connecting. So, all calls > to hdfsConnect will return the same handle but calls to hdfsConnectAsUser > with different users will return different handles. But, since HDFS client > handles are completely thread safe, this has no bearing on concurrency. > > > Concurrency and libhdfs/JNI > The libhdfs calls to JNI should always be creating thread local storage, > so (in theory), libhdfs should be as thread safe as the underlying calls > to the Hadoop FS. > > > > http://hadoop.apache.org/docs/r2.2.0/hadoop-project-dist/hadoop-hdfs/LibH > dfs.html >
>With TS builds PHP doesn't create thread local storage but passes the >thread ID to each function that requires it, as defined here >http://lxr.php.net/xref/PHP_5_4/TSRM/TSRM.h#165 (all the TSRMLS_* stuff). >Furthermore thread safety in PHP means that every request is served in a >separate thread and that is ruled by the correspending SAPI. So the >meaning of ZTS in PHP has no relation to the thread safety of some library >(as long as you don't need all the PHP structures to be thread safe). A >tread safe library can properly work within a multi-threaded program, >that's it. Which means it also can work within a single threaded program. >If libhdfs is theadsafe, it should be able to handle multiple >hdfsConnect() and others in a TS and NTS binaries. Looking at the example >from the apache doc, that's a single threaded program (so effectively a >NTS PHP as analogue). >But what I can see from the code, like here >https://github.com/yuduanchen/phdfs/blob/master/phdfs.c#L28 - your >implementation in PHP is currently not thread safe. The globally defined >struct php_hdfs_hanele is accessed directly in every method of the phdfs >class. In the NTS build that means it works as a singleton pattern, one >cannot simultaneously create two objects connecting to different hadoop >instances. In the TS build it's even worse as that struct will be >concurrently changed from every request. >To properly implement the phdfs class, you should use the create_object >callback provided by the zend_class_entry struct (you'll also need to >define a handler for object destroying). Please take a look at this for >example http://lxr.php.net/xref/PECL/xmldiff/xmldiff.cpp#214 or any other >core/pecl extension implementing an internal class. I have made some changes
