On Fri, August 8, 2014 18:24, China 陈毓端 wrote: >> libdhfs is thread safe. >> >> >> Concurrency and Hadoop FS "handles" >> The Hadoop FS implementation includes a FS handle cache which caches >> based on the URI of the namenode along with the user connecting. So, all >> calls to hdfsConnect will return the same handle but calls to >> hdfsConnectAsUser with different users will return different handles. >> But, since HDFS client >> handles are completely thread safe, this has no bearing on concurrency. >> >> >> Concurrency and libhdfs/JNI >> The libhdfs calls to JNI should always be creating thread local storage, >> so (in theory), libhdfs should be as thread safe as the underlying >> calls to the Hadoop FS. >> >> >> >> http://hadoop.apache.org/docs/r2.2.0/hadoop-project-dist/hadoop-hdfs/Li >> bH dfs.html >> > >> With TS builds PHP doesn't create thread local storage but passes the >> thread ID to each function that requires it, as defined here >> http://lxr.php.net/xref/PHP_5_4/TSRM/TSRM.h#165 (all the TSRMLS_* >> stuff). Furthermore thread safety in PHP means that every request is >> served in a separate thread and that is ruled by the correspending SAPI. >> So the >> meaning of ZTS in PHP has no relation to the thread safety of some >> library (as long as you don't need all the PHP structures to be thread >> safe). A tread safe library can properly work within a multi-threaded >> program, that's it. Which means it also can work within a single >> threaded program. > >> If libhdfs is theadsafe, it should be able to handle multiple >> hdfsConnect() and others in a TS and NTS binaries. Looking at the >> example from the apache doc, that's a single threaded program (so >> effectively a NTS PHP as analogue). >> > >> But what I can see from the code, like here >> https://github.com/yuduanchen/phdfs/blob/master/phdfs.c#L28 - your >> implementation in PHP is currently not thread safe. The globally >> defined struct php_hdfs_hanele is accessed directly in every method of >> the phdfs class. In the NTS build that means it works as a singleton >> pattern, one cannot simultaneously create two objects connecting to >> different hadoop instances. In the TS build it's even worse as that >> struct will be concurrently changed from every request. > >> To properly implement the phdfs class, you should use the >> create_object callback provided by the zend_class_entry struct (you'll >> also need to define a handler for object destroying). Please take a >> look at this for example >> http://lxr.php.net/xref/PECL/xmldiff/xmldiff.cpp#214 or any other >> core/pecl extension implementing an internal class. >> > > > I have made some changes > >
Yeah, now it's thread safe. But still it's a singleton, as despite port and host are incapsulated into the class, the internal fs handle is global for all the class instances. If that's what is needed, that's fine. But I guess it's not so. Otherwise the create_object callback is vital to utilize. You could then move the port and host initialization into the constructor, or additionally implement the read_property/write_property callback for the current syntax to stay, or implement setters/getters. Anyway the fs, port and host members would move into an internal object struct. Another good example on how to do this http://lxr.php.net/xref/PHP_5_4/ext/dom/php_dom.c#762 Regards Anatol -- PECL development discussion Mailing List (http://pecl.php.net/) To unsubscribe, visit: http://www.php.net/unsub.php
