On Fri, August 8, 2014 18:24, China 陈毓端 wrote:
>> libdhfs is thread safe.
>>
>>
>> Concurrency and Hadoop FS "handles"
>> The Hadoop FS implementation includes a FS handle cache which caches
>> based on the URI of the namenode along with the user connecting. So, all
>> calls to hdfsConnect will return the same handle but calls to
>> hdfsConnectAsUser with different users will return different handles.
>> But, since HDFS client
>> handles are completely thread safe, this has no bearing on concurrency.
>>
>>
>> Concurrency and libhdfs/JNI
>> The libhdfs calls to JNI should always be creating thread local storage,
>>  so (in theory), libhdfs should be as thread safe as the underlying
>> calls to the Hadoop FS.
>>
>>
>>
>> http://hadoop.apache.org/docs/r2.2.0/hadoop-project-dist/hadoop-hdfs/Li
>> bH dfs.html‍
>>
>
>> With TS builds PHP doesn't create thread local storage but passes the
>> ‍‍thread ID to each function that requires it, as defined here
>> ‍http://lxr.php.net/xref/PHP_5_4/TSRM/TSRM.h#165 (all the TSRMLS_*
>> stuff). ‍Furthermore thread safety in PHP means that every request is
>> served in a separate thread and that is ruled by the correspending SAPI.
>> So the
>> ‍‍meaning of ZTS in PHP has no relation to the thread safety of some
>> library ‍(as long as you don't need all the PHP structures to be thread
>> safe). A ‍tread safe library can properly work within a multi-threaded
>> program, ‍that's it. Which means it also can work within a single
>> threaded program.
>
>> ‍If libhdfs is theadsafe, it should be able to handle multiple
>> ‍hdfsConnect() and others in a TS and NTS binaries. Looking at the
>> example ‍from the apache doc, that's a single threaded program (so
>> effectively a ‍NTS PHP as analogue).
>>
>
>> ‍But what I can see from the code, like here
>> ‍https://github.com/yuduanchen/phdfs/blob/master/phdfs.c#L28 - your
>> ‍implementation in PHP is currently not thread safe. The globally
>> defined ‍struct php_hdfs_hanele is accessed directly in every method of
>> the phdfs ‍class. In the NTS build that means it works as a singleton
>> pattern, one ‍cannot simultaneously create two objects connecting to
>> different hadoop ‍‍instances. In the TS build it's even worse as that
>> struct will be concurrently changed from every request.
>
>> ‍To properly implement the phdfs class, you should use the
>> create_object ‍callback provided by the zend_class_entry struct (you'll
>> also need to ‍define a handler for object destroying). Please take a
>> look at this for ‍example
>> http://lxr.php.net/xref/PECL/xmldiff/xmldiff.cpp#214 or any other
>> ‍core/pecl extension implementing an internal class.
>>
>
>
> I have made some changes‍
>
>

Yeah, now it's thread safe. But still it's a singleton, as despite port
and host are incapsulated into the class, the internal fs handle is global
for all the class instances. If that's what is needed, that's fine. But I
guess it's not so.

Otherwise the create_object callback is vital to utilize. You could then
move the port and host initialization into the constructor, or
additionally implement the read_property/write_property callback for the
current syntax to stay, or implement setters/getters. Anyway the fs, port
and host members would move into an internal object struct. Another good
example on how to do this
http://lxr.php.net/xref/PHP_5_4/ext/dom/php_dom.c#762

Regards

Anatol



-- 
PECL development discussion Mailing List (http://pecl.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to