[ 
https://issues.apache.org/jira/browse/HDFS-11556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Clampffer updated HDFS-11556:
-----------------------------------
    Parent Issue: HDFS-14032  (was: HDFS-8707)

> libhdfs++: Improve the usage of std::promise and std::future for sync shims
> ---------------------------------------------------------------------------
>
>                 Key: HDFS-11556
>                 URL: https://issues.apache.org/jira/browse/HDFS-11556
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: James Clampffer
>            Assignee: James Clampffer
>            Priority: Major
>
> There's two issues with the current promise/future situation being used in 
> the synchronous shims.
> 1) C++ std::promise and std::futures aren't particularly intuitive (at least 
> for me).  Logically it makes sense to think of the promise and future objects 
> as handles that each hold onto a shared object containing the promised type + 
> a mutex and condition variable for synchronization.  When promise<T>::set(T 
> val) is called it seems reasonable to expect that it writes val into the 
> shared state, and and wakes the thread blocked in future<>::wait.  This would 
> be standard conformant but generally isn't how it's implemented in the 
> standard libraries.
> The common implementation is to bundle the value, mutex, and cond_var into 
> the promise.  This makes it really easy to introduce race conditions that 
> will often pass tests.. until they don't.  Example of most common case:
> {code}
> auto foo = std::promise<int>()
> std::thread async_task([&foo](){
>   ... do some work ...
>   foo.set_value(1);
> })
> auto future_val = foo.get_future();
> int result = future_val.get_value();
> {code}
> That last line has a race between promise<T>::~promise() which happens right 
> after it gets set and any reads of the value, mutex, or condition variable 
> owned by the promise but referenced by the future.
> When it does show up there will typically be a thread hung in 
> pthread_cond_wait and an invalid 32 bit read of the futex state that the 
> condition variable used under the hood.
> 2) If a callback happens to throw in an operation that has synchronous shims 
> on top of it the stack will unwind and io_service.run_once will be called to 
> keep getting work done (HDFS-9699).  The problem here is that once the stack 
> is unwound there's nothing around to set the promise so the thread blocked on 
> a future will never wake up.  The simple/unrealistic fix is to assert that 
> exceptions should never be used but that precludes the use of this library in 
> most C++ projects and the use of most third party dependencies that weren't 
> made by Google.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to