[
https://issues.apache.org/jira/browse/MESOS-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Anand Mazumdar updated MESOS-4658:
--
Description:
The {{Connection}} abstraction is prone to deadlocks arising from the object
being destroyed inside the same execution context.
Consider this example:
{code}
Option connection = process::http::connect(...).get();
connection.disconnected()
.onAny(defer(self(), , connection));
connection.disconnect();
connection = None();
{code}
In the above snippet, if the {{connection = None()}} gets executed first before
the actual dispatch to {{ConnectionProcess}} happens. You might loose the only
existing reference to {{Connection}} object inside
{{ConnectionProcess::disconnect}}. This would lead to the destruction of the
{{Connection}} object in the {{ConnectionProcess}} execution context.
We do have a snippet in our existing code that alludes to such occurrences
happening:
https://github.com/apache/mesos/blob/master/3rdparty/libprocess/src/http.cpp#L1325
{code}
// This is a one time request which will close the connection when
// the response is received. Since 'Connection' is reference-counted,
// we must keep a copy around until the disconnection occurs. Note
// that in order to avoid a deadlock (Connection destruction occurring
// from the ConnectionProcess execution context), we use 'async'.
{code}
AFAICT, for scenarios where we need to hold on to the {{Connection}} object for
later, this approach does not suffice.
was:
The {{Connection}} abstraction is prone to deadlocks arising from the object
being destroyed inside the same execution context.
Consider this example:
{code}
Option connection = process::http::connect(...);
connection.disconnected()
.onAny(defer(self(), , connection));
connection.disconnect();
connection = None();
{code}
In the above snippet, if the {{connection = None()}} gets executed first before
the actual dispatch to {{ConnectionProcess}} happens. You might loose the only
existing reference to {{Connection}} object inside
{{ConnectionProcess::disconnect}}. This would lead to the destruction of the
{{Connection}} object in the {{ConnectionProcess}} execution context.
We do have a snippet in our existing code that alludes to such occurrences
happening:
https://github.com/apache/mesos/blob/master/3rdparty/libprocess/src/http.cpp#L1325
{code}
// This is a one time request which will close the connection when
// the response is received. Since 'Connection' is reference-counted,
// we must keep a copy around until the disconnection occurs. Note
// that in order to avoid a deadlock (Connection destruction occurring
// from the ConnectionProcess execution context), we use 'async'.
{code}
AFAICT, for scenarios where we need to hold on to the {{Connection}} object for
later, this approach does not suffice.
> process::Connection can lead to deadlock around execution in the same context.
> --
>
> Key: MESOS-4658
> URL: https://issues.apache.org/jira/browse/MESOS-4658
> Project: Mesos
> Issue Type: Bug
> Components: HTTP API, libprocess
>Reporter: Anand Mazumdar
>Assignee: Shuai Lin
> Labels: mesosphere
>
> The {{Connection}} abstraction is prone to deadlocks arising from the object
> being destroyed inside the same execution context.
> Consider this example:
> {code}
> Option connection = process::http::connect(...).get();
> connection.disconnected()
> .onAny(defer(self(), , connection));
> connection.disconnect();
> connection = None();
> {code}
> In the above snippet, if the {{connection = None()}} gets executed first
> before the actual dispatch to {{ConnectionProcess}} happens. You might loose
> the only existing reference to {{Connection}} object inside
> {{ConnectionProcess::disconnect}}. This would lead to the destruction of the
> {{Connection}} object in the {{ConnectionProcess}} execution context.
> We do have a snippet in our existing code that alludes to such occurrences
> happening:
> https://github.com/apache/mesos/blob/master/3rdparty/libprocess/src/http.cpp#L1325
> {code}
> // This is a one time request which will close the connection when
> // the response is received. Since 'Connection' is reference-counted,
> // we must keep a copy around until the disconnection occurs. Note
> // that in order to avoid a deadlock (Connection destruction occurring
> // from the ConnectionProcess execution context), we use 'async'.
> {code}
> AFAICT, for scenarios where we need to hold on to the {{Connection}} object
> for later, this approach does not suffice.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)