[jira] [Commented] (DISPATCH-2262) Edge/Interior connections can half-fail in real multi-cloud environments

2021-11-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/DISPATCH-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17442474#comment-17442474
 ] 

ASF subversion and git services commented on DISPATCH-2262:
---

Commit 8774905f6a1bdfc6b103fb934fcd36d2bfd2ebb1 in qpid-dispatch's branch 
refs/heads/1.18.x from Ted Ross
[ https://gitbox.apache.org/repos/asf?p=qpid-dispatch.git;h=8774905 ]

DISPATCH-2262 - Added missing handling for client-side link loss.

(cherry picked from commit 6769203991b20ecf0fdeb28bb8d84962b73c22fd)


> Edge/Interior connections can half-fail in real multi-cloud environments
> 
>
> Key: DISPATCH-2262
> URL: https://issues.apache.org/jira/browse/DISPATCH-2262
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Container
>Affects Versions: 1.17.0
>Reporter: Ted Ross
>Assignee: Ted Ross
>Priority: Major
> Fix For: 1.18.0
>
>
> See PROTON-2440 for context.
> The configured keepalive on edge-to-interior connections can fail to provide 
> protection from connection loss.  This results in half-failed connections 
> where the edge sees connection failure and re-connects and the interior sees 
> nothing and accumulates multiple connections from the same edge.
> This is a serious problem because the interior will attempt to forward 
> deliveries across these dead-but-seemingly-alive connections resulting in 
> lack of message delivery.
> The solution proposed in this issue is to work around the problem by 
> introducing an application-level keepalive for edge-to-interior connections.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-2262) Edge/Interior connections can half-fail in real multi-cloud environments

2021-11-11 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/DISPATCH-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17442371#comment-17442371
 ] 

ASF subversion and git services commented on DISPATCH-2262:
---

Commit 6769203991b20ecf0fdeb28bb8d84962b73c22fd in qpid-dispatch's branch 
refs/heads/main from Ted Ross
[ https://gitbox.apache.org/repos/asf?p=qpid-dispatch.git;h=6769203 ]

DISPATCH-2262 - Added missing handling for client-side link loss.


> Edge/Interior connections can half-fail in real multi-cloud environments
> 
>
> Key: DISPATCH-2262
> URL: https://issues.apache.org/jira/browse/DISPATCH-2262
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Container
>Affects Versions: 1.17.0
>Reporter: Ted Ross
>Assignee: Ted Ross
>Priority: Major
> Fix For: 1.18.0
>
>
> See PROTON-2440 for context.
> The configured keepalive on edge-to-interior connections can fail to provide 
> protection from connection loss.  This results in half-failed connections 
> where the edge sees connection failure and re-connects and the interior sees 
> nothing and accumulates multiple connections from the same edge.
> This is a serious problem because the interior will attempt to forward 
> deliveries across these dead-but-seemingly-alive connections resulting in 
> lack of message delivery.
> The solution proposed in this issue is to work around the problem by 
> introducing an application-level keepalive for edge-to-interior connections.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-2262) Edge/Interior connections can half-fail in real multi-cloud environments

2021-11-11 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DISPATCH-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17442369#comment-17442369
 ] 

ASF GitHub Bot commented on DISPATCH-2262:
--

ted-ross merged pull request #1433:
URL: https://github.com/apache/qpid-dispatch/pull/1433


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Edge/Interior connections can half-fail in real multi-cloud environments
> 
>
> Key: DISPATCH-2262
> URL: https://issues.apache.org/jira/browse/DISPATCH-2262
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Container
>Affects Versions: 1.17.0
>Reporter: Ted Ross
>Assignee: Ted Ross
>Priority: Major
> Fix For: 1.18.0
>
>
> See PROTON-2440 for context.
> The configured keepalive on edge-to-interior connections can fail to provide 
> protection from connection loss.  This results in half-failed connections 
> where the edge sees connection failure and re-connects and the interior sees 
> nothing and accumulates multiple connections from the same edge.
> This is a serious problem because the interior will attempt to forward 
> deliveries across these dead-but-seemingly-alive connections resulting in 
> lack of message delivery.
> The solution proposed in this issue is to work around the problem by 
> introducing an application-level keepalive for edge-to-interior connections.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-2262) Edge/Interior connections can half-fail in real multi-cloud environments

2021-11-03 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DISPATCH-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438441#comment-17438441
 ] 

ASF GitHub Bot commented on DISPATCH-2262:
--

ted-ross merged pull request #1418:
URL: https://github.com/apache/qpid-dispatch/pull/1418


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Edge/Interior connections can half-fail in real multi-cloud environments
> 
>
> Key: DISPATCH-2262
> URL: https://issues.apache.org/jira/browse/DISPATCH-2262
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Container
>Affects Versions: 1.17.0
>Reporter: Ted Ross
>Assignee: Ted Ross
>Priority: Major
> Fix For: 1.18.0
>
>
> See PROTON-2440 for context.
> The configured keepalive on edge-to-interior connections can fail to provide 
> protection from connection loss.  This results in half-failed connections 
> where the edge sees connection failure and re-connects and the interior sees 
> nothing and accumulates multiple connections from the same edge.
> This is a serious problem because the interior will attempt to forward 
> deliveries across these dead-but-seemingly-alive connections resulting in 
> lack of message delivery.
> The solution proposed in this issue is to work around the problem by 
> introducing an application-level keepalive for edge-to-interior connections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-2262) Edge/Interior connections can half-fail in real multi-cloud environments

2021-11-03 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DISPATCH-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438423#comment-17438423
 ] 

ASF GitHub Bot commented on DISPATCH-2262:
--

ted-ross opened a new pull request #1418:
URL: https://github.com/apache/qpid-dispatch/pull/1418


   …proposed changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Edge/Interior connections can half-fail in real multi-cloud environments
> 
>
> Key: DISPATCH-2262
> URL: https://issues.apache.org/jira/browse/DISPATCH-2262
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Container
>Affects Versions: 1.17.0
>Reporter: Ted Ross
>Assignee: Ted Ross
>Priority: Major
> Fix For: 1.18.0
>
>
> See PROTON-2440 for context.
> The configured keepalive on edge-to-interior connections can fail to provide 
> protection from connection loss.  This results in half-failed connections 
> where the edge sees connection failure and re-connects and the interior sees 
> nothing and accumulates multiple connections from the same edge.
> This is a serious problem because the interior will attempt to forward 
> deliveries across these dead-but-seemingly-alive connections resulting in 
> lack of message delivery.
> The solution proposed in this issue is to work around the problem by 
> introducing an application-level keepalive for edge-to-interior connections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-2262) Edge/Interior connections can half-fail in real multi-cloud environments

2021-11-03 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DISPATCH-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438388#comment-17438388
 ] 

ASF GitHub Bot commented on DISPATCH-2262:
--

ted-ross merged pull request #1418:
URL: https://github.com/apache/qpid-dispatch/pull/1418






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Edge/Interior connections can half-fail in real multi-cloud environments
> 
>
> Key: DISPATCH-2262
> URL: https://issues.apache.org/jira/browse/DISPATCH-2262
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Container
>Affects Versions: 1.17.0
>Reporter: Ted Ross
>Assignee: Ted Ross
>Priority: Major
> Fix For: 1.18.0
>
>
> See PROTON-2440 for context.
> The configured keepalive on edge-to-interior connections can fail to provide 
> protection from connection loss.  This results in half-failed connections 
> where the edge sees connection failure and re-connects and the interior sees 
> nothing and accumulates multiple connections from the same edge.
> This is a serious problem because the interior will attempt to forward 
> deliveries across these dead-but-seemingly-alive connections resulting in 
> lack of message delivery.
> The solution proposed in this issue is to work around the problem by 
> introducing an application-level keepalive for edge-to-interior connections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-2262) Edge/Interior connections can half-fail in real multi-cloud environments

2021-11-03 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DISPATCH-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438383#comment-17438383
 ] 

ASF GitHub Bot commented on DISPATCH-2262:
--

ted-ross opened a new pull request #1418:
URL: https://github.com/apache/qpid-dispatch/pull/1418






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Edge/Interior connections can half-fail in real multi-cloud environments
> 
>
> Key: DISPATCH-2262
> URL: https://issues.apache.org/jira/browse/DISPATCH-2262
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Container
>Affects Versions: 1.17.0
>Reporter: Ted Ross
>Assignee: Ted Ross
>Priority: Major
> Fix For: 1.18.0
>
>
> See PROTON-2440 for context.
> The configured keepalive on edge-to-interior connections can fail to provide 
> protection from connection loss.  This results in half-failed connections 
> where the edge sees connection failure and re-connects and the interior sees 
> nothing and accumulates multiple connections from the same edge.
> This is a serious problem because the interior will attempt to forward 
> deliveries across these dead-but-seemingly-alive connections resulting in 
> lack of message delivery.
> The solution proposed in this issue is to work around the problem by 
> introducing an application-level keepalive for edge-to-interior connections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-2262) Edge/Interior connections can half-fail in real multi-cloud environments

2021-11-03 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/DISPATCH-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438258#comment-17438258
 ] 

ASF subversion and git services commented on DISPATCH-2262:
---

Commit be055324faa646106531068a23a65b641bb47e5a in qpid-dispatch's branch 
refs/heads/1.18.x from Ted Ross
[ https://gitbox.apache.org/repos/asf?p=qpid-dispatch.git;h=be05532 ]

DISPATCH-2262 - Trimmed down heartbeat message based on Robbie Gemmell's 
proposed changes.

(cherry picked from commit 95822e3083449259de90ccdf90a6eb1627420d3d)


> Edge/Interior connections can half-fail in real multi-cloud environments
> 
>
> Key: DISPATCH-2262
> URL: https://issues.apache.org/jira/browse/DISPATCH-2262
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Container
>Affects Versions: 1.17.0
>Reporter: Ted Ross
>Assignee: Ted Ross
>Priority: Major
> Fix For: 1.18.0
>
>
> See PROTON-2440 for context.
> The configured keepalive on edge-to-interior connections can fail to provide 
> protection from connection loss.  This results in half-failed connections 
> where the edge sees connection failure and re-connects and the interior sees 
> nothing and accumulates multiple connections from the same edge.
> This is a serious problem because the interior will attempt to forward 
> deliveries across these dead-but-seemingly-alive connections resulting in 
> lack of message delivery.
> The solution proposed in this issue is to work around the problem by 
> introducing an application-level keepalive for edge-to-interior connections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-2262) Edge/Interior connections can half-fail in real multi-cloud environments

2021-11-03 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/DISPATCH-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438257#comment-17438257
 ] 

ASF subversion and git services commented on DISPATCH-2262:
---

Commit 95822e3083449259de90ccdf90a6eb1627420d3d in qpid-dispatch's branch 
refs/heads/main from Ted Ross
[ https://gitbox.apache.org/repos/asf?p=qpid-dispatch.git;h=95822e3 ]

DISPATCH-2262 - Trimmed down heartbeat message based on Robbie Gemmell's 
proposed changes.


> Edge/Interior connections can half-fail in real multi-cloud environments
> 
>
> Key: DISPATCH-2262
> URL: https://issues.apache.org/jira/browse/DISPATCH-2262
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Container
>Affects Versions: 1.17.0
>Reporter: Ted Ross
>Assignee: Ted Ross
>Priority: Major
> Fix For: 1.18.0
>
>
> See PROTON-2440 for context.
> The configured keepalive on edge-to-interior connections can fail to provide 
> protection from connection loss.  This results in half-failed connections 
> where the edge sees connection failure and re-connects and the interior sees 
> nothing and accumulates multiple connections from the same edge.
> This is a serious problem because the interior will attempt to forward 
> deliveries across these dead-but-seemingly-alive connections resulting in 
> lack of message delivery.
> The solution proposed in this issue is to work around the problem by 
> introducing an application-level keepalive for edge-to-interior connections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-2262) Edge/Interior connections can half-fail in real multi-cloud environments

2021-11-03 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DISPATCH-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438255#comment-17438255
 ] 

ASF GitHub Bot commented on DISPATCH-2262:
--

ted-ross merged pull request #1418:
URL: https://github.com/apache/qpid-dispatch/pull/1418


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Edge/Interior connections can half-fail in real multi-cloud environments
> 
>
> Key: DISPATCH-2262
> URL: https://issues.apache.org/jira/browse/DISPATCH-2262
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Container
>Affects Versions: 1.17.0
>Reporter: Ted Ross
>Assignee: Ted Ross
>Priority: Major
> Fix For: 1.18.0
>
>
> See PROTON-2440 for context.
> The configured keepalive on edge-to-interior connections can fail to provide 
> protection from connection loss.  This results in half-failed connections 
> where the edge sees connection failure and re-connects and the interior sees 
> nothing and accumulates multiple connections from the same edge.
> This is a serious problem because the interior will attempt to forward 
> deliveries across these dead-but-seemingly-alive connections resulting in 
> lack of message delivery.
> The solution proposed in this issue is to work around the problem by 
> introducing an application-level keepalive for edge-to-interior connections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-2262) Edge/Interior connections can half-fail in real multi-cloud environments

2021-11-03 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DISPATCH-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17438242#comment-17438242
 ] 

ASF GitHub Bot commented on DISPATCH-2262:
--

ted-ross opened a new pull request #1418:
URL: https://github.com/apache/qpid-dispatch/pull/1418


   …proposed changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Edge/Interior connections can half-fail in real multi-cloud environments
> 
>
> Key: DISPATCH-2262
> URL: https://issues.apache.org/jira/browse/DISPATCH-2262
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Container
>Affects Versions: 1.17.0
>Reporter: Ted Ross
>Assignee: Ted Ross
>Priority: Major
> Fix For: 1.18.0
>
>
> See PROTON-2440 for context.
> The configured keepalive on edge-to-interior connections can fail to provide 
> protection from connection loss.  This results in half-failed connections 
> where the edge sees connection failure and re-connects and the interior sees 
> nothing and accumulates multiple connections from the same edge.
> This is a serious problem because the interior will attempt to forward 
> deliveries across these dead-but-seemingly-alive connections resulting in 
> lack of message delivery.
> The solution proposed in this issue is to work around the problem by 
> introducing an application-level keepalive for edge-to-interior connections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-2262) Edge/Interior connections can half-fail in real multi-cloud environments

2021-10-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DISPATCH-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435306#comment-17435306
 ] 

ASF GitHub Bot commented on DISPATCH-2262:
--

gemmellr commented on a change in pull request #1396:
URL: https://github.com/apache/qpid-dispatch/pull/1396#discussion_r738142657



##
File path: src/router_core/modules/heartbeat_edge/heartbeat_edge.c
##
@@ -0,0 +1,245 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include "core_attach_address_lookup.h"
+#include "core_link_endpoint.h"
+#include "core_events.h"
+#include "module.h"
+#include "router_core_private.h"
+
+#include "qpid/dispatch/ctools.h"
+#include "qpid/dispatch/discriminator.h"
+#include "qpid/dispatch/amqp.h"
+
+#include 
+
+
+typedef struct qcm_heartbeat_edge_t {
+qdr_core_t*core;
+qdrc_event_subscription_t *event_sub;
+qdr_connection_t  *edge_conn;
+qdr_core_timer_t  *timer;
+qdrc_endpoint_t   *endpoint;
+uint32_t   link_credit;
+uint32_t   next_msg_id;
+} qcm_heartbeat_edge_t;
+
+
+//
+// Core Link API Handlers
+//
+
+/**
+ * Event - The attachment of a link initiated by the core-endpoint was 
completed
+ *
+ * Note that core-endpoint incoming links are _not_ provided credit by the 
core.  It
+ * is the responsibility of the core-endpoint to supply credit at the 
appropriate time
+ * by calling qdrc_endpoint_flow_CT.
+ *
+ * @param link_context The opaque context supplied in the call to 
qdrc_endpoint_create_link_CT
+ * @param remote_source Pointer to the remote source terminus of the link
+ * @param remote_target Pointer to the remote target terminus of the link
+ */
+static void on_second_attach(void   *link_context,
+ qdr_terminus_t *remote_source,
+ qdr_terminus_t *remote_target)
+{
+qcm_heartbeat_edge_t *client = (qcm_heartbeat_edge_t*) link_context;
+qdr_core_timer_schedule_CT(client->core, client->timer, 1);
+qdr_terminus_free(remote_source);
+qdr_terminus_free(remote_target);
+}
+
+/**
+ * Event - Credit/Drain status for an outgoing core-endpoint link has changed
+ *
+ * @param link_context The opaque context associated with the endpoint link
+ * @param available_credit The number of deliveries that may be sent on this 
link
+ * @param drain True iff the peer receiver is requesting that the credit be 
drained
+ */
+static void on_flow(void *link_context,
+int   available_credit,
+bool  drain)
+{
+qcm_heartbeat_edge_t *client = (qcm_heartbeat_edge_t*) link_context;
+client->link_credit = drain ? 0 : available_credit;
+}
+
+/**
+ * Event - The settlement and/or disposition of a delivery has been updated
+ *
+ * @param link_context The opaque context associated with the endpoint link
+ * @param delivery The delivery object experiencing the change
+ * @param settled True iff the delivery has been settled by the peer
+ * @param disposition The disposition of the delivery 
(PN_[ACCEPTED|REJECTED|MODIFIED|RELEASED])
+ */
+static void on_update(void   *link_context,
+  qdr_delivery_t *delivery,
+  boolsettled,
+  uint64_tdisposition)
+{
+//qcm_heartbeat_edge_t *client = (qcm_heartbeat_edge_t*) link_context;
+}
+
+/**
+ * Event - A core-endpoint link has been detached
+ *
+ * Note: It is safe to discard objects referenced by the link_context in this 
handler.
+ *   There will be no further references to this link_context returned 
after this call.
+ *
+ * @param link_context The opaque context associated with the endpoint link
+ * @param error The error information that came with the detach or 0 if no 
error
+ */
+static void on_first_detach(void*link_context,
+qdr_error_t *error)
+{
+//qcm_heartbeat_edge_t *client = (qcm_heartbeat_edge_t*) 

[jira] [Commented] (DISPATCH-2262) Edge/Interior connections can half-fail in real multi-cloud environments

2021-10-27 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/DISPATCH-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435014#comment-17435014
 ] 

ASF subversion and git services commented on DISPATCH-2262:
---

Commit 81725eeb907b0d8173e956d3ca9f15a4d015db24 in qpid-dispatch's branch 
refs/heads/main from Ted Ross
[ https://gitbox.apache.org/repos/asf?p=qpid-dispatch.git;h=81725ee ]

DISPATCH-2262 - Fixed python-checker errors in the new test file.


> Edge/Interior connections can half-fail in real multi-cloud environments
> 
>
> Key: DISPATCH-2262
> URL: https://issues.apache.org/jira/browse/DISPATCH-2262
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Container
>Affects Versions: 1.17.0
>Reporter: Ted Ross
>Assignee: Ted Ross
>Priority: Major
> Fix For: 1.18.0
>
>
> See PROTON-2440 for context.
> The configured keepalive on edge-to-interior connections can fail to provide 
> protection from connection loss.  This results in half-failed connections 
> where the edge sees connection failure and re-connects and the interior sees 
> nothing and accumulates multiple connections from the same edge.
> This is a serious problem because the interior will attempt to forward 
> deliveries across these dead-but-seemingly-alive connections resulting in 
> lack of message delivery.
> The solution proposed in this issue is to work around the problem by 
> introducing an application-level keepalive for edge-to-interior connections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-2262) Edge/Interior connections can half-fail in real multi-cloud environments

2021-10-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DISPATCH-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435015#comment-17435015
 ] 

ASF GitHub Bot commented on DISPATCH-2262:
--

ted-ross merged pull request #1396:
URL: https://github.com/apache/qpid-dispatch/pull/1396


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Edge/Interior connections can half-fail in real multi-cloud environments
> 
>
> Key: DISPATCH-2262
> URL: https://issues.apache.org/jira/browse/DISPATCH-2262
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Container
>Affects Versions: 1.17.0
>Reporter: Ted Ross
>Assignee: Ted Ross
>Priority: Major
> Fix For: 1.18.0
>
>
> See PROTON-2440 for context.
> The configured keepalive on edge-to-interior connections can fail to provide 
> protection from connection loss.  This results in half-failed connections 
> where the edge sees connection failure and re-connects and the interior sees 
> nothing and accumulates multiple connections from the same edge.
> This is a serious problem because the interior will attempt to forward 
> deliveries across these dead-but-seemingly-alive connections resulting in 
> lack of message delivery.
> The solution proposed in this issue is to work around the problem by 
> introducing an application-level keepalive for edge-to-interior connections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-2262) Edge/Interior connections can half-fail in real multi-cloud environments

2021-10-27 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/DISPATCH-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435013#comment-17435013
 ] 

ASF subversion and git services commented on DISPATCH-2262:
---

Commit bafa55f22d775789cb0df786b142448ad2178867 in qpid-dispatch's branch 
refs/heads/main from Ted Ross
[ https://gitbox.apache.org/repos/asf?p=qpid-dispatch.git;h=bafa55f ]

DISPATCH-2262 - Added an app-level (above Proton) heartbeat feature that can be 
used when the Proton heartbeats aren't in effect.


> Edge/Interior connections can half-fail in real multi-cloud environments
> 
>
> Key: DISPATCH-2262
> URL: https://issues.apache.org/jira/browse/DISPATCH-2262
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Container
>Affects Versions: 1.17.0
>Reporter: Ted Ross
>Assignee: Ted Ross
>Priority: Major
> Fix For: 1.18.0
>
>
> See PROTON-2440 for context.
> The configured keepalive on edge-to-interior connections can fail to provide 
> protection from connection loss.  This results in half-failed connections 
> where the edge sees connection failure and re-connects and the interior sees 
> nothing and accumulates multiple connections from the same edge.
> This is a serious problem because the interior will attempt to forward 
> deliveries across these dead-but-seemingly-alive connections resulting in 
> lack of message delivery.
> The solution proposed in this issue is to work around the problem by 
> introducing an application-level keepalive for edge-to-interior connections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-2262) Edge/Interior connections can half-fail in real multi-cloud environments

2021-10-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DISPATCH-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17435008#comment-17435008
 ] 

ASF GitHub Bot commented on DISPATCH-2262:
--

kgiusti commented on a change in pull request #1396:
URL: https://github.com/apache/qpid-dispatch/pull/1396#discussion_r737739790



##
File path: src/router_core/modules/heartbeat_edge/heartbeat_edge.c
##
@@ -0,0 +1,245 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+#include "core_attach_address_lookup.h"
+#include "core_link_endpoint.h"
+#include "core_events.h"
+#include "module.h"
+#include "router_core_private.h"
+
+#include "qpid/dispatch/ctools.h"
+#include "qpid/dispatch/discriminator.h"
+#include "qpid/dispatch/amqp.h"
+
+#include 
+
+
+typedef struct qcm_heartbeat_edge_t {
+qdr_core_t*core;
+qdrc_event_subscription_t *event_sub;
+qdr_connection_t  *edge_conn;
+qdr_core_timer_t  *timer;
+qdrc_endpoint_t   *endpoint;
+uint32_t   link_credit;
+uint32_t   next_msg_id;
+} qcm_heartbeat_edge_t;
+
+
+//
+// Core Link API Handlers
+//
+
+/**
+ * Event - The attachment of a link initiated by the core-endpoint was 
completed
+ *
+ * Note that core-endpoint incoming links are _not_ provided credit by the 
core.  It
+ * is the responsibility of the core-endpoint to supply credit at the 
appropriate time
+ * by calling qdrc_endpoint_flow_CT.
+ *
+ * @param link_context The opaque context supplied in the call to 
qdrc_endpoint_create_link_CT
+ * @param remote_source Pointer to the remote source terminus of the link
+ * @param remote_target Pointer to the remote target terminus of the link
+ */
+static void on_second_attach(void   *link_context,
+ qdr_terminus_t *remote_source,
+ qdr_terminus_t *remote_target)
+{
+qcm_heartbeat_edge_t *client = (qcm_heartbeat_edge_t*) link_context;
+qdr_core_timer_schedule_CT(client->core, client->timer, 1);
+qdr_terminus_free(remote_source);
+qdr_terminus_free(remote_target);
+}
+
+/**
+ * Event - Credit/Drain status for an outgoing core-endpoint link has changed
+ *
+ * @param link_context The opaque context associated with the endpoint link
+ * @param available_credit The number of deliveries that may be sent on this 
link
+ * @param drain True iff the peer receiver is requesting that the credit be 
drained
+ */
+static void on_flow(void *link_context,
+int   available_credit,
+bool  drain)
+{
+qcm_heartbeat_edge_t *client = (qcm_heartbeat_edge_t*) link_context;
+client->link_credit = drain ? 0 : available_credit;
+}
+
+/**
+ * Event - The settlement and/or disposition of a delivery has been updated
+ *
+ * @param link_context The opaque context associated with the endpoint link
+ * @param delivery The delivery object experiencing the change
+ * @param settled True iff the delivery has been settled by the peer
+ * @param disposition The disposition of the delivery 
(PN_[ACCEPTED|REJECTED|MODIFIED|RELEASED])
+ */
+static void on_update(void   *link_context,
+  qdr_delivery_t *delivery,
+  boolsettled,
+  uint64_tdisposition)
+{
+//qcm_heartbeat_edge_t *client = (qcm_heartbeat_edge_t*) link_context;
+}
+
+/**
+ * Event - A core-endpoint link has been detached
+ *
+ * Note: It is safe to discard objects referenced by the link_context in this 
handler.
+ *   There will be no further references to this link_context returned 
after this call.
+ *
+ * @param link_context The opaque context associated with the endpoint link
+ * @param error The error information that came with the detach or 0 if no 
error
+ */
+static void on_first_detach(void*link_context,
+qdr_error_t *error)
+{
+//qcm_heartbeat_edge_t *client = (qcm_heartbeat_edge_t*) 

[jira] [Commented] (DISPATCH-2262) Edge/Interior connections can half-fail in real multi-cloud environments

2021-10-27 Thread Ted Ross (Jira)


[ 
https://issues.apache.org/jira/browse/DISPATCH-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17434954#comment-17434954
 ] 

Ted Ross commented on DISPATCH-2262:


[~gsim] - I think this one's a little different:

- It's intermittent, sometimes occurring and sometimes not with the same 
(default) configuration.
- When it happens (no empty frames), the timer does not expire and the 
connection does not close.

> Edge/Interior connections can half-fail in real multi-cloud environments
> 
>
> Key: DISPATCH-2262
> URL: https://issues.apache.org/jira/browse/DISPATCH-2262
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Container
>Affects Versions: 1.17.0
>Reporter: Ted Ross
>Assignee: Ted Ross
>Priority: Major
> Fix For: 1.18.0
>
>
> See PROTON-2440 for context.
> The configured keepalive on edge-to-interior connections can fail to provide 
> protection from connection loss.  This results in half-failed connections 
> where the edge sees connection failure and re-connects and the interior sees 
> nothing and accumulates multiple connections from the same edge.
> This is a serious problem because the interior will attempt to forward 
> deliveries across these dead-but-seemingly-alive connections resulting in 
> lack of message delivery.
> The solution proposed in this issue is to work around the problem by 
> introducing an application-level keepalive for edge-to-interior connections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-2262) Edge/Interior connections can half-fail in real multi-cloud environments

2021-10-27 Thread Gordon Sim (Jira)


[ 
https://issues.apache.org/jira/browse/DISPATCH-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17434949#comment-17434949
 ] 

Gordon Sim commented on DISPATCH-2262:
--

Could either of these be relevant? 
https://issues.apache.org/jira/browse/PROTON-2411 
https://issues.apache.org/jira/browse/PROTON-2422

> Edge/Interior connections can half-fail in real multi-cloud environments
> 
>
> Key: DISPATCH-2262
> URL: https://issues.apache.org/jira/browse/DISPATCH-2262
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Container
>Affects Versions: 1.17.0
>Reporter: Ted Ross
>Assignee: Ted Ross
>Priority: Major
> Fix For: 1.18.0
>
>
> See PROTON-2440 for context.
> The configured keepalive on edge-to-interior connections can fail to provide 
> protection from connection loss.  This results in half-failed connections 
> where the edge sees connection failure and re-connects and the interior sees 
> nothing and accumulates multiple connections from the same edge.
> This is a serious problem because the interior will attempt to forward 
> deliveries across these dead-but-seemingly-alive connections resulting in 
> lack of message delivery.
> The solution proposed in this issue is to work around the problem by 
> introducing an application-level keepalive for edge-to-interior connections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-2262) Edge/Interior connections can half-fail in real multi-cloud environments

2021-10-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DISPATCH-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17434879#comment-17434879
 ] 

ASF GitHub Bot commented on DISPATCH-2262:
--

ted-ross opened a new pull request #1396:
URL: https://github.com/apache/qpid-dispatch/pull/1396


   …that can be used when the Proton heartbeats aren't in effect.
   
   This is a workaround for a possible Proton heartbeat issue.
   
   The heartbeat facility can be used by any endpoint that connects to the 
router by creating a sender to address '_$qd.edge_heartbeat'.  Once this sender 
is attached, messages must be sent along the link to keep the connection alive. 
 The connection will be closed by the connected router after 8 seconds of no 
traffic on the heartbeat link.
   
   In this PR, edge routers use this facility to protect their uplink 
connections to interior routers.
   
   Question: should the timeout be configurable?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Edge/Interior connections can half-fail in real multi-cloud environments
> 
>
> Key: DISPATCH-2262
> URL: https://issues.apache.org/jira/browse/DISPATCH-2262
> Project: Qpid Dispatch
>  Issue Type: Bug
>  Components: Container
>Affects Versions: 1.17.0
>Reporter: Ted Ross
>Assignee: Ted Ross
>Priority: Major
> Fix For: 1.18.0
>
>
> See PROTON-2440 for context.
> The configured keepalive on edge-to-interior connections can fail to provide 
> protection from connection loss.  This results in half-failed connections 
> where the edge sees connection failure and re-connects and the interior sees 
> nothing and accumulates multiple connections from the same edge.
> This is a serious problem because the interior will attempt to forward 
> deliveries across these dead-but-seemingly-alive connections resulting in 
> lack of message delivery.
> The solution proposed in this issue is to work around the problem by 
> introducing an application-level keepalive for edge-to-interior connections.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org