[jira] [Commented] (PROTON-2177) IllegalStateException when freeing link as part of timeout handling

2020-02-03 Thread Carsten Lohmann (Jira)


[ 
https://issues.apache.org/jira/browse/PROTON-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17028829#comment-17028829
 ] 

Carsten Lohmann commented on PROTON-2177:
-

I wouldn't want to kill the whole session, thereby killing all other links on 
that session, when just one link creation attempt times out. (In our scenario 
there are several thousand links involved here, which would all need to be 
recreated.)

I would find it more consistent if "free()" would kind of detach the link 
object from any transport involvement, so that any further incoming frames 
would get ignored.
Or, if that is not feasible, to have "free()" throw an IllegalStateException if 
the link local + remote states are not CLOSED. That would at least make things 
more obvious and prevent the obscure "decref" IllegalStateException above.




> IllegalStateException when freeing link as part of timeout handling
> ---
>
> Key: PROTON-2177
> URL: https://issues.apache.org/jira/browse/PROTON-2177
> Project: Qpid Proton
>  Issue Type: Bug
>  Components: proton-j
>Affects Versions: proton-j-0.33.3
>Reporter: Carsten Lohmann
>Priority: Major
>
> Invoking free() on a link, or the processing of a received {{detach}} frame, 
> may result in such an exception if the preconditions below are met:
> {noformat}
> java.lang.IllegalStateException
>   at 
> org.apache.qpid.proton.engine.impl.EndpointImpl.decref(EndpointImpl.java:54)
>   at 
> org.apache.qpid.proton.engine.impl.LinkImpl.postFinal(LinkImpl.java:128)
>   at 
> org.apache.qpid.proton.engine.impl.EndpointImpl.decref(EndpointImpl.java:52)
>   at 
> org.apache.qpid.proton.engine.impl.TransportLink.clearRemoteHandle(TransportLink.java:125)
>   at 
> org.apache.qpid.proton.engine.impl.TransportImpl.handleDetach(TransportImpl.java:1379)
>   at 
> org.apache.qpid.proton.engine.impl.TransportImpl.handleDetach(TransportImpl.java:70)
>   at org.apache.qpid.proton.amqp.transport.Detach.invoke(Detach.java:86)
>   at 
> org.apache.qpid.proton.engine.impl.TransportImpl.handleFrame(TransportImpl.java:1453)
>   at 
> org.apache.qpid.proton.engine.impl.FrameParser.input(FrameParser.java:425)
>   at 
> org.apache.qpid.proton.engine.impl.FrameParser.process(FrameParser.java:536)
>   at 
> org.apache.qpid.proton.engine.impl.TransportImpl.process(TransportImpl.java:1570)
>   at 
> org.apache.qpid.proton.engine.impl.TransportImpl.processInput(TransportImpl.java:1528)
> {noformat}
> The scenario:
> We have implemented the logic for creating AMQP links with a 
> timeout-mechanism. That means that after invoking {{link.open()}} we wait for 
> a predefined time and if we haven't received the {{attach}} frame from the 
> server at that point, we call {{link.close()}} and then {{link.free()}} to 
> avoid having a memory leak.
> This has mostly worked well so far. In cases where the server was not sending 
> the attach frame in time, after calling {{sender.close()}} the server usually 
> finally responded with a {{detach}} frame (instead of sending an {{attach}} 
> in between).
> But lately, when testing with a high number of links (>1) we sometimes 
> encountered such a server behaviour (with Qpid Dispatch Router as server):
> - client sends {{attach}} frame
> - linkEstablishmentTimeout: server doesn't respond in time so client invokes 
> {{link.close()}} and then {{link.free()}}
> - *server sends an attach frame*
> - server sends a detach frame
> If that happens multiple times on the same session, the above exception 
> occurs when calling {{link.free()}}. Afterwards there are 'socked closed' 
> exceptions.
> If we don't invoke {{link.free}} as part of the linkEstablishmentTimeout 
> handling, the exception doesn't occur.
> But not invoking {{link.free()}} would create a memory leak if the server 
> doesn't respond at all to the {{attach}} frame.
> ---
> The issue can be reproduced with this test method (to be run as an additional 
> method in the "org.apache.qpid.proton.systemtests.FreeTest" class)
> {code:java|collapse=true}
> @Test
> public void testFreeOnLinkEstablishmentTimeout() throws Exception {
> LOGGER.fine(bold(" About to create transports"));
> getClient().transport = Proton.transport();
> ProtocolTracerEnabler.setProtocolTracer(getClient().transport, 
> TestLoggingHelper.CLIENT_PREFIX);
> getServer().transport = Proton.transport();
> ProtocolTracerEnabler.setProtocolTracer(getServer().transport, "  
>   " + TestLoggingHelper.SERVER_PREFIX);
> getClient().connection = Proton.connection();
> getClient().transport.bind(getClient().connection);
> getServer().connection = Proton.connection();
> 

[jira] [Updated] (PROTON-2177) IllegalStateException when freeing link as part of timeout handling

2020-01-31 Thread Carsten Lohmann (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carsten Lohmann updated PROTON-2177:

Description: 
Invoking free() on a link, or the processing of a received {{detach}} frame, 
may result in such an exception if the preconditions below are met:
{noformat}
java.lang.IllegalStateException
at 
org.apache.qpid.proton.engine.impl.EndpointImpl.decref(EndpointImpl.java:54)
at 
org.apache.qpid.proton.engine.impl.LinkImpl.postFinal(LinkImpl.java:128)
at 
org.apache.qpid.proton.engine.impl.EndpointImpl.decref(EndpointImpl.java:52)
at 
org.apache.qpid.proton.engine.impl.TransportLink.clearRemoteHandle(TransportLink.java:125)
at 
org.apache.qpid.proton.engine.impl.TransportImpl.handleDetach(TransportImpl.java:1379)
at 
org.apache.qpid.proton.engine.impl.TransportImpl.handleDetach(TransportImpl.java:70)
at org.apache.qpid.proton.amqp.transport.Detach.invoke(Detach.java:86)
at 
org.apache.qpid.proton.engine.impl.TransportImpl.handleFrame(TransportImpl.java:1453)
at 
org.apache.qpid.proton.engine.impl.FrameParser.input(FrameParser.java:425)
at 
org.apache.qpid.proton.engine.impl.FrameParser.process(FrameParser.java:536)
at 
org.apache.qpid.proton.engine.impl.TransportImpl.process(TransportImpl.java:1570)
at 
org.apache.qpid.proton.engine.impl.TransportImpl.processInput(TransportImpl.java:1528)
{noformat}

The scenario:
We have implemented the logic for creating AMQP links with a timeout-mechanism. 
That means that after invoking {{link.open()}} we wait for a predefined time 
and if we haven't received the {{attach}} frame from the server at that point, 
we call {{link.close()}} and then {{link.free()}} to avoid having a memory leak.

This has mostly worked well so far. In cases where the server was not sending 
the attach frame in time, after calling {{sender.close()}} the server usually 
finally responded with a {{detach}} frame (instead of sending an {{attach}} in 
between).

But lately, when testing with a high number of links (>1) we sometimes 
encountered such a server behaviour (with Qpid Dispatch Router as server):

- client sends {{attach}} frame
- linkEstablishmentTimeout: server doesn't respond in time so client invokes 
{{link.close()}} and then {{link.free()}}
- *server sends an attach frame*
- server sends a detach frame

If that happens multiple times on the same session, the above exception occurs 
when calling {{link.free()}}. Afterwards there are 'socked closed' exceptions.


If we don't invoke {{link.free}} as part of the linkEstablishmentTimeout 
handling, the exception doesn't occur.
But not invoking {{link.free()}} would create a memory leak if the server 
doesn't respond at all to the {{attach}} frame.

---
The issue can be reproduced with this test method (to be run as an additional 
method in the "org.apache.qpid.proton.systemtests.FreeTest" class)

{code:java|collapse=true}
@Test
public void testFreeOnLinkEstablishmentTimeout() throws Exception {
LOGGER.fine(bold(" About to create transports"));
getClient().transport = Proton.transport();
ProtocolTracerEnabler.setProtocolTracer(getClient().transport, 
TestLoggingHelper.CLIENT_PREFIX);
getServer().transport = Proton.transport();
ProtocolTracerEnabler.setProtocolTracer(getServer().transport, "
" + TestLoggingHelper.SERVER_PREFIX);
getClient().connection = Proton.connection();
getClient().transport.bind(getClient().connection);
getServer().connection = Proton.connection();
getServer().transport.bind(getServer().connection);
LOGGER.fine(bold(" About to open connections"));
getClient().connection.open();
getServer().connection.open();
doOutputInputCycle();
LOGGER.fine(bold(" About to open session"));
getClient().session = getClient().connection.session();
getClient().session.open();
pumpClientToServer();
getServer().session = 
getServer().connection.sessionHead(of(UNINITIALIZED), of(ACTIVE));
assertEndpointState(getServer().session, UNINITIALIZED, ACTIVE);
getServer().session.open();
assertEndpointState(getServer().session, ACTIVE, ACTIVE);
pumpServerToClient();
assertEndpointState(getClient().session, ACTIVE, ACTIVE);

for (int i = 0; i < 5; i++) {
LOGGER.fine("\n\n");
LOGGER.fine(bold(" About to create client sender " + i + "; 
refcount on session: " + getSessionRefCount(getClient().session)));
getClient().source = new Source();
getClient().source.setAddress(null);
getClient().target = new Target();
getClient().target.setAddress("myQueue");
getClient().sender = getClient().session.sender("sender" + i);
  

[jira] [Updated] (PROTON-2177) IllegalStateException when freeing link as part of timeout handling

2020-01-31 Thread Carsten Lohmann (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carsten Lohmann updated PROTON-2177:

Description: 
Invoking free() on a link may result in such an exception if the preconditions 
below are met:
{noformat}
java.lang.IllegalStateException
at 
org.apache.qpid.proton.engine.impl.EndpointImpl.decref(EndpointImpl.java:54)
at 
org.apache.qpid.proton.engine.impl.LinkImpl.postFinal(LinkImpl.java:128)
at 
org.apache.qpid.proton.engine.impl.EndpointImpl.decref(EndpointImpl.java:52)
at 
org.apache.qpid.proton.engine.impl.TransportLink.clearRemoteHandle(TransportLink.java:125)
at 
org.apache.qpid.proton.engine.impl.TransportImpl.handleDetach(TransportImpl.java:1379)
at 
org.apache.qpid.proton.engine.impl.TransportImpl.handleDetach(TransportImpl.java:70)
at org.apache.qpid.proton.amqp.transport.Detach.invoke(Detach.java:86)
at 
org.apache.qpid.proton.engine.impl.TransportImpl.handleFrame(TransportImpl.java:1453)
at 
org.apache.qpid.proton.engine.impl.FrameParser.input(FrameParser.java:425)
at 
org.apache.qpid.proton.engine.impl.FrameParser.process(FrameParser.java:536)
at 
org.apache.qpid.proton.engine.impl.TransportImpl.process(TransportImpl.java:1570)
at 
org.apache.qpid.proton.engine.impl.TransportImpl.processInput(TransportImpl.java:1528)
{noformat}

The scenario:
We have implemented the logic for creating AMQP links with a timeout-mechanism. 
That means that after invoking {{link.open()}} we wait for a predefined time 
and if we haven't received the {{attach}} frame from the server at that point, 
we call {{link.close()}} and then {{link.free()}} to avoid having a memory leak.

This has mostly worked well so far. In cases where the server was not sending 
the attach frame in time, after calling {{sender.close()}} the server usually 
finally responded with a {{detach}} frame (instead of sending an {{attach}} in 
between).

But lately, when testing with a high number of links (>1) we sometimes 
encountered such a server behaviour (with Qpid Dispatch Router as server):

- client sends {{attach}} frame
- linkEstablishmentTimeout: server doesn't respond in time so client invokes 
{{link.close()}} and then {{link.free()}}
- *server sends an attach frame*
- server sends a detach frame

If that happens multiple times on the same session, the above exception occurs 
when calling {{link.free()}}. Afterwards there are 'socked closed' exceptions.


If we don't invoke {{link.free}} as part of the linkEstablishmentTimeout 
handling, the exception doesn't occur.
But not invoking {{link.free()}} would create a memory leak if the server 
doesn't respond at all to the {{attach}} frame.

---
The issue can be reproduced with this test method (to be run as an additional 
method in the "org.apache.qpid.proton.systemtests.FreeTest" class)

{code:java|collapse=true}
@Test
public void testFreeOnLinkEstablishmentTimeout() throws Exception {
LOGGER.fine(bold(" About to create transports"));
getClient().transport = Proton.transport();
ProtocolTracerEnabler.setProtocolTracer(getClient().transport, 
TestLoggingHelper.CLIENT_PREFIX);
getServer().transport = Proton.transport();
ProtocolTracerEnabler.setProtocolTracer(getServer().transport, "
" + TestLoggingHelper.SERVER_PREFIX);
getClient().connection = Proton.connection();
getClient().transport.bind(getClient().connection);
getServer().connection = Proton.connection();
getServer().transport.bind(getServer().connection);
LOGGER.fine(bold(" About to open connections"));
getClient().connection.open();
getServer().connection.open();
doOutputInputCycle();
LOGGER.fine(bold(" About to open session"));
getClient().session = getClient().connection.session();
getClient().session.open();
pumpClientToServer();
getServer().session = 
getServer().connection.sessionHead(of(UNINITIALIZED), of(ACTIVE));
assertEndpointState(getServer().session, UNINITIALIZED, ACTIVE);
getServer().session.open();
assertEndpointState(getServer().session, ACTIVE, ACTIVE);
pumpServerToClient();
assertEndpointState(getClient().session, ACTIVE, ACTIVE);

for (int i = 0; i < 5; i++) {
LOGGER.fine("\n\n");
LOGGER.fine(bold(" About to create client sender " + i + "; 
refcount on session: " + getSessionRefCount(getClient().session)));
getClient().source = new Source();
getClient().source.setAddress(null);
getClient().target = new Target();
getClient().target.setAddress("myQueue");
getClient().sender = getClient().session.sender("sender" + i);

[jira] [Updated] (PROTON-2177) IllegalStateException when freeing link as part of timeout handling

2020-01-31 Thread Carsten Lohmann (Jira)


 [ 
https://issues.apache.org/jira/browse/PROTON-2177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carsten Lohmann updated PROTON-2177:

Description: 
Invoking free() on a link may result in such an exception if the preconditions 
below are met:
{noformat}
java.lang.IllegalStateException
at 
org.apache.qpid.proton.engine.impl.EndpointImpl.decref(EndpointImpl.java:54)
at 
org.apache.qpid.proton.engine.impl.LinkImpl.postFinal(LinkImpl.java:128)
at 
org.apache.qpid.proton.engine.impl.EndpointImpl.decref(EndpointImpl.java:52)
at 
org.apache.qpid.proton.engine.impl.TransportLink.clearRemoteHandle(TransportLink.java:125)
at 
org.apache.qpid.proton.engine.impl.TransportImpl.handleDetach(TransportImpl.java:1379)
at 
org.apache.qpid.proton.engine.impl.TransportImpl.handleDetach(TransportImpl.java:70)
at org.apache.qpid.proton.amqp.transport.Detach.invoke(Detach.java:86)
at 
org.apache.qpid.proton.engine.impl.TransportImpl.handleFrame(TransportImpl.java:1453)
at 
org.apache.qpid.proton.engine.impl.FrameParser.input(FrameParser.java:425)
at 
org.apache.qpid.proton.engine.impl.FrameParser.process(FrameParser.java:536)
at 
org.apache.qpid.proton.engine.impl.TransportImpl.process(TransportImpl.java:1570)
at 
org.apache.qpid.proton.engine.impl.TransportImpl.processInput(TransportImpl.java:1528)
{noformat}

The scenario:
We have implemented the logic for creating AMQP links with a timeout-mechanism. 
That means that after invoking {{link.open()}} we wait for a predefined time 
and if we haven't received the {{attach}} frame from the server at that point, 
we call {{link.close()}} and then {{link.free()}} to avoid having a memory leak.

This has mostly worked well so far. In cases where the server was not sending 
the attach frame in time, after calling {{sender.close()}} the server usually 
finally responded with a {{detach}} frame (instead of sending an {{attach}} in 
between).

But lately, when testing with a high number of links (>1) we sometimes 
encountered such a server behaviour (with Qpid Dispatch Router as server):

- client sends {{attach}} frame
- linkEstablishmentTimeout: server doesn't respond in time so client invokes 
{{link.close()}} and then {{link.free()}}
- *server sends an attach frame*
- server sends a detach frame

If that happens mulitple times on the same session, the above exception occurs 
when calling {{link.free()}}. Afterwards there are 'socked closed' exceptions.


If we don't invoke {{link.free}} as part of the linkEstablishmentTimeout 
handling, the exception doesn't occur.
But not invoking {{link.free()}} would create a memory leak if the server 
doesn't respond at all to the {{attach}} frame.

---
The issue can be reproduced with this test method (to be run as an additional 
method in the "org.apache.qpid.proton.systemtests.FreeTest" class)

{code:java|collapse=true}
@Test
public void testFreeOnLinkEstablishmentTimeout() throws Exception {
LOGGER.fine(bold(" About to create transports"));
getClient().transport = Proton.transport();
ProtocolTracerEnabler.setProtocolTracer(getClient().transport, 
TestLoggingHelper.CLIENT_PREFIX);
getServer().transport = Proton.transport();
ProtocolTracerEnabler.setProtocolTracer(getServer().transport, "
" + TestLoggingHelper.SERVER_PREFIX);
getClient().connection = Proton.connection();
getClient().transport.bind(getClient().connection);
getServer().connection = Proton.connection();
getServer().transport.bind(getServer().connection);
LOGGER.fine(bold(" About to open connections"));
getClient().connection.open();
getServer().connection.open();
doOutputInputCycle();
LOGGER.fine(bold(" About to open session"));
getClient().session = getClient().connection.session();
getClient().session.open();
pumpClientToServer();
getServer().session = 
getServer().connection.sessionHead(of(UNINITIALIZED), of(ACTIVE));
assertEndpointState(getServer().session, UNINITIALIZED, ACTIVE);
getServer().session.open();
assertEndpointState(getServer().session, ACTIVE, ACTIVE);
pumpServerToClient();
assertEndpointState(getClient().session, ACTIVE, ACTIVE);

for (int i = 0; i < 5; i++) {
LOGGER.fine("\n\n");
LOGGER.fine(bold(" About to create client sender " + i + "; 
refcount on session: " + getSessionRefCount(getClient().session)));
getClient().source = new Source();
getClient().source.setAddress(null);
getClient().target = new Target();
getClient().target.setAddress("myQueue");
getClient().sender = getClient().session.sender("sender" + i);

[jira] [Created] (PROTON-2177) IllegalStateException when freeing link as part of timeout handling

2020-01-31 Thread Carsten Lohmann (Jira)
Carsten Lohmann created PROTON-2177:
---

 Summary: IllegalStateException when freeing link as part of 
timeout handling
 Key: PROTON-2177
 URL: https://issues.apache.org/jira/browse/PROTON-2177
 Project: Qpid Proton
  Issue Type: Bug
  Components: proton-j
Affects Versions: proton-j-0.33.3
Reporter: Carsten Lohmann


Invoking free() on a link may result in such an exception if the preconditions 
below are met:
{noformat}
java.lang.IllegalStateException
at 
org.apache.qpid.proton.engine.impl.EndpointImpl.decref(EndpointImpl.java:54)
at 
org.apache.qpid.proton.engine.impl.LinkImpl.postFinal(LinkImpl.java:128)
at 
org.apache.qpid.proton.engine.impl.EndpointImpl.decref(EndpointImpl.java:52)
at 
org.apache.qpid.proton.engine.impl.TransportLink.clearRemoteHandle(TransportLink.java:125)
at 
org.apache.qpid.proton.engine.impl.TransportImpl.handleDetach(TransportImpl.java:1379)
at 
org.apache.qpid.proton.engine.impl.TransportImpl.handleDetach(TransportImpl.java:70)
at org.apache.qpid.proton.amqp.transport.Detach.invoke(Detach.java:86)
at 
org.apache.qpid.proton.engine.impl.TransportImpl.handleFrame(TransportImpl.java:1453)
at 
org.apache.qpid.proton.engine.impl.FrameParser.input(FrameParser.java:425)
at 
org.apache.qpid.proton.engine.impl.FrameParser.process(FrameParser.java:536)
at 
org.apache.qpid.proton.engine.impl.TransportImpl.process(TransportImpl.java:1570)
at 
org.apache.qpid.proton.engine.impl.TransportImpl.processInput(TransportImpl.java:1528)
{noformat}

The scenario:
We have implemented the logic for creating AMQP links with a timeout-mechanism. 
That means that after invoking {{link.open()}} we wait for a predefined time 
and if we haven't received the {{attach}} frame from the server at that point, 
we call {{link.close()}} and then {{link.free()}} to avoid having a memory leak.

This has mostly worked well so far. In cases where the server was not sending 
the attach frame in time, after calling {{sender.close()}} the server usually 
finally responded with a {{detach}} frame (instead of sending an {{attach}} in 
between).

But lately, when testing with a high number of links (>1) we sometimes 
encountered such a server behaviour (Qpid):

- client sends {{attach}} frame
- linkEstablishmentTimeout: server doesn't respond in time so client invokes 
{{link.close()}} and then {{link.free()}}
- *server sends an attach frame*
- server sends a detach frame

If that happens mulitple times on the same session, the above exception occurs 
when calling {{link.free()}}. Afterwards there are 'socked closed' exceptions.


If we don't invoke {{link.free}} as part of the linkEstablishmentTimeout 
handling, the exception doesn't occur.
But not invoking {{link.free()}} would create a memory leak if the server 
doesn't respond at all to the {{attach}} frame.

---
The issue can be reproduced with this test method (to be run as an additional 
method in the "org.apache.qpid.proton.systemtests.FreeTest" class)

{code:java|collapse=true}
@Test
public void testFreeOnLinkEstablishmentTimeout() throws Exception {
LOGGER.fine(bold(" About to create transports"));
getClient().transport = Proton.transport();
ProtocolTracerEnabler.setProtocolTracer(getClient().transport, 
TestLoggingHelper.CLIENT_PREFIX);
getServer().transport = Proton.transport();
ProtocolTracerEnabler.setProtocolTracer(getServer().transport, "
" + TestLoggingHelper.SERVER_PREFIX);
getClient().connection = Proton.connection();
getClient().transport.bind(getClient().connection);
getServer().connection = Proton.connection();
getServer().transport.bind(getServer().connection);
LOGGER.fine(bold(" About to open connections"));
getClient().connection.open();
getServer().connection.open();
doOutputInputCycle();
LOGGER.fine(bold(" About to open session"));
getClient().session = getClient().connection.session();
getClient().session.open();
pumpClientToServer();
getServer().session = 
getServer().connection.sessionHead(of(UNINITIALIZED), of(ACTIVE));
assertEndpointState(getServer().session, UNINITIALIZED, ACTIVE);
getServer().session.open();
assertEndpointState(getServer().session, ACTIVE, ACTIVE);
pumpServerToClient();
assertEndpointState(getClient().session, ACTIVE, ACTIVE);

for (int i = 0; i < 5; i++) {
LOGGER.fine("\n\n");
LOGGER.fine(bold(" About to create client sender " + i + "; 
refcount on session: " + getSessionRefCount(getClient().session)));
getClient().source = new Source();
getClient().source.setAddress(null);
getClient().target = new 

[jira] [Comment Edited] (DISPATCH-1086) Dispatch Router sporadically goes into a state where TLS connections to the auth service fail

2018-09-19 Thread Carsten Lohmann (JIRA)


[ 
https://issues.apache.org/jira/browse/DISPATCH-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620303#comment-16620303
 ] 

Carsten Lohmann edited comment on DISPATCH-1086 at 9/19/18 2:06 PM:


Here are backtraces from separate core dumps:

1:
{noformat}
#0  0x7f4a0578f4fd in pn_list_get (list=0x1424900, index=24140)
    at /build/qpid-proton-src/c/src/core/object/list.c:42
#1  0x7f4a0579866f in pni_session_bound (ssn=)
    at /build/qpid-proton-src/c/src/core/engine.c:1021
#2  0x7f4a0579892b in pn_connection_bound (
    connection=connection@entry=0x14244a0)
    at /build/qpid-proton-src/c/src/core/engine.c:157
#3  0x7f4a0579e268 in pn_transport_bind (transport=0xf1f760,
    connection=0x14244a0) at /build/qpid-proton-src/c/src/core/transport.c:706
#4  0x7f4a05797d35 in batch_next (batch=0x1461330)
    at /build/qpid-proton-src/c/src/core/connection_driver.c:41
#5  0x7f4a0557a6f1 in pconnection_batch_next ()
    at /build/qpid-proton-src/c/src/proactor/epoll.c:948
#6  0x7f4a05a0c2fb in thread_run (arg=arg@entry=0xf0bf60)
    at /build/qpid-dispatch-src/src/server.c:976
#7  0x7f4a05a0c590 in qd_server_run (qd=)
    at /build/qpid-dispatch-src/src/server.c:1247
#8  0x0040182c in main_process (
    config_path=0x7fff809c4965 "/tmp/qdrouterd.conf",
    python_pkgdir=, fd=2)
    at /build/qpid-dispatch-src/router/src/main.c:112
#9  0x00401589 in main (argc=3, argv=0x7fff809c3ba8)
    at /build/qpid-dispatch-src/router/src/main.c:360
{noformat}
2:
{noformat}
#0  ssl_cert_dup (cert=0x0) at ssl/ssl_cert.c:89
#1  0x7fda9b487b98 in SSL_new (ctx=0x1fb8a40) at ssl/ssl_lib.c:716
#2  0x7fda9c95b0c8 in init_ssl_socket (transport=0x7fda8014ec60,
    ssl=ssl@entry=0x7fda8004dc70)
    at /build/qpid-proton-src/c/src/ssl/openssl.c:1235
#3  0x7fda9c95bc18 in init_ssl_socket (ssl=0x7fda8004dc70,
    transport=0x7fda8014ec60)
    at /build/qpid-proton-src/c/src/ssl/openssl.c:1232
#4  process_input_ssl (transport=0x7fda8014ec60, layer=0,
    input_data=0x7fda80163f90 "\220\a", available=0)
    at /build/qpid-proton-src/c/src/ssl/openssl.c:963
#5  0x7fda9c95317a in transport_consume (
    transport=transport@entry=0x7fda8014ec60)
    at /build/qpid-proton-src/c/src/core/transport.c:1821
#6  0x7fda9c954eda in pn_transport_close_tail (transport=0x7fda8014ec60)
    at /build/qpid-proton-src/c/src/core/transport.c:2972
#7  0x7fda9c94cdea in pn_connection_driver_read_close (
    d=d@entry=0x7fda800e8968)
    at /build/qpid-proton-src/c/src/core/connection_driver.c:114
#8  0x7fda9c72f5f8 in pconnection_process (pc=pc@entry=0x7fda800e83c0,
    events=events@entry=0, timeout=timeout@entry=false,
    topup=topup@entry=true, is_io_2=is_io_2@entry=false)
    at /build/qpid-proton-src/c/src/proactor/epoll.c:1230
#9  0x7fda9c72f754 in pconnection_batch_next ()
    at /build/qpid-proton-src/c/src/proactor/epoll.c:953
#10 0x7fda9cbc12fb in thread_run (arg=0x1bd0f60)
    at /build/qpid-dispatch-src/src/server.c:976
#11 0x7fda9c512594 in start_thread (arg=)
    at pthread_create.c:463
#12 0x7fda9b7bbe6f in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
{noformat}
3:
{noformat}
#0  0x7f761a027858 in EVP_PKEY_up_ref (pkey=0x1a0bf8b480166)
    at crypto/evp/p_lib.c:160
#1  0x7f761a3898a4 in ssl_cert_dup (cert=0x7f761b85a6a0 )
    at ssl/ssl_cert.c:99
#2  0x7f761a394b98 in SSL_new (ctx=0x7f761ba77940 )
    at ssl/ssl_lib.c:716
#3  0x7f761b8680c8 in init_ssl_socket (transport=0xcbede0, ssl=0xc92d70)
    at /build/qpid-proton-src/c/src/ssl/openssl.c:1235
#4  0x7f761b869464 in init_ssl_socket (ssl=,
    transport=)
    at /build/qpid-proton-src/c/src/ssl/openssl.c:1232
#5  pn_ssl_init (ssl0=, domain=,
    session_id=session_id@entry=0x0)
    at /build/qpid-proton-src/c/src/ssl/openssl.c:822
#6  0x7f761bab09ef in qdr_handle_authentication_service_connection_event (
    e=e@entry=0xc78900) at /build/qpid-dispatch-src/src/remote_sasl.c:623
#7  0x7f761bacda2c in handle (qd_server=qd_server@entry=0x85bf60,
    e=e@entry=0xc78900, pn_conn=pn_conn@entry=0xc35ba0, ctx=ctx@entry=0x0)
    at /build/qpid-dispatch-src/src/server.c:864
#8  0x7f761bace2e4 in thread_run (arg=arg@entry=0x85bf60)
    at /build/qpid-dispatch-src/src/server.c:973
#9  0x7f761bace590 in qd_server_run (qd=)
    at /build/qpid-dispatch-src/src/server.c:1247
#10 0x0040182c in main_process (
    config_path=0x7ffecc430965 "/tmp/qdrouterd.conf",
    python_pkgdir=, fd=2)
    at /build/qpid-dispatch-src/router/src/main.c:112
#11 0x00401589 in main (argc=3, argv=0x7ffecc42e848)
    at /build/qpid-dispatch-src/router/src/main.c:360
{noformat}
 4:
{noformat}
#0  __GI_abort () at abort.c:107
#1  0x7f3308a8e7b7 in __libc_message (action=action@entry=do_abort,
    fmt=fmt@entry=0x7f3308b98359 "%s\n") at ../sysdeps/posix/libc_fatal.c:181

[jira] [Comment Edited] (DISPATCH-1086) Dispatch Router sporadically goes into a state where TLS connections to the auth service fail

2018-09-19 Thread Carsten Lohmann (JIRA)


[ 
https://issues.apache.org/jira/browse/DISPATCH-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620303#comment-16620303
 ] 

Carsten Lohmann edited comment on DISPATCH-1086 at 9/19/18 9:31 AM:


Here are backtraces from separate core dumps:

1:
{noformat}
#0  0x7f4a0578f4fd in pn_list_get (list=0x1424900, index=24140)
    at /build/qpid-proton-src/c/src/core/object/list.c:42
#1  0x7f4a0579866f in pni_session_bound (ssn=)
    at /build/qpid-proton-src/c/src/core/engine.c:1021
#2  0x7f4a0579892b in pn_connection_bound (
    connection=connection@entry=0x14244a0)
    at /build/qpid-proton-src/c/src/core/engine.c:157
#3  0x7f4a0579e268 in pn_transport_bind (transport=0xf1f760,
    connection=0x14244a0) at /build/qpid-proton-src/c/src/core/transport.c:706
#4  0x7f4a05797d35 in batch_next (batch=0x1461330)
    at /build/qpid-proton-src/c/src/core/connection_driver.c:41
#5  0x7f4a0557a6f1 in pconnection_batch_next ()
    at /build/qpid-proton-src/c/src/proactor/epoll.c:948
#6  0x7f4a05a0c2fb in thread_run (arg=arg@entry=0xf0bf60)
    at /build/qpid-dispatch-src/src/server.c:976
#7  0x7f4a05a0c590 in qd_server_run (qd=)
    at /build/qpid-dispatch-src/src/server.c:1247
#8  0x0040182c in main_process (
    config_path=0x7fff809c4965 "/tmp/qdrouterd.conf",
    python_pkgdir=, fd=2)
    at /build/qpid-dispatch-src/router/src/main.c:112
#9  0x00401589 in main (argc=3, argv=0x7fff809c3ba8)
    at /build/qpid-dispatch-src/router/src/main.c:360
{noformat}
2:
{noformat}
#0  ssl_cert_dup (cert=0x0) at ssl/ssl_cert.c:89
#1  0x7fda9b487b98 in SSL_new (ctx=0x1fb8a40) at ssl/ssl_lib.c:716
#2  0x7fda9c95b0c8 in init_ssl_socket (transport=0x7fda8014ec60,
    ssl=ssl@entry=0x7fda8004dc70)
    at /build/qpid-proton-src/c/src/ssl/openssl.c:1235
#3  0x7fda9c95bc18 in init_ssl_socket (ssl=0x7fda8004dc70,
    transport=0x7fda8014ec60)
    at /build/qpid-proton-src/c/src/ssl/openssl.c:1232
#4  process_input_ssl (transport=0x7fda8014ec60, layer=0,
    input_data=0x7fda80163f90 "\220\a", available=0)
    at /build/qpid-proton-src/c/src/ssl/openssl.c:963
#5  0x7fda9c95317a in transport_consume (
    transport=transport@entry=0x7fda8014ec60)
    at /build/qpid-proton-src/c/src/core/transport.c:1821
#6  0x7fda9c954eda in pn_transport_close_tail (transport=0x7fda8014ec60)
    at /build/qpid-proton-src/c/src/core/transport.c:2972
#7  0x7fda9c94cdea in pn_connection_driver_read_close (
    d=d@entry=0x7fda800e8968)
    at /build/qpid-proton-src/c/src/core/connection_driver.c:114
#8  0x7fda9c72f5f8 in pconnection_process (pc=pc@entry=0x7fda800e83c0,
    events=events@entry=0, timeout=timeout@entry=false,
    topup=topup@entry=true, is_io_2=is_io_2@entry=false)
    at /build/qpid-proton-src/c/src/proactor/epoll.c:1230
#9  0x7fda9c72f754 in pconnection_batch_next ()
    at /build/qpid-proton-src/c/src/proactor/epoll.c:953
#10 0x7fda9cbc12fb in thread_run (arg=0x1bd0f60)
    at /build/qpid-dispatch-src/src/server.c:976
#11 0x7fda9c512594 in start_thread (arg=)
    at pthread_create.c:463
#12 0x7fda9b7bbe6f in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
{noformat}
3:
{noformat}
#0  0x7f761a027858 in EVP_PKEY_up_ref (pkey=0x1a0bf8b480166)
    at crypto/evp/p_lib.c:160
#1  0x7f761a3898a4 in ssl_cert_dup (cert=0x7f761b85a6a0 )
    at ssl/ssl_cert.c:99
#2  0x7f761a394b98 in SSL_new (ctx=0x7f761ba77940 )
    at ssl/ssl_lib.c:716
#3  0x7f761b8680c8 in init_ssl_socket (transport=0xcbede0, ssl=0xc92d70)
    at /build/qpid-proton-src/c/src/ssl/openssl.c:1235
#4  0x7f761b869464 in init_ssl_socket (ssl=,
    transport=)
    at /build/qpid-proton-src/c/src/ssl/openssl.c:1232
#5  pn_ssl_init (ssl0=, domain=,
    session_id=session_id@entry=0x0)
    at /build/qpid-proton-src/c/src/ssl/openssl.c:822
#6  0x7f761bab09ef in qdr_handle_authentication_service_connection_event (
    e=e@entry=0xc78900) at /build/qpid-dispatch-src/src/remote_sasl.c:623
#7  0x7f761bacda2c in handle (qd_server=qd_server@entry=0x85bf60,
    e=e@entry=0xc78900, pn_conn=pn_conn@entry=0xc35ba0, ctx=ctx@entry=0x0)
    at /build/qpid-dispatch-src/src/server.c:864
#8  0x7f761bace2e4 in thread_run (arg=arg@entry=0x85bf60)
    at /build/qpid-dispatch-src/src/server.c:973
#9  0x7f761bace590 in qd_server_run (qd=)
    at /build/qpid-dispatch-src/src/server.c:1247
#10 0x0040182c in main_process (
    config_path=0x7ffecc430965 "/tmp/qdrouterd.conf",
    python_pkgdir=, fd=2)
    at /build/qpid-dispatch-src/router/src/main.c:112
#11 0x00401589 in main (argc=3, argv=0x7ffecc42e848)
    at /build/qpid-dispatch-src/router/src/main.c:360
{noformat}
 4:
{noformat}
#0  __GI_abort () at abort.c:107
#1  0x7f3308a8e7b7 in __libc_message (action=action@entry=do_abort,
    fmt=fmt@entry=0x7f3308b98359 "%s\n") at ../sysdeps/posix/libc_fatal.c:181

[jira] [Commented] (DISPATCH-1086) Dispatch Router sporadically goes into a state where TLS connections to the auth service fail

2018-09-19 Thread Carsten Lohmann (JIRA)


[ 
https://issues.apache.org/jira/browse/DISPATCH-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620303#comment-16620303
 ] 

Carsten Lohmann commented on DISPATCH-1086:
---

Here are backtraces from 3 separate core dumps:

1:
{noformat}
#0  0x7f4a0578f4fd in pn_list_get (list=0x1424900, index=24140)
    at /build/qpid-proton-src/c/src/core/object/list.c:42
#1  0x7f4a0579866f in pni_session_bound (ssn=)
    at /build/qpid-proton-src/c/src/core/engine.c:1021
#2  0x7f4a0579892b in pn_connection_bound (
    connection=connection@entry=0x14244a0)
    at /build/qpid-proton-src/c/src/core/engine.c:157
#3  0x7f4a0579e268 in pn_transport_bind (transport=0xf1f760,
    connection=0x14244a0) at /build/qpid-proton-src/c/src/core/transport.c:706
#4  0x7f4a05797d35 in batch_next (batch=0x1461330)
    at /build/qpid-proton-src/c/src/core/connection_driver.c:41
#5  0x7f4a0557a6f1 in pconnection_batch_next ()
    at /build/qpid-proton-src/c/src/proactor/epoll.c:948
#6  0x7f4a05a0c2fb in thread_run (arg=arg@entry=0xf0bf60)
    at /build/qpid-dispatch-src/src/server.c:976
#7  0x7f4a05a0c590 in qd_server_run (qd=)
    at /build/qpid-dispatch-src/src/server.c:1247
#8  0x0040182c in main_process (
    config_path=0x7fff809c4965 "/tmp/qdrouterd.conf",
    python_pkgdir=, fd=2)
    at /build/qpid-dispatch-src/router/src/main.c:112
#9  0x00401589 in main (argc=3, argv=0x7fff809c3ba8)
    at /build/qpid-dispatch-src/router/src/main.c:360
{noformat}
2:
{noformat}
#0  ssl_cert_dup (cert=0x0) at ssl/ssl_cert.c:89
#1  0x7fda9b487b98 in SSL_new (ctx=0x1fb8a40) at ssl/ssl_lib.c:716
#2  0x7fda9c95b0c8 in init_ssl_socket (transport=0x7fda8014ec60,
    ssl=ssl@entry=0x7fda8004dc70)
    at /build/qpid-proton-src/c/src/ssl/openssl.c:1235
#3  0x7fda9c95bc18 in init_ssl_socket (ssl=0x7fda8004dc70,
    transport=0x7fda8014ec60)
    at /build/qpid-proton-src/c/src/ssl/openssl.c:1232
#4  process_input_ssl (transport=0x7fda8014ec60, layer=0,
    input_data=0x7fda80163f90 "\220\a", available=0)
    at /build/qpid-proton-src/c/src/ssl/openssl.c:963
#5  0x7fda9c95317a in transport_consume (
    transport=transport@entry=0x7fda8014ec60)
    at /build/qpid-proton-src/c/src/core/transport.c:1821
#6  0x7fda9c954eda in pn_transport_close_tail (transport=0x7fda8014ec60)
    at /build/qpid-proton-src/c/src/core/transport.c:2972
#7  0x7fda9c94cdea in pn_connection_driver_read_close (
    d=d@entry=0x7fda800e8968)
    at /build/qpid-proton-src/c/src/core/connection_driver.c:114
#8  0x7fda9c72f5f8 in pconnection_process (pc=pc@entry=0x7fda800e83c0,
    events=events@entry=0, timeout=timeout@entry=false,
    topup=topup@entry=true, is_io_2=is_io_2@entry=false)
    at /build/qpid-proton-src/c/src/proactor/epoll.c:1230
#9  0x7fda9c72f754 in pconnection_batch_next ()
    at /build/qpid-proton-src/c/src/proactor/epoll.c:953
#10 0x7fda9cbc12fb in thread_run (arg=0x1bd0f60)
    at /build/qpid-dispatch-src/src/server.c:976
#11 0x7fda9c512594 in start_thread (arg=)
    at pthread_create.c:463
#12 0x7fda9b7bbe6f in clone ()
    at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
{noformat}
3:
{noformat}
#0  0x7f761a027858 in EVP_PKEY_up_ref (pkey=0x1a0bf8b480166)
    at crypto/evp/p_lib.c:160
#1  0x7f761a3898a4 in ssl_cert_dup (cert=0x7f761b85a6a0 )
    at ssl/ssl_cert.c:99
#2  0x7f761a394b98 in SSL_new (ctx=0x7f761ba77940 )
    at ssl/ssl_lib.c:716
#3  0x7f761b8680c8 in init_ssl_socket (transport=0xcbede0, ssl=0xc92d70)
    at /build/qpid-proton-src/c/src/ssl/openssl.c:1235
#4  0x7f761b869464 in init_ssl_socket (ssl=,
    transport=)
    at /build/qpid-proton-src/c/src/ssl/openssl.c:1232
#5  pn_ssl_init (ssl0=, domain=,
    session_id=session_id@entry=0x0)
    at /build/qpid-proton-src/c/src/ssl/openssl.c:822
#6  0x7f761bab09ef in qdr_handle_authentication_service_connection_event (
    e=e@entry=0xc78900) at /build/qpid-dispatch-src/src/remote_sasl.c:623
#7  0x7f761bacda2c in handle (qd_server=qd_server@entry=0x85bf60,
    e=e@entry=0xc78900, pn_conn=pn_conn@entry=0xc35ba0, ctx=ctx@entry=0x0)
    at /build/qpid-dispatch-src/src/server.c:864
#8  0x7f761bace2e4 in thread_run (arg=arg@entry=0x85bf60)
    at /build/qpid-dispatch-src/src/server.c:973
#9  0x7f761bace590 in qd_server_run (qd=)
    at /build/qpid-dispatch-src/src/server.c:1247
#10 0x0040182c in main_process (
    config_path=0x7ffecc430965 "/tmp/qdrouterd.conf",
    python_pkgdir=, fd=2)
    at /build/qpid-dispatch-src/router/src/main.c:112
#11 0x00401589 in main (argc=3, argv=0x7ffecc42e848)
    at /build/qpid-dispatch-src/router/src/main.c:360
{noformat}
 

> Dispatch Router sporadically goes into a state where TLS connections to the 
> auth service fail
> -
>
> Key: DISPATCH-1086
>

***UNCHECKED*** [jira] [Commented] (DISPATCH-1086) Dispatch Router sporadically goes into a state where TLS connections to the auth service fail

2018-09-19 Thread Carsten Lohmann (JIRA)


[ 
https://issues.apache.org/jira/browse/DISPATCH-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16620236#comment-16620236
 ] 

Carsten Lohmann commented on DISPATCH-1086:
---

(yes, sent via mail)

> Dispatch Router sporadically goes into a state where TLS connections to the 
> auth service fail
> -
>
> Key: DISPATCH-1086
> URL: https://issues.apache.org/jira/browse/DISPATCH-1086
> Project: Qpid Dispatch
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Keith Wall
>Priority: Major
>
> Whilst running performance tests against Enmasse, we periodically see a 
> problem where Dispatch Router (1.1.0) goes into a state where fails to form 
> TLS connections to the authservice. When this occurs, the router needs to be 
> restarted to restore service. There does not seem to be a pattern to when the 
> issue occurs, but in all cases where it has been seen, the test case included 
> tens or hundreds of concurrently formed connections.
> The following message is written to the log:
>  
> {noformat}
> 2018-07-06 10:38:45.543519 + AUTHSERVICE (warning) Cannot initialise 
> SSL{noformat}
>  Unfortunately turning up the router logging (using the following command) 
> reveal no more useful information. This Proton improvement JIRA was raised to 
> include the diagnostics from OpenSSL.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Commented] (DISPATCH-1086) Dispatch Router sporadically goes into a state where TLS connections to the auth service fail

2018-09-17 Thread Carsten Lohmann (JIRA)


[ 
https://issues.apache.org/jira/browse/DISPATCH-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16617739#comment-16617739
 ] 

Carsten Lohmann commented on DISPATCH-1086:
---

We also got this issue.

Using the additional log information provided by PROTON-1886 (with PN_TRACE_DRV 
set to 1), there was this log output:
{noformat}
[0x7feae80d4f30]:Client SSL socket created.
[0x7feaf0142120]:Client SSL socket created.
[0x7feae80d4f30]:Read 26 bytes from SSL socket for app
[0x7feae80d4f30]:SSL socket freed.
[0x7feaf0142120]:Read 26 bytes from SSL socket for app
[0x7feaf0142120]:SSL socket freed.
[0x7feaec0c4e60]:SSL socket setup failure.
[0x7feaec0c4e60]:error:140BA0E4:SSL routines:SSL_new:ssl ctx has no default ssl 
version
2018-09-17 15:48:49.713040 + AUTHSERVICE (warning) Cannot initialise SSL
[0x7feaec0c4e60]:SSL socket setup failure.
[0x7feaec0c4e60]:error:140BA0E4:SSL routines:SSL_new:ssl ctx has no default ssl 
version
[0x7feae805ea00]:SSL socket setup failure.
[0x7feae805ea00]:error:140BA0E4:SSL routines:SSL_new:ssl ctx has no default ssl 
version
2018-09-17 15:48:49.713156 + AUTHSERVICE (warning) Cannot initialise SSL
[0x7feae805ea00]:SSL socket setup failure.
[0x7feae805ea00]:error:140BA0E4:SSL routines:SSL_new:ssl ctx has no default ssl 
version
[0x7feae8116100]:SSL socket setup failure.
[0x7feae8116100]:error:140BA0E4:SSL routines:SSL_new:ssl ctx has no default ssl 
version
2018-09-17 15:48:49.758598 + AUTHSERVICE (warning) Cannot initialise SSL
[0x7feae8116100]:SSL socket setup failure.
[0x7feae8116100]:error:140BA0E4:SSL routines:SSL_new:ssl ctx has no default ssl 
version
[0x7feae806adf0]:SSL socket setup failure.
[0x7feae806adf0]:error:140BA0E4:SSL routines:SSL_new:ssl ctx has no default ssl 
version
2018-09-17 15:48:49.762368 + AUTHSERVICE (warning) Cannot initialise SSL
[0x7feae806adf0]:SSL socket setup failure.
[0x7feae806adf0]:error:140BA0E4:SSL routines:SSL_new:ssl ctx has no default ssl 
version{noformat}
After the last line above, the router exited with a seg fault.

> Dispatch Router sporadically goes into a state where TLS connections to the 
> auth service fail
> -
>
> Key: DISPATCH-1086
> URL: https://issues.apache.org/jira/browse/DISPATCH-1086
> Project: Qpid Dispatch
>  Issue Type: Bug
>Affects Versions: 1.1.0
>Reporter: Keith Wall
>Priority: Major
>
> Whilst running performance tests against Enmasse, we periodically see a 
> problem where Dispatch Router (1.1.0) goes into a state where fails to form 
> TLS connections to the authservice. When this occurs, the router needs to be 
> restarted to restore service. There does not seem to be a pattern to when the 
> issue occurs, but in all cases where it has been seen, the test case included 
> tens or hundreds of concurrently formed connections.
> The following message is written to the log:
>  
> {noformat}
> 2018-07-06 10:38:45.543519 + AUTHSERVICE (warning) Cannot initialise 
> SSL{noformat}
>  Unfortunately turning up the router logging (using the following command) 
> reveal no more useful information. This Proton improvement JIRA was raised to 
> include the diagnostics from OpenSSL.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org



[jira] [Created] (DISPATCH-1088) Add OpenTracing support

2018-07-25 Thread Carsten Lohmann (JIRA)
Carsten Lohmann created DISPATCH-1088:
-

 Summary: Add OpenTracing support
 Key: DISPATCH-1088
 URL: https://issues.apache.org/jira/browse/DISPATCH-1088
 Project: Qpid Dispatch
  Issue Type: New Feature
Reporter: Carsten Lohmann


In order to get an overview of the lifetime of a message going through 
different messaging components, it would be good to have support for 
distributed tracing in the dispatch router.
 The [OpenTracing|http://opentracing.io/] standard defines an API for this, 
facilitating the use of different (OpenTracing-compatible) tracing systems 
(e.g. [Jaeger|https://jaegertracing.io/]).

This will mean identifying the different steps of the message processing inside 
the router and creating/finishing corresponding (child-)_Span_s for it (see 
[OpenTracing semantic 
spec|https://github.com/opentracing/specification/blob/master/specification.md]).

As for the implementation, there is an [OpenTracing API for 
C++|https://github.com/opentracing/opentracing-cpp].

Some more pointers:
 - [Example for loading the tracer library 
dynamically|https://github.com/opentracing/opentracing-cpp/blob/master/example/dynamic_load/dynamic_load-example.cpp]
 (corresponding [PR|https://github.com/opentracing/opentracing-cpp/pull/45])
 This approach is supported by Jaeger now as well.
 - [Jaeger C++ OpenTracing 
binding|https://github.com/jaegertracing/jaeger-client-cpp]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: dev-unsubscr...@qpid.apache.org
For additional commands, e-mail: dev-h...@qpid.apache.org