[jira] [Comment Edited] (QPID-8238) [Broker-J] Improve performance of asynchronous publishing of transient messages into topic exchange having queues bound using non-overlapping selectors

Alex Rudyy (JIRA) Thu, 01 Nov 2018 04:38:41 -0700


    [ 
https://issues.apache.org/jira/browse/QPID-8238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16644766#comment-16644766
 ]


Alex Rudyy edited comment on QPID-8238 at 11/1/18 11:37 AM:
------------------------------------------------------------

Apart from lower throughput of NIO layer, there is another aspect to the 
performance issues I had failed to notice before. When running my performance 
tests for longer amount of time I started to see long GC pauses:
{noformat}
2018-10-10T05:32:11.250-0400: 1918.070: [GC (Allocation Failure) 
2018-10-10T05:32:11.250-0400: 1918.071: [ParNew: 3145728K->3145728K(3145728K), 
0.0000665 secs]2018-10-10T05:32:11.250-0400: 1918.071: 
[CMS2018-10-10T05:32:11.363-0400: 1918.183: [CMS-concurrent-mark: 1.696/1.700 
secs] [Times: user=51.70 sys=1.93, real=1.70 secs]
 (concurrent mode failure): 6418793K->5067015K(6990528K), 43.9299868 secs] 
9564521K->5067015K(10136256K), [Metaspace: 37679K->37679K(1083392K)], 
43.9303498 secs] [Times: user=45.39 sys=0.00, real=43.92 secs]
2018-10-10T05:32:55.181-0400: 1962.001: Total time for which application 
threads were stopped: 43.9323704 seconds, Stopping threads took: 0.0004930 
seconds

2018-10-10T05:33:01.804-0400: 1968.624: [GC (Allocation Failure) 
2018-10-10T05:33:01.804-0400: 1968.624: [ParNew: 3145728K->349504K(3145728K), 
1.1281568 secs] 8881780K->6564365K(10136256K), 1.1284601 secs] [Times: 
user=47.24 sys=0.01, real=1.13 secs]
2018-10-10T05:33:02.933-0400: 1969.753: Total time for which application 
threads were stopped: 1.1302133 seconds, Stopping threads took: 0.0003888 
seconds

2018-10-10T05:33:04.515-0400: 1971.335: [GC (Allocation Failure) 
2018-10-10T05:33:04.515-0400: 1971.335: [ParNew: 3145728K->349504K(3145728K), 
1.1778553 secs] 9360589K->7056409K(10136256K), 1.1781560 secs] [Times: 
user=50.64 sys=0.00, real=1.18 secs]
2018-10-10T05:33:05.693-0400: 1972.513: Total time for which application 
threads were stopped: 1.1804099 seconds, Stopping threads took: 0.0007907 
seconds

2018-10-10T05:33:07.300-0400: 1974.120: [GC (Allocation Failure) 
2018-10-10T05:33:07.300-0400: 1974.120: [ParNew: 3145728K->3145728K(3145728K), 
0.0000554 secs]2018-10-10T05:33:07.300-0400: 1974.120: 
[CMS2018-10-10T05:33:13.021-0400: 1979.841: [CMS-concurrent-preclean: 
8.160/9.850 secs] [Times: user=110.86 sys=3.14, real=9.85 secs]
 (concurrent mode failure): 6706905K->5608432K(6990528K), 48.4980083 secs] 
9852633K->5608432K(10136256K), [Metaspace: 37689K->37689K(1083392K)], 
48.4983623 secs] [Times: user=48.53 sys=0.00, real=48.49 secs]
2018-10-10T05:33:55.798-0400: 2022.619: Total time for which application 
threads were stopped: 48.5008915 seconds, Stopping threads took: 0.0008321 
seconds

2018-10-10T05:33:59.596-0400: 2026.417: [GC (Allocation Failure) 
2018-10-10T05:33:59.596-0400: 2026.417: [ParNew: 3145728K->349504K(3145728K), 
1.2127988 secs] 8892795K->6665263K(10136256K), 1.2130993 secs] [Times: 
user=46.20 sys=0.10, real=1.21 secs]
2018-10-10T05:34:00.810-0400: 2027.630: Total time for which application 
threads were stopped: 1.2151635 seconds, Stopping threads took: 0.0006465 
seconds

2018-10-10T05:34:02.374-0400: 2029.194: [GC (Allocation Failure) 
2018-10-10T05:34:02.374-0400: 2029.194: [ParNew: 3145728K->3145728K(3145728K), 
0.0000684 secs]2018-10-10T05:34:02.374-0400: 2029.195: 
[CMS2018-10-10T05:34:04.462-0400: 2031.282: [CMS-concurrent-mark: 3.616/3.626 
secs] [Times: user=63.40 sys=3.03, real=3.62 secs]
 (concurrent mode failure): 6315759K->5859983K(6990528K), 44.3028251 secs] 
9461487K->5859983K(10136256K), [Metaspace: 37695K->37695K(1083392K)], 
44.3032096 secs] [Times: user=60.53 sys=0.71, real=44.30 secs]
2018-10-10T05:34:46.678-0400: 2073.498: Total time for which application 
threads were stopped: 44.3052052 seconds, Stopping threads took: 0.0005032 
seconds

{noformat}
The nature of the pauses is not clear to me at the moment. I need to 
investigate this further.


was (Author: alex.rufous):
Apart from lower throughput of NIO layer, there is another aspect to the 
performance issues I had failed to notice before. When running my performance 
tests for longer amount of time I started to see long GC pauses on cleaning of 
JVM Metaspace:

{noformat}
2018-10-10T05:32:11.250-0400: 1918.070: [GC (Allocation Failure) 
2018-10-10T05:32:11.250-0400: 1918.071: [ParNew: 3145728K->3145728K(3145728K), 
0.0000665 secs]2018-10-10T05:32:11.250-0400: 1918.071: 
[CMS2018-10-10T05:32:11.363-0400: 1918.183: [CMS-concurrent-mark: 1.696/1.700 
secs] [Times: user=51.70 sys=1.93, real=1.70 secs]
 (concurrent mode failure): 6418793K->5067015K(6990528K), 43.9299868 secs] 
9564521K->5067015K(10136256K), [Metaspace: 37679K->37679K(1083392K)], 
43.9303498 secs] [Times: user=45.39 sys=0.00, real=43.92 secs]
2018-10-10T05:32:55.181-0400: 1962.001: Total time for which application 
threads were stopped: 43.9323704 seconds, Stopping threads took: 0.0004930 
seconds

2018-10-10T05:33:01.804-0400: 1968.624: [GC (Allocation Failure) 
2018-10-10T05:33:01.804-0400: 1968.624: [ParNew: 3145728K->349504K(3145728K), 
1.1281568 secs] 8881780K->6564365K(10136256K), 1.1284601 secs] [Times: 
user=47.24 sys=0.01, real=1.13 secs]
2018-10-10T05:33:02.933-0400: 1969.753: Total time for which application 
threads were stopped: 1.1302133 seconds, Stopping threads took: 0.0003888 
seconds

2018-10-10T05:33:04.515-0400: 1971.335: [GC (Allocation Failure) 
2018-10-10T05:33:04.515-0400: 1971.335: [ParNew: 3145728K->349504K(3145728K), 
1.1778553 secs] 9360589K->7056409K(10136256K), 1.1781560 secs] [Times: 
user=50.64 sys=0.00, real=1.18 secs]
2018-10-10T05:33:05.693-0400: 1972.513: Total time for which application 
threads were stopped: 1.1804099 seconds, Stopping threads took: 0.0007907 
seconds

2018-10-10T05:33:07.300-0400: 1974.120: [GC (Allocation Failure) 
2018-10-10T05:33:07.300-0400: 1974.120: [ParNew: 3145728K->3145728K(3145728K), 
0.0000554 secs]2018-10-10T05:33:07.300-0400: 1974.120: 
[CMS2018-10-10T05:33:13.021-0400: 1979.841: [CMS-concurrent-preclean: 
8.160/9.850 secs] [Times: user=110.86 sys=3.14, real=9.85 secs]
 (concurrent mode failure): 6706905K->5608432K(6990528K), 48.4980083 secs] 
9852633K->5608432K(10136256K), [Metaspace: 37689K->37689K(1083392K)], 
48.4983623 secs] [Times: user=48.53 sys=0.00, real=48.49 secs]
2018-10-10T05:33:55.798-0400: 2022.619: Total time for which application 
threads were stopped: 48.5008915 seconds, Stopping threads took: 0.0008321 
seconds

2018-10-10T05:33:59.596-0400: 2026.417: [GC (Allocation Failure) 
2018-10-10T05:33:59.596-0400: 2026.417: [ParNew: 3145728K->349504K(3145728K), 
1.2127988 secs] 8892795K->6665263K(10136256K), 1.2130993 secs] [Times: 
user=46.20 sys=0.10, real=1.21 secs]
2018-10-10T05:34:00.810-0400: 2027.630: Total time for which application 
threads were stopped: 1.2151635 seconds, Stopping threads took: 0.0006465 
seconds

2018-10-10T05:34:02.374-0400: 2029.194: [GC (Allocation Failure) 
2018-10-10T05:34:02.374-0400: 2029.194: [ParNew: 3145728K->3145728K(3145728K), 
0.0000684 secs]2018-10-10T05:34:02.374-0400: 2029.195: 
[CMS2018-10-10T05:34:04.462-0400: 2031.282: [CMS-concurrent-mark: 3.616/3.626 
secs] [Times: user=63.40 sys=3.03, real=3.62 secs]
 (concurrent mode failure): 6315759K->5859983K(6990528K), 44.3028251 secs] 
9461487K->5859983K(10136256K), [Metaspace: 37695K->37695K(1083392K)], 
44.3032096 secs] [Times: user=60.53 sys=0.71, real=44.30 secs]
2018-10-10T05:34:46.678-0400: 2073.498: Total time for which application 
threads were stopped: 44.3052052 seconds, Stopping threads took: 0.0005032 
seconds

{noformat}

The nature of the pauses is not clear to me at the moment. I need to 
investigate this further.

> [Broker-J] Improve performance of asynchronous publishing of transient 
> messages into topic exchange having queues bound using non-overlapping 
> selectors 
> --------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: QPID-8238
>                 URL: https://issues.apache.org/jira/browse/QPID-8238
>             Project: Qpid
>          Issue Type: Improvement
>          Components: Broker-J
>    Affects Versions: qpid-java-broker-7.0.6
>            Reporter: Alex Rudyy
>            Priority: Major
>             Fix For: qpid-java-broker-7.1.0, qpid-java-broker-7.0.7
>
>         Attachments: 
> 0001-QPID-8238-Use-java.lang.String-for-keys-and-values-i.patch
>
>
> The performance of asynchronous publishing of transient messages into topic 
> exchange which routes messages into queues bound using non-overlapping 
> selectors is 2-3 times slower than performance of 0.32 broker. The 
> performance degradation is observed with AMQP 0.9, though, I suspect that the 
> AMQP 0-10 protocol could be affected as well.
> I was running tests with 10 concurrent producers publishing messages  on 
> separate connections using the same routing key into 10 different queues 
> (subscribers queues) bound to the exchange using non-overlapping selectors.
> My testing showed that performance of 7.0 broker for this particular use case 
> was 2-3 times worse than performance of 0.32 broker.
> The following factors contributed to degradation of performance:
> •     Copying data from direct memory into heap memory whilst decoding 
> message headers. Due to this factor, the decoding of message headers is 
> around twice slower. It seems it contributes around 70% to total performance 
> degradation
> •     The message routing algorithm is slower due to need to support a new 
> feature to route messages into bound exchanges (in addition to queues) using 
> replacement routing key.
> •     AMQ short strings caching contributes 5-10% to total performance 
> degradation. The caching was added to manage heap space more efficiently.
> The numbers provided here could be inaccurate due instrumentation overhead 
> whilst profiling the issue.
> Potentially, caching can be turned off but that will not improve performance 
> much.
> On other hand, adding of additional caching of strings to amqp-short-strings 
> would improve the performance a bit. Whilst evaluating selectors, the fields 
> used in selector expressions are represented as java strings but they get 
> converted every time into amqp-short-strings when looking up for message 
> header values. If 10 queues are bound to the exchange using the same binding 
> key, the selector expression is evaluated 10 times for the incoming message. 
> Thus, all selector field names are get converted into amqp-short-strings 10 
> times as well. It seems adding caching here can improve the performance. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (QPID-8238) [Broker-J] Improve performance of asynchronous publishing of transient messages into topic exchange having queues bound using non-overlapping selectors

Reply via email to