[
https://issues.apache.org/jira/browse/QPID-8238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16644766#comment-16644766
]
Alex Rudyy edited comment on QPID-8238 at 11/1/18 11:37 AM:
------------------------------------------------------------
Apart from lower throughput of NIO layer, there is another aspect to the
performance issues I had failed to notice before. When running my performance
tests for longer amount of time I started to see long GC pauses:
{noformat}
2018-10-10T05:32:11.250-0400: 1918.070: [GC (Allocation Failure)
2018-10-10T05:32:11.250-0400: 1918.071: [ParNew: 3145728K->3145728K(3145728K),
0.0000665 secs]2018-10-10T05:32:11.250-0400: 1918.071:
[CMS2018-10-10T05:32:11.363-0400: 1918.183: [CMS-concurrent-mark: 1.696/1.700
secs] [Times: user=51.70 sys=1.93, real=1.70 secs]
(concurrent mode failure): 6418793K->5067015K(6990528K), 43.9299868 secs]
9564521K->5067015K(10136256K), [Metaspace: 37679K->37679K(1083392K)],
43.9303498 secs] [Times: user=45.39 sys=0.00, real=43.92 secs]
2018-10-10T05:32:55.181-0400: 1962.001: Total time for which application
threads were stopped: 43.9323704 seconds, Stopping threads took: 0.0004930
seconds
2018-10-10T05:33:01.804-0400: 1968.624: [GC (Allocation Failure)
2018-10-10T05:33:01.804-0400: 1968.624: [ParNew: 3145728K->349504K(3145728K),
1.1281568 secs] 8881780K->6564365K(10136256K), 1.1284601 secs] [Times:
user=47.24 sys=0.01, real=1.13 secs]
2018-10-10T05:33:02.933-0400: 1969.753: Total time for which application
threads were stopped: 1.1302133 seconds, Stopping threads took: 0.0003888
seconds
2018-10-10T05:33:04.515-0400: 1971.335: [GC (Allocation Failure)
2018-10-10T05:33:04.515-0400: 1971.335: [ParNew: 3145728K->349504K(3145728K),
1.1778553 secs] 9360589K->7056409K(10136256K), 1.1781560 secs] [Times:
user=50.64 sys=0.00, real=1.18 secs]
2018-10-10T05:33:05.693-0400: 1972.513: Total time for which application
threads were stopped: 1.1804099 seconds, Stopping threads took: 0.0007907
seconds
2018-10-10T05:33:07.300-0400: 1974.120: [GC (Allocation Failure)
2018-10-10T05:33:07.300-0400: 1974.120: [ParNew: 3145728K->3145728K(3145728K),
0.0000554 secs]2018-10-10T05:33:07.300-0400: 1974.120:
[CMS2018-10-10T05:33:13.021-0400: 1979.841: [CMS-concurrent-preclean:
8.160/9.850 secs] [Times: user=110.86 sys=3.14, real=9.85 secs]
(concurrent mode failure): 6706905K->5608432K(6990528K), 48.4980083 secs]
9852633K->5608432K(10136256K), [Metaspace: 37689K->37689K(1083392K)],
48.4983623 secs] [Times: user=48.53 sys=0.00, real=48.49 secs]
2018-10-10T05:33:55.798-0400: 2022.619: Total time for which application
threads were stopped: 48.5008915 seconds, Stopping threads took: 0.0008321
seconds
2018-10-10T05:33:59.596-0400: 2026.417: [GC (Allocation Failure)
2018-10-10T05:33:59.596-0400: 2026.417: [ParNew: 3145728K->349504K(3145728K),
1.2127988 secs] 8892795K->6665263K(10136256K), 1.2130993 secs] [Times:
user=46.20 sys=0.10, real=1.21 secs]
2018-10-10T05:34:00.810-0400: 2027.630: Total time for which application
threads were stopped: 1.2151635 seconds, Stopping threads took: 0.0006465
seconds
2018-10-10T05:34:02.374-0400: 2029.194: [GC (Allocation Failure)
2018-10-10T05:34:02.374-0400: 2029.194: [ParNew: 3145728K->3145728K(3145728K),
0.0000684 secs]2018-10-10T05:34:02.374-0400: 2029.195:
[CMS2018-10-10T05:34:04.462-0400: 2031.282: [CMS-concurrent-mark: 3.616/3.626
secs] [Times: user=63.40 sys=3.03, real=3.62 secs]
(concurrent mode failure): 6315759K->5859983K(6990528K), 44.3028251 secs]
9461487K->5859983K(10136256K), [Metaspace: 37695K->37695K(1083392K)],
44.3032096 secs] [Times: user=60.53 sys=0.71, real=44.30 secs]
2018-10-10T05:34:46.678-0400: 2073.498: Total time for which application
threads were stopped: 44.3052052 seconds, Stopping threads took: 0.0005032
seconds
{noformat}
The nature of the pauses is not clear to me at the moment. I need to
investigate this further.
was (Author: alex.rufous):
Apart from lower throughput of NIO layer, there is another aspect to the
performance issues I had failed to notice before. When running my performance
tests for longer amount of time I started to see long GC pauses on cleaning of
JVM Metaspace:
{noformat}
2018-10-10T05:32:11.250-0400: 1918.070: [GC (Allocation Failure)
2018-10-10T05:32:11.250-0400: 1918.071: [ParNew: 3145728K->3145728K(3145728K),
0.0000665 secs]2018-10-10T05:32:11.250-0400: 1918.071:
[CMS2018-10-10T05:32:11.363-0400: 1918.183: [CMS-concurrent-mark: 1.696/1.700
secs] [Times: user=51.70 sys=1.93, real=1.70 secs]
(concurrent mode failure): 6418793K->5067015K(6990528K), 43.9299868 secs]
9564521K->5067015K(10136256K), [Metaspace: 37679K->37679K(1083392K)],
43.9303498 secs] [Times: user=45.39 sys=0.00, real=43.92 secs]
2018-10-10T05:32:55.181-0400: 1962.001: Total time for which application
threads were stopped: 43.9323704 seconds, Stopping threads took: 0.0004930
seconds
2018-10-10T05:33:01.804-0400: 1968.624: [GC (Allocation Failure)
2018-10-10T05:33:01.804-0400: 1968.624: [ParNew: 3145728K->349504K(3145728K),
1.1281568 secs] 8881780K->6564365K(10136256K), 1.1284601 secs] [Times:
user=47.24 sys=0.01, real=1.13 secs]
2018-10-10T05:33:02.933-0400: 1969.753: Total time for which application
threads were stopped: 1.1302133 seconds, Stopping threads took: 0.0003888
seconds
2018-10-10T05:33:04.515-0400: 1971.335: [GC (Allocation Failure)
2018-10-10T05:33:04.515-0400: 1971.335: [ParNew: 3145728K->349504K(3145728K),
1.1778553 secs] 9360589K->7056409K(10136256K), 1.1781560 secs] [Times:
user=50.64 sys=0.00, real=1.18 secs]
2018-10-10T05:33:05.693-0400: 1972.513: Total time for which application
threads were stopped: 1.1804099 seconds, Stopping threads took: 0.0007907
seconds
2018-10-10T05:33:07.300-0400: 1974.120: [GC (Allocation Failure)
2018-10-10T05:33:07.300-0400: 1974.120: [ParNew: 3145728K->3145728K(3145728K),
0.0000554 secs]2018-10-10T05:33:07.300-0400: 1974.120:
[CMS2018-10-10T05:33:13.021-0400: 1979.841: [CMS-concurrent-preclean:
8.160/9.850 secs] [Times: user=110.86 sys=3.14, real=9.85 secs]
(concurrent mode failure): 6706905K->5608432K(6990528K), 48.4980083 secs]
9852633K->5608432K(10136256K), [Metaspace: 37689K->37689K(1083392K)],
48.4983623 secs] [Times: user=48.53 sys=0.00, real=48.49 secs]
2018-10-10T05:33:55.798-0400: 2022.619: Total time for which application
threads were stopped: 48.5008915 seconds, Stopping threads took: 0.0008321
seconds
2018-10-10T05:33:59.596-0400: 2026.417: [GC (Allocation Failure)
2018-10-10T05:33:59.596-0400: 2026.417: [ParNew: 3145728K->349504K(3145728K),
1.2127988 secs] 8892795K->6665263K(10136256K), 1.2130993 secs] [Times:
user=46.20 sys=0.10, real=1.21 secs]
2018-10-10T05:34:00.810-0400: 2027.630: Total time for which application
threads were stopped: 1.2151635 seconds, Stopping threads took: 0.0006465
seconds
2018-10-10T05:34:02.374-0400: 2029.194: [GC (Allocation Failure)
2018-10-10T05:34:02.374-0400: 2029.194: [ParNew: 3145728K->3145728K(3145728K),
0.0000684 secs]2018-10-10T05:34:02.374-0400: 2029.195:
[CMS2018-10-10T05:34:04.462-0400: 2031.282: [CMS-concurrent-mark: 3.616/3.626
secs] [Times: user=63.40 sys=3.03, real=3.62 secs]
(concurrent mode failure): 6315759K->5859983K(6990528K), 44.3028251 secs]
9461487K->5859983K(10136256K), [Metaspace: 37695K->37695K(1083392K)],
44.3032096 secs] [Times: user=60.53 sys=0.71, real=44.30 secs]
2018-10-10T05:34:46.678-0400: 2073.498: Total time for which application
threads were stopped: 44.3052052 seconds, Stopping threads took: 0.0005032
seconds
{noformat}
The nature of the pauses is not clear to me at the moment. I need to
investigate this further.
> [Broker-J] Improve performance of asynchronous publishing of transient
> messages into topic exchange having queues bound using non-overlapping
> selectors
> --------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: QPID-8238
> URL: https://issues.apache.org/jira/browse/QPID-8238
> Project: Qpid
> Issue Type: Improvement
> Components: Broker-J
> Affects Versions: qpid-java-broker-7.0.6
> Reporter: Alex Rudyy
> Priority: Major
> Fix For: qpid-java-broker-7.1.0, qpid-java-broker-7.0.7
>
> Attachments:
> 0001-QPID-8238-Use-java.lang.String-for-keys-and-values-i.patch
>
>
> The performance of asynchronous publishing of transient messages into topic
> exchange which routes messages into queues bound using non-overlapping
> selectors is 2-3 times slower than performance of 0.32 broker. The
> performance degradation is observed with AMQP 0.9, though, I suspect that the
> AMQP 0-10 protocol could be affected as well.
> I was running tests with 10 concurrent producers publishing messages on
> separate connections using the same routing key into 10 different queues
> (subscribers queues) bound to the exchange using non-overlapping selectors.
> My testing showed that performance of 7.0 broker for this particular use case
> was 2-3 times worse than performance of 0.32 broker.
> The following factors contributed to degradation of performance:
> • Copying data from direct memory into heap memory whilst decoding
> message headers. Due to this factor, the decoding of message headers is
> around twice slower. It seems it contributes around 70% to total performance
> degradation
> • The message routing algorithm is slower due to need to support a new
> feature to route messages into bound exchanges (in addition to queues) using
> replacement routing key.
> • AMQ short strings caching contributes 5-10% to total performance
> degradation. The caching was added to manage heap space more efficiently.
> The numbers provided here could be inaccurate due instrumentation overhead
> whilst profiling the issue.
> Potentially, caching can be turned off but that will not improve performance
> much.
> On other hand, adding of additional caching of strings to amqp-short-strings
> would improve the performance a bit. Whilst evaluating selectors, the fields
> used in selector expressions are represented as java strings but they get
> converted every time into amqp-short-strings when looking up for message
> header values. If 10 queues are bound to the exchange using the same binding
> key, the selector expression is evaluated 10 times for the incoming message.
> Thus, all selector field names are get converted into amqp-short-strings 10
> times as well. It seems adding caching here can improve the performance.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]