[
https://issues.apache.org/jira/browse/ARTEMIS-2852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17171394#comment-17171394
]
Kasper Kondzielski edited comment on ARTEMIS-2852 at 8/5/20, 10:02 AM:
-----------------------------------------------------------------------
I think that you got it right. We have 1 master and 2 slaves.
We wanted to achieve safe and persistent data replication. That's why we chose
master-slave configuration, as it is the only one which guarantees replication.
I know that the additional slave isn't used as only a single slave can be
connected to a given master. I think that this is actually a leftover from a
previous configurations and I just left is as it was.
Maybe it would be easier to describe what we were trying to achieve based on
some real example of another queue. Take a look at rabbitMq with their quorum
queues for example. Given a cluster of 3 nodes each node participates equally
to message processing and data replication. i.e. words data won't be lost even
if any of them goes down.
Having said that I started to think that our test might be a little bit unfair,
since we configured data replication (using master-slave approach) but we
didn't take care of message redistribution. Am I right, that a cluster of 3
master nodes connected with each others and 3 slave nodes, each connected with
a particular master node, would be a more appropriate solution?
Something like that:
!Selection_451.png!
Which also should solve the splitbrain problem.
Keep in mind that in our tests we are not scaling the cluster but rather amount
of sender and receivers.
was (Author: kkondzielski):
I think that you got it right. We have 1 master and 2 slaves.
We wanted to achieve safe and persistent data replication. That's why we chose
master-slave configuration, as it is the only one which guarantees replication.
I know that the additional slave isn't used as only a single slave can be
connected to a given master. I think that this is actually a leftover from a
previous configurations and I just left is as it was.
Maybe it would be easier to describe what we were trying to achieve based on
some real example of another queue. Take a look at rabbitMq with their quorum
queues for example. Given a cluster of 3 nodes each node participates equally
to message processing and data replication. i.e. words data won't be lost even
if any of them goes down.
Having said that I started to think that our test might be a little bit unfair,
since we configured data replication (using master-slave approach) but we
didn't take care of message redistribution. Am I right, that a cluster of 3
master nodes connected with each others and 3 slave nodes, each connected with
a particular master node, would be a more appropriate solution?
Something like that:
!Selection_451.png!
Keep in mind that in our tests we are not scaling the cluster but rather amount
of sender and receivers.
> Huge performance decrease between versions 2.2.0 and 2.13.0
> -----------------------------------------------------------
>
> Key: ARTEMIS-2852
> URL: https://issues.apache.org/jira/browse/ARTEMIS-2852
> Project: ActiveMQ Artemis
> Issue Type: Bug
> Reporter: Kasper Kondzielski
> Priority: Major
> Attachments: Selection_433.png, Selection_434.png, Selection_440.png,
> Selection_441.png, Selection_451.png
>
>
> Hi,
> Recently, we started to prepare a new revision of our blog-post in which we
> test various implementations of replicated queues. Previous version can be
> found here: [https://softwaremill.com/mqperf/]
> We updated artemis binary to 2.13.0, regenerated configuration file and
> applied all the performance tricks you told us last time. In particular these
> were:
> * the {{Xmx}} java parameter bumped to {{16G (now bumped to 48G)}}
> * in {{broker.xml}}, the {{global-max-size}} setting changed to {{8G (this
> one we forgot to set, but we suspect that it is not the issue)}}
> * {{journal-type}} set to {{MAPPED}}
> * {{journal-datasync}}, {{journal-sync-non-transactional}} and
> {{journal-sync-transactional}} all set to false
> Apart from that we changed machines' type we use to r5.2xlarge ( 8 cores, 64
> GIB memory, Network bandwidth Up to 10 Gbps, Storage bandwidth Up to 4,750
> Mbps) and we decided to always run twice as much receivers as senders.
> From our tests it looks like version 2.13.0 is not scaling as well, with the
> increase of senders and receivers, as version 2.2.0 (previously tested).
> Basically is not scaling at all as the throughput stays almost at the same
> level, while previously it used to grow linearly.
> Here you can find our tests results for both versions:
> [https://docs.google.com/spreadsheets/d/1kr9fzSNLD8bOhMkP7K_4axBQiKel1aJtpxsBCOy9ugU/edit?usp=sharing]
> We are aware that now there is a dedicated page in documentation about
> performance tuning, but we are surprised that same settings as before
> performs much worse.
> Maybe there is an obvious property which we overlooked which should be turned
> on?
> All changes between those versions together with the final configuration can
> be found on this merged PR:
> [https://github.com/softwaremill/mqperf/commit/6bfae489e11a250dc9e6ef59719782f839e8874a]
>
> Charts showing machines' usage in attachments. Memory consumed by artemis
> process didn't exceed ~ 16 GB. Bandwidht and cpu weren't also a bottlenecks.
> p.s. I wanted to ask this question on mailing list/nabble forum first but it
> seems that I don't have permissions to do so even though I registered &
> subscribed. Is that intentional?
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)