You're right that the network might be congested (noisy neighbors), but I
was asking if you were saturating your host's NIC with your own traffic.
Both could be a problem, but the latter is possible to investigate without
support from Amazon's technical support team.

Tim

On Thu, Aug 29, 2019, 11:13 AM James Green <james.mk.gr...@gmail.com> wrote:

> Tim,
>
> The NIC issue is a potential issue. We're trying to deploy via Fargate
> where possible to shift the operational burden away from our developer
> staff, so noisy neighbours is entirely possible, as is cross-AZ latency.
>
> Touch wood, I seem to be in a place where throughput is at least as good as
> existing production. I've yet to "liven" all the potential traffic patterns
> to simulate the more complex loads but it may be good enough for now.
>
>
> On Thu, 29 Aug 2019 at 13:40, Tim Bain <tb...@alumni.duke.edu> wrote:
>
> > Might the choke point be the NIC on the EC2 instance? If you run the
> > consumers for A and B on different EC2s, how does that throughput compare
> > to what you're seeing?
> >
> > Also, I'd recommend you use JVisualVM or similar to capture a CPU
> sampling
> > (not profiling!) snapshot of your producer program to see where it's
> > spending its time. If there's a significant amount of time spent anywhere
> > except making the network call to send the bytes of the payload, then dig
> > into that.
> >
> > Tim
> >
> > On Wed, Aug 28, 2019, 12:03 PM alan protasio <alanp...@gmail.com> wrote:
> >
> > > Hi,
> > >
> > > I think you can try disable concurrentStoreAndDispatchQueues and rerun
> > the
> > > tests.
> > >
> > > Alan Diego
> > >
> > >
> > > On Wed, Aug 28, 2019 at 9:42 AM James Green <james.mk.gr...@gmail.com>
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > Following-up as I've run more tests.
> > > >
> > > > My minimal producer suffered the same bug as our main application: we
> > had
> > > > spring boot activemq thread pooling turned on as a property, but the
> > > > library (referenced in the main docs) was not included. Looking back
> at
> > > my
> > > > rather sparse notes at the time I activated this my sends of 10K
> > messages
> > > > went from taking 11m47s to between 2m56s - 3m31s which is a marked
> > > > improvement.
> > > >
> > > > To my chagrin, this has made little different to our real-world
> > > > application, and so I have modified my minimal producer to be capable
> > of
> > > > sending to the main application via it's queues.
> > > >
> > > > Allow me to elaborate at this point as it's important to understand
> > what
> > > > I'm looking at...
> > > >
> > > > The messages follow a small path through a series of queues as they
> are
> > > > processed. Queue A -> B -> C.
> > > >
> > > > If my minimal producer sends to Queue C (skipping A and B) I'm able
> to
> > > > produce at 49/s which is "quick enough".
> > > > If my minimal producer sends to Queue B (skipping A) I'm able to
> > produce
> > > at
> > > > 28/s - 38/s which is variable but most of the tests reached 38/s.
> > > > If my minimal producer sends to Queue A I'm able to produce at 28/s -
> > > 42/s
> > > > - again variable.
> > > >
> > > > Now Queues A and B are consumed by separate Camel routes inside the
> > same
> > > > application. Queue C is entirely separate.
> > > >
> > > > Looking at throughput graphs of the consumption of Queue C, when
> first
> > > > going through (A,B) for 10K messages, then going through (B), I can
> see
> > > > (A,B) is twice as slow.
> > > >
> > > > I'm left wondering if there's contention somehow within the
> application
> > > > consuming from (A,B) that is only showing up during load testing on
> > AWS,
> > > I
> > > > was not expecting it would be 2x slower unless the producer thread is
> > > > shared - you might imagine a thread pool was solve that!
> > > >
> > > > At this point I have ensured that there are 4 instances of each
> > > application
> > > > and they can happily deal with about 50 messages per second across
> the
> > > > queues with persistence on. I am uncertain whether I should be
> > expecting
> > > > more.
> > > >
> > > > If anyone has insights on why the two routes within the same
> > application
> > > > appear contended and indeed on whether overall throughput should be a
> > lot
> > > > higher I'd love to hear it.
> > > >
> > > > James
> > > >
> > > >
> > > > On Thu, 22 Aug 2019 at 14:02, Tim Bain <tb...@alumni.duke.edu>
> wrote:
> > > >
> > > > > Can you create a minimal producer via the OpenWire protocol in Java
> > or
> > > > > another language of your choice, to determine if your Camel
> producer
> > is
> > > > > slow because it's OpenWire or because it's Camel? I suspect you'll
> > find
> > > > > that OpenWire is the culprit, not Camel, but let's confirm that.
> > > > >
> > > > > All of these numbers sound tiny compared to what the ActiveMQ
> product
> > > is
> > > > > capable of (though I don't have any insight into how Amazon has
> > > > configured
> > > > > the brokers, nor into any code customizations they might have
> made).
> > If
> > > > you
> > > > > run multiple minimal producers in parallel, does throughput
> increase
> > > > > linearly?
> > > > >
> > > > > Also, you say you're testing with small payloads; are they small
> > enough
> > > > > that you might be running into the Nagle algorithm on your TCP
> > sockets?
> > > > If
> > > > > you use larger (e.g. 1KB) payloads, what does that do to your
> > > throughput
> > > > on
> > > > > a single producer?
> > > > >
> > > > > Tim
> > > > >
> > > > > On Thu, Aug 22, 2019, 2:54 AM James Green <
> james.mk.gr...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > I've been busy shifting an existing workload into AWS recently,
> > and a
> > > > > load
> > > > > > test shows a serious performance drop when sending to ActiveMQ
> > which
> > > I
> > > > > > could use some advice on.
> > > > > >
> > > > > > Quick architecture summary: We send requests via a webserver that
> > are
> > > > > > forwarded as messages to a queue. A backend receives these
> messages
> > > and
> > > > > > forwards them onward to another queue. Spring Boot with Camel
> > powers
> > > > the
> > > > > > show within Docker containers. Messages are persistent.
> > > > > >
> > > > > > Story so far:
> > > > > >
> > > > > > Tests show this first queue builds rapidly with pending messages
> > yet
> > > > > > monitoring of our existing production environment shows no such
> > > > backlog.
> > > > > >
> > > > > > Our existing production environment has everything in a single DC
> > so
> > > > it's
> > > > > > super low latency. Our AWS environment uses Fargate with
> AmazonMQ.
> > I
> > > > > > understand send latency will be higher and AmazonMQ will store
> the
> > > > > messages
> > > > > > across three AZs.
> > > > > >
> > > > > > So I launched a small EC2 instance to run some comparison tests:
> > > > > >
> > > > > > Receiving via a Camel route is super quick. This is not a
> problem.
> > > > > > Sending via a minimal Camel route is super slow. 14 messages per
> > > > second.
> > > > > We
> > > > > > appear to be doing at least 20-30 per second in production but
> it's
> > > > > enough
> > > > > > of a difference.
> > > > > > Sending via PHP with stomp-php setting both persistence on and
> > > receipt
> > > > > > headers on is substantially faster than sending via Camel. 55
> > > messages
> > > > > per
> > > > > > second.
> > > > > > Tests have been with 10K small payloads.
> > > > > >
> > > > > > At this point I'm thinking that both Camel and PHP should be
> > sending
> > > > with
> > > > > > the same properties - synchronously and with persistence. The
> > > messages
> > > > on
> > > > > > the queue are flagged persistent when viewed by the web console.
> > > > > >
> > > > > > Can anyone provide further suggestions to try?
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > James
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to