Hi, folks

This is the discussion about "ARTEMIS-2399 Fix performance degradation when
there are a lot of subscribers".

First apologize i didn't clarify our thoughts.

As noted in the part of Environment, page-max-cache-size is set to 1
meaning at most one page is allowed in softValueCache. We have tested with
the default page-max-cache-size which is 5, it would take some time to see
the performance degradation since at start the cursor positions of 100
subscribers are similar when all the messages read hits the softValueCache.
But after some time, the cursor positions are different. When these
positions are located more than 5 pages, it means some page would be read
back and forth. This can be proved by the trace log "adding pageCache
pageNr=xxx into cursor = test-topic" in PageCursorProviderImpl where some
pages are read a lot of times for the same subscriber. From the time on,
the performance starts to degrade. So we set page-max-cache-size to 1 here
just to make the test process more fast and it doesn't change the final
result.

The softValueCache would be removed if memory is really low, in addition
the map size reaches capacity(default 5). In most cases, the subscribers
are tailing read which are served by softValueCache(no need to bother
disk), thus we need to keep it. But When some subscribers fall behind, they
need to read page not in softValueCache. After looking up code, we found one
depage round is following at most MAX_SCHEDULED_RUNNERS deliver round in
most situations, and that's to say at most MAX_DELIVERIES_IN_LOOP *
MAX_SCHEDULED_RUNNERS number of messages would be depaged next. If you
adjust QueueImpl logger to debug level, you would see logs like "Queue
Memory Size after depage on queue=sub4 is 53478769 with maxSize = 52428800.
Depaged 68 messages, pendingDelivery=1002, intermediateMessageReferences=
23162, queueDelivering=0". In order to depage less than 2000 messages, each
subscriber has to read a whole page which is unnecessary and wasteful. In
our test where one page(50MB) contains ~40000 messages, one subscriber
maybe read 40000/2000=20 times of page if softValueCache is evicted to
finish delivering it. This has drastically slowed down the process and
burdened on the disk. So we add the PageIndexCacheImpl and read one message
each time rather than read all messages of page. In this way, for each
subscriber each page is read only once after finishing delivering.

Having said that, the softValueCache is used for tailing read. If it's
evicted, it won't be reloaded to prevent from the issue illustrated above.
Instead the pageIndexCache would be used.

Regarding implementation details, we noted that before delivering page, a
pageCursorInfo is constructed which needs to read the whole page. We can
take this opportunity to construct the pageIndexCache. It's very simple to
code. We also think of building a offset index file and some concerns
stemed from following:

   1. When to write and sync index file? Would it have some performance
   implications?
   2. If we have a index file, we can construct pageCursorInfo through
   it(no need to read the page like before), but we need to write the total
   message number into it first. Seems a little weird putting this into the
   index file.
   3. If experiencing hard crash, a recover mechanism would be needed to
   recover page and page index files, E.g. truncating to the valid size. So
   how do we know which files need to be sanity checked?
   4. A variant binary search algorithm maybe needed, see
   
https://github.com/apache/kafka/blob/70ddd8af71938b4f5f6d1bb3df6243ef13359bcf/core/src/main/scala/kafka/log/AbstractIndex.scala
    .
   5. Unlike kafka from which user fetches lots of messages at once and
   broker just needs to look up start offset from the index file once, artemis
   delivers message one by one and that means we have to look up the index
   every time we deliver a message. Although the index file is possibly in
   page cache, there are still chances we miss cache.
   6. Compatibility with old files.

To sum that, kafka uses a mmaped index file and we use a index cache. Both
are designed to find physical file position according offset(kafka) or
message number(artemis). And we prefer the index cache bcs it's easy to
understand and maintain.

We also tested the one subscriber case with the same setup.
The original:
consumer tps(11000msg/s) and latency:
[image: orig_single_subscriber.png]
producer tps(30000msg/s) and latency:
[image: orig_single_producer.png]
The pr:
consumer tps(14000msg/s) and latency:
[image: pr_single_consumer.png]
producer tps(30000msg/s) and latency:
[image: pr_single_producer.png]
It showed result is similar and event a little better in the case of single
subscriber.

We used our inner test platform and i think jmeter can also be used to test
again it.

Reply via email to