Hi

First of all i think this is an excellent effort, and could be a potential
massive positive change.

Before making any change on such scale, i do think we need to ensure we
have sufficient benchmarks on a number of scenarios, not just one use case,
and the benchmark tool used does need to be available openly so that others
can verify the measures and check on their setups.

Some additional scenarios i would want/need covering are:

PageCache set to 5, and all consumers keeping up, but lagging enough to be
reading from the same 1st page cache, latency and throughput need to be
measured for all.
PageCache set to 5 and all consumers but one keeping up but lagging enough
to be reading from the same 1st page cahce, but the one is falling off the
end, causing the page cache swapping, measure latecy and througput of those
keeping up in the 1st page cache not caring for the one.

Regards to solution some alternative approach to discuss

In your scenario if i understand correctly each subscriber is effectivly
having their own queue (1 to 1 mapping) not sharing.
You mention kafka and say multiple consumers doent read serailly on the
address and this is true, but per queue processing through messages
(dispatch) is still serial even with multiple shared consumers on a queue.

What about keeping the existing mechanism but having a queue hold reference
to a page cache that the queue is currently on, being kept from gc (e.g.
not soft) therefore meaning page cache isnt being swapped around, when you
have queues (in your case subscribers) swapping pagecaches back and forth
avoidning the constant re-read issue.

Also i think Franz had an excellent idea, do away with pagecache in its
current form entirely, ensure the offset is kept with the reference and
rely on OS caching keeping hot blocks/data.

Best
Michael



On Thu, 27 Jun 2019 at 05:13, yw yw <[email protected]> wrote:

> Hi, folks
>
> This is the discussion about "ARTEMIS-2399 Fix performance degradation
> when there are a lot of subscribers".
>
> First apologize i didn't clarify our thoughts.
>
> As noted in the part of Environment, page-max-cache-size is set to 1
> meaning at most one page is allowed in softValueCache. We have tested with
> the default page-max-cache-size which is 5, it would take some time to
> see the performance degradation since at start the cursor positions of 100
> subscribers are similar when all the messages read hits the softValueCache.
> But after some time, the cursor positions are different. When these
> positions are located more than 5 pages, it means some page would be read
> back and forth. This can be proved by the trace log "adding pageCache
> pageNr=xxx into cursor = test-topic" in PageCursorProviderImpl where some
> pages are read a lot of times for the same subscriber. From the time on,
> the performance starts to degrade. So we set page-max-cache-size to 1
> here just to make the test process more fast and it doesn't change the
> final result.
>
> The softValueCache would be removed if memory is really low, in addition
> the map size reaches capacity(default 5). In most cases, the subscribers
> are tailing read which are served by softValueCache(no need to bother
> disk), thus we need to keep it. But When some subscribers fall behind, they
> need to read page not in softValueCache. After looking up code, we found one
> depage round is following at most MAX_SCHEDULED_RUNNERS deliver round in
> most situations, and that's to say at most MAX_DELIVERIES_IN_LOOP *
> MAX_SCHEDULED_RUNNERS number of messages would be depaged next. If you
> adjust QueueImpl logger to debug level, you would see logs like "Queue
> Memory Size after depage on queue=sub4 is 53478769 with maxSize = 52428800.
> Depaged 68 messages, pendingDelivery=1002, intermediateMessageReferences=
> 23162, queueDelivering=0". In order to depage less than 2000 messages,
> each subscriber has to read a whole page which is unnecessary and wasteful.
> In our test where one page(50MB) contains ~40000 messages, one subscriber
> maybe read 40000/2000=20 times of page if softValueCache is evicted to
> finish delivering it. This has drastically slowed down the process and
> burdened on the disk. So we add the PageIndexCacheImpl and read one message
> each time rather than read all messages of page. In this way, for each
> subscriber each page is read only once after finishing delivering.
>
> Having said that, the softValueCache is used for tailing read. If it's
> evicted, it won't be reloaded to prevent from the issue illustrated above.
> Instead the pageIndexCache would be used.
>
> Regarding implementation details, we noted that before delivering page, a
> pageCursorInfo is constructed which needs to read the whole page. We can
> take this opportunity to construct the pageIndexCache. It's very simple to
> code. We also think of building a offset index file and some concerns
> stemed from following:
>
>    1. When to write and sync index file? Would it have some performance
>    implications?
>    2. If we have a index file, we can construct pageCursorInfo through
>    it(no need to read the page like before), but we need to write the total
>    message number into it first. Seems a little weird putting this into the
>    index file.
>    3. If experiencing hard crash, a recover mechanism would be needed to
>    recover page and page index files, E.g. truncating to the valid size. So
>    how do we know which files need to be sanity checked?
>    4. A variant binary search algorithm maybe needed, see
>    
> https://github.com/apache/kafka/blob/70ddd8af71938b4f5f6d1bb3df6243ef13359bcf/core/src/main/scala/kafka/log/AbstractIndex.scala
>     .
>    5. Unlike kafka from which user fetches lots of messages at once and
>    broker just needs to look up start offset from the index file once, artemis
>    delivers message one by one and that means we have to look up the index
>    every time we deliver a message. Although the index file is possibly in
>    page cache, there are still chances we miss cache.
>    6. Compatibility with old files.
>
> To sum that, kafka uses a mmaped index file and we use a index cache. Both
> are designed to find physical file position according offset(kafka) or
> message number(artemis). And we prefer the index cache bcs it's easy to
> understand and maintain.
>
> We also tested the one subscriber case with the same setup.
> The original:
> consumer tps(11000msg/s) and latency:
> [image: orig_single_subscriber.png]
> producer tps(30000msg/s) and latency:
> [image: orig_single_producer.png]
> The pr:
> consumer tps(14000msg/s) and latency:
> [image: pr_single_consumer.png]
> producer tps(30000msg/s) and latency:
> [image: pr_single_producer.png]
> It showed result is similar and event a little better in the case of
> single subscriber.
>
> We used our inner test platform and i think jmeter can also be used to
> test again it.
>

Reply via email to