Re: [akka-user] [akka-persistence] throughput and benchmarks

Patrik Nordwall Mon, 17 Mar 2014 01:07:13 -0700

On Mon, Mar 17, 2014 at 8:38 AM, Carsten Saathoff
<[email protected]>wrote:


> Hi Patrik,
>
> Am Sonntag, 16. März 2014 20:47:24 UTC+1 schrieb Patrik Nordwall:
>
>> On Sat, Mar 15, 2014 at 6:17 PM, Carsten Saathoff <
>> [email protected]> wrote:
>>
>>> Hi,
>>>
>>> I am currently evaluating akka-persistence, specifically with respect to
>>> its throughput. I want to use it in a system consisting of approximately
>>> 1000000 actors, each representing an aggregate root, distributed over a
>>> sharded cluster of actor systems. Each of these actors should become a
>>> processor (or probably an eventsourced processor). So I am interested in
>>> how many messages a single actor system can persist per second. In a
>>> typical scenario each actor will receive a single message, but I want the
>>> time the system takes to persist all messages to be as short as possible.
>>>
>>> I wrote a simple test case without any sharding, but having a similar
>>> internal structure: https://github.com/kodemaniak/akka-persistence-
>>> throughput-test
>>>
>>> A sender sends messages to each ID in the system. All messages pass
>>> through a region actor which routes the messages according to the id
>>> contained therein and which creates a child actor per ID on demand. I am
>>> able to persist 2500-3000 msgs per second on my MacBook Pro (Mid 2010) with
>>> a SSD when the actors are recovered. During recovery it is around 1000
>>> msgs/second.
>>>
>>
>> What journal are you using? The bottleneck will be in the IO to the data
>> store. LevelDB is not really an option in a clustered system.
>>
>
> In the small test above I am using LevelDB. In the real system we are
> going to use a HBase backed journal (very probably at least).
>
> The main reason I ended up like that was that I wanted to measure the
> impact on performance when making certain actors persistent. However, after
> having obtained the first numbers, I wasn't sure how to explain them and
> that's why I am actually asking here ;)
>
>
>> When I replace the region actor with a single receiver that receives and
>>> persists all messages, the throughput increases by one magnitude, i.e.,
>>> >20k msgs/s when the actor is initialized.
>>>
>>> My assumption would have been that throughput is independent of the
>>> number of actors persisting messages. And in any case, I would not have
>>> expected one magnitude difference. Additionally, both numbers seem to be
>>> lower that what I've read before about the performance (50k msgs/s IIRC,
>>> though that's obviously hardware dependent), although the numbers with a
>>> single processor are very close.
>>>
>>
>> When using a single processor there is a huge difference between a
>> command sourced Processor and an EventsourcedProcessor. The reason is that
>> the command sourced Processor can take advantage of a dynamic batching
>> optimization which will reduce the number of roundtrips and fsyncs for
>> LevelDB.
>>
>
> In my tests I am using a command sourced processor. I already read before
> that event sourcing will further impact the performance.
>
>
>>
>> On my MacBook Pro: 2,3 GHz Intel Core i7, SSD
>> 110201.58 persistent commands per second
>> 10204.87 persistent events per second
>>
>
> How have these numbers be obtained? If I can achieve those numbers with a
> lot of Porcessors in the system, I am more than happy ;)
>

with the PerformanceSpec
Run with:
sbt -Dakka.persistence.performance.cycles.load=200000
-Dakka.persistence.performance.cycles.warmup=10000 "project
akka-persistence-experimental" "test-only akka.persistence.PerformanceSpec"



>
>
>> When I fire up 100 EventsourcedProcessor, I see around 117 events per
>> second in each.
>>
>>
>>>
>>> Am I doing anything wrong or are the numbers as expected? Is it a bad
>>> idea to have many processors in a system?
>>>
>>
>> I don't think it's a bad idea.
>>
>>
>>>  Are there any official benchmarks available, maybe with code?
>>>
>>
>> No, we have not benchmarked much. There is only akka.persistence.
>> PerformanceSpec<https://github.com/akka/akka/blob/v2.3.0/akka-persistence/src/test/scala/akka/persistence/PerformanceSpec.scala>
>> .
>>
>
> Will have a look at it, thanks!
>
>
>>
>> I think the journal implementation, the used data store, and
>> serialization will be the biggest factors.
>>
>
> Yeah, I thought so. However, as I wrote above, my main concern right now
> is the large difference in numbers between a single Processor and many.
>

I think that is because of the batching that can be utilised fully by a
single command sourced Processor. Using one Processor holding data for
thousands of entities will introduce other scalability problems compared to
many small entities, that can more easily be sharded, passivated, and so on.

I see no reason why a journal implementation could not batch operations and
then it should not matter if you use many or few processors. Note this line
in the hbase journal doc <https://github.com/ktoso/akka-persistence-hbase/>:
"Even though in the code it looks like it issues one Put at a time, this is
not the case, as writes are buffered and then batch written thanks to
AsyncBase."

/Patrik


>
> Thanks again
>
> Carsten
>
> --
> >>>>>>>>>> Read the docs: http://akka.io/docs/
> >>>>>>>>>> Check the FAQ:
> http://doc.akka.io/docs/akka/current/additional/faq.html
> >>>>>>>>>> Search the archives: https://groups.google.com/group/akka-user
> ---
> You received this message because you are subscribed to the Google Groups
> "Akka User List" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at http://groups.google.com/group/akka-user.
> For more options, visit https://groups.google.com/d/optout.
>



-- 

Patrik Nordwall
Typesafe <http://typesafe.com/> -  Reactive apps on the JVM
Twitter: @patriknw

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Re: [akka-user] [akka-persistence] throughput and benchmarks

Reply via email to