Despite the excellent and timely blog post from Colin Breck

http://blog.colinbreck.com/maximizing-throughput-for-akka-streams/

we are having a devil of a time optimizing throughput in a stream which
does the following

1) consume messages containing a channel UUID from Kafka
The topic is partitioned based on channel UUID and we are using
committablePartitionedSource
2) lookup a legacy channel id from db
3) do a write (batched of course) to the same db
4) commit offsets

We've tried all sorts of things like using a Caffeine cache  in 2) and
peeking into the cache and then having a different flow for items with uuid
in cache or not. But because each partition (and hence the stream for each
partition) has a mix of channels, the slow channels (not in cache) slow
down the whole stream. We don't think groupBy is an option because of
ordering.

Our next idea is a custom stage that does a broadcast to two outputs, 1)
dedicated to populating the cache, and 2) for cache lookup and write. That
doesn't solve the merged backpressure problem so we thought we could add a
large buffer to 2.

Short of writing an entirely separate consumer that just populates the
cache, is there any other way to broadcast and somehow decouple the
backpressure of the two outputs from each other?

Any other ideas welcome too. Thanks.

-- 
>>>>>>>>>>      Read the docs: http://akka.io/docs/
>>>>>>>>>>      Check the FAQ: 
>>>>>>>>>> http://doc.akka.io/docs/akka/current/additional/faq.html
>>>>>>>>>>      Search the archives: https://groups.google.com/group/akka-user
--- 
You received this message because you are subscribed to the Google Groups "Akka 
User List" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to akka-user+unsubscr...@googlegroups.com.
To post to this group, send email to akka-user@googlegroups.com.
Visit this group at https://groups.google.com/group/akka-user.
For more options, visit https://groups.google.com/d/optout.

Reply via email to