I think this sounds good, but want to be clear I understand the consumers and 
producers involved - is this summary correct?

Controller: 
* consumes "completed-<controllerid>" topic (as usual)
Invoker:
* in case of logs NOT in db: when queue full, publish non-blocking to 
"completed-non-blocking"
*in case of logs in db: when queue full, publish all to "Activations" topic
OverflownActivationRecorderService (new service): 
* in case of logs NOT in db: consumes "completed-*" topic(s) AND 
"completed-non-blocking" topic
* in case of logs in db: consumes "Activations" topic
        
Thanks!
Tyson

On 9/11/19, 4:51 AM, "Chetan Mehrotra" <chetan.mehro...@gmail.com> wrote:

    As part of implementing this feature I came across support for topic
    patterns in Kafka [1] [2]. It seems to allow listening to multiple
    topics by same or a group of consumer. So after discussing with Sven
    (thanks Sven!) I came up with following proposal
    
    With this I think we can go back to "Option B1 - Activations via
    controller topic" and thus subscribe to "completed-.*" pattern.
    
    This would help by avoiding any extra load on Kafka as we consumer
    same activation result messages as being sent to Controller. However
    there are few caveats
    
    1. Currently we send activation result via Kafka only for blocking calls
    2. Result send does not contain logs
    
    So we can possibly have support for 2 modes
    
    Option CB1 - Existing topic + new topic for non blocking result
    -------------------
    
    This mode would be used if the setup does not record the logs in db.
    In this mode we would add support in Invoker to also send result for
    non blocking calls to a new "completed-non-blocking" topic and then
    listen for "completed-.*"
    
    Option CB2 - New topic + KafkaActivationStore
    ------------------
    This mode can be used if setup stores logs in db. Here we would have a
    new KafkaActivationStore which would send the activations to a new
    "activations" topic
    
    The ActivationPersister service can support both modes and cluster
    operator can configure it in required mode
    
    Chetan Mehrotra
    [1] 
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdoc.akka.io%2Fdocs%2Falpakka-kafka%2Fcurrent%2Fsubscription.html%23topic-pattern&amp;data=02%7C01%7Ctnorris%40adobe.com%7C9381bd5b8c0845ced67608d736ae5029%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637037994611727272&amp;sdata=pKognLhE6vFlE4k6ztn0%2BnYmnyVBi%2FFkD1NhN6PkkeI%3D&amp;reserved=0
    [2] 
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fkafka.apache.org%2F11%2Fjavadoc%2Forg%2Fapache%2Fkafka%2Fclients%2Fconsumer%2FKafkaConsumer.html%23subscribe-java.util.regex.Pattern-org.apache.kafka.clients.consumer.ConsumerRebalanceListener-&amp;data=02%7C01%7Ctnorris%40adobe.com%7C9381bd5b8c0845ced67608d736ae5029%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637037994611727272&amp;sdata=SJIKaxcjtscX9FUjkUWdVTFN3Y3mmJfwNQUCJOKnqNg%3D&amp;reserved=0
    
    On Mon, Jun 24, 2019 at 11:57 PM Chetan Mehrotra
    <chetan.mehro...@gmail.com> wrote:
    >
    > > For B1, we can scale out the service as controllers are scaled out, but 
it
    > > would be much complex to manually assign topics.
    >
    > Yes thats what my concern was in B1. So would for now target B2
    > approach where we have a dedicated new topic and then have it consumed
    > by a new service.  If it poses problem down the line then we can go
    > for B1. B
    >
    > Chetan Mehrotra
    >
    > On Tue, Jun 25, 2019 at 10:08 AM Dominic Kim <style9...@gmail.com> wrote:
    > >
    > > Let me share a few ideas on them.
    > >
    > > Regarding option B1, I think it can scale out better than option B2.
    > > If I understood correctly, scaling out of the service will be highly
    > > dependent on Kafka.
    > > Since the number of consumers is limited to the number of partitions, 
the
    > > number of service nodes will be also limited to the number of 
partitions.
    > >
    > > So in the case of B2, if we create a new topic with some partition 
numbers,
    > > we cannot scale out the service nodes more than that.
    > > At some point, we may need to alter the number of partitions and it's 
not
    > > easy in Kafka.
    > > (Since the activation processing here is asynchronous, we may bear some
    > > downtime(1~2s) to alter the partition. Then it would be fine.)
    > >
    > > In the case of B1, there will be many controller topics with their own
    > > partitions.
    > > Since controllers can be scaled out, there will be more topics, and the
    > > activation service can scale out accordingly.
    > > But in this case, we need to manually control the topic assignment.
    > > (Not partition assignment, it will be done by Kafka.)
    > >
    > > Let's say we have 3 controller topics with 2 partitions each.
    > > For HA, it would be great to have at least two nodes.
    > > At first, both nodes will take care of all three topics.
    > > Based on the partition assignment plan in Kafka, both nodes will fetch
    > > activation messages without any duplication.
    > > As controllers are scaled out, two nodes may not be enough to take care 
of
    > > all topics.
    > > At this point, we need to scale out the service nodes more.
    > > Then we need to do logical partitioning for topics.
    > >
    > > For example, the node1 and 2 will take care of topic0 ~ 1 and node3 and 
4
    > > will take care of topic2 ~ 3.
    > > In this way, we can guarantee the minimum HA and scale out the nodes as
    > > well.
    > > Among them, topic partitions will be also assigned by Kafka.
    > >
    > > So in short,
    > > For B1, we can scale out the service as controllers are scaled out, but 
it
    > > would be much complex to manually assign topics.
    > > And one node may have more than one Kafka consumers.
    > >
    > > For B2, scaling might be limited unless we have a big enough number of
    > > partitions at topic creation time.
    > > But if we can bear some downtime, this might not be a problem and this
    > > option will be a lot simpler.
    > >
    > > Best regards
    > > Dominic.
    > >
    > >
    > >
    > >
    > > 2019년 6월 24일 (월) 오후 6:50, Chetan Mehrotra <chetan.mehro...@gmail.com>님이 
작성:
    > >
    > > > Okie so we can then aim for adding an optional support for storing
    > > > activations via a separate service.
    > > >
    > > > Currently we also send the activation result on respective controller
    > > > topic. With this change we would also be sending same activation
    > > > record on another topic. So we have another choice to make
    > > >
    > > > Option B1 - Activations via controller topic
    > > > --------------------------------------------------------
    > > >
    > > > Here we avoid a new topic and instead have a service which listen to
    > > > all controller topics for activation records. However that would be
    > > > tricky to implement and also tricky to scale out. As scaling out such
    > > > a service by running multiple copies would not be easy in terms of
    > > > sharding/partitioning
    > > >
    > > > Here the benefit is that we reduce the duplicate writes on Kafka.
    > > >
    > > > Option B2 - Introduce a new topic altogether
    > > > -----------------------------------------------------------
    > > >
    > > > We introduce a new topic to which all invokers write the activation
    > > > records (like the case for user-events). Then implementing a new
    > > > service to read from a single (possibly partitioned topic) would be
    > > > easier.
    > > >
    > > > My suggestion is to go for B2 for now.
    > > >
    > > > Any feedback on that?
    > > >
    > > > Chetan Mehrotra
    > > >
    > > > On Fri, Jun 21, 2019 at 11:46 PM Rodric Rabbah <rod...@gmail.com> 
wrote:
    > > > >
    > > > > > Can we handle these in same way as user events? Maybe exactly 
like user
    > > > > events, as in use a single service to process both topics.
    > > > >
    > > > > good call - the user events already contains much of the activation
    > > > record
    > > > > (if not all modulo the logs)?
    > > > >
    > > > > -r
    > > >
    

Reply via email to