Resending email again...
Hello,
I would like to understand the options available to design an ingestion 
pipeline to support the following requirements.
1) Events are coming from various sources and depending on the type of the 
events it will be stored in specific Kafka topics (say we have 4 topics)
2) The events that are part of topics are weighted (Topic1: 0.6, Topic2: 0.1: 
Topic3: 0.2 and Topic4: 0.1)
3) The events are to be processed (consumed and enriched) based on the weights. 
For example, if I am reading 10 events from each topic, then I should consider 
processing 6 events from Topic1, 1 event from Topic2, 2 events from Topic3 and 
1 event from Topic4. Basically trying to do something similar to this 
implementation https://github.com/flipkart-incubator/priority-kafka-client
Question:

1) Should I handle the weighted distribution at the source (custom) connector 
or use a window after we read the data?
2) When reading from multiple Kafka topics, how the source connector enforce 
the batch read? If the batch size is 100, will it try to read 100 messages from 
each topic at once or through round-robin (try to get 100 from Topic1 first, 
and move on to the next topics till the batch size is reached)
Appreciate your inputs.

ThanksVijay    On Monday, December 16, 2019, 08:20:31 PM PST, Vijay 
Srinivasaraghavan <vijikar...@yahoo.com> wrote:  
 
 Hello,
I would like to understand options available to design an ingestion pipeline to 
support the following requirements.
1) Events are coming from various sources and depending on the type of the 
events it will be stored in specific Kafka topics (say we have 4 topics)
2) The events that are part of topics are weighted (Topic1: 0.6, Topic2: 0.1: 
Topic3: 0.2 and Topic4: 0.1)
3) The events are to be processed (consumed and enriched) based on the weights. 
For example, if I am reading 10 events from each topic, then I should consider 
processing 6 events from Topic1, 1 event from Topic2, 2 events from Topic3 and 
1 event from Topic4. Basically trying to do something similar to this 
implementation https://github.com/flipkart-incubator/priority-kafka-client
Question:

1) Should I handle the weighted distribution at the source (custom) connector 
or use a window after we read the data?
2) When reading from multiple Kafka topics, how the source connector enforce 
the batch read? If the batch size is 100, will it try to read 100 messages from 
each topic at once or through round-robin (try to get 100 from Topic1 first, 
and move on to the next topics till the batch size is reached)
Appreciate your inputs.

ThanksVijay  

Reply via email to