Hi, Raghu Thanks for clarification.
But how I can implement solution for HA? 2 separate job with same code and same group.id, if one job was killed, second job can start handle messages after rebalance by kafka. Thanks Alexey Diomin 2016-11-15 21:31 GMT+04:00 Raghu Angadi <[email protected]>: > Demin, > > 1) KafkaIO does not depend on consumer group.id at all. So you can have > multiple parallel pipelines reading from the same topic. Note > that consumer.assign(partitions) does not need group id, only > consumer.subscribe() does. > > Sometimes users might want to set a group.id for various reasons. E.g. if > you want to monitor consumption of a specific group.id externally. You > can set a group.id if you want. > > 2) Beam applications store the consumption state internally in their > checkpoint (sort of like Kafka storing offsets for a consumer group). That > said, 'restart' support from different runners (Flink, Spark etc) might be > bit different. Please specify your set up and we can confirm. > > Raghu. > > > > On Tue, Nov 15, 2016 at 4:19 AM, Demin Alexey <[email protected]> wrote: > >> Hi >> >> I have some question about kafka connector: >> 1) in code i can see *consumer.assign(source.assignedPartitions);* as >> result 2 client with same group name can't use inner load balancer from >> kafka >> >> 2) other problem "how resume reading kafka after restart". In past we >> depend on inner kafka mechanism for store offset in kafka/zookeeper, but >> with KafkaIO i didn't found how i can make this behavior. >> >> >> Additional clarification about processing: >> sometime we need 2 separate job with same group.id for HA, if one job >> was killed, second job can start handle messages after rebalance by kafka. >> >> Could you help me how we can build our requirement with beam processing? >> >> Thans >> Alexey Diomin >> >> >
