Kafka Stream Deadman

2017-10-06 Thread Trevor Huey
I have a sneaky suspicion this is out of scope for Kafka Streams, however I
thought it wouldn't hurt to ask... I'm trying to implement a temperature
monitoring system. Kafka Streams seems great for doing that. The one
scenario that I'm not able to cover, however, is detecting when a
temperature sensor fails to report data for X amount of time (a.k.a
"deadman"). Is there some pattern that makes this possible in Kafka Streams?

I'd appreciate any pointers, even if the answer is "no" :) Thanks!
-- 
Trevor Huey
Jolt
*Front End Lead*
E-mail: *trevor.h...@joltup.com * |  Website:
*joltup.com
*


Re: kafka broker loosing offsets?

2017-10-06 Thread tao xiao
Do you have unclean leader election turned on? If killing 100 is the only
way to reproduce the problem, it is possible with unclean leader election
turned on that leadership was transferred to out of ISR follower which may
not have the latest high watermark
On Sat, Oct 7, 2017 at 3:51 AM Dmitriy Vsekhvalnov 
wrote:

> About to verify hypothesis on monday, but looks like that in latest tests.
> Need to double check.
>
> On Fri, Oct 6, 2017 at 11:25 PM, Stas Chizhov  wrote:
>
> > So no matter in what sequence you shutdown brokers it is only 1 that
> causes
> > the major problem? That would indeed be a bit weird. have you checked
> > offsets of your consumer - right after offsets jump back - does it start
> > from the topic start or does it go back to some random position? Have you
> > checked if all offsets are actually being committed by consumers?
> >
> > fre 6 okt. 2017 kl. 20:59 skrev Dmitriy Vsekhvalnov <
> > dvsekhval...@gmail.com
> > >:
> >
> > > Yeah, probably we can dig around.
> > >
> > > One more observation, the most lag/re-consumption trouble happening
> when
> > we
> > > kill broker with lowest id (e.g. 100 from [100,101,102]).
> > > When crashing other brokers - there is nothing special happening, lag
> > > growing little bit but nothing crazy (e.g. thousands, not millions).
> > >
> > > Is it sounds suspicious?
> > >
> > > On Fri, Oct 6, 2017 at 9:23 PM, Stas Chizhov 
> wrote:
> > >
> > > > Ted: when choosing earliest/latest you are saying: if it happens that
> > > there
> > > > is no "valid" offset committed for a consumer (for whatever reason:
> > > > bug/misconfiguration/no luck) it will be ok to start from the
> beginning
> > > or
> > > > end of the topic. So if you are not ok with that you should choose
> > none.
> > > >
> > > > Dmitriy: Ok. Then it is spring-kafka that maintains this offset per
> > > > partition state for you. it might also has that problem of leaving
> > stale
> > > > offsets lying around, After quickly looking through
> > > > https://github.com/spring-projects/spring-kafka/blob/
> > > > 1945f29d5518e3c4a9950ba82135420dfb61e808/spring-kafka/src/
> > > > main/java/org/springframework/kafka/listener/
> > > > KafkaMessageListenerContainer.java
> > > > it looks possible since offsets map is not cleared upon partition
> > > > revocation, but that is just a hypothesis. I have no experience with
> > > > spring-kafka. However since you say you consumers were always active
> I
> > > find
> > > > this theory worth investigating.
> > > >
> > > >
> > > > 2017-10-06 18:20 GMT+02:00 Vincent Dautremont <
> > > > vincent.dautrem...@olamobile.com.invalid>:
> > > >
> > > > > is there a way to read messages on a topic partition from a
> specific
> > > node
> > > > > we that we choose (and not by the topic partition leader) ?
> > > > > I would like to read myself that each of the __consumer_offsets
> > > partition
> > > > > replicas have the same consumer group offset written in it in it.
> > > > >
> > > > > On Fri, Oct 6, 2017 at 6:08 PM, Dmitriy Vsekhvalnov <
> > > > > dvsekhval...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Stas:
> > > > > >
> > > > > > we rely on spring-kafka, it  commits offsets "manually" for us
> > after
> > > > > event
> > > > > > handler completed. So it's kind of automatic once there is
> constant
> > > > > stream
> > > > > > of events (no idle time, which is true for us). Though it's not
> > what
> > > > pure
> > > > > > kafka-client calls "automatic" (flush commits at fixed
> intervals).
> > > > > >
> > > > > > On Fri, Oct 6, 2017 at 7:04 PM, Stas Chizhov  >
> > > > wrote:
> > > > > >
> > > > > > > You don't have autocmmit enables that means you commit offsets
> > > > > yourself -
> > > > > > > correct? If you store them per partition somewhere and fail to
> > > clean
> > > > it
> > > > > > up
> > > > > > > upon rebalance next time the consumer gets this partition
> > assigned
> > > > > during
> > > > > > > next rebalance it can commit old stale offset- can this be the
> > > case?
> > > > > > >
> > > > > > >
> > > > > > > fre 6 okt. 2017 kl. 17:59 skrev Dmitriy Vsekhvalnov <
> > > > > > > dvsekhval...@gmail.com
> > > > > > > >:
> > > > > > >
> > > > > > > > Reprocessing same events again - is fine for us (idempotent).
> > > While
> > > > > > > loosing
> > > > > > > > data is more critical.
> > > > > > > >
> > > > > > > > What are reasons of such behaviour? Consumers are never idle,
> > > > always
> > > > > > > > commiting, probably something wrong with broker setup then?
> > > > > > > >
> > > > > > > > On Fri, Oct 6, 2017 at 6:58 PM, Ted Yu 
> > > > wrote:
> > > > > > > >
> > > > > > > > > Stas:
> > > > > > > > > bq.  using anything but none is not really an option
> > > > > > > > >
> > > > > > > > > If you have time, can you explain a bit more ?
> > > > > > > > >
> > > > > > > > > Thanks
> > > > > > > > >
> > > > > > > > > On Fri, Oct 6, 2017 at 

Re: Serve interactive queries from standby replicas

2017-10-06 Thread Guozhang Wang
Hi Stas,

Would you mind creating a JIRA for this functionality request so that we
won't forget about it and drop on the floor?


Guozhang

On Fri, Oct 6, 2017 at 1:10 PM, Stas Chizhov  wrote:

> Thank you!
>
> I guess eventually consistent reads might be a reasonable trade off if you
> can get ability to serve reads without downtime in some cases.
>
> By the way standby replicas are just extra consumers/processors of input
> topics? Or is there  some custom protocol for sinking the state?
>
>
>
> fre 6 okt. 2017 kl. 20:03 skrev Matthias J. Sax :
>
> > No, that is not possible.
> >
> > Note: standby replicas might "lag" behind the active store, and thus,
> > you would get different results if querying standby replicas would be
> > supported.
> >
> > We might add this functionality at some point though -- but there are no
> > concrete plans atm. Contributions are always welcome of course :)
> >
> >
> > -Matthias
> >
> > On 10/6/17 4:18 AM, Stas Chizhov wrote:
> > > Hi
> > >
> > > Is there a way to serve read read requests from standby replicas?
> > > StreamsMeatadata does not seem to provide standby end points as far as
> I
> > > can see.
> > >
> > > Thank you,
> > > Stas
> > >
> >
> >
>



-- 
-- Guozhang


Re: Serve interactive queries from standby replicas

2017-10-06 Thread Stas Chizhov
Ok I see. Thanks again!

fre 6 okt. 2017 kl. 22:13 skrev Matthias J. Sax :

> >> I guess eventually consistent reads might be a reasonable trade off if
> you
> >> can get ability to serve reads without downtime in some cases.
>
> Agreed :)
>
> >> By the way standby replicas are just extra consumers/processors of input
> >> topics? Or is there  some custom protocol for sinking the state?
>
> We use a second consumer, that reads the changlog topic (that is written
> by the active store) to update the hot standby.
>
>
> -Matthias
>
> On 10/6/17 1:10 PM, Stas Chizhov wrote:
> > Thank you!
> >
> > I guess eventually consistent reads might be a reasonable trade off if
> you
> > can get ability to serve reads without downtime in some cases.
> >
> > By the way standby replicas are just extra consumers/processors of input
> > topics? Or is there  some custom protocol for sinking the state?
> >
> >
> >
> > fre 6 okt. 2017 kl. 20:03 skrev Matthias J. Sax :
> >
> >> No, that is not possible.
> >>
> >> Note: standby replicas might "lag" behind the active store, and thus,
> >> you would get different results if querying standby replicas would be
> >> supported.
> >>
> >> We might add this functionality at some point though -- but there are no
> >> concrete plans atm. Contributions are always welcome of course :)
> >>
> >>
> >> -Matthias
> >>
> >> On 10/6/17 4:18 AM, Stas Chizhov wrote:
> >>> Hi
> >>>
> >>> Is there a way to serve read read requests from standby replicas?
> >>> StreamsMeatadata does not seem to provide standby end points as far as
> I
> >>> can see.
> >>>
> >>> Thank you,
> >>> Stas
> >>>
> >>
> >>
> >
>
>


Re: kafka broker loosing offsets?

2017-10-06 Thread Dmitriy Vsekhvalnov
About to verify hypothesis on monday, but looks like that in latest tests.
Need to double check.

On Fri, Oct 6, 2017 at 11:25 PM, Stas Chizhov  wrote:

> So no matter in what sequence you shutdown brokers it is only 1 that causes
> the major problem? That would indeed be a bit weird. have you checked
> offsets of your consumer - right after offsets jump back - does it start
> from the topic start or does it go back to some random position? Have you
> checked if all offsets are actually being committed by consumers?
>
> fre 6 okt. 2017 kl. 20:59 skrev Dmitriy Vsekhvalnov <
> dvsekhval...@gmail.com
> >:
>
> > Yeah, probably we can dig around.
> >
> > One more observation, the most lag/re-consumption trouble happening when
> we
> > kill broker with lowest id (e.g. 100 from [100,101,102]).
> > When crashing other brokers - there is nothing special happening, lag
> > growing little bit but nothing crazy (e.g. thousands, not millions).
> >
> > Is it sounds suspicious?
> >
> > On Fri, Oct 6, 2017 at 9:23 PM, Stas Chizhov  wrote:
> >
> > > Ted: when choosing earliest/latest you are saying: if it happens that
> > there
> > > is no "valid" offset committed for a consumer (for whatever reason:
> > > bug/misconfiguration/no luck) it will be ok to start from the beginning
> > or
> > > end of the topic. So if you are not ok with that you should choose
> none.
> > >
> > > Dmitriy: Ok. Then it is spring-kafka that maintains this offset per
> > > partition state for you. it might also has that problem of leaving
> stale
> > > offsets lying around, After quickly looking through
> > > https://github.com/spring-projects/spring-kafka/blob/
> > > 1945f29d5518e3c4a9950ba82135420dfb61e808/spring-kafka/src/
> > > main/java/org/springframework/kafka/listener/
> > > KafkaMessageListenerContainer.java
> > > it looks possible since offsets map is not cleared upon partition
> > > revocation, but that is just a hypothesis. I have no experience with
> > > spring-kafka. However since you say you consumers were always active I
> > find
> > > this theory worth investigating.
> > >
> > >
> > > 2017-10-06 18:20 GMT+02:00 Vincent Dautremont <
> > > vincent.dautrem...@olamobile.com.invalid>:
> > >
> > > > is there a way to read messages on a topic partition from a specific
> > node
> > > > we that we choose (and not by the topic partition leader) ?
> > > > I would like to read myself that each of the __consumer_offsets
> > partition
> > > > replicas have the same consumer group offset written in it in it.
> > > >
> > > > On Fri, Oct 6, 2017 at 6:08 PM, Dmitriy Vsekhvalnov <
> > > > dvsekhval...@gmail.com>
> > > > wrote:
> > > >
> > > > > Stas:
> > > > >
> > > > > we rely on spring-kafka, it  commits offsets "manually" for us
> after
> > > > event
> > > > > handler completed. So it's kind of automatic once there is constant
> > > > stream
> > > > > of events (no idle time, which is true for us). Though it's not
> what
> > > pure
> > > > > kafka-client calls "automatic" (flush commits at fixed intervals).
> > > > >
> > > > > On Fri, Oct 6, 2017 at 7:04 PM, Stas Chizhov 
> > > wrote:
> > > > >
> > > > > > You don't have autocmmit enables that means you commit offsets
> > > > yourself -
> > > > > > correct? If you store them per partition somewhere and fail to
> > clean
> > > it
> > > > > up
> > > > > > upon rebalance next time the consumer gets this partition
> assigned
> > > > during
> > > > > > next rebalance it can commit old stale offset- can this be the
> > case?
> > > > > >
> > > > > >
> > > > > > fre 6 okt. 2017 kl. 17:59 skrev Dmitriy Vsekhvalnov <
> > > > > > dvsekhval...@gmail.com
> > > > > > >:
> > > > > >
> > > > > > > Reprocessing same events again - is fine for us (idempotent).
> > While
> > > > > > loosing
> > > > > > > data is more critical.
> > > > > > >
> > > > > > > What are reasons of such behaviour? Consumers are never idle,
> > > always
> > > > > > > commiting, probably something wrong with broker setup then?
> > > > > > >
> > > > > > > On Fri, Oct 6, 2017 at 6:58 PM, Ted Yu 
> > > wrote:
> > > > > > >
> > > > > > > > Stas:
> > > > > > > > bq.  using anything but none is not really an option
> > > > > > > >
> > > > > > > > If you have time, can you explain a bit more ?
> > > > > > > >
> > > > > > > > Thanks
> > > > > > > >
> > > > > > > > On Fri, Oct 6, 2017 at 8:55 AM, Stas Chizhov <
> > schiz...@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > If you set auto.offset.reset to none next time it happens
> you
> > > > will
> > > > > be
> > > > > > > in
> > > > > > > > > much better position to find out what happens. Also in
> > general
> > > > with
> > > > > > > > current
> > > > > > > > > semantics of offset reset policy IMO using anything but
> none
> > is
> > > > not
> > > > > > > > really
> > > > > > > > > an option unless it is ok for consumer to loose some data
> > > > (latest)
> > > > > or
> > > > > > > > > reprocess 

Re: kafka broker loosing offsets?

2017-10-06 Thread Stas Chizhov
So no matter in what sequence you shutdown brokers it is only 1 that causes
the major problem? That would indeed be a bit weird. have you checked
offsets of your consumer - right after offsets jump back - does it start
from the topic start or does it go back to some random position? Have you
checked if all offsets are actually being committed by consumers?

fre 6 okt. 2017 kl. 20:59 skrev Dmitriy Vsekhvalnov :

> Yeah, probably we can dig around.
>
> One more observation, the most lag/re-consumption trouble happening when we
> kill broker with lowest id (e.g. 100 from [100,101,102]).
> When crashing other brokers - there is nothing special happening, lag
> growing little bit but nothing crazy (e.g. thousands, not millions).
>
> Is it sounds suspicious?
>
> On Fri, Oct 6, 2017 at 9:23 PM, Stas Chizhov  wrote:
>
> > Ted: when choosing earliest/latest you are saying: if it happens that
> there
> > is no "valid" offset committed for a consumer (for whatever reason:
> > bug/misconfiguration/no luck) it will be ok to start from the beginning
> or
> > end of the topic. So if you are not ok with that you should choose none.
> >
> > Dmitriy: Ok. Then it is spring-kafka that maintains this offset per
> > partition state for you. it might also has that problem of leaving stale
> > offsets lying around, After quickly looking through
> > https://github.com/spring-projects/spring-kafka/blob/
> > 1945f29d5518e3c4a9950ba82135420dfb61e808/spring-kafka/src/
> > main/java/org/springframework/kafka/listener/
> > KafkaMessageListenerContainer.java
> > it looks possible since offsets map is not cleared upon partition
> > revocation, but that is just a hypothesis. I have no experience with
> > spring-kafka. However since you say you consumers were always active I
> find
> > this theory worth investigating.
> >
> >
> > 2017-10-06 18:20 GMT+02:00 Vincent Dautremont <
> > vincent.dautrem...@olamobile.com.invalid>:
> >
> > > is there a way to read messages on a topic partition from a specific
> node
> > > we that we choose (and not by the topic partition leader) ?
> > > I would like to read myself that each of the __consumer_offsets
> partition
> > > replicas have the same consumer group offset written in it in it.
> > >
> > > On Fri, Oct 6, 2017 at 6:08 PM, Dmitriy Vsekhvalnov <
> > > dvsekhval...@gmail.com>
> > > wrote:
> > >
> > > > Stas:
> > > >
> > > > we rely on spring-kafka, it  commits offsets "manually" for us after
> > > event
> > > > handler completed. So it's kind of automatic once there is constant
> > > stream
> > > > of events (no idle time, which is true for us). Though it's not what
> > pure
> > > > kafka-client calls "automatic" (flush commits at fixed intervals).
> > > >
> > > > On Fri, Oct 6, 2017 at 7:04 PM, Stas Chizhov 
> > wrote:
> > > >
> > > > > You don't have autocmmit enables that means you commit offsets
> > > yourself -
> > > > > correct? If you store them per partition somewhere and fail to
> clean
> > it
> > > > up
> > > > > upon rebalance next time the consumer gets this partition assigned
> > > during
> > > > > next rebalance it can commit old stale offset- can this be the
> case?
> > > > >
> > > > >
> > > > > fre 6 okt. 2017 kl. 17:59 skrev Dmitriy Vsekhvalnov <
> > > > > dvsekhval...@gmail.com
> > > > > >:
> > > > >
> > > > > > Reprocessing same events again - is fine for us (idempotent).
> While
> > > > > loosing
> > > > > > data is more critical.
> > > > > >
> > > > > > What are reasons of such behaviour? Consumers are never idle,
> > always
> > > > > > commiting, probably something wrong with broker setup then?
> > > > > >
> > > > > > On Fri, Oct 6, 2017 at 6:58 PM, Ted Yu 
> > wrote:
> > > > > >
> > > > > > > Stas:
> > > > > > > bq.  using anything but none is not really an option
> > > > > > >
> > > > > > > If you have time, can you explain a bit more ?
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > > On Fri, Oct 6, 2017 at 8:55 AM, Stas Chizhov <
> schiz...@gmail.com
> > >
> > > > > wrote:
> > > > > > >
> > > > > > > > If you set auto.offset.reset to none next time it happens you
> > > will
> > > > be
> > > > > > in
> > > > > > > > much better position to find out what happens. Also in
> general
> > > with
> > > > > > > current
> > > > > > > > semantics of offset reset policy IMO using anything but none
> is
> > > not
> > > > > > > really
> > > > > > > > an option unless it is ok for consumer to loose some data
> > > (latest)
> > > > or
> > > > > > > > reprocess it second time (earliest).
> > > > > > > >
> > > > > > > > fre 6 okt. 2017 kl. 17:44 skrev Ted Yu  >:
> > > > > > > >
> > > > > > > > > Should Kafka log warning if log.retention.hours is lower
> than
> > > > > number
> > > > > > of
> > > > > > > > > hours specified by offsets.retention.minutes ?
> > > > > > > > >
> > > > > > > > > On Fri, Oct 6, 2017 at 8:35 AM, Manikumar <
> > > > > manikumar.re...@gmail.com
> > 

Re: Serve interactive queries from standby replicas

2017-10-06 Thread Matthias J. Sax
>> I guess eventually consistent reads might be a reasonable trade off if you
>> can get ability to serve reads without downtime in some cases.

Agreed :)

>> By the way standby replicas are just extra consumers/processors of input
>> topics? Or is there  some custom protocol for sinking the state?

We use a second consumer, that reads the changlog topic (that is written
by the active store) to update the hot standby.


-Matthias

On 10/6/17 1:10 PM, Stas Chizhov wrote:
> Thank you!
> 
> I guess eventually consistent reads might be a reasonable trade off if you
> can get ability to serve reads without downtime in some cases.
> 
> By the way standby replicas are just extra consumers/processors of input
> topics? Or is there  some custom protocol for sinking the state?
> 
> 
> 
> fre 6 okt. 2017 kl. 20:03 skrev Matthias J. Sax :
> 
>> No, that is not possible.
>>
>> Note: standby replicas might "lag" behind the active store, and thus,
>> you would get different results if querying standby replicas would be
>> supported.
>>
>> We might add this functionality at some point though -- but there are no
>> concrete plans atm. Contributions are always welcome of course :)
>>
>>
>> -Matthias
>>
>> On 10/6/17 4:18 AM, Stas Chizhov wrote:
>>> Hi
>>>
>>> Is there a way to serve read read requests from standby replicas?
>>> StreamsMeatadata does not seem to provide standby end points as far as I
>>> can see.
>>>
>>> Thank you,
>>> Stas
>>>
>>
>>
> 



signature.asc
Description: OpenPGP digital signature


Re: Serve interactive queries from standby replicas

2017-10-06 Thread Stas Chizhov
Thank you!

I guess eventually consistent reads might be a reasonable trade off if you
can get ability to serve reads without downtime in some cases.

By the way standby replicas are just extra consumers/processors of input
topics? Or is there  some custom protocol for sinking the state?



fre 6 okt. 2017 kl. 20:03 skrev Matthias J. Sax :

> No, that is not possible.
>
> Note: standby replicas might "lag" behind the active store, and thus,
> you would get different results if querying standby replicas would be
> supported.
>
> We might add this functionality at some point though -- but there are no
> concrete plans atm. Contributions are always welcome of course :)
>
>
> -Matthias
>
> On 10/6/17 4:18 AM, Stas Chizhov wrote:
> > Hi
> >
> > Is there a way to serve read read requests from standby replicas?
> > StreamsMeatadata does not seem to provide standby end points as far as I
> > can see.
> >
> > Thank you,
> > Stas
> >
>
>


Re: Scala API

2017-10-06 Thread Jozef.koval
Also voting in favor of reactive-kafka. Should fit nicely into your akka app.

Jozef

Sent from [ProtonMail](https://protonmail.ch), encrypted email based in 
Switzerland.

>  Original Message 
> Subject: Re: Scala API
> Local Time: October 6, 2017 8:09 PM
> UTC Time: October 6, 2017 6:09 PM
> From: a...@indeni.com
> To: users@kafka.apache.org
>
> Hi .
> IMO reactive-kafka gives a very nice api for streams . However if you want
> an alternative for using steams, you can try
> Scala-kafka-client http://www.cakesolutions.net/teamblogs/
> getting-started-with-kafka-using-scala-kafka-client-and-akka which doesn"t
> use streams but gives nice integration with actors
> Cheers
> Avi
>
> On Oct 6, 2017 6:46 AM, "Josh Maidana"  wrote:
>
> Hello
>
> We are integrating KAFKA with an AKKA system written in Scala. Is there a
> Scala API available for KAFKA? Is the best option to use AKKA KAFKA Stream?
>
> --
> Kind regards
> *Josh Meraj Maidana*

Re: kafka broker loosing offsets?

2017-10-06 Thread Dmitriy Vsekhvalnov
Yeah, probably we can dig around.

One more observation, the most lag/re-consumption trouble happening when we
kill broker with lowest id (e.g. 100 from [100,101,102]).
When crashing other brokers - there is nothing special happening, lag
growing little bit but nothing crazy (e.g. thousands, not millions).

Is it sounds suspicious?

On Fri, Oct 6, 2017 at 9:23 PM, Stas Chizhov  wrote:

> Ted: when choosing earliest/latest you are saying: if it happens that there
> is no "valid" offset committed for a consumer (for whatever reason:
> bug/misconfiguration/no luck) it will be ok to start from the beginning or
> end of the topic. So if you are not ok with that you should choose none.
>
> Dmitriy: Ok. Then it is spring-kafka that maintains this offset per
> partition state for you. it might also has that problem of leaving stale
> offsets lying around, After quickly looking through
> https://github.com/spring-projects/spring-kafka/blob/
> 1945f29d5518e3c4a9950ba82135420dfb61e808/spring-kafka/src/
> main/java/org/springframework/kafka/listener/
> KafkaMessageListenerContainer.java
> it looks possible since offsets map is not cleared upon partition
> revocation, but that is just a hypothesis. I have no experience with
> spring-kafka. However since you say you consumers were always active I find
> this theory worth investigating.
>
>
> 2017-10-06 18:20 GMT+02:00 Vincent Dautremont <
> vincent.dautrem...@olamobile.com.invalid>:
>
> > is there a way to read messages on a topic partition from a specific node
> > we that we choose (and not by the topic partition leader) ?
> > I would like to read myself that each of the __consumer_offsets partition
> > replicas have the same consumer group offset written in it in it.
> >
> > On Fri, Oct 6, 2017 at 6:08 PM, Dmitriy Vsekhvalnov <
> > dvsekhval...@gmail.com>
> > wrote:
> >
> > > Stas:
> > >
> > > we rely on spring-kafka, it  commits offsets "manually" for us after
> > event
> > > handler completed. So it's kind of automatic once there is constant
> > stream
> > > of events (no idle time, which is true for us). Though it's not what
> pure
> > > kafka-client calls "automatic" (flush commits at fixed intervals).
> > >
> > > On Fri, Oct 6, 2017 at 7:04 PM, Stas Chizhov 
> wrote:
> > >
> > > > You don't have autocmmit enables that means you commit offsets
> > yourself -
> > > > correct? If you store them per partition somewhere and fail to clean
> it
> > > up
> > > > upon rebalance next time the consumer gets this partition assigned
> > during
> > > > next rebalance it can commit old stale offset- can this be the case?
> > > >
> > > >
> > > > fre 6 okt. 2017 kl. 17:59 skrev Dmitriy Vsekhvalnov <
> > > > dvsekhval...@gmail.com
> > > > >:
> > > >
> > > > > Reprocessing same events again - is fine for us (idempotent). While
> > > > loosing
> > > > > data is more critical.
> > > > >
> > > > > What are reasons of such behaviour? Consumers are never idle,
> always
> > > > > commiting, probably something wrong with broker setup then?
> > > > >
> > > > > On Fri, Oct 6, 2017 at 6:58 PM, Ted Yu 
> wrote:
> > > > >
> > > > > > Stas:
> > > > > > bq.  using anything but none is not really an option
> > > > > >
> > > > > > If you have time, can you explain a bit more ?
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > On Fri, Oct 6, 2017 at 8:55 AM, Stas Chizhov  >
> > > > wrote:
> > > > > >
> > > > > > > If you set auto.offset.reset to none next time it happens you
> > will
> > > be
> > > > > in
> > > > > > > much better position to find out what happens. Also in general
> > with
> > > > > > current
> > > > > > > semantics of offset reset policy IMO using anything but none is
> > not
> > > > > > really
> > > > > > > an option unless it is ok for consumer to loose some data
> > (latest)
> > > or
> > > > > > > reprocess it second time (earliest).
> > > > > > >
> > > > > > > fre 6 okt. 2017 kl. 17:44 skrev Ted Yu :
> > > > > > >
> > > > > > > > Should Kafka log warning if log.retention.hours is lower than
> > > > number
> > > > > of
> > > > > > > > hours specified by offsets.retention.minutes ?
> > > > > > > >
> > > > > > > > On Fri, Oct 6, 2017 at 8:35 AM, Manikumar <
> > > > manikumar.re...@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > normally, log.retention.hours (168hrs)  should be higher
> than
> > > > > > > > > offsets.retention.minutes (336 hrs)?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Fri, Oct 6, 2017 at 8:58 PM, Dmitriy Vsekhvalnov <
> > > > > > > > > dvsekhval...@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Ted,
> > > > > > > > > >
> > > > > > > > > > Broker: v0.11.0.0
> > > > > > > > > >
> > > > > > > > > > Consumer:
> > > > > > > > > > kafka-clients v0.11.0.0
> > > > > > > > > > auto.offset.reset = earliest
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On 

Re: kafka broker loosing offsets?

2017-10-06 Thread Stas Chizhov
Ted: when choosing earliest/latest you are saying: if it happens that there
is no "valid" offset committed for a consumer (for whatever reason:
bug/misconfiguration/no luck) it will be ok to start from the beginning or
end of the topic. So if you are not ok with that you should choose none.

Dmitriy: Ok. Then it is spring-kafka that maintains this offset per
partition state for you. it might also has that problem of leaving stale
offsets lying around, After quickly looking through
https://github.com/spring-projects/spring-kafka/blob/1945f29d5518e3c4a9950ba82135420dfb61e808/spring-kafka/src/main/java/org/springframework/kafka/listener/KafkaMessageListenerContainer.java
it looks possible since offsets map is not cleared upon partition
revocation, but that is just a hypothesis. I have no experience with
spring-kafka. However since you say you consumers were always active I find
this theory worth investigating.


2017-10-06 18:20 GMT+02:00 Vincent Dautremont <
vincent.dautrem...@olamobile.com.invalid>:

> is there a way to read messages on a topic partition from a specific node
> we that we choose (and not by the topic partition leader) ?
> I would like to read myself that each of the __consumer_offsets partition
> replicas have the same consumer group offset written in it in it.
>
> On Fri, Oct 6, 2017 at 6:08 PM, Dmitriy Vsekhvalnov <
> dvsekhval...@gmail.com>
> wrote:
>
> > Stas:
> >
> > we rely on spring-kafka, it  commits offsets "manually" for us after
> event
> > handler completed. So it's kind of automatic once there is constant
> stream
> > of events (no idle time, which is true for us). Though it's not what pure
> > kafka-client calls "automatic" (flush commits at fixed intervals).
> >
> > On Fri, Oct 6, 2017 at 7:04 PM, Stas Chizhov  wrote:
> >
> > > You don't have autocmmit enables that means you commit offsets
> yourself -
> > > correct? If you store them per partition somewhere and fail to clean it
> > up
> > > upon rebalance next time the consumer gets this partition assigned
> during
> > > next rebalance it can commit old stale offset- can this be the case?
> > >
> > >
> > > fre 6 okt. 2017 kl. 17:59 skrev Dmitriy Vsekhvalnov <
> > > dvsekhval...@gmail.com
> > > >:
> > >
> > > > Reprocessing same events again - is fine for us (idempotent). While
> > > loosing
> > > > data is more critical.
> > > >
> > > > What are reasons of such behaviour? Consumers are never idle, always
> > > > commiting, probably something wrong with broker setup then?
> > > >
> > > > On Fri, Oct 6, 2017 at 6:58 PM, Ted Yu  wrote:
> > > >
> > > > > Stas:
> > > > > bq.  using anything but none is not really an option
> > > > >
> > > > > If you have time, can you explain a bit more ?
> > > > >
> > > > > Thanks
> > > > >
> > > > > On Fri, Oct 6, 2017 at 8:55 AM, Stas Chizhov 
> > > wrote:
> > > > >
> > > > > > If you set auto.offset.reset to none next time it happens you
> will
> > be
> > > > in
> > > > > > much better position to find out what happens. Also in general
> with
> > > > > current
> > > > > > semantics of offset reset policy IMO using anything but none is
> not
> > > > > really
> > > > > > an option unless it is ok for consumer to loose some data
> (latest)
> > or
> > > > > > reprocess it second time (earliest).
> > > > > >
> > > > > > fre 6 okt. 2017 kl. 17:44 skrev Ted Yu :
> > > > > >
> > > > > > > Should Kafka log warning if log.retention.hours is lower than
> > > number
> > > > of
> > > > > > > hours specified by offsets.retention.minutes ?
> > > > > > >
> > > > > > > On Fri, Oct 6, 2017 at 8:35 AM, Manikumar <
> > > manikumar.re...@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > normally, log.retention.hours (168hrs)  should be higher than
> > > > > > > > offsets.retention.minutes (336 hrs)?
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, Oct 6, 2017 at 8:58 PM, Dmitriy Vsekhvalnov <
> > > > > > > > dvsekhval...@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Ted,
> > > > > > > > >
> > > > > > > > > Broker: v0.11.0.0
> > > > > > > > >
> > > > > > > > > Consumer:
> > > > > > > > > kafka-clients v0.11.0.0
> > > > > > > > > auto.offset.reset = earliest
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Fri, Oct 6, 2017 at 6:24 PM, Ted Yu <
> yuzhih...@gmail.com>
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > > What's the value for auto.offset.reset  ?
> > > > > > > > > >
> > > > > > > > > > Which release are you using ?
> > > > > > > > > >
> > > > > > > > > > Cheers
> > > > > > > > > >
> > > > > > > > > > On Fri, Oct 6, 2017 at 7:52 AM, Dmitriy Vsekhvalnov <
> > > > > > > > > > dvsekhval...@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi all,
> > > > > > > > > > >
> > > > > > > > > > > we several time faced situation where consumer-group
> > > started
> > > > to
> > > > > > > > > > re-consume
> > > > > > > > > > 

Re: Scala API

2017-10-06 Thread Avi Levi
Hi .
IMO reactive-kafka gives a very nice api for streams . However if you want
an alternative for using steams, you can try
Scala-kafka-client http://www.cakesolutions.net/teamblogs/
getting-started-with-kafka-using-scala-kafka-client-and-akka which doesn't
use streams but gives nice integration with actors
Cheers
Avi


On Oct 6, 2017 6:46 AM, "Josh Maidana"  wrote:

Hello

We are integrating KAFKA with an AKKA system written in Scala. Is there a
Scala API available for KAFKA? Is the best option to use AKKA KAFKA Stream?

--
Kind regards
*Josh Meraj Maidana*


Re: Compacted topic, entry deletion, kafka 0.11.0.0

2017-10-06 Thread Matthias J. Sax
Setting topic policy to "compact,delete" should be sufficient. Cf.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-71%3A+Enable+log+compaction+and+deletion+to+co-exist

Note: retention time is not based on wall-clock time, but embedded
record timestamps. Thus, old messages get only deleted if new messages
with larger timestamps are written to the topic.


-Matthias

On 10/6/17 6:17 AM, Asko Alhoniemi wrote:
> Hello
> 
> I understand that the compacted topic is meant to keep at least the latest
> key value pair.
> 
> However, I am having an issue since it can happen that entry becomes old
> and I need to remove it. It may also occur, that I am not able to send key
> "null" pair. So I need another method to remove my hanging entries. My hope
> was with the following configuration:
> 
> Topic:test.topicPartitionCount:30ReplicationFactor:2
> Configs:segment.bytes=1048576,min.cleanable.dirty.ratio=0.1,
> delete.retention.ms=180,retention.ms=90,segment.ms
> =90,cleanup.policy=compact,delete
> Topic: test.topicPartition: 0Leader: 0Replicas: 1,0Isr:
> 0
> Topic: test.topicPartition: 1Leader: 0Replicas: 0,1Isr:
> 0
> Topic: test.topicPartition: 2Leader: 0Replicas: 1,0Isr:
> 0
> Topic: test.topicPartition: 3Leader: 0Replicas: 0,1Isr:
> 0
> Topic: test.topicPartition: 4Leader: 0Replicas: 1,0Isr:
> 0
> Topic: test.topicPartition: 5Leader: 0Replicas: 0,1Isr:
> 0
> Topic: test.topicPartition: 6Leader: 0Replicas: 1,0Isr:
> 0
> Topic: test.topicPartition: 7Leader: 0Replicas: 0,1Isr:
> 0
> Topic: test.topicPartition: 8Leader: 0Replicas: 1,0Isr:
> 0
> Topic: test.topicPartition: 9Leader: 0Replicas: 0,1Isr:
> 0
> Topic: test.topicPartition: 10Leader: 0Replicas: 1,0
> Isr: 0
> Topic: test.topicPartition: 11Leader: 0Replicas: 0,1
> Isr: 0
> Topic: test.topicPartition: 12Leader: 0Replicas: 1,0
> Isr: 0
> Topic: test.topicPartition: 13Leader: 0Replicas: 0,1
> Isr: 0
> Topic: test.topicPartition: 14Leader: 0Replicas: 1,0
> Isr: 0
> Topic: test.topicPartition: 15Leader: 0Replicas: 0,1
> Isr: 0
> Topic: test.topicPartition: 16Leader: 0Replicas: 1,0
> Isr: 0
> Topic: test.topicPartition: 17Leader: 0Replicas: 0,1
> Isr: 0
> Topic: test.topicPartition: 18Leader: 0Replicas: 1,0
> Isr: 0
> Topic: test.topicPartition: 19Leader: 0Replicas: 0,1
> Isr: 0
> Topic: test.topicPartition: 20Leader: 0Replicas: 1,0
> Isr: 0
> Topic: test.topicPartition: 21Leader: 0Replicas: 0,1
> Isr: 0
> Topic: test.topicPartition: 22Leader: 0Replicas: 1,0
> Isr: 0
> Topic: test.topicPartition: 23Leader: 0Replicas: 0,1
> Isr: 0
> Topic: test.topicPartition: 24Leader: 0Replicas: 1,0
> Isr: 0
> Topic: test.topicPartition: 25Leader: 0Replicas: 0,1
> Isr: 0
> Topic: test.topicPartition: 26Leader: 0Replicas: 1,0
> Isr: 0
> Topic: test.topicPartition: 27Leader: 0Replicas: 0,1
> Isr: 0
> Topic: test.topicPartition: 28Leader: 0Replicas: 1,0
> Isr: 0
> Topic: test.topicPartition: 29Leader: 0Replicas: 0,1
> Isr: 0
> 
> but it does not get cleaned. I stop writing one key value pair, but still
> can see that key value pair after days.
> 
> Then I changed broker configuration on the fly:
> 
> # The minimum age of a log file to be eligible for deletion
> #log.retention.hours=168
> log.retention.minutes=16
> 
> # A size-based retention policy for logs. Segments are pruned from the log
> as long as the remaining
> # segments don't drop below log.retention.bytes.
> #log.retention.bytes=1073741824
> log.retention.bytes=1
> 
> # The maximum size of a log segment file. When this size is reached a new
> log segment will be created.
> #log.segment.bytes=1073741824
> log.segment.bytes=1
> 
> So far I haven't been able to drop away the desired key value pair with any
> of these configuration actions.
> 
> Am I trying something that is not possible or am I missing some
> configuration options or should this work?
> 
> br.
> 
> Asko Alhoniemi
> 



signature.asc
Description: OpenPGP digital signature


Re: Serve interactive queries from standby replicas

2017-10-06 Thread Matthias J. Sax
No, that is not possible.

Note: standby replicas might "lag" behind the active store, and thus,
you would get different results if querying standby replicas would be
supported.

We might add this functionality at some point though -- but there are no
concrete plans atm. Contributions are always welcome of course :)


-Matthias

On 10/6/17 4:18 AM, Stas Chizhov wrote:
> Hi
> 
> Is there a way to serve read read requests from standby replicas?
> StreamsMeatadata does not seem to provide standby end points as far as I
> can see.
> 
> Thank you,
> Stas
> 



signature.asc
Description: OpenPGP digital signature


Re: Serve interactive queries from standby replicas

2017-10-06 Thread Damian Guy
Hi,

No that isn't supported.

Thanks,
Damian

On Fri, 6 Oct 2017 at 04:18 Stas Chizhov  wrote:

> Hi
>
> Is there a way to serve read read requests from standby replicas?
> StreamsMeatadata does not seem to provide standby end points as far as I
> can see.
>
> Thank you,
> Stas
>


Re: Scala API

2017-10-06 Thread Matthias J. Sax
The new clients (producer/consumer/admin) as well as Connect and Streams
API are only available in Java.

You can use Streams API with Scala though. There is one thing you need
to consider:
https://docs.confluent.io/current/streams/faq.html#scala-compile-error-no-type-parameter-java-defined-trait-is-invariant-in-type-t


-Matthias

On 10/5/17 8:45 PM, Josh Maidana wrote:
> Hello
> 
> We are integrating KAFKA with an AKKA system written in Scala. Is there a
> Scala API available for KAFKA? Is the best option to use AKKA KAFKA Stream?
> 



signature.asc
Description: OpenPGP digital signature


Re: kafka broker loosing offsets?

2017-10-06 Thread Ted Yu
A brief search brought me to related discussion on this JIRA:

https://issues.apache.org/jira/browse/KAFKA-3806?focusedCommentId=15906349=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15906349

FYI

On Fri, Oct 6, 2017 at 10:37 AM, Manikumar 
wrote:

> @Ted  Yes, I think we should add log warning message.
>
> On Fri, Oct 6, 2017 at 9:50 PM, Vincent Dautremont <
> vincent.dautrem...@olamobile.com.invalid> wrote:
>
> > is there a way to read messages on a topic partition from a specific node
> > we that we choose (and not by the topic partition leader) ?
> > I would like to read myself that each of the __consumer_offsets partition
> > replicas have the same consumer group offset written in it in it.
> >
> > On Fri, Oct 6, 2017 at 6:08 PM, Dmitriy Vsekhvalnov <
> > dvsekhval...@gmail.com>
> > wrote:
> >
> > > Stas:
> > >
> > > we rely on spring-kafka, it  commits offsets "manually" for us after
> > event
> > > handler completed. So it's kind of automatic once there is constant
> > stream
> > > of events (no idle time, which is true for us). Though it's not what
> pure
> > > kafka-client calls "automatic" (flush commits at fixed intervals).
> > >
> > > On Fri, Oct 6, 2017 at 7:04 PM, Stas Chizhov 
> wrote:
> > >
> > > > You don't have autocmmit enables that means you commit offsets
> > yourself -
> > > > correct? If you store them per partition somewhere and fail to clean
> it
> > > up
> > > > upon rebalance next time the consumer gets this partition assigned
> > during
> > > > next rebalance it can commit old stale offset- can this be the case?
> > > >
> > > >
> > > > fre 6 okt. 2017 kl. 17:59 skrev Dmitriy Vsekhvalnov <
> > > > dvsekhval...@gmail.com
> > > > >:
> > > >
> > > > > Reprocessing same events again - is fine for us (idempotent). While
> > > > loosing
> > > > > data is more critical.
> > > > >
> > > > > What are reasons of such behaviour? Consumers are never idle,
> always
> > > > > commiting, probably something wrong with broker setup then?
> > > > >
> > > > > On Fri, Oct 6, 2017 at 6:58 PM, Ted Yu 
> wrote:
> > > > >
> > > > > > Stas:
> > > > > > bq.  using anything but none is not really an option
> > > > > >
> > > > > > If you have time, can you explain a bit more ?
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > On Fri, Oct 6, 2017 at 8:55 AM, Stas Chizhov  >
> > > > wrote:
> > > > > >
> > > > > > > If you set auto.offset.reset to none next time it happens you
> > will
> > > be
> > > > > in
> > > > > > > much better position to find out what happens. Also in general
> > with
> > > > > > current
> > > > > > > semantics of offset reset policy IMO using anything but none is
> > not
> > > > > > really
> > > > > > > an option unless it is ok for consumer to loose some data
> > (latest)
> > > or
> > > > > > > reprocess it second time (earliest).
> > > > > > >
> > > > > > > fre 6 okt. 2017 kl. 17:44 skrev Ted Yu :
> > > > > > >
> > > > > > > > Should Kafka log warning if log.retention.hours is lower than
> > > > number
> > > > > of
> > > > > > > > hours specified by offsets.retention.minutes ?
> > > > > > > >
> > > > > > > > On Fri, Oct 6, 2017 at 8:35 AM, Manikumar <
> > > > manikumar.re...@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > normally, log.retention.hours (168hrs)  should be higher
> than
> > > > > > > > > offsets.retention.minutes (336 hrs)?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Fri, Oct 6, 2017 at 8:58 PM, Dmitriy Vsekhvalnov <
> > > > > > > > > dvsekhval...@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Ted,
> > > > > > > > > >
> > > > > > > > > > Broker: v0.11.0.0
> > > > > > > > > >
> > > > > > > > > > Consumer:
> > > > > > > > > > kafka-clients v0.11.0.0
> > > > > > > > > > auto.offset.reset = earliest
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Fri, Oct 6, 2017 at 6:24 PM, Ted Yu <
> > yuzhih...@gmail.com>
> > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > What's the value for auto.offset.reset  ?
> > > > > > > > > > >
> > > > > > > > > > > Which release are you using ?
> > > > > > > > > > >
> > > > > > > > > > > Cheers
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Oct 6, 2017 at 7:52 AM, Dmitriy Vsekhvalnov <
> > > > > > > > > > > dvsekhval...@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi all,
> > > > > > > > > > > >
> > > > > > > > > > > > we several time faced situation where consumer-group
> > > > started
> > > > > to
> > > > > > > > > > > re-consume
> > > > > > > > > > > > old events from beginning. Here is scenario:
> > > > > > > > > > > >
> > > > > > > > > > > > 1. x3 broker kafka cluster on top of x3 node
> zookeeper
> > > > > > > > > > > > 2. RF=3 for all topics
> > > > > > > > > > > > 3. log.retention.hours=168 and
> > > > > 

Re: kafka broker loosing offsets?

2017-10-06 Thread Manikumar
@Ted  Yes, I think we should add log warning message.

On Fri, Oct 6, 2017 at 9:50 PM, Vincent Dautremont <
vincent.dautrem...@olamobile.com.invalid> wrote:

> is there a way to read messages on a topic partition from a specific node
> we that we choose (and not by the topic partition leader) ?
> I would like to read myself that each of the __consumer_offsets partition
> replicas have the same consumer group offset written in it in it.
>
> On Fri, Oct 6, 2017 at 6:08 PM, Dmitriy Vsekhvalnov <
> dvsekhval...@gmail.com>
> wrote:
>
> > Stas:
> >
> > we rely on spring-kafka, it  commits offsets "manually" for us after
> event
> > handler completed. So it's kind of automatic once there is constant
> stream
> > of events (no idle time, which is true for us). Though it's not what pure
> > kafka-client calls "automatic" (flush commits at fixed intervals).
> >
> > On Fri, Oct 6, 2017 at 7:04 PM, Stas Chizhov  wrote:
> >
> > > You don't have autocmmit enables that means you commit offsets
> yourself -
> > > correct? If you store them per partition somewhere and fail to clean it
> > up
> > > upon rebalance next time the consumer gets this partition assigned
> during
> > > next rebalance it can commit old stale offset- can this be the case?
> > >
> > >
> > > fre 6 okt. 2017 kl. 17:59 skrev Dmitriy Vsekhvalnov <
> > > dvsekhval...@gmail.com
> > > >:
> > >
> > > > Reprocessing same events again - is fine for us (idempotent). While
> > > loosing
> > > > data is more critical.
> > > >
> > > > What are reasons of such behaviour? Consumers are never idle, always
> > > > commiting, probably something wrong with broker setup then?
> > > >
> > > > On Fri, Oct 6, 2017 at 6:58 PM, Ted Yu  wrote:
> > > >
> > > > > Stas:
> > > > > bq.  using anything but none is not really an option
> > > > >
> > > > > If you have time, can you explain a bit more ?
> > > > >
> > > > > Thanks
> > > > >
> > > > > On Fri, Oct 6, 2017 at 8:55 AM, Stas Chizhov 
> > > wrote:
> > > > >
> > > > > > If you set auto.offset.reset to none next time it happens you
> will
> > be
> > > > in
> > > > > > much better position to find out what happens. Also in general
> with
> > > > > current
> > > > > > semantics of offset reset policy IMO using anything but none is
> not
> > > > > really
> > > > > > an option unless it is ok for consumer to loose some data
> (latest)
> > or
> > > > > > reprocess it second time (earliest).
> > > > > >
> > > > > > fre 6 okt. 2017 kl. 17:44 skrev Ted Yu :
> > > > > >
> > > > > > > Should Kafka log warning if log.retention.hours is lower than
> > > number
> > > > of
> > > > > > > hours specified by offsets.retention.minutes ?
> > > > > > >
> > > > > > > On Fri, Oct 6, 2017 at 8:35 AM, Manikumar <
> > > manikumar.re...@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > normally, log.retention.hours (168hrs)  should be higher than
> > > > > > > > offsets.retention.minutes (336 hrs)?
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, Oct 6, 2017 at 8:58 PM, Dmitriy Vsekhvalnov <
> > > > > > > > dvsekhval...@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Ted,
> > > > > > > > >
> > > > > > > > > Broker: v0.11.0.0
> > > > > > > > >
> > > > > > > > > Consumer:
> > > > > > > > > kafka-clients v0.11.0.0
> > > > > > > > > auto.offset.reset = earliest
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Fri, Oct 6, 2017 at 6:24 PM, Ted Yu <
> yuzhih...@gmail.com>
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > > What's the value for auto.offset.reset  ?
> > > > > > > > > >
> > > > > > > > > > Which release are you using ?
> > > > > > > > > >
> > > > > > > > > > Cheers
> > > > > > > > > >
> > > > > > > > > > On Fri, Oct 6, 2017 at 7:52 AM, Dmitriy Vsekhvalnov <
> > > > > > > > > > dvsekhval...@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi all,
> > > > > > > > > > >
> > > > > > > > > > > we several time faced situation where consumer-group
> > > started
> > > > to
> > > > > > > > > > re-consume
> > > > > > > > > > > old events from beginning. Here is scenario:
> > > > > > > > > > >
> > > > > > > > > > > 1. x3 broker kafka cluster on top of x3 node zookeeper
> > > > > > > > > > > 2. RF=3 for all topics
> > > > > > > > > > > 3. log.retention.hours=168 and
> > > > offsets.retention.minutes=20160
> > > > > > > > > > > 4. running sustainable load (pushing events)
> > > > > > > > > > > 5. doing disaster testing by randomly shutting down 1
> of
> > 3
> > > > > broker
> > > > > > > > nodes
> > > > > > > > > > > (then provision new broker back)
> > > > > > > > > > >
> > > > > > > > > > > Several times after bouncing broker we faced situation
> > > where
> > > > > > > consumer
> > > > > > > > > > group
> > > > > > > > > > > started to re-consume old events.
> > > > > > > > > > >
> > > > > > > > > > > consumer group:
> > > > > > > > > > >
> > > > > > > > > > > 1. 

Re: Kafka Streams with embedded RocksDB

2017-10-06 Thread Sachin Mittal
Well as far as I remember this was an old issue with 0.10.0.x or something.
In 0.10.2.x librocksdbjni dll is part of rocksdbjni-5.0.1.jar so there is
no need to build separately for windows.

Did something got changed in 0.11.x ?

Thanks
Sachin


On Fri, Oct 6, 2017 at 10:00 PM, Ted Yu  wrote:

> I assume you have read
> https://github.com/facebook/rocksdb/wiki/Building-on-Windows
>
> Please also see https://github.com/facebook/rocksdb/issues/2531
>
> BTW your question should be directed to rocksdb forum.
>
>
>
> On Fri, Oct 6, 2017 at 6:39 AM, Valentin Forst  wrote:
>
> > Hi there,
> >
> > We have Kafka 0.11.0.0 running to DC/OS. So, I am developing on windows
> > (it's not my fault ;-)) and have to ship to a Linux-Container (DC/OS) in
> > order to run a java-app.
> >
> > What is the best way to use Kafka-streams maven dependency w.r.t. RocksDB
> > in order to work on both OSs?
> > Currently I am getting the following error:
> >
> > UnsatisfiedLinkError … Can’t find dependency library. Trying to find
> > rocksdbjni,…dll
> >
> > Thanks in advance
> > Valentin
>


Re: Kafka Streams with embedded RocksDB

2017-10-06 Thread Ted Yu
I assume you have read
https://github.com/facebook/rocksdb/wiki/Building-on-Windows

Please also see https://github.com/facebook/rocksdb/issues/2531

BTW your question should be directed to rocksdb forum.



On Fri, Oct 6, 2017 at 6:39 AM, Valentin Forst  wrote:

> Hi there,
>
> We have Kafka 0.11.0.0 running to DC/OS. So, I am developing on windows
> (it's not my fault ;-)) and have to ship to a Linux-Container (DC/OS) in
> order to run a java-app.
>
> What is the best way to use Kafka-streams maven dependency w.r.t. RocksDB
> in order to work on both OSs?
> Currently I am getting the following error:
>
> UnsatisfiedLinkError … Can’t find dependency library. Trying to find
> rocksdbjni,…dll
>
> Thanks in advance
> Valentin


Re: kafka broker loosing offsets?

2017-10-06 Thread Vincent Dautremont
is there a way to read messages on a topic partition from a specific node
we that we choose (and not by the topic partition leader) ?
I would like to read myself that each of the __consumer_offsets partition
replicas have the same consumer group offset written in it in it.

On Fri, Oct 6, 2017 at 6:08 PM, Dmitriy Vsekhvalnov 
wrote:

> Stas:
>
> we rely on spring-kafka, it  commits offsets "manually" for us after event
> handler completed. So it's kind of automatic once there is constant stream
> of events (no idle time, which is true for us). Though it's not what pure
> kafka-client calls "automatic" (flush commits at fixed intervals).
>
> On Fri, Oct 6, 2017 at 7:04 PM, Stas Chizhov  wrote:
>
> > You don't have autocmmit enables that means you commit offsets yourself -
> > correct? If you store them per partition somewhere and fail to clean it
> up
> > upon rebalance next time the consumer gets this partition assigned during
> > next rebalance it can commit old stale offset- can this be the case?
> >
> >
> > fre 6 okt. 2017 kl. 17:59 skrev Dmitriy Vsekhvalnov <
> > dvsekhval...@gmail.com
> > >:
> >
> > > Reprocessing same events again - is fine for us (idempotent). While
> > loosing
> > > data is more critical.
> > >
> > > What are reasons of such behaviour? Consumers are never idle, always
> > > commiting, probably something wrong with broker setup then?
> > >
> > > On Fri, Oct 6, 2017 at 6:58 PM, Ted Yu  wrote:
> > >
> > > > Stas:
> > > > bq.  using anything but none is not really an option
> > > >
> > > > If you have time, can you explain a bit more ?
> > > >
> > > > Thanks
> > > >
> > > > On Fri, Oct 6, 2017 at 8:55 AM, Stas Chizhov 
> > wrote:
> > > >
> > > > > If you set auto.offset.reset to none next time it happens you will
> be
> > > in
> > > > > much better position to find out what happens. Also in general with
> > > > current
> > > > > semantics of offset reset policy IMO using anything but none is not
> > > > really
> > > > > an option unless it is ok for consumer to loose some data (latest)
> or
> > > > > reprocess it second time (earliest).
> > > > >
> > > > > fre 6 okt. 2017 kl. 17:44 skrev Ted Yu :
> > > > >
> > > > > > Should Kafka log warning if log.retention.hours is lower than
> > number
> > > of
> > > > > > hours specified by offsets.retention.minutes ?
> > > > > >
> > > > > > On Fri, Oct 6, 2017 at 8:35 AM, Manikumar <
> > manikumar.re...@gmail.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > normally, log.retention.hours (168hrs)  should be higher than
> > > > > > > offsets.retention.minutes (336 hrs)?
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Oct 6, 2017 at 8:58 PM, Dmitriy Vsekhvalnov <
> > > > > > > dvsekhval...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Ted,
> > > > > > > >
> > > > > > > > Broker: v0.11.0.0
> > > > > > > >
> > > > > > > > Consumer:
> > > > > > > > kafka-clients v0.11.0.0
> > > > > > > > auto.offset.reset = earliest
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, Oct 6, 2017 at 6:24 PM, Ted Yu 
> > > > wrote:
> > > > > > > >
> > > > > > > > > What's the value for auto.offset.reset  ?
> > > > > > > > >
> > > > > > > > > Which release are you using ?
> > > > > > > > >
> > > > > > > > > Cheers
> > > > > > > > >
> > > > > > > > > On Fri, Oct 6, 2017 at 7:52 AM, Dmitriy Vsekhvalnov <
> > > > > > > > > dvsekhval...@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi all,
> > > > > > > > > >
> > > > > > > > > > we several time faced situation where consumer-group
> > started
> > > to
> > > > > > > > > re-consume
> > > > > > > > > > old events from beginning. Here is scenario:
> > > > > > > > > >
> > > > > > > > > > 1. x3 broker kafka cluster on top of x3 node zookeeper
> > > > > > > > > > 2. RF=3 for all topics
> > > > > > > > > > 3. log.retention.hours=168 and
> > > offsets.retention.minutes=20160
> > > > > > > > > > 4. running sustainable load (pushing events)
> > > > > > > > > > 5. doing disaster testing by randomly shutting down 1 of
> 3
> > > > broker
> > > > > > > nodes
> > > > > > > > > > (then provision new broker back)
> > > > > > > > > >
> > > > > > > > > > Several times after bouncing broker we faced situation
> > where
> > > > > > consumer
> > > > > > > > > group
> > > > > > > > > > started to re-consume old events.
> > > > > > > > > >
> > > > > > > > > > consumer group:
> > > > > > > > > >
> > > > > > > > > > 1. enable.auto.commit = false
> > > > > > > > > > 2. tried graceful group shutdown, kill -9 and terminating
> > AWS
> > > > > nodes
> > > > > > > > > > 3. never experienced re-consumption for given cases.
> > > > > > > > > >
> > > > > > > > > > What can cause that old events re-consumption? Is it
> > related
> > > to
> > > > > > > > bouncing
> > > > > > > > > > one of brokers? What to search in a logs? Any broker
> > settings
> > > > to
> 

Re: kafka broker loosing offsets?

2017-10-06 Thread Dmitriy Vsekhvalnov
Stas:

we rely on spring-kafka, it  commits offsets "manually" for us after event
handler completed. So it's kind of automatic once there is constant stream
of events (no idle time, which is true for us). Though it's not what pure
kafka-client calls "automatic" (flush commits at fixed intervals).

On Fri, Oct 6, 2017 at 7:04 PM, Stas Chizhov  wrote:

> You don't have autocmmit enables that means you commit offsets yourself -
> correct? If you store them per partition somewhere and fail to clean it up
> upon rebalance next time the consumer gets this partition assigned during
> next rebalance it can commit old stale offset- can this be the case?
>
>
> fre 6 okt. 2017 kl. 17:59 skrev Dmitriy Vsekhvalnov <
> dvsekhval...@gmail.com
> >:
>
> > Reprocessing same events again - is fine for us (idempotent). While
> loosing
> > data is more critical.
> >
> > What are reasons of such behaviour? Consumers are never idle, always
> > commiting, probably something wrong with broker setup then?
> >
> > On Fri, Oct 6, 2017 at 6:58 PM, Ted Yu  wrote:
> >
> > > Stas:
> > > bq.  using anything but none is not really an option
> > >
> > > If you have time, can you explain a bit more ?
> > >
> > > Thanks
> > >
> > > On Fri, Oct 6, 2017 at 8:55 AM, Stas Chizhov 
> wrote:
> > >
> > > > If you set auto.offset.reset to none next time it happens you will be
> > in
> > > > much better position to find out what happens. Also in general with
> > > current
> > > > semantics of offset reset policy IMO using anything but none is not
> > > really
> > > > an option unless it is ok for consumer to loose some data (latest) or
> > > > reprocess it second time (earliest).
> > > >
> > > > fre 6 okt. 2017 kl. 17:44 skrev Ted Yu :
> > > >
> > > > > Should Kafka log warning if log.retention.hours is lower than
> number
> > of
> > > > > hours specified by offsets.retention.minutes ?
> > > > >
> > > > > On Fri, Oct 6, 2017 at 8:35 AM, Manikumar <
> manikumar.re...@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > normally, log.retention.hours (168hrs)  should be higher than
> > > > > > offsets.retention.minutes (336 hrs)?
> > > > > >
> > > > > >
> > > > > > On Fri, Oct 6, 2017 at 8:58 PM, Dmitriy Vsekhvalnov <
> > > > > > dvsekhval...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Ted,
> > > > > > >
> > > > > > > Broker: v0.11.0.0
> > > > > > >
> > > > > > > Consumer:
> > > > > > > kafka-clients v0.11.0.0
> > > > > > > auto.offset.reset = earliest
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Oct 6, 2017 at 6:24 PM, Ted Yu 
> > > wrote:
> > > > > > >
> > > > > > > > What's the value for auto.offset.reset  ?
> > > > > > > >
> > > > > > > > Which release are you using ?
> > > > > > > >
> > > > > > > > Cheers
> > > > > > > >
> > > > > > > > On Fri, Oct 6, 2017 at 7:52 AM, Dmitriy Vsekhvalnov <
> > > > > > > > dvsekhval...@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi all,
> > > > > > > > >
> > > > > > > > > we several time faced situation where consumer-group
> started
> > to
> > > > > > > > re-consume
> > > > > > > > > old events from beginning. Here is scenario:
> > > > > > > > >
> > > > > > > > > 1. x3 broker kafka cluster on top of x3 node zookeeper
> > > > > > > > > 2. RF=3 for all topics
> > > > > > > > > 3. log.retention.hours=168 and
> > offsets.retention.minutes=20160
> > > > > > > > > 4. running sustainable load (pushing events)
> > > > > > > > > 5. doing disaster testing by randomly shutting down 1 of 3
> > > broker
> > > > > > nodes
> > > > > > > > > (then provision new broker back)
> > > > > > > > >
> > > > > > > > > Several times after bouncing broker we faced situation
> where
> > > > > consumer
> > > > > > > > group
> > > > > > > > > started to re-consume old events.
> > > > > > > > >
> > > > > > > > > consumer group:
> > > > > > > > >
> > > > > > > > > 1. enable.auto.commit = false
> > > > > > > > > 2. tried graceful group shutdown, kill -9 and terminating
> AWS
> > > > nodes
> > > > > > > > > 3. never experienced re-consumption for given cases.
> > > > > > > > >
> > > > > > > > > What can cause that old events re-consumption? Is it
> related
> > to
> > > > > > > bouncing
> > > > > > > > > one of brokers? What to search in a logs? Any broker
> settings
> > > to
> > > > > try?
> > > > > > > > >
> > > > > > > > > Thanks in advance.
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: kafka broker loosing offsets?

2017-10-06 Thread Stas Chizhov
You don't have autocmmit enables that means you commit offsets yourself -
correct? If you store them per partition somewhere and fail to clean it up
upon rebalance next time the consumer gets this partition assigned during
next rebalance it can commit old stale offset- can this be the case?


fre 6 okt. 2017 kl. 17:59 skrev Dmitriy Vsekhvalnov :

> Reprocessing same events again - is fine for us (idempotent). While loosing
> data is more critical.
>
> What are reasons of such behaviour? Consumers are never idle, always
> commiting, probably something wrong with broker setup then?
>
> On Fri, Oct 6, 2017 at 6:58 PM, Ted Yu  wrote:
>
> > Stas:
> > bq.  using anything but none is not really an option
> >
> > If you have time, can you explain a bit more ?
> >
> > Thanks
> >
> > On Fri, Oct 6, 2017 at 8:55 AM, Stas Chizhov  wrote:
> >
> > > If you set auto.offset.reset to none next time it happens you will be
> in
> > > much better position to find out what happens. Also in general with
> > current
> > > semantics of offset reset policy IMO using anything but none is not
> > really
> > > an option unless it is ok for consumer to loose some data (latest) or
> > > reprocess it second time (earliest).
> > >
> > > fre 6 okt. 2017 kl. 17:44 skrev Ted Yu :
> > >
> > > > Should Kafka log warning if log.retention.hours is lower than number
> of
> > > > hours specified by offsets.retention.minutes ?
> > > >
> > > > On Fri, Oct 6, 2017 at 8:35 AM, Manikumar  >
> > > > wrote:
> > > >
> > > > > normally, log.retention.hours (168hrs)  should be higher than
> > > > > offsets.retention.minutes (336 hrs)?
> > > > >
> > > > >
> > > > > On Fri, Oct 6, 2017 at 8:58 PM, Dmitriy Vsekhvalnov <
> > > > > dvsekhval...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi Ted,
> > > > > >
> > > > > > Broker: v0.11.0.0
> > > > > >
> > > > > > Consumer:
> > > > > > kafka-clients v0.11.0.0
> > > > > > auto.offset.reset = earliest
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Fri, Oct 6, 2017 at 6:24 PM, Ted Yu 
> > wrote:
> > > > > >
> > > > > > > What's the value for auto.offset.reset  ?
> > > > > > >
> > > > > > > Which release are you using ?
> > > > > > >
> > > > > > > Cheers
> > > > > > >
> > > > > > > On Fri, Oct 6, 2017 at 7:52 AM, Dmitriy Vsekhvalnov <
> > > > > > > dvsekhval...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > we several time faced situation where consumer-group started
> to
> > > > > > > re-consume
> > > > > > > > old events from beginning. Here is scenario:
> > > > > > > >
> > > > > > > > 1. x3 broker kafka cluster on top of x3 node zookeeper
> > > > > > > > 2. RF=3 for all topics
> > > > > > > > 3. log.retention.hours=168 and
> offsets.retention.minutes=20160
> > > > > > > > 4. running sustainable load (pushing events)
> > > > > > > > 5. doing disaster testing by randomly shutting down 1 of 3
> > broker
> > > > > nodes
> > > > > > > > (then provision new broker back)
> > > > > > > >
> > > > > > > > Several times after bouncing broker we faced situation where
> > > > consumer
> > > > > > > group
> > > > > > > > started to re-consume old events.
> > > > > > > >
> > > > > > > > consumer group:
> > > > > > > >
> > > > > > > > 1. enable.auto.commit = false
> > > > > > > > 2. tried graceful group shutdown, kill -9 and terminating AWS
> > > nodes
> > > > > > > > 3. never experienced re-consumption for given cases.
> > > > > > > >
> > > > > > > > What can cause that old events re-consumption? Is it related
> to
> > > > > > bouncing
> > > > > > > > one of brokers? What to search in a logs? Any broker settings
> > to
> > > > try?
> > > > > > > >
> > > > > > > > Thanks in advance.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: kafka broker loosing offsets?

2017-10-06 Thread Dmitriy Vsekhvalnov
Reprocessing same events again - is fine for us (idempotent). While loosing
data is more critical.

What are reasons of such behaviour? Consumers are never idle, always
commiting, probably something wrong with broker setup then?

On Fri, Oct 6, 2017 at 6:58 PM, Ted Yu  wrote:

> Stas:
> bq.  using anything but none is not really an option
>
> If you have time, can you explain a bit more ?
>
> Thanks
>
> On Fri, Oct 6, 2017 at 8:55 AM, Stas Chizhov  wrote:
>
> > If you set auto.offset.reset to none next time it happens you will be in
> > much better position to find out what happens. Also in general with
> current
> > semantics of offset reset policy IMO using anything but none is not
> really
> > an option unless it is ok for consumer to loose some data (latest) or
> > reprocess it second time (earliest).
> >
> > fre 6 okt. 2017 kl. 17:44 skrev Ted Yu :
> >
> > > Should Kafka log warning if log.retention.hours is lower than number of
> > > hours specified by offsets.retention.minutes ?
> > >
> > > On Fri, Oct 6, 2017 at 8:35 AM, Manikumar 
> > > wrote:
> > >
> > > > normally, log.retention.hours (168hrs)  should be higher than
> > > > offsets.retention.minutes (336 hrs)?
> > > >
> > > >
> > > > On Fri, Oct 6, 2017 at 8:58 PM, Dmitriy Vsekhvalnov <
> > > > dvsekhval...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi Ted,
> > > > >
> > > > > Broker: v0.11.0.0
> > > > >
> > > > > Consumer:
> > > > > kafka-clients v0.11.0.0
> > > > > auto.offset.reset = earliest
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Oct 6, 2017 at 6:24 PM, Ted Yu 
> wrote:
> > > > >
> > > > > > What's the value for auto.offset.reset  ?
> > > > > >
> > > > > > Which release are you using ?
> > > > > >
> > > > > > Cheers
> > > > > >
> > > > > > On Fri, Oct 6, 2017 at 7:52 AM, Dmitriy Vsekhvalnov <
> > > > > > dvsekhval...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > we several time faced situation where consumer-group started to
> > > > > > re-consume
> > > > > > > old events from beginning. Here is scenario:
> > > > > > >
> > > > > > > 1. x3 broker kafka cluster on top of x3 node zookeeper
> > > > > > > 2. RF=3 for all topics
> > > > > > > 3. log.retention.hours=168 and offsets.retention.minutes=20160
> > > > > > > 4. running sustainable load (pushing events)
> > > > > > > 5. doing disaster testing by randomly shutting down 1 of 3
> broker
> > > > nodes
> > > > > > > (then provision new broker back)
> > > > > > >
> > > > > > > Several times after bouncing broker we faced situation where
> > > consumer
> > > > > > group
> > > > > > > started to re-consume old events.
> > > > > > >
> > > > > > > consumer group:
> > > > > > >
> > > > > > > 1. enable.auto.commit = false
> > > > > > > 2. tried graceful group shutdown, kill -9 and terminating AWS
> > nodes
> > > > > > > 3. never experienced re-consumption for given cases.
> > > > > > >
> > > > > > > What can cause that old events re-consumption? Is it related to
> > > > > bouncing
> > > > > > > one of brokers? What to search in a logs? Any broker settings
> to
> > > try?
> > > > > > >
> > > > > > > Thanks in advance.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: kafka broker loosing offsets?

2017-10-06 Thread Ted Yu
Stas:
bq.  using anything but none is not really an option

If you have time, can you explain a bit more ?

Thanks

On Fri, Oct 6, 2017 at 8:55 AM, Stas Chizhov  wrote:

> If you set auto.offset.reset to none next time it happens you will be in
> much better position to find out what happens. Also in general with current
> semantics of offset reset policy IMO using anything but none is not really
> an option unless it is ok for consumer to loose some data (latest) or
> reprocess it second time (earliest).
>
> fre 6 okt. 2017 kl. 17:44 skrev Ted Yu :
>
> > Should Kafka log warning if log.retention.hours is lower than number of
> > hours specified by offsets.retention.minutes ?
> >
> > On Fri, Oct 6, 2017 at 8:35 AM, Manikumar 
> > wrote:
> >
> > > normally, log.retention.hours (168hrs)  should be higher than
> > > offsets.retention.minutes (336 hrs)?
> > >
> > >
> > > On Fri, Oct 6, 2017 at 8:58 PM, Dmitriy Vsekhvalnov <
> > > dvsekhval...@gmail.com>
> > > wrote:
> > >
> > > > Hi Ted,
> > > >
> > > > Broker: v0.11.0.0
> > > >
> > > > Consumer:
> > > > kafka-clients v0.11.0.0
> > > > auto.offset.reset = earliest
> > > >
> > > >
> > > >
> > > > On Fri, Oct 6, 2017 at 6:24 PM, Ted Yu  wrote:
> > > >
> > > > > What's the value for auto.offset.reset  ?
> > > > >
> > > > > Which release are you using ?
> > > > >
> > > > > Cheers
> > > > >
> > > > > On Fri, Oct 6, 2017 at 7:52 AM, Dmitriy Vsekhvalnov <
> > > > > dvsekhval...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > we several time faced situation where consumer-group started to
> > > > > re-consume
> > > > > > old events from beginning. Here is scenario:
> > > > > >
> > > > > > 1. x3 broker kafka cluster on top of x3 node zookeeper
> > > > > > 2. RF=3 for all topics
> > > > > > 3. log.retention.hours=168 and offsets.retention.minutes=20160
> > > > > > 4. running sustainable load (pushing events)
> > > > > > 5. doing disaster testing by randomly shutting down 1 of 3 broker
> > > nodes
> > > > > > (then provision new broker back)
> > > > > >
> > > > > > Several times after bouncing broker we faced situation where
> > consumer
> > > > > group
> > > > > > started to re-consume old events.
> > > > > >
> > > > > > consumer group:
> > > > > >
> > > > > > 1. enable.auto.commit = false
> > > > > > 2. tried graceful group shutdown, kill -9 and terminating AWS
> nodes
> > > > > > 3. never experienced re-consumption for given cases.
> > > > > >
> > > > > > What can cause that old events re-consumption? Is it related to
> > > > bouncing
> > > > > > one of brokers? What to search in a logs? Any broker settings to
> > try?
> > > > > >
> > > > > > Thanks in advance.
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: kafka broker loosing offsets?

2017-10-06 Thread Stas Chizhov
If you set auto.offset.reset to none next time it happens you will be in
much better position to find out what happens. Also in general with current
semantics of offset reset policy IMO using anything but none is not really
an option unless it is ok for consumer to loose some data (latest) or
reprocess it second time (earliest).

fre 6 okt. 2017 kl. 17:44 skrev Ted Yu :

> Should Kafka log warning if log.retention.hours is lower than number of
> hours specified by offsets.retention.minutes ?
>
> On Fri, Oct 6, 2017 at 8:35 AM, Manikumar 
> wrote:
>
> > normally, log.retention.hours (168hrs)  should be higher than
> > offsets.retention.minutes (336 hrs)?
> >
> >
> > On Fri, Oct 6, 2017 at 8:58 PM, Dmitriy Vsekhvalnov <
> > dvsekhval...@gmail.com>
> > wrote:
> >
> > > Hi Ted,
> > >
> > > Broker: v0.11.0.0
> > >
> > > Consumer:
> > > kafka-clients v0.11.0.0
> > > auto.offset.reset = earliest
> > >
> > >
> > >
> > > On Fri, Oct 6, 2017 at 6:24 PM, Ted Yu  wrote:
> > >
> > > > What's the value for auto.offset.reset  ?
> > > >
> > > > Which release are you using ?
> > > >
> > > > Cheers
> > > >
> > > > On Fri, Oct 6, 2017 at 7:52 AM, Dmitriy Vsekhvalnov <
> > > > dvsekhval...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > we several time faced situation where consumer-group started to
> > > > re-consume
> > > > > old events from beginning. Here is scenario:
> > > > >
> > > > > 1. x3 broker kafka cluster on top of x3 node zookeeper
> > > > > 2. RF=3 for all topics
> > > > > 3. log.retention.hours=168 and offsets.retention.minutes=20160
> > > > > 4. running sustainable load (pushing events)
> > > > > 5. doing disaster testing by randomly shutting down 1 of 3 broker
> > nodes
> > > > > (then provision new broker back)
> > > > >
> > > > > Several times after bouncing broker we faced situation where
> consumer
> > > > group
> > > > > started to re-consume old events.
> > > > >
> > > > > consumer group:
> > > > >
> > > > > 1. enable.auto.commit = false
> > > > > 2. tried graceful group shutdown, kill -9 and terminating AWS nodes
> > > > > 3. never experienced re-consumption for given cases.
> > > > >
> > > > > What can cause that old events re-consumption? Is it related to
> > > bouncing
> > > > > one of brokers? What to search in a logs? Any broker settings to
> try?
> > > > >
> > > > > Thanks in advance.
> > > > >
> > > >
> > >
> >
>


Re: kafka broker loosing offsets?

2017-10-06 Thread Ted Yu
Should Kafka log warning if log.retention.hours is lower than number of
hours specified by offsets.retention.minutes ?

On Fri, Oct 6, 2017 at 8:35 AM, Manikumar  wrote:

> normally, log.retention.hours (168hrs)  should be higher than
> offsets.retention.minutes (336 hrs)?
>
>
> On Fri, Oct 6, 2017 at 8:58 PM, Dmitriy Vsekhvalnov <
> dvsekhval...@gmail.com>
> wrote:
>
> > Hi Ted,
> >
> > Broker: v0.11.0.0
> >
> > Consumer:
> > kafka-clients v0.11.0.0
> > auto.offset.reset = earliest
> >
> >
> >
> > On Fri, Oct 6, 2017 at 6:24 PM, Ted Yu  wrote:
> >
> > > What's the value for auto.offset.reset  ?
> > >
> > > Which release are you using ?
> > >
> > > Cheers
> > >
> > > On Fri, Oct 6, 2017 at 7:52 AM, Dmitriy Vsekhvalnov <
> > > dvsekhval...@gmail.com>
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > we several time faced situation where consumer-group started to
> > > re-consume
> > > > old events from beginning. Here is scenario:
> > > >
> > > > 1. x3 broker kafka cluster on top of x3 node zookeeper
> > > > 2. RF=3 for all topics
> > > > 3. log.retention.hours=168 and offsets.retention.minutes=20160
> > > > 4. running sustainable load (pushing events)
> > > > 5. doing disaster testing by randomly shutting down 1 of 3 broker
> nodes
> > > > (then provision new broker back)
> > > >
> > > > Several times after bouncing broker we faced situation where consumer
> > > group
> > > > started to re-consume old events.
> > > >
> > > > consumer group:
> > > >
> > > > 1. enable.auto.commit = false
> > > > 2. tried graceful group shutdown, kill -9 and terminating AWS nodes
> > > > 3. never experienced re-consumption for given cases.
> > > >
> > > > What can cause that old events re-consumption? Is it related to
> > bouncing
> > > > one of brokers? What to search in a logs? Any broker settings to try?
> > > >
> > > > Thanks in advance.
> > > >
> > >
> >
>


Re: kafka broker loosing offsets?

2017-10-06 Thread Vincent Dautremont
Hi,
I'm having the same setup as Dimitry, I've experienced exactly the same
issue already 2 times this last month.
(the only difference with Dimitry's setup is that I have librdkafka 0.9.5
clients.

It's like if the __consumer_offsets partitions were not synced but still
reported as synced (and so the syncing would never be restarted/continued).
I've never experienced that on Kafka 0.9.x or 0.10.x clusters.
the 0.11.0.0 cluster when it happened got upgraded to 0.11.0.1 in hope it
fixes this.

On Fri, Oct 6, 2017 at 5:35 PM, Manikumar  wrote:

> normally, log.retention.hours (168hrs)  should be higher than
> offsets.retention.minutes (336 hrs)?
>
>
> On Fri, Oct 6, 2017 at 8:58 PM, Dmitriy Vsekhvalnov <
> dvsekhval...@gmail.com>
> wrote:
>
> > Hi Ted,
> >
> > Broker: v0.11.0.0
> >
> > Consumer:
> > kafka-clients v0.11.0.0
> > auto.offset.reset = earliest
> >
> >
> >
> > On Fri, Oct 6, 2017 at 6:24 PM, Ted Yu  wrote:
> >
> > > What's the value for auto.offset.reset  ?
> > >
> > > Which release are you using ?
> > >
> > > Cheers
> > >
> > > On Fri, Oct 6, 2017 at 7:52 AM, Dmitriy Vsekhvalnov <
> > > dvsekhval...@gmail.com>
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > we several time faced situation where consumer-group started to
> > > re-consume
> > > > old events from beginning. Here is scenario:
> > > >
> > > > 1. x3 broker kafka cluster on top of x3 node zookeeper
> > > > 2. RF=3 for all topics
> > > > 3. log.retention.hours=168 and offsets.retention.minutes=20160
> > > > 4. running sustainable load (pushing events)
> > > > 5. doing disaster testing by randomly shutting down 1 of 3 broker
> nodes
> > > > (then provision new broker back)
> > > >
> > > > Several times after bouncing broker we faced situation where consumer
> > > group
> > > > started to re-consume old events.
> > > >
> > > > consumer group:
> > > >
> > > > 1. enable.auto.commit = false
> > > > 2. tried graceful group shutdown, kill -9 and terminating AWS nodes
> > > > 3. never experienced re-consumption for given cases.
> > > >
> > > > What can cause that old events re-consumption? Is it related to
> > bouncing
> > > > one of brokers? What to search in a logs? Any broker settings to try?
> > > >
> > > > Thanks in advance.
> > > >
> > >
> >
>

-- 
The information transmitted is intended only for the person or entity to 
which it is addressed and may contain confidential and/or privileged 
material. Any review, retransmission, dissemination or other use of, or 
taking of any action in reliance upon, this information by persons or 
entities other than the intended recipient is prohibited. If you received 
this in error, please contact the sender and delete the material from any 
computer.


Re: kafka broker loosing offsets?

2017-10-06 Thread Manikumar
normally, log.retention.hours (168hrs)  should be higher than
offsets.retention.minutes (336 hrs)?


On Fri, Oct 6, 2017 at 8:58 PM, Dmitriy Vsekhvalnov 
wrote:

> Hi Ted,
>
> Broker: v0.11.0.0
>
> Consumer:
> kafka-clients v0.11.0.0
> auto.offset.reset = earliest
>
>
>
> On Fri, Oct 6, 2017 at 6:24 PM, Ted Yu  wrote:
>
> > What's the value for auto.offset.reset  ?
> >
> > Which release are you using ?
> >
> > Cheers
> >
> > On Fri, Oct 6, 2017 at 7:52 AM, Dmitriy Vsekhvalnov <
> > dvsekhval...@gmail.com>
> > wrote:
> >
> > > Hi all,
> > >
> > > we several time faced situation where consumer-group started to
> > re-consume
> > > old events from beginning. Here is scenario:
> > >
> > > 1. x3 broker kafka cluster on top of x3 node zookeeper
> > > 2. RF=3 for all topics
> > > 3. log.retention.hours=168 and offsets.retention.minutes=20160
> > > 4. running sustainable load (pushing events)
> > > 5. doing disaster testing by randomly shutting down 1 of 3 broker nodes
> > > (then provision new broker back)
> > >
> > > Several times after bouncing broker we faced situation where consumer
> > group
> > > started to re-consume old events.
> > >
> > > consumer group:
> > >
> > > 1. enable.auto.commit = false
> > > 2. tried graceful group shutdown, kill -9 and terminating AWS nodes
> > > 3. never experienced re-consumption for given cases.
> > >
> > > What can cause that old events re-consumption? Is it related to
> bouncing
> > > one of brokers? What to search in a logs? Any broker settings to try?
> > >
> > > Thanks in advance.
> > >
> >
>


Re: kafka broker loosing offsets?

2017-10-06 Thread Dmitriy Vsekhvalnov
Hi Ted,

Broker: v0.11.0.0

Consumer:
kafka-clients v0.11.0.0
auto.offset.reset = earliest



On Fri, Oct 6, 2017 at 6:24 PM, Ted Yu  wrote:

> What's the value for auto.offset.reset  ?
>
> Which release are you using ?
>
> Cheers
>
> On Fri, Oct 6, 2017 at 7:52 AM, Dmitriy Vsekhvalnov <
> dvsekhval...@gmail.com>
> wrote:
>
> > Hi all,
> >
> > we several time faced situation where consumer-group started to
> re-consume
> > old events from beginning. Here is scenario:
> >
> > 1. x3 broker kafka cluster on top of x3 node zookeeper
> > 2. RF=3 for all topics
> > 3. log.retention.hours=168 and offsets.retention.minutes=20160
> > 4. running sustainable load (pushing events)
> > 5. doing disaster testing by randomly shutting down 1 of 3 broker nodes
> > (then provision new broker back)
> >
> > Several times after bouncing broker we faced situation where consumer
> group
> > started to re-consume old events.
> >
> > consumer group:
> >
> > 1. enable.auto.commit = false
> > 2. tried graceful group shutdown, kill -9 and terminating AWS nodes
> > 3. never experienced re-consumption for given cases.
> >
> > What can cause that old events re-consumption? Is it related to bouncing
> > one of brokers? What to search in a logs? Any broker settings to try?
> >
> > Thanks in advance.
> >
>


Re: kafka broker loosing offsets?

2017-10-06 Thread Ted Yu
What's the value for auto.offset.reset  ?

Which release are you using ?

Cheers

On Fri, Oct 6, 2017 at 7:52 AM, Dmitriy Vsekhvalnov 
wrote:

> Hi all,
>
> we several time faced situation where consumer-group started to re-consume
> old events from beginning. Here is scenario:
>
> 1. x3 broker kafka cluster on top of x3 node zookeeper
> 2. RF=3 for all topics
> 3. log.retention.hours=168 and offsets.retention.minutes=20160
> 4. running sustainable load (pushing events)
> 5. doing disaster testing by randomly shutting down 1 of 3 broker nodes
> (then provision new broker back)
>
> Several times after bouncing broker we faced situation where consumer group
> started to re-consume old events.
>
> consumer group:
>
> 1. enable.auto.commit = false
> 2. tried graceful group shutdown, kill -9 and terminating AWS nodes
> 3. never experienced re-consumption for given cases.
>
> What can cause that old events re-consumption? Is it related to bouncing
> one of brokers? What to search in a logs? Any broker settings to try?
>
> Thanks in advance.
>


Re: __consumers topic is overloaded

2017-10-06 Thread Ted Yu
Graph image didn't come through.
Consider using third party site for hosting image.

On Fri, Oct 6, 2017 at 6:46 AM, Alexander Petrovsky 
wrote:

> Hello!
>
> I observe the follow strange behavior in my kafka graphs:
>
>
> As you can see, the topic __consumer_offsets have very bit rate, is it
> okay? Or I should find consumers that using ConsumerGro​ups and fix some
> parameters? And what parameters I should fix?
>
> --
> Петровский Александр / Alexander Petrovsky,
>
> Skype: askjuise
> Phone: +7 931 9877991 <+7%20931%20987-79-91>
>
>


Compacted topic, entry deletion, kafka 0.11.0.0

2017-10-06 Thread Asko Alhoniemi
Hello

I understand that the compacted topic is meant to keep at least the latest
key value pair.

However, I am having an issue since it can happen that entry becomes old
and I need to remove it. It may also occur, that I am not able to send key
"null" pair. So I need another method to remove my hanging entries. My hope
was with the following configuration:

Topic:test.topicPartitionCount:30ReplicationFactor:2
Configs:segment.bytes=1048576,min.cleanable.dirty.ratio=0.1,
delete.retention.ms=180,retention.ms=90,segment.ms
=90,cleanup.policy=compact,delete
Topic: test.topicPartition: 0Leader: 0Replicas: 1,0Isr:
0
Topic: test.topicPartition: 1Leader: 0Replicas: 0,1Isr:
0
Topic: test.topicPartition: 2Leader: 0Replicas: 1,0Isr:
0
Topic: test.topicPartition: 3Leader: 0Replicas: 0,1Isr:
0
Topic: test.topicPartition: 4Leader: 0Replicas: 1,0Isr:
0
Topic: test.topicPartition: 5Leader: 0Replicas: 0,1Isr:
0
Topic: test.topicPartition: 6Leader: 0Replicas: 1,0Isr:
0
Topic: test.topicPartition: 7Leader: 0Replicas: 0,1Isr:
0
Topic: test.topicPartition: 8Leader: 0Replicas: 1,0Isr:
0
Topic: test.topicPartition: 9Leader: 0Replicas: 0,1Isr:
0
Topic: test.topicPartition: 10Leader: 0Replicas: 1,0
Isr: 0
Topic: test.topicPartition: 11Leader: 0Replicas: 0,1
Isr: 0
Topic: test.topicPartition: 12Leader: 0Replicas: 1,0
Isr: 0
Topic: test.topicPartition: 13Leader: 0Replicas: 0,1
Isr: 0
Topic: test.topicPartition: 14Leader: 0Replicas: 1,0
Isr: 0
Topic: test.topicPartition: 15Leader: 0Replicas: 0,1
Isr: 0
Topic: test.topicPartition: 16Leader: 0Replicas: 1,0
Isr: 0
Topic: test.topicPartition: 17Leader: 0Replicas: 0,1
Isr: 0
Topic: test.topicPartition: 18Leader: 0Replicas: 1,0
Isr: 0
Topic: test.topicPartition: 19Leader: 0Replicas: 0,1
Isr: 0
Topic: test.topicPartition: 20Leader: 0Replicas: 1,0
Isr: 0
Topic: test.topicPartition: 21Leader: 0Replicas: 0,1
Isr: 0
Topic: test.topicPartition: 22Leader: 0Replicas: 1,0
Isr: 0
Topic: test.topicPartition: 23Leader: 0Replicas: 0,1
Isr: 0
Topic: test.topicPartition: 24Leader: 0Replicas: 1,0
Isr: 0
Topic: test.topicPartition: 25Leader: 0Replicas: 0,1
Isr: 0
Topic: test.topicPartition: 26Leader: 0Replicas: 1,0
Isr: 0
Topic: test.topicPartition: 27Leader: 0Replicas: 0,1
Isr: 0
Topic: test.topicPartition: 28Leader: 0Replicas: 1,0
Isr: 0
Topic: test.topicPartition: 29Leader: 0Replicas: 0,1
Isr: 0

but it does not get cleaned. I stop writing one key value pair, but still
can see that key value pair after days.

Then I changed broker configuration on the fly:

# The minimum age of a log file to be eligible for deletion
#log.retention.hours=168
log.retention.minutes=16

# A size-based retention policy for logs. Segments are pruned from the log
as long as the remaining
# segments don't drop below log.retention.bytes.
#log.retention.bytes=1073741824
log.retention.bytes=1

# The maximum size of a log segment file. When this size is reached a new
log segment will be created.
#log.segment.bytes=1073741824
log.segment.bytes=1

So far I haven't been able to drop away the desired key value pair with any
of these configuration actions.

Am I trying something that is not possible or am I missing some
configuration options or should this work?

br.

Asko Alhoniemi


__consumers topic is overloaded

2017-10-06 Thread Alexander Petrovsky
Hello!

I observe the follow strange behavior in my kafka graphs:


As you can see, the topic __consumer_offsets have very bit rate, is it
okay? Or I should find consumers that using ConsumerGro​ups and fix some
parameters? And what parameters I should fix?

-- 
Петровский Александр / Alexander Petrovsky,

Skype: askjuise
Phone: +7 931 9877991


kafka broker loosing offsets?

2017-10-06 Thread Dmitriy Vsekhvalnov
Hi all,

we several time faced situation where consumer-group started to re-consume
old events from beginning. Here is scenario:

1. x3 broker kafka cluster on top of x3 node zookeeper
2. RF=3 for all topics
3. log.retention.hours=168 and offsets.retention.minutes=20160
4. running sustainable load (pushing events)
5. doing disaster testing by randomly shutting down 1 of 3 broker nodes
(then provision new broker back)

Several times after bouncing broker we faced situation where consumer group
started to re-consume old events.

consumer group:

1. enable.auto.commit = false
2. tried graceful group shutdown, kill -9 and terminating AWS nodes
3. never experienced re-consumption for given cases.

What can cause that old events re-consumption? Is it related to bouncing
one of brokers? What to search in a logs? Any broker settings to try?

Thanks in advance.


Kafka Streams with embedded RocksDB

2017-10-06 Thread Valentin Forst
Hi there,

We have Kafka 0.11.0.0 running to DC/OS. So, I am developing on windows (it's 
not my fault ;-)) and have to ship to a Linux-Container (DC/OS) in order to run 
a java-app.

What is the best way to use Kafka-streams maven dependency w.r.t. RocksDB in 
order to work on both OSs?
Currently I am getting the following error:

UnsatisfiedLinkError … Can’t find dependency library. Trying to find 
rocksdbjni,…dll

Thanks in advance
Valentin

Serve interactive queries from standby replicas

2017-10-06 Thread Stas Chizhov
Hi

Is there a way to serve read read requests from standby replicas?
StreamsMeatadata does not seem to provide standby end points as far as I
can see.

Thank you,
Stas


Re: Topics and Partitions

2017-10-06 Thread Josh Meraj Maidana
Thank you.

I would have expected for the topic to be created either by the producer or
consumer, as it is a bit indeterministic, whether the consumer or producer
would come up first.

Kind regards


On Fri, Oct 6, 2017 at 12:53 PM, Michal Michalski <
michal.michal...@zalando.ie> wrote:

> Hey Josh,
>
> Consumption from non-existent topic will end up with
> "LEADER_NOT_AVAILABLE".
>
> However (!) I just tested it locally (Kafka 0.11) and it seems like
> consuming from a topic that doesn't exist with auto.create.topics.enable
> set to true *will create it* as well (I'm checking it in Zookeeper's
> /brokers/topics path).
>
> I'm a bit surprised this works. Documentation states that:
>
> "You have the option of either adding topics manually *or having them be
> created automatically when data is first published to a non-existent
> topic.*
> "
>
> This (pretty old) email thread confirm that that's intentional:
> http://grokbase.com/t/kafka/users/14a2rgj2h2/auto-topic-
> creation-not-working-for-attempts-to-consume-non-existing-topic
> (Jun Rao: "In general, *only writers should trigger auto topic creation,
> but not the readers*. So, a topic can be auto created by the producer, but
> not the consumer.")
>
> So I'm not sure now if it's a regression or a change made later that's not
> reflected in the docs, but it looks like you *can* currently create topics
> using consumer. I wouldn't rely on this "feature" though - to me,
> personally, it seems wrong and I'm guessing it might be a bug.
>
> Please correct me if I'm wrong / missing something :-)
>
> Michał
>
>
>
> On 6 October 2017 at 04:37, Josh Maidana  wrote:
>
> > Michal,
> >
> > You mentioned topics are only dynamically created with producers. Does
> that
> > mean if a consumer starts on a non-existent topic, it throws an error?
> >
> > Kind regards
> > Meeraj
> >
> > On Thu, Oct 5, 2017 at 9:20 PM, Josh Maidana 
> > wrote:
> >
> > > Thank you, Michal.
> > >
> > > That answers all my questions, many thanks.
> > >
> > > Josh
> > >
> > > On Thu, Oct 5, 2017 at 1:21 PM, Michal Michalski <
> > > michal.michal...@zalando.ie> wrote:
> > >
> > >> Hi Josh,
> > >>
> > >> 1. I don't know for sure (haven't seen the code that does it), but
> it's
> > >> probably the most "even" split possible for given number of brokers
> and
> > >> partitions. So for 8 partitions and 3 brokers it would be [3, 3, 2].
> > >> 2. See "num.partitions" in broker config. BTW. only producer can
> create
> > >> topic dynamically, not consumer.
> > >> 3. See 3. The value has to be non-zero, so it's always specified.
> > >> 4. Based on the ProducerRecord (message) key. See:
> > >> https://kafka.apache.org/0110/javadoc/index.html?org/apache/
> > >> kafka/clients/producer/KafkaProducer.html
> > >> 5. Yes - you need to create multiple consumers with the same group.id
> .
> > >> 6. Yes, there'll be at most one consumer (within a consumer group)
> > >> handling
> > >> given partition at a given time.
> > >> 7. Yes, it's a process called "rebalancing" - it reassigns partitions
> to
> > >> consumers when the number of consumers changes.
> > >> 8. Your consumer will commit the last processed offset to special
> Kafka
> > >> topic (or Zookeeper, but that's not a default) every so often
> > >> (periodically
> > >> or "on demand", when you tell it to), so for each partition and
> consumer
> > >> group you know what was and wasn't processed yet. The new consumer
> will
> > >> pick up from the place where the dead one left off.
> > >> 9. If I understand your question correctly - no, Kafka is pull-based
> and
> > >> not push-based by design.
> > >>
> > >> Kind regards,
> > >> Michał
> > >>
> > >> On 5 October 2017 at 09:37, Josh Maidana 
> wrote:
> > >>
> > >> > Hello
> > >> >
> > >> > I am quite new to KAFKA and come from a JMS/messaging background.
> > >> Reading
> > >> > through the documentation, I gather using partitions and consumer
> > >> groups,
> > >> > KAFKA achieves both P2P and pub/sub. I have a few questions on
> > >> partitions,
> > >> > though, I was wondering someone could kindly please point me in the
> > >> right
> > >> > directions.
> > >> >
> > >> > 1. In a multi-server scenario, how does KAFKA decide how many
> > >> partitions of
> > >> > a given topic is assigned to a given node?
> > >> > 2. When a topic is created dynamically by a consumer or a producer,
> > how
> > >> is
> > >> > the number of partitions specified?
> > >> > 3. If it is not or can't be specified, how does KAFKA decide the
> > number
> > >> of
> > >> > partitions to create?
> > >> > 4. If a producer doesn't specify a partition, how does KAFKA decide
> to
> > >> > which partition the message is allocated.
> > >> > 5. On consumption, do I need to explicitly create multiple consumers
> > to
> > >> > attain parallelism?
> > >> > 6. If yes, would KAFKA allocate different partition to different
> > >> consumers
> > >> > who are part of the same consumer 

Re: Topics and Partitions

2017-10-06 Thread Michal Michalski
Hey Josh,

Consumption from non-existent topic will end up with "LEADER_NOT_AVAILABLE".

However (!) I just tested it locally (Kafka 0.11) and it seems like
consuming from a topic that doesn't exist with auto.create.topics.enable
set to true *will create it* as well (I'm checking it in Zookeeper's
/brokers/topics path).

I'm a bit surprised this works. Documentation states that:

"You have the option of either adding topics manually *or having them be
created automatically when data is first published to a non-existent topic.*
"

This (pretty old) email thread confirm that that's intentional:
http://grokbase.com/t/kafka/users/14a2rgj2h2/auto-topic-creation-not-working-for-attempts-to-consume-non-existing-topic
(Jun Rao: "In general, *only writers should trigger auto topic creation,
but not the readers*. So, a topic can be auto created by the producer, but
not the consumer.")

So I'm not sure now if it's a regression or a change made later that's not
reflected in the docs, but it looks like you *can* currently create topics
using consumer. I wouldn't rely on this "feature" though - to me,
personally, it seems wrong and I'm guessing it might be a bug.

Please correct me if I'm wrong / missing something :-)

Michał



On 6 October 2017 at 04:37, Josh Maidana  wrote:

> Michal,
>
> You mentioned topics are only dynamically created with producers. Does that
> mean if a consumer starts on a non-existent topic, it throws an error?
>
> Kind regards
> Meeraj
>
> On Thu, Oct 5, 2017 at 9:20 PM, Josh Maidana 
> wrote:
>
> > Thank you, Michal.
> >
> > That answers all my questions, many thanks.
> >
> > Josh
> >
> > On Thu, Oct 5, 2017 at 1:21 PM, Michal Michalski <
> > michal.michal...@zalando.ie> wrote:
> >
> >> Hi Josh,
> >>
> >> 1. I don't know for sure (haven't seen the code that does it), but it's
> >> probably the most "even" split possible for given number of brokers and
> >> partitions. So for 8 partitions and 3 brokers it would be [3, 3, 2].
> >> 2. See "num.partitions" in broker config. BTW. only producer can create
> >> topic dynamically, not consumer.
> >> 3. See 3. The value has to be non-zero, so it's always specified.
> >> 4. Based on the ProducerRecord (message) key. See:
> >> https://kafka.apache.org/0110/javadoc/index.html?org/apache/
> >> kafka/clients/producer/KafkaProducer.html
> >> 5. Yes - you need to create multiple consumers with the same group.id.
> >> 6. Yes, there'll be at most one consumer (within a consumer group)
> >> handling
> >> given partition at a given time.
> >> 7. Yes, it's a process called "rebalancing" - it reassigns partitions to
> >> consumers when the number of consumers changes.
> >> 8. Your consumer will commit the last processed offset to special Kafka
> >> topic (or Zookeeper, but that's not a default) every so often
> >> (periodically
> >> or "on demand", when you tell it to), so for each partition and consumer
> >> group you know what was and wasn't processed yet. The new consumer will
> >> pick up from the place where the dead one left off.
> >> 9. If I understand your question correctly - no, Kafka is pull-based and
> >> not push-based by design.
> >>
> >> Kind regards,
> >> Michał
> >>
> >> On 5 October 2017 at 09:37, Josh Maidana  wrote:
> >>
> >> > Hello
> >> >
> >> > I am quite new to KAFKA and come from a JMS/messaging background.
> >> Reading
> >> > through the documentation, I gather using partitions and consumer
> >> groups,
> >> > KAFKA achieves both P2P and pub/sub. I have a few questions on
> >> partitions,
> >> > though, I was wondering someone could kindly please point me in the
> >> right
> >> > directions.
> >> >
> >> > 1. In a multi-server scenario, how does KAFKA decide how many
> >> partitions of
> >> > a given topic is assigned to a given node?
> >> > 2. When a topic is created dynamically by a consumer or a producer,
> how
> >> is
> >> > the number of partitions specified?
> >> > 3. If it is not or can't be specified, how does KAFKA decide the
> number
> >> of
> >> > partitions to create?
> >> > 4. If a producer doesn't specify a partition, how does KAFKA decide to
> >> > which partition the message is allocated.
> >> > 5. On consumption, do I need to explicitly create multiple consumers
> to
> >> > attain parallelism?
> >> > 6. If yes, would KAFKA allocate different partition to different
> >> consumers
> >> > who are part of the same consumer group?
> >> > 7. If one of those consumers exit, would KAFKA reallocate the
> >> partitions to
> >> > remaining consumers?
> >> > 8. How are the offsets propagated from an exited to consumer to the
> new
> >> > consumer to which the partition is reallocated?
> >> > 9. Is there a listener based API for consumption instead os a blocking
> >> > poll?
> >> >
> >> > Kind regards
> >> > Josh
> >> >
> >>
> >
> >
>