kafka message expiry question

2014-11-19 Thread Weide Zhang
hi,

as far as i understand, log retention time in kafka will delete message
that's older than the retention time. i'm wonder what it implies for
consumer since i'm using simple consumer to manage offsets in predefined
consumer group.

say i have a list of messages for a partition of topic:

1,2,3,4,5 are the message (offsets) associated with the partition in
current time.

if message 1,2,3 expired earlier and only 4,5 are left, does that mean
consumer can only consume 4,5 and need a way to detect 1,2,3 has expired
and make sure it never reads before the earliest offset for the partition ?

if all 1,2,3,4,5 are expired, it seems the offset will become 0. i assume
in this case consumer need to reset its consumed offsets to 0 for the
consumer group.

Thanks,

Weide


Re: offset commit api

2014-08-18 Thread Weide Zhang
Thanks Joel. Do you know if I'm using consumer group string 


On Mon, Aug 4, 2014 at 3:16 PM, Joel Koshy jjkosh...@gmail.com wrote:
 Weide, 0.8.1.1 does not support offsets storage in Kafka. The brokers
 do support offset commit requests/fetches but simply forward to
 ZooKeeper - you can issue the offset commit and fetch requests to any
 broker. Kafka-backed consumer offsets is currently in trunk and will
 be released in 0.8.2.
 
 Thanks,
 
 Joel
 
 On Mon, Aug 04, 2014 at 02:57:02PM -0700, Weide Zhang wrote:
  Hi
 
  It seems to me that 0.8.1.1 doesn't have the ConsumerMetadata API. So what
  broker I should choose in order to commit and fetch offset information ?
  Shall I use zookeeper for offset to manage it manually instead ?
 
  Thanks,
 
  Weide
 
 
 
  On Sun, Aug 3, 2014 at 4:34 PM, Weide Zhang weo...@gmail.com wrote:
 
   Hi,
  
   I'm reading the offset management on the API link.
  
   https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommit/FetchAPI
  
   I have a couple of questions regarding using the offset fetch and commit
   API in 0.8.1.1 ?
  
   1. Is the new offset commit and fetch api usable in use in 0.8.1.1 ?  Does
   0.8.1.1 already support offset coordinator ?
  
   2. what's the difference between old offsetrequest and new
   offsetfetchrequest ? It seems to me that the new api support per consumer
   group offset management fetch while old api doesn't. Also, what's the
   purpose of using a timestamp parameter in the fetch request ?
  
   3. In 0.8.1.1, the OffsetCommitRequest uses OffsetMetadataAndError, could
   you tell me what's the purpose of the error parameter and metadata
   parameter in the request ?
  
   4. Can I assume the offset management is purely independent of message
   consumption ? In other words, if I use a simple consumer to fetch message
   with random client id, can I still manually set some consumer group along
   with offset in the offset commit message ? Is that allowed ?
  
   Thanks,
  
   Weide
  



Re: offset commit api

2014-08-04 Thread Weide Zhang
Hi

It seems to me that 0.8.1.1 doesn't have the ConsumerMetadata API. So what
broker I should choose in order to commit and fetch offset information ?
Shall I use zookeeper for offset to manage it manually instead ?

Thanks,

Weide



On Sun, Aug 3, 2014 at 4:34 PM, Weide Zhang weo...@gmail.com wrote:

 Hi,

 I'm reading the offset management on the API link.

 https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-OffsetCommit/FetchAPI

 I have a couple of questions regarding using the offset fetch and commit
 API in 0.8.1.1 ?

 1. Is the new offset commit and fetch api usable in use in 0.8.1.1 ?  Does
 0.8.1.1 already support offset coordinator ?

 2. what's the difference between old offsetrequest and new
 offsetfetchrequest ? It seems to me that the new api support per consumer
 group offset management fetch while old api doesn't. Also, what's the
 purpose of using a timestamp parameter in the fetch request ?

 3. In 0.8.1.1, the OffsetCommitRequest uses OffsetMetadataAndError, could
 you tell me what's the purpose of the error parameter and metadata
 parameter in the request ?

 4. Can I assume the offset management is purely independent of message
 consumption ? In other words, if I use a simple consumer to fetch message
 with random client id, can I still manually set some consumer group along
 with offset in the offset commit message ? Is that allowed ?

 Thanks,

 Weide




Re: kafka simpleconsumer question

2014-07-10 Thread Weide Zhang
Hi Guozhang,

Just a follow up question. Is one simpleconsumer instance can be shared by
multiple threads and fetch data from multiple topics and commit offsets ?

It seems all the implementation are synchronized which means it can be
shared between multiple threads ? Is my understanding correct ?

Thanks,

Weide


On Tue, Jul 1, 2014 at 5:36 PM, Guozhang Wang wangg...@gmail.com wrote:

 Hi Weide,

 1. The old consumer is still depending on the ZK no matter if you use ZK
 for offsets or not, since it depends on ZK for group manage anyways. In the
 new consumer (0.9) we have completely the ZK dependance.

 2. Yes, it queries the broker for leader info.

 3. You can use a single consumer for multiple topics, but a single consumer
 can only talk to a single broker; if your topic spans across multiple
 brokers you would also need multiple simple consumers.

 Guozhang


 On Tue, Jul 1, 2014 at 5:14 PM, Weide Zhang weo...@gmail.com wrote:

  Hi ,
 
  Just want to ask some basic question about kafka simple consumer.
 
  1. if I'm using simple consumer and doesn't really depend on zookeeper to
  manage partition offset. (application manage offset themselves). Will
 that
  remove the zookeeper dependency for consumer ?
  2. if zookeeper dies, will simple consumer still able to get partition
  leadership information from broker itself ? Or it indirectly uses
 zookeeper
  to find out the partition leadership for particular topics ?
  3. does simpleconsumer api have topicfilter logic similar to high level
  consumer provides ? Or I have to create multiple simpleconsumer for
  multiple topics themselves explicitly ?
 
  Thanks a lot,
 
  Weide
 



 --
 -- Guozhang



kafka simpleconsumer question

2014-07-01 Thread Weide Zhang
Hi ,

Just want to ask some basic question about kafka simple consumer.

1. if I'm using simple consumer and doesn't really depend on zookeeper to
manage partition offset. (application manage offset themselves). Will that
remove the zookeeper dependency for consumer ?
2. if zookeeper dies, will simple consumer still able to get partition
leadership information from broker itself ? Or it indirectly uses zookeeper
to find out the partition leadership for particular topics ?
3. does simpleconsumer api have topicfilter logic similar to high level
consumer provides ? Or I have to create multiple simpleconsumer for
multiple topics themselves explicitly ?

Thanks a lot,

Weide


topics load balancing within a consumer group

2014-06-03 Thread Weide Zhang
Hi,

I have a question regarding load balancing within a consumer group.

Say I have a consumer group of 4 consumers which subscribe to 4 topics ,
each of which have one partition. Will there be rebalancing happening on
topic level ? Or I will expect consumer 1 have all the data ?

Weide


kafka 0.8.1.1 works under java 1.6 ?

2014-05-28 Thread Weide Zhang
Hi,

According to the Kafka documentation, seems Java 1.7 is recommended. But
for our production environment we are still using java 1.6. Will that be a
problem of using java 1.6 and use Kafka 0.8.1.1 ?

Thanks a lot,

Weide


Re: What happens to Kafka when ZK lost its quorum?

2014-05-15 Thread Weide Zhang
Hi Guozhang,

In worst case zookeeper dies for say 1 hour and come back up, will things
still be recovered automatically after 1 hour ?

Weide


On Wed, May 14, 2014 at 8:28 AM, Guozhang Wang wangg...@gmail.com wrote:

 In 0.8, the servers and consumers are heavily dependent on ZK to function.
 With ZK down, the servers cannot manage replicas and consumers cannot
 assign partitions within the group.

 In 0.9 we are going to remove the ZK dependency from consumer clients, but
 Kafka servers would still be dependent on ZK.

 Guozhang


 On Tue, May 13, 2014 at 6:52 AM, Connie Yang cybercon...@gmail.com
 wrote:

  Hi all,
 
  Can Kafka producers, brokers and consumers still be processing messages
 and
  functioning in their normal states if Zookeeper lost its quorum?
 
  Thanks,
  Connie
 



 --
 -- Guozhang



zookeeper down

2014-05-13 Thread Weide Zhang
Can kafka survive when zookeeper is down and not connectable ? Will the
consumer or producer still work in that case ?

Weide


Re: question about mirror maker

2014-05-11 Thread Weide Zhang
Hi Todd,

Thanks for your answer. with regard to fail over for mirror maker, does
that mean if i have 4 mirror maker running in different machines with same
consumer group, it will auto load balance if one of the mirror maker fails
? Also, it looks to prevent mirror maker commit wrong (consumer work but
not producer) due to cross data center network issue, mirror maker need to
be placed along with the target cluster so that this scenario is minimized
?


On Sat, May 10, 2014 at 11:39 PM, Todd Palino tpal...@linkedin.com wrote:

 Well, if you have a cluster in each datacenter, all with the same topics,
 you can¹t just mirror the messages between them, as you will create a
 loop. The way we do it is to have a ³local² cluster and an ³aggregate²
 cluster. The local cluster has the data for only that datacenter. Then we
 run mirror makers that copy the messages from each of the local clusters
 into the aggregate cluster. Everything produces into the local clusters,
 and nothing produces into the aggregate clusters. In general, consumers
 consume from the aggregate cluster (unless they specifically want only
 local data).

 The mirror maker is as fault tolerant as any other consumer. That is, if a
 mirror maker goes down, the others configured with the same consumer group
 (we generally run at least 4 for any mirror maker, sometimes up to 10)
 will rebalance and start back up from the last committed offset. What you
 need to watch out for is if the mirror maker is unable to produce
 messages, for example, if the network goes down. If it can still consume
 messages, but cannot produce them, you will lose messages as the consumer
 will continue to commit offsets with no knowledge that the producer is
 failing.

 -Todd

 On 5/8/14, 11:20 AM, Weide Zhang weo...@gmail.com wrote:

 Hi,
 
 I have a question about mirror maker. say I have 3 data centers each
 producing topic 'A' with separate kafka cluster running. if 3 of the data
 need to be kept in sync with each other, shall i create 3 mirror maker in
 each data center to get the data from the other two ?
 
 also, it mentioned that mirror making is not fault tolerant ? so what will
 be the behavior of mirror consumer if it went down due to network and back
 up ? do they catch up with last offset from which they last mirror ? If
 so,
 is it enabled by default or I have to configure  ?
 
 Thanks a lot,
 
 Weide




question about mirror maker

2014-05-10 Thread Weide Zhang
Hi,

I have a question about mirror maker. say I have 3 data centers each
producing topic 'A' with separate kafka cluster running. if 3 of the data
need to be kept in sync with each other, shall i create 3 mirror maker in
each data center to get the data from the other two ?

also, it mentioned that mirror making is not fault tolerant ? so what will
be the behavior of mirror consumer if it went down due to network and back
up ? do they catch up with last offset from which they last mirror ? If so,
is it enabled by default or I have to configure  ?

Thanks a lot,

Weide