Re: New client commitAsync SendFailedException

2016-03-14 Thread Jay Kreps
This seems like a bug, no? It should just initiate the request not wait for it to be written, there is no way for the user to reason about the state of the send buffer. -jay On Monday, March 14, 2016, Jason Gustafson wrote: > Hey Alexey, > > Asynchronous commit handling

Re: New client commitAsync SendFailedException

2016-03-14 Thread Alexey Romanchuk
Thanks for reply Jason! Is it any way to control size of this buffer? Will it fails if I try to commit offsets for 100 topics/partitions? Sure I can work around it by batching all commits into one call, but the problem I can see is that the API does not enforce me to do this. From client point

Re: UNKNOWN_MEMBER_ID assigned to consumer group

2016-03-14 Thread tao xiao
Thanks Jason. What does consumer 1 would do upon receiving UNKNOWN_MEMBER_ID and does it rejoin the group eventually if it keeps polling? On Tue, 15 Mar 2016 at 00:58 Jason Gustafson wrote: > Hey Tao, > > This error indicates that a rebalance completed successfully before

Re: How to get message count per topic?

2016-03-14 Thread Stevo Slavić
See https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol Using metadata api one can get topic partitions and for each partition which broker is lead. Using offset api one can get partition size. Both apis are low level and to use them directly you would use

Re: seekToBeginning doesn't work without auto.offset.reset

2016-03-14 Thread Jason Gustafson
The offset API is definitely a gap at the moment. I think there were some problems with the old consumer's API and we wanted to make sure we didn't make the same mistakes. Unfortunately, I'm not sure anyone has had the time to give this the attention it needs. Here's a couple JIRAS if you want to

Re: seekToBeginning doesn't work without auto.offset.reset

2016-03-14 Thread Cody Koeninger
Sorry, by metadata I also meant the equivalent of the old OffsetRequest api, which partitionsFor doesn't give you. I understand why you didn't want to expose the broken "offsets before a certain time" api, but I don't understand why equivalent functionality for first or last offset isn't

Re: Larger Size Error Message

2016-03-14 Thread Fang Wong
After changing log level from INFO to TRACE, here is kafka server.log: [2016-03-14 06:43:03,568] TRACE 156 bytes written. (kafka.network.BoundedByteBufferSend) [2016-03-14 06:43:03,575] TRACE 68 bytes read. (kafka.network.BoundedByteBufferReceive) [2016-03-14 06:43:03,575] TRACE

Re: Deletion of topic on 0.9.0.0 spams this exception

2016-03-14 Thread Scott Reynolds
Yep that is it. Thanks. I will watch the issue. On Mon, Mar 14, 2016 at 1:13 PM Stevo Slavić wrote: > I've recently created related ticket > https://issues.apache.org/jira/browse/KAFKA-3390 > > On Mon, Mar 14, 2016, 20:54 Scott Reynolds wrote: > > >

How to get message count per topic?

2016-03-14 Thread Grant Overby (groverby)
What is the most direct way to get a message count per topic or per partition? For context, this is to enable testing. We'd like to confirm with Kafka that a certain number of messages have been written or that the number of messages we processed is equal to the number received by Kafka.

Re: Deletion of topic on 0.9.0.0 spams this exception

2016-03-14 Thread Stevo Slavić
I've recently created related ticket https://issues.apache.org/jira/browse/KAFKA-3390 On Mon, Mar 14, 2016, 20:54 Scott Reynolds wrote: > >Conditional update of path > >/brokers/topics/messages.events/partitions/0/state with data > >

Re: Kafka 0.9.0.1 broker 0.9 consumer location of consumer group data

2016-03-14 Thread Rajiv Kurian
Thanks Jason. I'll try to upgrade and see if it helps. On Mon, Mar 14, 2016 at 12:04 PM, Jason Gustafson wrote: > I think this is the one: https://issues.apache.org/jira/browse/KAFKA-2978. > > -Jason > > On Mon, Mar 14, 2016 at 11:54 AM, Rajiv Kurian

Re: seekToBeginning doesn't work without auto.offset.reset

2016-03-14 Thread Jason Gustafson
Ah, that makes more sense. I have no idea about the limitations of your use case, but maybe you could expose a different interface to users. interface RebalanceListener { void onPartitionsAssigned(Consumer consumer, Collection partitions); void onPartitionsRevoked(Consumer

Kafka mirror maker issue. (data loss?)

2016-03-14 Thread feifei hsu
Hi, We are thinking using mirror maker to replic our kafka data stream. However, I heard mirror maker may lose data which we do not want. I am wondering if anyone has experience of mirror maker. How good and what the best practice to prevent dataloss is when we do data replica? Thanks

Re: New client commitAsync SendFailedException

2016-03-14 Thread Jason Gustafson
Hey Alexey, Asynchronous commit handling could probably be improved quite a bit. Basically what's happening is that the client's send buffer is getting filled up, which causes the subsequent commits to fail with SendFailedException. We don't currently implement any retrying for asynchronous

Re: Kafka 0.9.0.1 broker 0.9 consumer location of consumer group data

2016-03-14 Thread Rajiv Kurian
@Jason, can you please point me to the bug that you were talking about in 0.9.0.0? On Mon, Mar 14, 2016 at 11:36 AM, Rajiv Kurian wrote: > No I haven't. It's still running the 0.9.0 client. I'll try upgrading if > it sounds like an old bug. > > On Mon, Mar 14, 2016 at 11:24

Re: Kafka 0.9.0.1 broker 0.9 consumer location of consumer group data

2016-03-14 Thread Rajiv Kurian
No I haven't. It's still running the 0.9.0 client. I'll try upgrading if it sounds like an old bug. On Mon, Mar 14, 2016 at 11:24 AM, Jason Gustafson wrote: > Hey Rajiv, > > That sounds suspiciously like one of the bugs from 0.9.0.0. Have you > updated kafka-clients to

Re: Retry Message Consumption On Database Failure

2016-03-14 Thread Jason Gustafson
Yeah, that's the idea. Here's the JIRA I was thinking of: https://issues.apache.org/jira/browse/KAFKA-2273. I'm guessing this will need a KIP after 0.10 is out. -Jason On Mon, Mar 14, 2016 at 11:21 AM, Christian Posta wrote: > Jason, > > Can you link to the proposal

Re: Kafka 0.9.0.1 broker 0.9 consumer location of consumer group data

2016-03-14 Thread Jason Gustafson
Hey Rajiv, That sounds suspiciously like one of the bugs from 0.9.0.0. Have you updated kafka-clients to 0.9.0.1? -Jason On Mon, Mar 14, 2016 at 11:18 AM, Rajiv Kurian wrote: > Has any one run into similar problems. I have experienced the same problem > again. This time

Re: Retry Message Consumption On Database Failure

2016-03-14 Thread Christian Posta
Jason, Can you link to the proposal so I can take a look? Would the "sticky" proposal prefer to keep partitions assigned to consumers who currently have them and have not failed? On Mon, Mar 14, 2016 at 10:16 AM, Jason Gustafson wrote: > Hey Michael, > > I don't think a

Re: Kafka 0.9.0.1 broker 0.9 consumer location of consumer group data

2016-03-14 Thread Rajiv Kurian
Has any one run into similar problems. I have experienced the same problem again. This time when I use kafka-consumer-groups.sh tool it says that my consumer group is either missing or rebalancing. But when I use the --list method it shows up on the list. So my guess is it is rebalancing some how.

Re: Kafka Applicability - Large Messages

2016-03-14 Thread Cees de Groot
On Mon, Mar 14, 2016 at 5:42 AM, Jens Rantil wrote: > Just making it more explicit: AFAIK, all Kafka consumers I've seen loads > the incoming messages into memory. Unless you make it possible to stream it > do disk or something you need to make sure your consumers has the

Re: seekToBeginning doesn't work without auto.offset.reset

2016-03-14 Thread Jason Gustafson
Late arrival to this discussion. I'm not really sure I see the problem with accessing the consumer in the rebalance listener. Before we passed the consumer instance as a separate argument, but that was only because the rebalance listener had to be passed by classname before a reference to the

Re: Retry Message Consumption On Database Failure

2016-03-14 Thread Jason Gustafson
Hey Michael, I don't think a policy of retrying indefinitely is generally possible with the new consumer even if you had a heartbeat API. The problem is that the consumer itself doesn't control when the group needs to rebalance. If another consumer joins or leaves the group, then all consumers

Kafka Streams question

2016-03-14 Thread Mike Thomsen
I was reading a bit about Kafka Streams and was wondering if it is appropriate for my team's use. We ingest data using Kafka and Storm. Data gets pulled by Storm and sent off to bolts that publish the data into HBase and Solr. One of the things we need is something analogous to Storm's ability to

Re: UNKNOWN_MEMBER_ID assigned to consumer group

2016-03-14 Thread Jason Gustafson
Hey Tao, This error indicates that a rebalance completed successfully before the consumer could rejoin. Basically it works like this: 1. Consumer 1 joins the group and is assigned member id A 2. Consumer 1's session timeout expires before successfully heartbeating. 3. The group is rebalanced

Re: seekToBeginning doesn't work without auto.offset.reset

2016-03-14 Thread Cody Koeninger
Honestly the fact that everything is hidden inside poll() has been confusing people since last year, e.g. https://issues.apache.org/jira/browse/KAFKA-2359 I can try to formulate a KIP for this, but it seems clear that I'm not the only one giving this feedback, and I may not understand all the

AUTO: Yan Wang is out of the office (returning 03/17/2016)

2016-03-14 Thread Yan Wang
I am out of the office until 03/17/2016. Note: This is an automated response to your message "New client commitAsync SendFailedException" sent on 3/14/2016 10:18:14 AM. This is the only notification you will receive while this person is away. ** This email and any attachments may contain

New client commitAsync SendFailedException

2016-03-14 Thread Alexey Romanchuk
Hi all! I am using new client 0.9.0.1. I found that when I call commitAsync multiple times before calling poll most of commits failed with SendFailedException. Here it is an example of code - https://gist.github.com/13h3r/42633bcd64b80ddffe6b Could you please explain commitAsync in more

UNKNOWN_MEMBER_ID assigned to consumer group

2016-03-14 Thread tao xiao
Hi team, I have about 10 consumers using the same consumer group connecting to Kafka. Occasional I can see UNKNOWN_MEMBER_ID assigned to some of the consumers. I want to under what situation this would happen? I use Kafka version 0.9.0.1

Re: Need a help in understanding __consumer_offsets topic creation in Kafka Cluster

2016-03-14 Thread Achanta Vamsi Subhash
We changed the policy to "delete" dynamically for the __consumer_offsets topic and it was a better option than doing a cluster restart after enabling log compaction. Also, we found problems when you are replicating to a log compacted topic from a non-compacted topic (which is leader). On Mon, Mar

Re: Consuming previous messages and from different group.id

2016-03-14 Thread Gerard Klijs
Hi, if you use a new group for a consumer, the auto.offset.reset value will determine whether it will start at the beginning (with value earliest) or at the end (with value latest). For each group a separate offset is used, to two consumer, belonging to two different groups, when started before

Re: Kafka Applicability - Large Messages

2016-03-14 Thread Ben Stopford
Becket did a good talk at the last Kafka meetup on how Linked In handle the large message problem. http://www.slideshare.net/JiangjieQin/handle-large-messages-in-apache-kafka-58692297 > On 14 Mar 2016, at

Re: Kafka topics with infinite retention?

2016-03-14 Thread Ben Stopford
A couple of things: - Compacted topics provide a useful way to retain meaningful datasets inside the broker, which don’t grow indefinitely. If you have an update-in-place use case, where the event sourced approach doesn’t buy you much, these will keep the reload time down when you regenerate

Re: Kafka topics with infinite retention?

2016-03-14 Thread Jens Rantil
This is definitely an interesting use case. However, you need to be aware that changing the broker topology won't rebalance the preexisting data from the previous brokers. That is, you risk loosing data. Cheers, Jens On Wed, Mar 9, 2016 at 2:10 PM Daniel Schierbeck

Re: Kafka topics with infinite retention?

2016-03-14 Thread Gerard Klijs
You might find what you want when looking how Kafka is used for samza, http://samza.apache.org/ On Mon, Mar 14, 2016 at 10:34 AM Daniel Schierbeck wrote: > Partitions being limited by disk size is no different from e.g. a SQL > store. This would not be used for

Re: Kafka Applicability - Large Messages

2016-03-14 Thread Jens Rantil
Just making it more explicit: AFAIK, all Kafka consumers I've seen loads the incoming messages into memory. Unless you make it possible to stream it do disk or something you need to make sure your consumers has the available memory. Cheers, Jens On Fri, Mar 4, 2016 at 6:07 PM Cees de Groot

Re: Kafka topics with infinite retention?

2016-03-14 Thread Daniel Schierbeck
Partitions being limited by disk size is no different from e.g. a SQL store. This would not be used for extremely high throughput. If, eventually, there was a good case for not requiring that an entire partition must be stored on a single machine, it would be possible to use the log segments for

Re: Kafka topics with infinite retention?

2016-03-14 Thread Giidox
I would like to read an answer to this question as well. This is a similar architecture as I am planning. Dealing with secondary data store for old messages would indeed make things complicated. Clark Haskins wrote that the partition size is limited by machines capacity (I assume disk space):

Re: Need a help in understanding __consumer_offsets topic creation in Kafka Cluster

2016-03-14 Thread Kunal Gupta
Thanks @Stevo Slavić *Thanks, Kunal* *+91-9958189589* *Data Analyst* *First Paper Publication : **http://dl.acm.org/citation.cfm?id=2790798 * *Blog:- **http://learnhardwithkunalgupta.blogspot.in * On

Re: Need a help in understanding __consumer_offsets topic creation in Kafka Cluster

2016-03-14 Thread Stevo Slavić
You are affected by this 0.9.0.0 bug https://issues.apache.org/jira/browse/KAFKA-2988 It was fixed for 0.9.0.1. You could just apply same fix to your 0.9.0.0 cluster but I'd recommend upgrading to 0.9.0.1. Kind regards, Stevo Slavic. On Mon, Mar 14, 2016, 07:10 Kunal Gupta

Need a help in understanding __consumer_offsets topic creation in Kafka Cluster

2016-03-14 Thread Kunal Gupta
Hi everyone, I am new here, recently join the group. I faced a problem in Kafka Cluster, a problem is described below. I am using Kafka version 0.9.0.0 We have established a Kafka Cluster of 3 machines where 2 machines are utilized for Kafka broker and same 3 machines utilized for zookeeper.