Re: LeaderNotAvailableException

2014-08-11 Thread Ryan Williams
Thanks for the heads up on attachments, here's a gist: https://gist.githubusercontent.com/ryanwi/84deb8774a6922ff3704/raw/75c33ad71d0d41301533cbc645fa9846736d5eb0/gistfile1.txt This seems to mostly happen in my development environment, when running a single broker. I don't see any broker failure

Re: LeaderNotAvailableException

2014-08-11 Thread Ryan Williams
Thanks Neha, Indeed, there are no replicas apparently. $ bin/kafka-topics.sh --describe --zookeeper localhost:2181 Topic:eventsPartitionCount:2ReplicationFactor:1Configs: Topic: eventsPartition: 0Leader: 0Replicas: 0Isr: 0 Topic: eventsPartition: 1Leade

Re: Consumer Parallelism

2014-08-11 Thread Guozhang Wang
Hello Mingtao, The partition will not be re-assigned to other consumers unless the current consumer fails, so the behavior you described will not be expected. Guozhang On Mon, Aug 11, 2014 at 6:27 PM, Mingtao Zhang wrote: > Hi Guozhang, > > I do have another Email talking about Partitions per

Re: Issue with 240 topics per day

2014-08-11 Thread Chen Wang
Got it. thanks for the input Todd! Chen On Mon, Aug 11, 2014 at 9:31 PM, Todd Palino wrote: > As I noted, we have a cluster right now with 70k partitions. It’s running > on over 30 brokers, partly to cover the number of partitions and and > partly to cover the amount of data that we push throug

Re: LeaderNotAvailableException

2014-08-11 Thread Guozhang Wang
Ryan, Apache mailing list does not allow attachments exceeding a certain size limit, so the server logs is blocked. >From the controller log it seems this only broker has failed and hence no partitions will be available. This could be a soft failure (e.g. long GC), or the ZK server side issues. Y

Re: Issue with 240 topics per day

2014-08-11 Thread Todd Palino
As I noted, we have a cluster right now with 70k partitions. It’s running on over 30 brokers, partly to cover the number of partitions and and partly to cover the amount of data that we push through it. If you can have at least 4 or 5 brokers, I wouldn’t anticipate any problems with the number of p

Re: Kafka Consumer not consuming in webMethods.

2014-08-11 Thread Jun Rao
0.8-beta is really old. Could you try using 0.8.1.1? Thanks, Jun On Mon, Aug 11, 2014 at 11:11 AM, Seshadri, Balaji wrote: > Hi Jun/Neha/Team, > > > > We are trying to consume from Kafka using webMethods as our consumer. When > we start the consumer, fetcher and leader threads went into WAI

Re: LeaderNotAvailableException

2014-08-11 Thread Neha Narkhede
[2014-08-11 19:12:45,321] ERROR Controller 0 epoch 3 initiated state change for partition [mytopic,0] from OfflinePartition to OnlinePartition failed (state.change.logger) kafka.common. NoReplicaOnlineException: No replica for partition [mytopic,0] is alive. Live brokers are: [Set()], Assigned repl

Re: Issue with 240 topics per day

2014-08-11 Thread Chen Wang
Todd, Yes I actually thought about that. My concern is that even a weeks topic partition(240*7*3 = 5040) is too many. Does linkedin have a good experience in using this many topics in your system?:-) Thanks, Chen On Mon, Aug 11, 2014 at 9:02 PM, Todd Palino wrote: > In order to delete topics, y

Re: Issue with 240 topics per day

2014-08-11 Thread Todd Palino
In order to delete topics, you need to shut down the entire cluster (all brokers), delete the topics from Zookeeper, and delete the log files and partition directory from the disk on the brokers. Then you can restart the cluster. Assuming that you can take a periodic outage on your cluster, you can

Re: Issue with 240 topics per day

2014-08-11 Thread Chen Wang
Unfortunately, this would not work in our system. It means almost every several minutes i will need to scan the entire queue, which is not possible in our case. In fact, our old system is designed in this way: store the data in hbase, and with hourly mapreduce to scan entire table figure out which

Re: Issue with 240 topics per day

2014-08-11 Thread Chen Wang
Vipul, The problem is that the producer does not know when it should set the window start and window end boundary.. The data does not arrive in order. I also think its difficult to get the offset of the boundary, and only pull messages between those boundaries: i am already trying to avoid use the

Re: Issue with 240 topics per day

2014-08-11 Thread Philip O'Toole
Ok, now that is good detail. I understand your issue. It's somewhat difficult as to use Kafka in your situation, as Kafka is a FIFO queue, but you are trying to use it with data that is not tightly ordered in that manner.  I don't have any definite solutions, but perhaps this might work. Assu

Re: Issue with 240 topics per day

2014-08-11 Thread vipul jhawar
Your use case requires messages to pushed out when time comes instead of the order in which they arrived, while kafka may not be best for this as within the Q you want some message batch to be sent out early and some later. There could be another way to solve this with offset management as kafka is

Re: Issue with 240 topics per day

2014-08-11 Thread Chen Wang
Those data has a timestamp: its actually email campaigns with scheduled send time. But since they can be scheduled ahead(e.g, two days ahead), I cannot read it when it arrives. It has to wait until its actual scheduled send time. As you can tell, the sequence within the 6 min does not matter, but

Re: Issue with 240 topics per day

2014-08-11 Thread Philip O'Toole
Why do you need to read it every 6 minutes? Why not just read it as it arrives? If it naturally arrives in 6 minute bursts, you'll read it in 6 minute bursts, no? Perhaps the data does not have timestamps embedded in it, so that is why you are relying on time-based topic names? In that case I w

Re: Issue with 240 topics per day

2014-08-11 Thread Chen Wang
"And if you can't consume it all within 6 minutes, partition the topic until you can run enough consumers such that you can keep up.", this is what I intend to do for each 6min -topic. What I really need is a partitioned queue: each 6 minute of data can put into a separate partition, so that I can

Re: Consumer Parallelism

2014-08-11 Thread Mingtao Zhang
Hi Guozhang, I do have another Email talking about Partitions per topic. I paste it within this Email. I am expecting those consumers will work concurrently. The behavior I observed here is consumer thread-1 will work a while, then thread-3 will work, then thread-0 ..., is it normal? version is

Re: Issue with 240 topics per day

2014-08-11 Thread Philip O'Toole
It's still not clear to me why you need to create so many topics. Write the data to a single topic and consume it when it arrives. It doesn't matter if it arrives in bursts, as long as you can process it all within 6 minutes, right? And if you can't consume it all within 6 minutes, partition t

Re: Issue with 240 topics per day

2014-08-11 Thread Chen Wang
Philip, That is right. There is huge amount of data flushed into the topic within each 6 minutes. Then at the end of each 6 min, I only want to read from that specify topic, and data within that topic has to be processed as fast as possible. I was originally using redis queue for this purpose, but

Re: LeaderNotAvailableException

2014-08-11 Thread Ryan Williams
The broker appears to be running $ telnet kafka-server 9092 Trying... Connected to kafka-server Escape character is '^]'. I've attached today's server.log. There was a manual restart of kafka, which you'll notice, but that didn't fix the issue. Thanks for looking! On Mon, Aug 11, 2014 a

Re: Issue with 240 topics per day

2014-08-11 Thread Philip O'Toole
I'd love to know more about what you're trying to do here. It sounds like you're trying to create topics on a schedule, trying to make it easy to locate data for a given time range? I'm not sure it makes sense to use Kafka in this manner. Can you provide more detail? Philip   ---

Re: Issue with 240 topics per day

2014-08-11 Thread Chen Wang
Todd, I actually only intend to keep each topic valid for 3 days most. Each of our topic has 3 partitions, so its around 3*240*3 =2160 partitions. Since there is no api for deleting topic, i guess i could set up a cron job deleting the out dated topics(folders) from zookeeper.. do you know when the

Re: Consumer Parallelism

2014-08-11 Thread Guozhang Wang
Mingtao, How many partitions of the consumed topic has? Basically the data is distributed per-partition, and hence if the number of consumers is larger than the number of partitions, some consumers will not get any data. Guozhang On Mon, Aug 11, 2014 at 3:29 PM, Mingtao Zhang wrote: > Is it a

Re: LeaderNotAvailableException

2014-08-11 Thread Guozhang Wang
Hi Ryan, Could you check if all of your brokers are still live and running? Also could you check the server log in addition to the producer / state-change / controller logs? Guozhang On Mon, Aug 11, 2014 at 12:45 PM, Ryan Williams wrote: > I have a single broker test Kafka instance that was r

Re: Issue with 240 topics per day

2014-08-11 Thread Todd Palino
You need to consider your total partition count as you do this. After 30 days, assuming 1 partition per topic, you have 7200 partitions. Depending on how many brokers you have, this can start to be a problem. We just found an issue on one of our clusters that has over 70k partitions that there¹s no

Re: Consumer Parallelism

2014-08-11 Thread Mingtao Zhang
Is it anyhow related to the issue? WARN No previously checkpointed highwatermark value found for topic RAW partition 0. Returning 0 as the highwatermark (kafka.server.HighwaterMarkCheckpoint) Mingtao

Re: Error while producing messages

2014-08-11 Thread Guozhang Wang
Could you try again by setting the config to true? On Mon, Aug 11, 2014 at 3:14 PM, Tanneru, Raj wrote: > Hi Guozhang, > > I didn't set enable.controlled.shutdown to true. Yes I am shutting down 1 > broker at a time slowly. However I begin the test(2 clients producing > messages) long time afte

RE: Error while producing messages

2014-08-11 Thread Tanneru, Raj
Hi Guozhang, I didn't set enable.controlled.shutdown to true. Yes I am shutting down 1 broker at a time slowly. However I begin the test(2 clients producing messages) long time after taking down the brokers. I see the below debug message on live broker once in a while. [2014-08-11 15:09:56,078

Consumer Parallelism

2014-08-11 Thread Mingtao Zhang
Hi, We are using the following method on ConsumerConnector to get multiple streams per topic, and we have multiple partitions per topic. It looks like only one of the runnable is active through a relative long time period. Is there anything we could possible missed? public Map>> createMessageStr

Issue with 240 topics per day

2014-08-11 Thread Chen Wang
Folks, Is there any potential issue with creating 240 topics every day? Although the retention of each topic is set to be 2 days, I am a little concerned that since right now there is no delete topic api, the zookeepers might be overloaded. Thanks, Chen

[Kafka MirrorMaker] Message with Custom Partition Logic

2014-08-11 Thread Bhavesh Mistry
HI Kafka Dev Team, We have to aggregate events (count) per DC and across DCs for one of topic. We have standard Linked-in data pipe line producers --> Local Brokers --> MM --> Center Brokers. So I would like to know How MM handles messages when custom partitioning logic is used as below and

RE: Kafka Consumer not consuming in webMethods.

2014-08-11 Thread Seshadri, Balaji
Any pointers would helpful. From: Seshadri, Balaji Sent: Monday, August 11, 2014 12:19 PM To: 'jun@gmail.com'; 'neha.narkh...@gmail.com'; 'users@kafka.apache.org' Subject: RE: Kafka Consumer not consuming in webMethods. The offset checker does show lot of lag. rain-raw-consumers1 rain-raw-li

LeaderNotAvailableException

2014-08-11 Thread Ryan Williams
I have a single broker test Kafka instance that was running fine on Friday (basically out of the box configuration with 2 partitions), now I come back on Monday and producers are unable to send messages. What else can i look at to debug, and prevent? I know how to recover by removing data directo

Re: [New Feature Request] Ability to Inject Queue Implementation Async Mode

2014-08-11 Thread Bhavesh Mistry
Thank you all for the comments. Yes, I understand concern from community members with extra burden of having the complexity to drop message, but if ability to inject implementation of the Queue which will make this completely transparent to Kafka. I just need fine-grained control of the applicati

Re: Error while producing messages

2014-08-11 Thread Guozhang Wang
Hi Raj, I have a couple of more questions for you: 1. On the server configs, did you set enable.controlled.shutdown to true or not? 2. When you shutdown just one broker, did you see any errors? Here I am assuming you are not shutting down brokers too quickly, but shutdown one broker at a time, wa

RE: Kafka Consumer not consuming in webMethods.

2014-08-11 Thread Seshadri, Balaji
The offset checker does show lot of lag. rain-raw-consumers1 rain-raw-listner 47 630138482 32181 rain-raw-consumers1_dm1mad06.echostar.com-1407776221959-74777cd2-3 From: Seshadri, Balaji Sent: Monday, August 11, 2014 12:11 PM To: 'jun@gmail.com

RE: Error while producing messages

2014-08-11 Thread Tanneru, Raj
I shutdown 3 out 5. With 2 brokers I start seeing failures after successfully sending some messages. Not all messages are failing. I wanted to understand the case/s when we log below message? If you notice there is difference in send/receive buffer size of actual and requested. I don’t see this

Re: Error while producing messages

2014-08-11 Thread Guozhang Wang
I am not sure I understand completely. How many brokers did you shutdown out of the total number of 5 brokers? With single-partition topics, if the replication factor is 3, this partition will be hosted on 3 brokers. Guozhang On Mon, Aug 11, 2014 at 9:46 AM, Tanneru, Raj wrote: > Sorry I shoul

RE: Error while producing messages

2014-08-11 Thread Tanneru, Raj
Sorry I should have provided this information. All my topics have single partition meaning there are 2 nodes that already have topic partition. It's just that the node having 3rd partition is down. So if a message fails there is no reason why other message should succeed, unless there is network

Re: A wired producer connection timeout issue

2014-08-11 Thread Jun Rao
This seems to happen when fetching the metadata. Are you using a VIP as broker.list? Thanks, Jun On Fri, Aug 8, 2014 at 2:42 PM, S. Zhou wrote: > Thanks Guozhang. Any ideas on what could be wrong on that machine? We set > up multiple producers in the same way but only one has this issue. > >

RE: Consume more than produce

2014-08-11 Thread Guy Doulberg
If you can't see the image, I uploaded it to dropbox https://www.dropbox.com/s/gckn4gt7gv26l9w/graph.png From: Guy Doulberg [mailto:guy.doulb...@perion.com] Sent: Monday, August 11, 2014 4:58 PM To: users@kafka.apache.org Subject: RE: Consume more than produce Hey I had an issue in the prod

RE: Consume more than produce

2014-08-11 Thread Guy Doulberg
Hey I had an issue in the production two days ago, For some reason 2 brokers in my 5 brokers cluster were stuck, meaning their process was up, but they didn't answer to port 9092. The ZK saw them as live brokers. Producer couldn't produce events to them and consumer couldn't consume S