Kafka 9 consumer generates EOF exceptions when parsing responses

2016-09-01 Thread Rajiv Kurian
I've seen these before but we have recently moved to the 0.9 consumer code for one of our big Kafka use cases and now we see about 1 million EOF exceptions in a 5 minute period. This can't be very good for performance. My guess is that these exceptions are expected since it uses DataInputStream to

Kafka 0.9 consumer gets stuck in epoll

2016-08-30 Thread Rajiv Kurian
We had a Kafka 0.9 consumer stuck in the epoll native call under the following circumstances. 1. It was started bootstrapped with a cluster with 3 brokers A, B and C with ids 1,2,3. 2. Change the assignment of the brokers to some topic partitions. Seek to the beginning of each topic partition.

Deleting a topic on the 0.8x brokers

2016-07-14 Thread Rajiv Kurian
We plan to stop using a particular Kafka topic running on a certain subset of a 0.82x cluster. This topic is served by 9 brokers (leaders + replicas) and these 9 brokers have no other topics on them. Once we have stopped sending and consuming traffic from this topic (and hence the 9 brokers) what

Re: [DISCUSS] Java 8 as a minimum requirement

2016-06-16 Thread Rajiv Kurian
+1 On Thu, Jun 16, 2016 at 1:45 PM, Ismael Juma wrote: > Hi all, > > I would like to start a discussion on making Java 8 a minimum requirement > for Kafka's next feature release (let's say Kafka 0.10.1.0 for now). This > is the first discussion on the topic so the idea is to

Debugging high log flush latency

2016-04-27 Thread Rajiv Kurian
We monitor the log flush latency p95 on all our Kafka nodes and occasionally we see it creep up from the regular figure of under 15 ms to above 150 ms. Restarting the node usually doesn't help. It seems to fix itself over time but we are not quite sure about the underlying reason. It's

Re: Is there a tool to list the topic-partitions on a particular broker

2016-04-21 Thread Rajiv Kurian
> > > > There is a awesome tool called "kafka-manager", which was opened sourced > by > > Yahoo. > > https://github.com/yahoo/kafka-manager > > > > > >> On 21 April 2016 at 08:07, Rajiv Kurian <ra...@signalfx.com> wrote: > >> >

Is there a tool to list the topic-partitions on a particular broker

2016-04-20 Thread Rajiv Kurian
The kafka-topics.sh tool lists topics and where the partitions are. Is there a similar tool where I could give it a broker id and it would give me all the topic-partitions on it? I want to bring down a few brokers but before doing that I want to make sure that I've migrated all topics away from

Re: two questions

2016-03-21 Thread Rajiv Kurian
They don't work with the old brokers. We made the assumption that they did and had to roll-back. On Mon, Mar 21, 2016 at 10:42 AM, Alexis Midon < alexis.mi...@airbnb.com.invalid> wrote: > Hi Ismael, > > could you elaborate on "newer clients don't work with older brokers > though."? doc pointers

Re: Kafka 0.9.0.1 broker 0.9 consumer location of consumer group data

2016-03-14 Thread Rajiv Kurian
Thanks Jason. I'll try to upgrade and see if it helps. On Mon, Mar 14, 2016 at 12:04 PM, Jason Gustafson <ja...@confluent.io> wrote: > I think this is the one: https://issues.apache.org/jira/browse/KAFKA-2978. > > -Jason > > On Mon, Mar 14, 2016 at 11:54 AM, Rajiv Kuria

Re: Kafka 0.9.0.1 broker 0.9 consumer location of consumer group data

2016-03-14 Thread Rajiv Kurian
@Jason, can you please point me to the bug that you were talking about in 0.9.0.0? On Mon, Mar 14, 2016 at 11:36 AM, Rajiv Kurian <ra...@signalfx.com> wrote: > No I haven't. It's still running the 0.9.0 client. I'll try upgrading if > it sounds like an old bug. > > On Mon, Mar

Re: Kafka 0.9.0.1 broker 0.9 consumer location of consumer group data

2016-03-14 Thread Rajiv Kurian
ated kafka-clients to 0.9.0.1? > > -Jason > > On Mon, Mar 14, 2016 at 11:18 AM, Rajiv Kurian <ra...@signalfx.com> wrote: > > > Has any one run into similar problems. I have experienced the same > problem > > again. This time when I use kafka-consumer-groups.sh too

Re: Kafka 0.9.0.1 broker 0.9 consumer location of consumer group data

2016-03-14 Thread Rajiv Kurian
. Again I have a single consumer group per topic with a single consumer in that group. Wondering it this causes some edge case. This consumer is up as of now, so I don't know why it would say it is rebalancing. On Wed, Mar 9, 2016 at 11:05 PM, Rajiv Kurian <ra...@signalfx.com> wrote: &g

Re: Kafka 0.9.0.1 broker 0.9 consumer location of consumer group data

2016-03-09 Thread Rajiv Kurian
sh kafka-consumer-groups.sh --bootstrap-server localhost:9092 --list > --new-consumer > > > On Thu, Mar 10, 2016 at 12:02 PM, Rajiv Kurian <ra...@signalfx.com> wrote: > > > Hi Guozhang, > > > > I tried using the kafka-consumer-groups.sh --list command and it says I

Re: Kafka 0.9.0.1 broker 0.9 consumer location of consumer group data

2016-03-09 Thread Rajiv Kurian
oup metadata you can use the ConsumerGroupCommand, wrapped > in bin/kafka-consumer-groups.sh. > > Guozhang > > On Wed, Mar 9, 2016 at 5:48 PM, Rajiv Kurian <ra...@signalfx.com> wrote: > > > Don't think I made my questions clear: > > > > On Kafka 0.9.0.1

Re: Kafka 0.9.0.1 broker 0.9 consumer location of consumer group data

2016-03-09 Thread Rajiv Kurian
to it i.e. it was not provisioned from before. Messages currently are only being sent to partition 0 even though there are 8 partitions per topic. Thanks, Rajiv On Wed, Mar 9, 2016 at 4:30 PM, Rajiv Kurian <ra...@signalfx.com> wrote: > Also forgot to mention that when I do consume with the console consu

Re: Kafka 0.9.0.1 broker 0.9 consumer location of consumer group data

2016-03-09 Thread Rajiv Kurian
Also forgot to mention that when I do consume with the console consumer I do see data coming through. On Wed, Mar 9, 2016 at 3:44 PM, Rajiv Kurian <ra...@signalfx.com> wrote: > I am running the 0.9.0.1 broker with the 0.9 consumer. I am using the > subscribe feature on the consumer

Kafka 0.9.0.1 broker 0.9 consumer location of consumer group data

2016-03-09 Thread Rajiv Kurian
I am running the 0.9.0.1 broker with the 0.9 consumer. I am using the subscribe feature on the consumer to subscribe to a topic with 8 partitions. consumer.subscribe(Arrays.asList(myTopic)); I have a single consumer group for said topic and a single process subscribed with 8 partitions. When I

Re: Kafka protocol fetch request max wait.

2016-02-05 Thread Rajiv Kurian
I've updated Kafka-3159 with my findings. Thanks, Rajiv On Thu, Feb 4, 2016 at 10:25 PM, Rajiv Kurian <ra...@signalfx.com> wrote: > I think I found out when the problem happens. When a broker that is sent a > fetch request has no messages for any of the partitions it is being aske

Re: FW: 0.9 consumer log spam - Marking the coordinator dead

2016-02-05 Thread Rajiv Kurian
Thanks for the update Ismael. On Fri, Feb 5, 2016 at 10:31 AM, Ismael Juma <ism...@juma.me.uk> wrote: > Hi Rajiv, > > Jun just sent a message about 0.9.0.1. It should be out soon if everything > goes well. > > Ismael > > On Fri, Feb 5, 2016 at 5:48 PM, Rajiv Kur

Re: FW: 0.9 consumer log spam - Marking the coordinator dead

2016-02-05 Thread Rajiv Kurian
Hi Ismael, Is there a maven release planned soon? We've seen this problem too and it is rather disconcerting. Thanks, Rajiv On Fri, Feb 5, 2016 at 5:15 AM, Ismael Juma wrote: > Hi Simon, > > It may be worth trying the 0.9.0 branch as it includes a number of > important

Re: Kafka protocol fetch request max wait.

2016-02-05 Thread Rajiv Kurian
ore and report what I find on the JIRA. > > -Jason > > On Fri, Feb 5, 2016 at 9:50 AM, Rajiv Kurian <ra...@signalfx.com> wrote: > > > I've updated Kafka-3159 with my findings. > > > > Thanks, > > Rajiv > > > > On Thu, Feb 4, 2016 at 10:25 PM,

Kafka protocol fetch request max wait.

2016-02-04 Thread Rajiv Kurian
I am writing a Kafka consumer client using the document at https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol One place where I am having problems is the fetch request itself. I am able to send fetch requests and can get fetch responses that I can parse properly, but

Re: Kafka protocol fetch request max wait.

2016-02-04 Thread Rajiv Kurian
, Feb 4, 2016 at 8:58 PM, Rajiv Kurian <ra...@signalfx.com> wrote: > I actually restarted my application with the consumer config I mentioned > at https://issues.apache.org/jira/browse/KAFKA-3159 and I can't get it to > use high CPU any more :( Not quite sure about how to proceed. I'l

Re: Kafka protocol fetch request max wait.

2016-02-04 Thread Rajiv Kurian
On Thu, Feb 4, 2016 at 4:56 PM, Rajiv Kurian <ra...@signalfx.com> wrote: > And just like that it stopped happening even though I didn't change any of > my code. I had filed https://issues.apache.org/jira/browse/KAFKA-3159 > where the stock 0.9 kafka consumer was using very high CPU a

Re: Kafka protocol fetch request max wait.

2016-02-04 Thread Rajiv Kurian
the same problem (lots of empty messages) even though we asked the broker to park the request till enough bytes came through. On Thu, Feb 4, 2016 at 3:21 PM, Rajiv Kurian <ra...@signalfx.com> wrote: > I am writing a Kafka consumer client using the document at > https://cwiki.apache.or

Re: Kafka protocol fetch request max wait.

2016-02-04 Thread Rajiv Kurian
> Jason > > On Thu, Feb 4, 2016 at 5:27 PM, Rajiv Kurian <ra...@signalfx.com> wrote: > > > Indeed this seems to be the case. I am now running the client mentioned > in > > https://issues.apache.org/jira/browse/KAFKA-3159 and it is no longer > > taking up high

Re: Kafka protocol fetch request max wait.

2016-02-04 Thread Rajiv Kurian
happens under those conditions. On Thu, Feb 4, 2016 at 8:40 PM, Rajiv Kurian <ra...@signalfx.com> wrote: > Hey Jason, > > Yes I checked for error codes. There were none. The message was perfectly > legal as parsed by my hand written parser. I also verified the size of the > respo

Trying to figure out the protocol.

2016-02-01 Thread Rajiv Kurian
I am trying to write a Kafka client (specifically a consumer) and am using the protocol document at https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol . Had a specific question on the Offset Request API: OffsetRequest => ReplicaId [TopicName [Partition Time

Re: Getting very poor performance from the new Kafka consumer

2016-01-27 Thread Rajiv Kurian
On Tue, Jan 26, 2016 at 10:44 PM, Guozhang Wang <wangg...@gmail.com> wrote: > Rajiv, > > Could you try to build the new consumer from 0.9.0 branch and see if the > issue can be re-produced? > > Guozhang > > On Mon, Jan 25, 2016 at 9:46 PM, Rajiv Kur

Re: Getting very poor performance from the new Kafka consumer

2016-01-27 Thread Rajiv Kurian
exceptions locally, but not nearly at the rate that you're > reporting. That might be a factor of the number of partitions, so I'll do > some investigation. > > -Jason > > On Wed, Jan 27, 2016 at 8:40 AM, Rajiv Kurian <ra...@signalfx.com> wrote: > > > Hi Guozhang, > &g

Re: Stuck consumer with new consumer API in 0.9

2016-01-26 Thread Rajiv Kurian
Thanks Jun. On Tue, Jan 26, 2016 at 3:48 PM, Jun Rao <j...@confluent.io> wrote: > Rajiv, > > We haven't released 0.9.0.1 yet. To try the fix, you can build a new client > jar off the 0.9.0 branch. > > Thanks, > > Jun > > On Mon, Jan 25, 2016 at 12:03 PM, Raj

Re: Getting very poor performance from the new Kafka consumer

2016-01-25 Thread Rajiv Kurian
The exception seems to be thrown here https://github.com/apache/kafka/blob/0.9.0/clients/src/main/java/org/apache/kafka/common/record/MemoryRecords.java#L236 Is this not expected to hit often? On Mon, Jan 25, 2016 at 9:22 PM, Rajiv Kurian <ra...@signalfx.com> wrote: > Wanted to ad

Getting very poor performance from the new Kafka consumer

2016-01-25 Thread Rajiv Kurian
We are using the new kafka consumer with the following config (as logged by kafka) metric.reporters = [] metadata.max.age.ms = 30 value.deserializer = class org.apache.kafka.common.serialization.ByteArrayDeserializer group.id = myGroup.id

Re: Getting very poor performance from the new Kafka consumer

2016-01-25 Thread Rajiv Kurian
for the poor performance. On Mon, Jan 25, 2016 at 9:20 PM, Rajiv Kurian <ra...@signalfx.com> wrote: > We are using the new kafka consumer with the following config (as logged > by kafka) > > metric.reporters = [] > > metadata.max.age.ms = 30 > >

Re: Stuck consumer with new consumer API in 0.9

2016-01-25 Thread Rajiv Kurian
Hi Jason, Was this a server bug or a client bug? Thanks, Rajiv On Mon, Jan 25, 2016 at 11:23 AM, Jason Gustafson wrote: > Apologies for the late arrival to this thread. There was a bug in the > 0.9.0.0 release of Kafka which could cause the consumer to stop fetching > from

Re: Stuck consumer with new consumer API in 0.9

2016-01-25 Thread Rajiv Kurian
Gustafson <ja...@confluent.io> wrote: > Hey Rajiv, the bug was on the client. Here's a link to the JIRA: > https://issues.apache.org/jira/browse/KAFKA-2978. > > -Jason > > On Mon, Jan 25, 2016 at 11:42 AM, Rajiv Kurian <ra...@signalfx.com> wrote: > > >

Kafka 0.9 client producer compatibility with Kafka 0.8.2 broker

2016-01-13 Thread Rajiv Kurian
We just upgraded one of our Kafka client producers from 0.8.2 to 0.9. Our broker is still running 0.8.2. I knew that the new 0.9 consumer requires the new broker and I was under the impression that the new producer would still work with the old broker. However this doesn't seem to be the case. I

Re: fallout from upgrading to the new Kafka producers

2016-01-13 Thread Rajiv Kurian
n look through its changes by just searching "producer" in the > release notes: > > http://mirror.stjschools.org/public/apache/kafka/0.9.0.0/RELEASE_NOTES.html > > > Guozhang > > > On Tue, Jan 12, 2016 at 6:00 PM, Rajiv Kurian <ra...@signalfx.com> wrote

Re: fallout from upgrading to the new Kafka producers

2016-01-12 Thread Rajiv Kurian
he socket to "notify" > the client, etc. > > Guozhang > > > > > On Mon, Jan 11, 2016 at 1:08 PM, Rajiv Kurian <ra...@signalfx.com> wrote: > > > We have recently upgraded some of our applications to use the Kafka 0.8.2 > > Java producers fr

0.9 consumer reading a range of log messages

2016-01-06 Thread Rajiv Kurian
I want to use the new 0.9 consumer for a particular application. My use case is the following: i) The TopicPartition I need to poll has a short log say 10 mins odd (log.retention.minutes is set to 10). ii) I don't use a consumer group i.e. I manage the partition assignment myself. iii)

Re: 0.9 consumer reading a range of log messages

2016-01-06 Thread Rajiv Kurian
che.org/documentation.html#brokerconfigs. > And for what it's worth, KIP-32, which adds a timestamp to each message, > should provide some better options for handling this. > > -Jason > > On Wed, Jan 6, 2016 at 9:37 AM, Rajiv Kurian <ra...@signalfx.com> wrote: > > >

Re: 0.9 consumer reading a range of log messages

2016-01-06 Thread Rajiv Kurian
egments. This > means > > > that you cannot generally guarantee that messages will be deleted > within > > > the configured retention time. However, you can control the segment > size > > > using "log.segment.bytes" and the delay before deletion with "

Re: Fallout from upgrading to kafka 0.9 from 0.8.2.3

2015-12-17 Thread Rajiv Kurian
guess). Again our other (lower traffic) cluster that was upgraded was totally fine so it doesn't seem like it happens all the time. > > Jun > > > On Tue, Dec 15, 2015 at 12:52 PM, Rajiv Kurian <ra...@signalfx.com> wrote: > > > We had to revert to 0.8.3 because three of

Re: Fallout from upgrading to kafka 0.9 from 0.8.2.3

2015-12-17 Thread Rajiv Kurian
at 12:23 PM, Jun Rao <j...@confluent.io> wrote: > Are you using the new java producer? > > Thanks, > > Jun > > On Thu, Dec 17, 2015 at 9:58 AM, Rajiv Kurian <ra...@signalfx.com> wrote: > > > Hi Jun, > > Answers inline: > > > > On Thu

Re: Fallout from upgrading to kafka 0.9 from 0.8.2.3

2015-12-17 Thread Rajiv Kurian
> Thanks, > > Jun > > On Thu, Dec 17, 2015 at 12:41 PM, Rajiv Kurian <ra...@signalfx.com> wrote: > > > The topic which stopped working had clients that were only using the old > > Java producer that is a wrapper over the Scala producer. Again it seemed > to

Re: Fallout from upgrading to kafka 0.9 from 0.8.2.3

2015-12-17 Thread Rajiv Kurian
2? > > -Dana > On Dec 17, 2015 5:56 PM, "Rajiv Kurian" <ra...@signalfx.com> wrote: > > > Yes we are in the process of upgrading to the new producers. But the > > problem seems deeper than a compatibility issue. We have one environment > > where the o

Re: Fallout from upgrading to kafka 0.9 from 0.8.2.3

2015-12-17 Thread Rajiv Kurian
ple > use that. > > Also, when those producers had the issue, were there any other things weird > in the broker (e.g., broker's ZK session expires)? > > Thanks, > > Jun > > On Thu, Dec 17, 2015 at 2:37 PM, Rajiv Kurian <ra...@signalfx.com> wrote: > > > I can't think of

Kafka 0.9 consumer API question

2015-12-15 Thread Rajiv Kurian
We are trying to use the Kafka 0.9 consumer API to poll specific partitions. We consume partitions based on our own logic instead of delegating that to Kafka. One of our use cases is handling a change in the partitions that we consume. This means that sometimes we need to consume additional

Re: Fallout from upgrading to kafka 0.9 from 0.8.2.3

2015-12-15 Thread Rajiv Kurian
the brokers were running 0.9 code with inter.broker.protocol.version=0.8.2.X I restarted them one by one with the 0.8.2.3 broker code. This however like I mentioned did not fix the three broken topics. On Mon, Dec 14, 2015 at 3:13 PM, Rajiv Kurian <ra...@signalfx.com> wrote: > Now that it

Re: Kafka 0.9 consumer API question

2015-12-15 Thread Rajiv Kurian
> unless profiling shows that it is actually a problem. Are your partition > assignments generally very large? > > -Jason > > > On Tue, Dec 15, 2015 at 1:32 PM, Rajiv Kurian <ra...@signalfx.com> wrote: > > > We are trying to use the Kafka 0.9 consumer API to poll spe

Re: Kafka 0.9 consumer API question

2015-12-15 Thread Rajiv Kurian
y the set returned by > assignment(). Then it's just one copy to convert it to a list, and we can > fix this by adding the assign() variant I suggested above. > > By the way, here's a link to the JIRA I created: > https://issues.apache.org/jira/browse/KAFKA-2991. > > -Jason > > On T

Fallout from upgrading to kafka 0.9 from 0.8.2.3

2015-12-14 Thread Rajiv Kurian
I upgraded one of our Kafka clusters (9 nodes) from 0.8.2.3 to 0.9 following the instructions at http://kafka.apache.org/documentation.html#upgrade Most things seem to work fine based on our metrics. Something I noticed is that the network out on 3 of the nodes goes up every 5-6 minutes. I see a

Re: Any gotchas upgrading to 0.9?

2015-12-14 Thread Rajiv Kurian
Scratch that. On more careful observation I do see this in the logs: inter.broker.protocol.version = 0.8.2.X On Mon, Dec 14, 2015 at 10:25 AM, Rajiv Kurian <ra...@signalfx.com> wrote: > I am in the process of updating to 0.9 and had another question. > > The docs at http://k

Re: Any gotchas upgrading to 0.9?

2015-12-14 Thread Rajiv Kurian
. So the normal > upgrade path is server-first, clients later. > > Filed https://issues.apache.org/jira/browse/KAFKA-2923 to update the > upgrade doc to include it. > > Guozhang > > On Tue, Dec 1, 2015 at 11:14 AM, Rajiv Kurian <ra...@signalfx.com> wrote: > &g

Re: Fallout from upgrading to kafka 0.9 from 0.8.2.3

2015-12-14 Thread Rajiv Kurian
Now that it has been a bit longer, the spikes I was seeing are gone but the CPU and network in/out on the three brokers that were showing the spikes are still much higher than before the upgrade. Their CPUs have increased from around 1-2% to 12-20%. The network in on the same brokers has gone up

Any gotchas upgrading to 0.9?

2015-12-01 Thread Rajiv Kurian
I plan to upgrade both the server and clients to 0.9. Had a few questions before I went ahead with the upgrade: 1. Do all brokers need to be on 0.9? Currently we are running 0.8.2. We'd ideally like to convert only a few brokers to 0.9 and only if we don't see problems convert the rest. 2. Is it

Re: Any gotchas upgrading to 0.9?

2015-12-01 Thread Rajiv Kurian
I saw the upgrade path documentation at http://kafka.apache.org/documentation.html and that kind of answers (1). Not sure if there is anything about client compatibility though. On Tue, Dec 1, 2015 at 8:51 AM, Rajiv Kurian <ra...@signalfx.com> wrote: > I plan to upgrade both t

Re: Any gotchas upgrading to 0.9?

2015-12-01 Thread Rajiv Kurian
defaults. For example we have a certain cluster dedicated to serving a single important topic and we'd hate for it to be throttled because of incorrect defaults. Thanks, Rajiv On Tue, Dec 1, 2015 at 8:54 AM, Rajiv Kurian <ra...@signalfx.com> wrote: > I saw the upgrade path documentation

Re: Any gotchas upgrading to 0.9?

2015-12-01 Thread Rajiv Kurian
ag.time.max.ms, it is > > > considered dead and is removed from the ISR. The mechanism of detecting > > > slow replicas has changed - if a replica starts lagging behind the > leader > > > for longer than replica.lag.time.max.ms, then it is considered too > slow &g

Re: Partial broker shutdown causing producers to stall even with replicas

2015-11-16 Thread Rajiv Kurian
Yes I think so. We specifically upgraded the Kafka broker with a patch to avoid the ZK client NPEs. Guess not all of them are fixed. The Kafka broker becoming a zombie even if one ZK node is bad is especially terrible. On Tuesday, November 17, 2015, Mahdi Ben Hamida wrote: >

Re: Kafka 0.9.0 release branch

2015-10-13 Thread Rajiv Kurian
A bit off topic but does this release contain the new single threaded consumer that supports the poll interface? Thanks! On Mon, Oct 12, 2015 at 1:31 PM, Jun Rao wrote: > Hi, Everyone, > > As we are getting closer to the 0.9.0 release, we plan to cut an 0.9.0 > release

Re: Questions on the new Kafka consumer

2015-10-13 Thread Rajiv Kurian
position what does the consumer group do? Do I still have to specify it all the time? Thanks, Rajiv On Tue, Oct 13, 2015 at 1:14 PM, Rajiv Kurian <ra...@signalfx.com> wrote: > I was reading the documentation for the new Kafka consumer API at > https://github.com/apache/kafka/blob/trunk

Questions on the new Kafka consumer

2015-10-13 Thread Rajiv Kurian
I was reading the documentation for the new Kafka consumer API at https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/consumer/KafkaConsumer.java and came across this: "Each Kafka consumer must specify a consumer group that it belongs to." Currently we use

Debugging high log flush latency on a broker.

2015-09-22 Thread Rajiv Kurian
I have a particular broker(version 0.8.2.1) in a cluster receiving about 15000 messages/second of around 100 bytes each (bytes-in / messages-in). This broker has bursts of really high log flush latency p95s. The latency sometimes goes to above 1.5 seconds from a steady state of < 20 ms. Running

Re: Debugging high log flush latency on a broker.

2015-09-22 Thread Rajiv Kurian
> filesystem free list (or whatever data structure it is these days (-: ). > > -Steve > > On Tue, Sep 22, 2015 at 11:46:49AM -0700, Rajiv Kurian wrote: > > Also any hints on how I can find the exact topic/partitions assigned to > > this broker? I know in

Re: Debugging periodic high kafka log flush times

2015-09-21 Thread Rajiv Kurian
a 100 bytes each) I'd think we are over-provisioned. But we still see periodic jumps in log flush latency. Any hints on what else we might measure/check etc to figure this out? On Thu, Sep 17, 2015 at 4:39 PM, Rajiv Kurian <ra...@signalfx.com> wrote: > We have a 9 node cluster runnin

Closing connection messages

2015-09-17 Thread Rajiv Kurian
My broker logs are full of messages of the following type of log message: INFO [kafka-network-thread-9092-1] [kafka.network.Processor ]: Closing socket connection to /some_ip_that_I_know. I see at least one every 4-5 seconds. Something I identified was that the ip of the closed

Re: Higher CPU going from 0.8.1 to 0.8.2.1

2015-08-21 Thread Rajiv Kurian
connections with are running the Java wrapper over the Scala SimpleConsumer. Is there any logging I can enable to understand why exactly these connections are being closed so often? Thanks, Rajiv On Fri, Aug 21, 2015 at 3:50 PM, Rajiv Kurian ra...@signalfuse.com wrote: We upgraded a 9 broker cluster

Higher CPU going from 0.8.1 to 0.8.2.1

2015-08-21 Thread Rajiv Kurian
We upgraded a 9 broker cluster from version 0.8.1 to version 0.8.2.1. Actually we cherry-picked the commit at 41ba26273b497e4cbcc947c742ff6831b7320152 to get zkClient 0.5 because we ran into a bug described at https://issues.apache.org/jira/browse/KAFKA-824 Right after the update the CPU spiked

Re: Higher CPU going from 0.8.1 to 0.8.2.1

2015-08-21 Thread Rajiv Kurian
a profiling on your broker process? Any hot code path differences between these two versions? Thanks, -Tao On Fri, Aug 21, 2015 at 3:59 PM, Rajiv Kurian ra...@signalfuse.com wrote: The only thing I notice in the logs which is a bit unsettling is about a once a second rate of messages

Re: Trying to debug a kafka failure

2015-08-13 Thread Rajiv Kurian
the restart fixed it either. On Wed, Aug 12, 2015 at 1:52 PM, Rajiv Kurian ra...@signalfuse.com wrote: We run around 10 kafka brokers running 0.8.1. This morning we had a failure were some of the partitions were under replicated. We narrowed the problem down to the following metrics on that one

Re: Monitoring kafka metrics

2015-08-13 Thread Rajiv Kurian
kafka.server:type=BrokerTopicMetrics,name=My_Topic-FailedProduceRequestsPerSec On Thu, Aug 13, 2015 at 2:27 PM, Rajiv Kurian ra...@signalfuse.com wrote: Till recently we were on 0.8.1 and updated to 0.8.2.1 Everything seems to work but I am no longer seeing metrics reported from the broker that was updated

Re: Monitoring kafka metrics

2015-08-13 Thread Rajiv Kurian
The problem was that the metric names had all changed in the latest version. Fixing the names seems to have done it. On Thu, Aug 13, 2015 at 3:13 PM, Rajiv Kurian ra...@signalfuse.com wrote: Aah that seems like a red herring - seems like the underlying cause is that the MBeans I was trying

Monitoring kafka metrics

2015-08-13 Thread Rajiv Kurian
Till recently we were on 0.8.1 and updated to 0.8.2.1 Everything seems to work but I am no longer seeing metrics reported from the broker that was updated to the new version. My config file has the following lines: kafka.metrics.polling.interval.secs=5

Best way to replace a single broker

2015-05-14 Thread Rajiv Kurian
Hi all, Sometimes we need to replace a kafka broker because it turns out to be a bad instance. What is the best way of doing this? We have been using the kafka-reassign-partitions.sh to migrate all topics to the new list of brokers which is the (old list + the new instance - the bad instance).

Re: Best way to replace a single broker

2015-05-14 Thread Rajiv Kurian
After this is done (all replicas are in sync) you can trigger leader election (or preferred replica election, whatever it is called) if it does not happen automatically. -- Andrey Yegorov On Thu, May 14, 2015 at 11:12 AM, Rajiv Kurian ra...@signalfuse.com wrote: Hi all

Kafka log flush time outlier

2015-05-13 Thread Rajiv Kurian
I have a single broker in a cluster of 9 brokers that has a log-flush-time-99th of 260 ms or more. Other brokers have a log-flush-time-99th of less than 30 ms. The misbehaving broker is running on the same kind of machine (c3.4x on Ec2) that the other ones are running on. It's bytes-in, bytes-out,

What is the expected way to deal with out of space on brokers

2015-04-06 Thread Rajiv Kurian
I have had some brokers die because of lack of disk space. The logs for all partitions were way higher (5G+) than I would have expected given the how I configured them for (100 MB size AND 1h rollover). What is the recommended way of recovering from this error. Should I delete certain log files

Re: Kafka 0.9 consumer API

2015-03-26 Thread Rajiv Kurian
that can be probably re-used for the consumer as well. org.apache.kafka.clients.producer.internals.BufferPool Please feel free to add more comments on KAFKA-2045. Guozhang On Tue, Mar 24, 2015 at 12:21 PM, Rajiv Kurian ra...@signalfuse.com wrote: Hi Guozhang, Yeah the main

Re: Kafka 0.9 consumer API

2015-03-24 Thread Rajiv Kurian
allocate new buffer for each de-compressed message, and it may be required to do de-compression with re-useable buffer with memory control. I will create a ticket under KAFKA-1326 for that. Guozhang Guozhang On Sun, Mar 22, 2015 at 1:22 PM, Rajiv Kurian ra...@signalfuse.com wrote: Hi

Re: Kafka 0.9 consumer API

2015-03-22 Thread Rajiv Kurian
, Rajiv Kurian ra...@signalfuse.com wrote: I had a few more thoughts on the new API. Currently we use kafka to transfer really compact messages - around 25-35 bytes each. Our use case is a lot of messages but each very small. Will it be possible to do the following to reuse

Re: Kafka 0.9 consumer API

2015-03-22 Thread Rajiv Kurian
. -Jay On Sat, Mar 21, 2015 at 9:17 PM, Rajiv Kurian ra...@signalfuse.com wrote: Just a follow up - I have implemented a pretty hacky prototype It's too unclean to share right now but I can clean it up if you are interested. I don't think it offers anything that people already don't

Re: Kafka 0.9 consumer API

2015-03-20 Thread Rajiv Kurian
work in progress is documented here: On Thu, Mar 19, 2015 at 7:18 PM, Rajiv Kurian ra...@signalfuse.com javascript:; wrote: Is there a link to the proposed new consumer non-blocking API? Thanks, Rajiv

Re: Kafka 0.9 consumer API

2015-03-20 Thread Rajiv Kurian
. When this API is available we plan to use a single thread to get data from kafka, process them as well as run periodic jobs. For the periodic jobs to run we need a guarantee on how much time the poll call can take at most. Thanks! On Fri, Mar 20, 2015 at 6:59 AM, Rajiv Kurian ra...@signalfuse.com

Kafka 0.9 consumer API

2015-03-19 Thread Rajiv Kurian
Is there a link to the proposed new consumer non-blocking API? Thanks, Rajiv

Recommended way of handling brokers coming up / down in the SimpleConsumer

2015-02-10 Thread Rajiv Kurian
I am using the SimpleConsumer to consume specific partitions on specific processes. The workflow is kind of like this: i) An external arbiter assigns partitions to a specific processes. It provides the guarantees of: a) All partitions are consumed by the cluster. b) A single partition is

Re: SimpleConsumer leaks sockets on an UnresolvedAddressException

2015-01-27 Thread Rajiv Kurian
, Guozhang Wang wangg...@gmail.com wrote: Rajiv, Which version of Kafka are you using? I just checked SimpleConsumer's code, and in its close() function, disconnect() is called, which will close the socket. Guozhang On Mon, Jan 26, 2015 at 2:36 PM, Rajiv Kurian ra...@signalfuse.com wrote

Re: SimpleConsumer leaks sockets on an UnresolvedAddressException

2015-01-27 Thread Rajiv Kurian
:0.8.0] On Tue, Jan 27, 2015 at 10:19 AM, Rajiv Kurian ra...@signalfuse.com wrote: I am using 0.8.1. The source is here: https://github.com/apache/kafka/blob/0.8.1/core/src/main/scala/kafka/consumer/SimpleConsumer.scala Here is the definition of disconnect(): private def disconnect

SimpleConsumer leaks sockets on an UnresolvedAddressException

2015-01-26 Thread Rajiv Kurian
Here is my typical flow: void run() { if (simpleConsumer == null) { simpleConsumer = new SimpleConsumer(host, port, (int) kafkaSocketTimeout, kafkaRExeiveBufferSize, clientName); } try { // Do stuff with simpleConsumer. } catch (Exception e) { if (consumer != null) {

Re: Kafka sending messages with zero copy

2015-01-26 Thread Rajiv Kurian
for this proposal, it would be great if you can upload some implementation patch for the CAS idea and show some memory usage / perf differences. Guozhang On Sun, Dec 14, 2014 at 9:27 PM, Rajiv Kurian ra...@signalfuse.com wrote: Resuscitating this thread. I've done some more experiments

Re: SimpleConsumer leaks sockets on an UnresolvedAddressException

2015-01-26 Thread Rajiv Kurian
) { logger.error(e); // Assume UnresolvedAddressException. if (consumer != null) { simpleConsumer.close(); simpleConsumer = null; } } } } On Mon, Jan 26, 2015 at 2:27 PM, Rajiv Kurian ra...@signalfuse.com wrote: Here is my typical flow: void run

Re: Trying to figure out kafka latency issues

2014-12-30 Thread Rajiv Kurian
about that other log message. It says 4 bytes written to log xyz but what it is actually logging is the number of messages not the number of bytes, so that is quite misleading and a bug. -Jay On Mon, Dec 29, 2014 at 6:37 PM, Rajiv Kurian ra...@signalfuse.com wrote: Thanks Jay. I just

Re: Trying to figure out kafka latency issues

2014-12-30 Thread Rajiv Kurian
Edit: I had to set kafka.request.logger=TRACE to see the request timings. On Tue, Dec 30, 2014 at 4:37 PM, Rajiv Kurian ra...@signalfuse.com wrote: Got it. I had to enable to see the logs. Here are two files with the timings I got from the period I had logging enabled: Producer Requests

Re: Trying to figure out kafka latency issues

2014-12-30 Thread Rajiv Kurian
the pauses occur. Alternately if you can find a reproducible test case we can turn into a JIRA someone else may be willing to dive in. -Jay On Tue, Dec 30, 2014 at 4:37 PM, Rajiv Kurian ra...@signalfuse.com wrote: Got it. I had to enable to see the logs. Here are two files

Re: Trying to figure out kafka latency issues

2014-12-29 Thread Rajiv Kurian
things better :) Yeah I have no experience with the Java client so can't really help there. Good luck! -Original Message- From: Rajiv Kurian [ra...@signalfuse.com] Received: Sunday, 21 Dec 2014, 12:25PM To: users@kafka.apache.org [users@kafka.apache.org

Re: Trying to figure out kafka latency issues

2014-12-29 Thread Rajiv Kurian
again! On Mon, Dec 29, 2014 at 10:22 AM, Rajiv Kurian ra...@signalfuse.com wrote: Thanks Jay. Will check (1) and (2) and get back to you. The test is not stand-alone now. It might be a bit of work to extract it to a stand-alone executable. It might take me a bit of time to get that going

Re: Trying to figure out kafka latency issues

2014-12-29 Thread Rajiv Kurian
). Thanks! On Mon, Dec 29, 2014 at 3:02 PM, Rajiv Kurian ra...@signalfuse.com wrote: Hi Jay, Re (1) - I am not sure how to do this? Actually I am not sure what this means. Is this the time every write/fetch request is received on the broker? Do I need to enable some specific log level

Re: Trying to figure out kafka latency issues

2014-12-29 Thread Rajiv Kurian
In case the attachments don't work out here is an imgur link - http://imgur.com/NslGpT3,Uw6HFow#0 On Mon, Dec 29, 2014 at 3:13 PM, Rajiv Kurian ra...@signalfuse.com wrote: Never mind about (2). I see these stats are already being output by the kafka producer. I've attached a couple

Re: Trying to figure out kafka latency issues

2014-12-29 Thread Rajiv Kurian
) and doesn't seem to be waiting on memory which is somewhat surprising to me (the NaN is an artifact of how we compute that stat--no allocations took place in the time period measured so it is kind of 0/0). -Jay On Mon, Dec 29, 2014 at 4:54 PM, Rajiv Kurian ra...@signalfuse.com wrote: In case

Re: Trying to figure out kafka latency issues

2014-12-28 Thread Rajiv Kurian
with the Java client so can't really help there. Good luck! -Original Message- From: Rajiv Kurian [ra...@signalfuse.com] Received: Sunday, 21 Dec 2014, 12:25PM To: users@kafka.apache.org [users@kafka.apache.org] Subject: Re: Trying to figure out kafka latency issues I'll take a look at the GC

  1   2   >