Re: Storing Kafka Message JSON to deep storage like S3

2016-12-06 Thread noah
If you are willing to setup Kafka Connect, my company has built this connector: https://github.com/spredfast/kafka-connect-s3

0.9 KafkaConsumer Memory Usage

2016-06-21 Thread noah
I'm using 0.9.0.1 consumers on 0.9.0.1 brokers. In a single Java service, we have 4 producers and 5 consumers. They are all KafkaProducer and KafkaConsumer instances (the new consumer.) Since the 0.9 upgrade, this service is now OOMing after a being up for a few minutes. Heap dumps show >80MB of

Strange ZK Error precedes frequent rebalances

2015-10-14 Thread noah
A number of our developers are seeing errors like the one below in their console when running a consumer on their laptop. The error is always followed by logging indicating that the local consumer is rebalancing, and in the meantime we are not making much progress. I'm reading this as the

Re: Strange ZK Error precedes frequent rebalances

2015-10-14 Thread noah
sumer > > Gwen > > On Wed, Oct 14, 2015 at 1:47 PM, noah <iamn...@gmail.com> wrote: > > > A number of our developers are seeing errors like the one below in their > > console when running a consumer on their laptop. The error is always > > followed by logging i

Re: Frequent Consumer and Producer Disconnects

2015-09-26 Thread noah
Now, given you have a topic with 16 > partitions, and you're running 23 consumers, 7 of those consumer threads > are going to be idle because they do not own partitions. > > -Todd > > > On Fri, Sep 25, 2015 at 3:27 PM, noah <iamn...@gmail.com> wrote: > >> We're

Frequent Consumer and Producer Disconnects

2015-09-24 Thread noah
We are having issues with producers and consumers frequently fully disconnecting (from both the brokers and ZK) and reconnecting without any apparent cause. On our production systems it can happen anywhere from every 10-15 seconds to 15-20 minutes. On our less beefy test systems and developer

Re: high level consumer timeout?

2015-09-23 Thread noah
ing out what is going on here, I'm in > any case quite thrilled that at least it seems to work now. :) Thanks! > -J > > -Original Message- > From: noah [mailto:iamn...@gmail.com] > Sent: 23 September 2015 09:44 > To: users@kafka.apache.org > Subject: Re: high level consumer time

Re: high level consumer timeout?

2015-09-23 Thread noah
Assuming this is a test case with a new topic/consumer groups for each run, do you have auto.offset.reset=smallest? This happens to me constantly in tests because my consumers end up missing the first message since the default is largest (in which case auto commit is a red herring.) On Wed, Sep

Re: committing offsets

2015-09-22 Thread noah
If you are using the console consumer to check the offsets topic, remember that you need this line in consumer.properties: exclude.internal.topics=false On Tue, Sep 22, 2015 at 6:05 AM Joris Peeters wrote: > Ah, nice! Does not look like it is working, though. For

Re: log.retention.hours not working?

2015-09-21 Thread noah
"minimum age of a log file to be eligible for deletion" Key word is minimum. If you only have 1k logs, Kafka doesn't need to delete anything. Try to push more data through and when it needs to, it will start deleting old logs. On Mon, Sep 21, 2015 at 8:58 PM allen chan

Tools/recommendations to debug performance issues?

2015-09-14 Thread noah
We're using 0.8.2.1 processing maybe 1 million messages per hour. Each message includes tracking information with a timestamp for when it was produced, and a timestamp for when it was consumed, to give us roughly the amount of time it spent in Kafka. On average this number is in the seconds and

Re: How to monitor lag when "kafka" is used as offset.storage?

2015-09-02 Thread noah
We use Burrow . There are rest endpoints you can use to get offsets and manually calculate lag, but if you are focused on alerting, I'd use it's consumer statuses as they are a bit smarter than a simple lag calculation. On Wed, Sep 2, 2015 at 4:08 AM shahab

Re: Using Kafka as a persistent store

2015-07-10 Thread noah
I don't want to endorse this use of Kafka, but assuming you can give your message unique identifiers, I believe using log compaction will keep all unique messages forever. You can read about how consumer offsets stored in Kafka are managed using a compacted topic here:

Re: kafka consumer group API

2015-07-09 Thread noah
Hi! I did something similar. You can use the high level consumer but turn off auto commit and commit only what you are done with. Here's the code I used: https://github.com/iamnoah/kakfa-offsets-test On Thu, Jul 9, 2015 at 4:53 PM Shashank Singh shashank.ru...@gmail.com wrote: Hi Team I was

Re: How to monitor consuming rate and lag?

2015-06-30 Thread noah
If you are committing offsets to Kafka, try Burrow: https://github.com/linkedin/Burrow On Tue, Jun 30, 2015 at 3:41 AM Shady Xu shad...@gmail.com wrote: Hi all, I'm now using https://github.com/airbnb/kafka-statsd-metrics2 to monitor our Kafka cluster. But there are not metrics about

Re: How to fetch offset in SimpleConsumer using Java

2015-06-29 Thread noah
I believe clientGroup is your consumer group id. You must've picked a value to commit with, so it needs to be the same one. On Mon, Jun 29, 2015 at 12:50 AM Xiang Zhou (Samuel) zhou...@gmail.com wrote: Hi, I use the following snippets to try to get fetch the offset in a SimpleConsumer I have

Re: Message loss due to zookeeper ensemble doesn't work

2015-06-26 Thread noah
I think you have it backwards. If you don't write your consumer offsets, the worst case is that consumers will read some messages a second time. If your messages are idempotent, then you wont lose or corrupt any data. When the ZK cluster comes back up you can start writing offsets again. However,

Re: Manual Offset Commits with High Level Consumer skipping messages

2015-06-22 Thread noah
: loop on consume - process - commit offset every N messages. So we can make sure there is no weird race condition. Thanks, Jiangjie (Becket) Qin On 6/21/15, 6:23 AM, noah iamn...@gmail.com wrote: On Sun, Jun 21, 2015 at 1:10 AM Jiangjie Qin j...@linkedin.com.invalid wrote: Hey Noah

Re: Manual Offset Commits with High Level Consumer skipping messages

2015-06-21 Thread noah
On Sun, Jun 21, 2015 at 1:10 AM Jiangjie Qin j...@linkedin.com.invalid wrote: Hey Noah, Carl is right about the offset. The offset to be commit should be the largest-consumed-offset + 1. But this should not break the at least once guarantee. From what I can see, your consumer should

Re: Manual Offset Commits with High Level Consumer skipping messages

2015-06-19 Thread noah
? I.e. where do you get the metadata for the consumed messages? On Thu, Jun 18, 2015 at 11:21 PM, noah iamn...@gmail.com wrote: We are in a situation where we need at least once delivery. We have a thread that pulls messages off the consumer, puts them in a queue where they go through a few

Manual Offset Commits with High Level Consumer skipping messages

2015-06-18 Thread noah
We are in a situation where we need at least once delivery. We have a thread that pulls messages off the consumer, puts them in a queue where they go through a few async steps, and then after the final step, we want to commit the offset to the messages we have completed. There may be items we have