Re: broker randomly shuts down

allen chan Thu, 30 Jun 2016 17:07:40 -0700

Anyone else have ideas?

This is still happening. I moved off zookeeper from the server to its own
dedicated VMs.
Kakfa starts with 4G of heap and gets nowhere near that much consumed when
it crashed.
i bumped up the zookeeper timeout settings but that has not solved it.


I also disconnected all the producers and consumers. This point something
between kafka and zookeeper right?

Again logs are no help as to why kafka decided to shut itself down
https://gist.github.com/allenmchan/f9331e54bb4fd77cc5bc0b031a7a6206




On Thu, Jun 2, 2016 at 4:22 PM, Russ Lavoie <russlav...@gmail.com> wrote:

> What about in dmesg?  I have run into this issue and it was the OOM
> killer.  I also ran into a heap issue using too much of the direct memory
> (JVM).  Reducing the fetcher threads helped with that problem.
> On Jun 2, 2016 12:19 PM, "allen chan" <allen.michael.c...@gmail.com>
> wrote:
>
> > Hi Tom,
> >
> > That is one of the first things that i checked. Active memory never goes
> > above 50% of overall available. File cache uses the rest of the memory
> but
> > i do not think that causes OOM killer.
> > Either way there is no entries in /var/log/messages (centos) to show OOM
> is
> > happening.
> >
> > Thanks
> >
> > On Thu, Jun 2, 2016 at 5:36 AM, Tom Crayford <tcrayf...@heroku.com>
> wrote:
> >
> > > That looks like somebody is killing the process. I'd suspect either the
> > > linux OOM killer or something else automatically killing the JVM for
> some
> > > reason.
> > >
> > > For the OOM killer, assuming you're on ubuntu, it's pretty easy to find
> > in
> > > /var/log/syslog (depending on your setup). I don't know about other
> > > operating systems.
> > >
> > > On Thu, Jun 2, 2016 at 5:54 AM, allen chan <
> allen.michael.c...@gmail.com
> > >
> > > wrote:
> > >
> > > > I have an issue where my brokers would randomly shut itself down.
> > > > I turned on debug in log4j.properties but still do not see a reason
> why
> > > the
> > > > shutdown is happening.
> > > >
> > > > Anyone seen this behavior before?
> > > >
> > > > version 0.10.0
> > > > log4j.properties
> > > >     log4j.rootLogger=DEBUG, kafkaAppender
> > > > * I tried TRACE level but i do not see any additional log messages
> > > >
> > > > snippet of log around shutdown
> > > > [2016-06-01 15:11:51,374] DEBUG Got ping response for sessionid:
> > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > [2016-06-01 15:11:53,376] DEBUG Got ping response for sessionid:
> > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > [2016-06-01 15:11:55,377] DEBUG Got ping response for sessionid:
> > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > [2016-06-01 15:11:57,380] DEBUG Got ping response for sessionid:
> > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > [2016-06-01 15:11:59,383] DEBUG Got ping response for sessionid:
> > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > [2016-06-01 15:12:01,386] DEBUG Got ping response for sessionid:
> > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > [2016-06-01 15:12:03,389] DEBUG Got ping response for sessionid:
> > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > [2016-06-01 15:12:04,121] INFO [Group Metadata Manager on Broker 2]:
> > > > Removed 0 expired offsets in 0 milliseconds.
> > > > (kafka.coordinator.GroupMetadataManager)
> > > > [2016-06-01 15:12:04,121] INFO [Group Metadata Manager on Broker 2]:
> > > > Removed 0 expired offsets in 0 milliseconds.
> > > > (kafka.coordinator.GroupMetadataManager)
> > > > [2016-06-01 15:12:05,390] DEBUG Got ping response for sessionid:
> > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > [2016-06-01 15:12:07,393] DEBUG Got ping response for sessionid:
> > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > [2016-06-01 15:12:09,396] DEBUG Got ping response for sessionid:
> > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > [2016-06-01 15:12:11,399] DEBUG Got ping response for sessionid:
> > > > 0x2550a693b470001 after 1ms (org.apache.zookeeper.ClientCnxn)
> > > > [2016-06-01 15:12:13,334] INFO [Kafka Server 2], shutting down
> > > > (kafka.server.KafkaServer)
> > > > [2016-06-01 15:12:13,334] INFO [Kafka Server 2], shutting down
> > > > (kafka.server.KafkaServer)
> > > > [2016-06-01 15:12:13,336] INFO [Kafka Server 2], Starting controlled
> > > > shutdown (kafka.server.KafkaServer)
> > > > [2016-06-01 15:12:13,336] INFO [Kafka Server 2], Starting controlled
> > > > shutdown (kafka.server.KafkaServer)
> > > > [2016-06-01 15:12:13,338] DEBUG Added sensor with name
> > > connections-closed:
> > > > (org.apache.kafka.common.metrics.Metrics)
> > > > [2016-06-01 15:12:13,338] DEBUG Added sensor with name
> > > connections-created:
> > > > (org.apache.kafka.common.metrics.Metrics)
> > > > [2016-06-01 15:12:13,338] DEBUG Added sensor with name
> > > bytes-sent-received:
> > > > (org.apache.kafka.common.metrics.Metrics)
> > > > [2016-06-01 15:12:13,338] DEBUG Added sensor with name bytes-sent:
> > > > (org.apache.kafka.common.metrics.Metrics)
> > > > [2016-06-01 15:12:13,339] DEBUG Added sensor with name
> bytes-received:
> > > > (org.apache.kafka.common.metrics.Metrics)
> > > > [2016-06-01 15:12:13,339] DEBUG Added sensor with name select-time:
> > > > (org.apache.kafka.common.metrics.Metrics)
> > > >
> > > > --
> > > > Allen Michael Chan
> > > >
> > >
> >
> >
> >
> > --
> > Allen Michael Chan
> >
>



-- 
Allen Michael Chan

Re: broker randomly shuts down

Reply via email to