Re: Broker Exceptions

Jiangjie Qin Mon, 09 Mar 2015 11:39:13 -0700

Is there anything wrong with brokers around that time? E.g. Broker restart?
The log you pasted are actually from replica fetchers. Could you paste the
related logs in controller.log?


Thanks.

Jiangjie (Becket) Qin

On 3/9/15, 10:32 AM, "Zakee" <kzak...@netzero.net> wrote:

>Correction: Actually  the rebalance happened quite until 24 hours after
>the start, and thats where below errors were found. Ideally rebalance
>should not have happened at all.
>
>
>Thanks
>Zakee
>
>
>
>> On Mar 9, 2015, at 10:28 AM, Zakee <kzak...@netzero.net> wrote:
>> 
>>> Hmm, that sounds like a bug. Can you paste the log of leader rebalance
>>> here?
>> Thanks for you suggestions.
>> It looks like the rebalance actually happened only once soon after I
>>started with clean cluster and data was pushed, it didn’t happen again
>>so far, and I see the partitions leader counts on brokers did not change
>>since then. One of the brokers was constantly showing 0 for partition
>>leader count. Is that normal?
>> 
>> Also, I still see lots of below errors (~69k) going on in the logs
>>since the restart. Is there any other reason than rebalance for these
>>errors?
>> 
>> [2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-2-5], Error for
>>partition [Topic-11,7] to broker 5:class
>>kafka.common.NotLeaderForPartitionException
>>(kafka.server.ReplicaFetcherThread)
>> [2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-1-5], Error for
>>partition [Topic-2,25] to broker 5:class
>>kafka.common.NotLeaderForPartitionException
>>(kafka.server.ReplicaFetcherThread)
>> [2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-2-5], Error for
>>partition [Topic-2,21] to broker 5:class
>>kafka.common.NotLeaderForPartitionException
>>(kafka.server.ReplicaFetcherThread)
>> [2015-03-07 14:23:28,963] ERROR [ReplicaFetcherThread-1-5], Error for
>>partition [Topic-22,9] to broker 5:class
>>kafka.common.NotLeaderForPartitionException
>>(kafka.server.ReplicaFetcherThread)
>> 
>>> Some other things to check are:
>>> 1. The actual property name is auto.leader.rebalance.enable, not
>>> auto.leader.rebalance. You’ve probably known this, just to double
>>>confirm.
>> Yes 
>> 
>>> 2. In zookeeper path, can you verify /admin/preferred_replica_election
>>> does not exist?
>> ls /admin
>> [delete_topics]
>> ls /admin/preferred_replica_election
>> Node does not exist: /admin/preferred_replica_election
>> 
>> 
>> Thanks
>> Zakee
>> 
>> 
>> 
>>> On Mar 7, 2015, at 10:49 PM, Jiangjie Qin <j...@linkedin.com.INVALID>
>>>wrote:
>>> 
>>> Hmm, that sounds like a bug. Can you paste the log of leader rebalance
>>> here?
>>> Some other things to check are:
>>> 1. The actual property name is auto.leader.rebalance.enable, not
>>> auto.leader.rebalance. You’ve probably known this, just to double
>>>confirm.
>>> 2. In zookeeper path, can you verify /admin/preferred_replica_election
>>> does not exist?
>>> 
>>> Jiangjie (Becket) Qin
>>> 
>>> On 3/7/15, 10:24 PM, "Zakee" <kzak...@netzero.net> wrote:
>>> 
>>>> I started with  clean cluster and started to push data. It still does
>>>>the
>>>> rebalance at random durations even though the auto.leader.relabalance
>>>>is
>>>> set to false.
>>>> 
>>>> Thanks
>>>> Zakee
>>>> 
>>>> 
>>>> 
>>>>> On Mar 6, 2015, at 3:51 PM, Jiangjie Qin <j...@linkedin.com.INVALID>
>>>>> wrote:
>>>>> 
>>>>> Yes, the rebalance should not happen in that case. That is a little
>>>>>bit
>>>>> strange. Could you try to launch a clean Kafka cluster with
>>>>> auto.leader.election disabled and try push data?
>>>>> When leader migration occurs, NotLeaderForPartition exception is
>>>>> expected.
>>>>> 
>>>>> Jiangjie (Becket) Qin
>>>>> 
>>>>> 
>>>>> On 3/6/15, 3:14 PM, "Zakee" <kzak...@netzero.net> wrote:
>>>>> 
>>>>>> Yes, Jiangjie, I do see lots of these errors "Starting preferred
>>>>>> replica
>>>>>> leader election for partitions” in logs. I also see lot of Produce
>>>>>> request failure warnings in with the NotLeader Exception.
>>>>>> 
>>>>>> I tried switching off the auto.leader.relabalance to false. I am
>>>>>>still
>>>>>> noticing the rebalance happening. My understanding was the rebalance
>>>>>> will
>>>>>> not happen when this is set to false.
>>>>>> 
>>>>>> Thanks
>>>>>> Zakee
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On Feb 25, 2015, at 5:17 PM, Jiangjie Qin
>>>>>>><j...@linkedin.com.INVALID>
>>>>>>> wrote:
>>>>>>> 
>>>>>>> I don’t think num.replica.fetchers will help in this case.
>>>>>>>Increasing
>>>>>>> number of fetcher threads will only help in cases where you have a
>>>>>>> large
>>>>>>> amount of data coming into a broker and more replica fetcher
>>>>>>>threads
>>>>>>> will
>>>>>>> help keep up. We usually only use 1-2 for each broker. But in your
>>>>>>> case,
>>>>>>> it looks that leader migration cause issue.
>>>>>>> Do you see anything else in the log? Like preferred leader
>>>>>>>election?
>>>>>>> 
>>>>>>> Jiangjie (Becket) Qin
>>>>>>> 
>>>>>>> On 2/25/15, 5:02 PM, "Zakee" <kzak...@netzero.net
>>>>>>> <mailto:kzak...@netzero.net>> wrote:
>>>>>>> 
>>>>>>>> Thanks, Jiangjie.
>>>>>>>> 
>>>>>>>> Yes, I do see under partitions usually shooting every hour.
>>>>>>>>Anythings
>>>>>>>> that
>>>>>>>> I could try to reduce it?
>>>>>>>> 
>>>>>>>> How does "num.replica.fetchers" affect the replica sync? Currently
>>>>>>>> have
>>>>>>>> configured 7 each of 5 brokers.
>>>>>>>> 
>>>>>>>> -Zakee
>>>>>>>> 
>>>>>>>> On Wed, Feb 25, 2015 at 4:17 PM, Jiangjie Qin
>>>>>>>> <j...@linkedin.com.invalid>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> These messages are usually caused by leader migration. I think as
>>>>>>>>> long
>>>>>>>>> as
>>>>>>>>> you don¹t see this lasting for ever and got a bunch of under
>>>>>>>>> replicated
>>>>>>>>> partitions, it should be fine.
>>>>>>>>> 
>>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>>> 
>>>>>>>>> On 2/25/15, 4:07 PM, "Zakee" <kzak...@netzero.net> wrote:
>>>>>>>>> 
>>>>>>>>>> Need to know if I should I be worried about this or ignore them.
>>>>>>>>>> 
>>>>>>>>>> I see tons of these exceptions/warnings in the broker logs, not
>>>>>>>>>> sure
>>>>>>>>> what
>>>>>>>>>> causes them and what could be done to fix them.
>>>>>>>>>> 
>>>>>>>>>> ERROR [ReplicaFetcherThread-3-5], Error for partition
>>>>>>>>>>[TestTopic]
>>>>>>>>>> to
>>>>>>>>>> broker
>>>>>>>>>> 5:class kafka.common.NotLeaderForPartitionException
>>>>>>>>>> (kafka.server.ReplicaFetcherThread)
>>>>>>>>>> [2015-02-25 11:01:41,785] ERROR [ReplicaFetcherThread-3-5],
>>>>>>>>>>Error
>>>>>>>>>> for
>>>>>>>>>> partition [TestTopic] to broker 5:class
>>>>>>>>>> kafka.common.NotLeaderForPartitionException
>>>>>>>>>> (kafka.server.ReplicaFetcherThread)
>>>>>>>>>> [2015-02-25 11:01:41,785] WARN [Replica Manager on Broker 2]:
>>>>>>>>>>Fetch
>>>>>>>>>> request
>>>>>>>>>> with correlation id 950084 from client ReplicaFetcherThread-1-2
>>>>>>>>>>on
>>>>>>>>>> partition [TestTopic,2] failed due to Leader not local for
>>>>>>>>>> partition
>>>>>>>>>> [TestTopic,2] on broker 2 (kafka.server.ReplicaManager)
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> Any ideas?
>>>>>>>>>> 
>>>>>>>>>> -Zakee
>>>>>>>>>> ____________________________________________________________
>>>>>>>>>> Next Apple Sensation
>>>>>>>>>> 1 little-known path to big profits
>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>http://thirdpartyoffers.netzero.net/TGL3231/54ee63b9e704b63b94061
>>>>>>>>>>st0
>>>>>>>>>> 3v
>>>>>>>>>> uc
>>>>>>>>> 
>>>>>>>>> ____________________________________________________________
>>>>>>>>> Extended Stay America
>>>>>>>>> Get Fantastic Amenities, low rates! Kitchen, Ample Workspace,
>>>>>>>>>Free
>>>>>>>>> WIFI
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>http://thirdpartyoffers.netzero.net/TGL3255/54ee66f26da6f66f10ad4m
>>>>>>>>>p02
>>>>>>>>> du
>>>>>>>>> c
>>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> ____________________________________________________________
>>>>>>> Extended Stay America
>>>>>>> Official Site. Free WIFI, Kitchens. Our best rates here,
>>>>>>>guaranteed.
>>>>>>> 
>>>>>>>http://thirdpartyoffers.netzero.net/TGL3255/54ee80744cfa7747461mp13d
>>>>>>>uc
>>>>>>> 
>>>>>>> 
>>>>>>><http://thirdpartyoffers.netzero.net/TGL3255/54ee80744cfa7747461mp13
>>>>>>>duc
>>>>>>>> 
>>>>> 
>>>>> 
>>>>> ____________________________________________________________
>>>>> The WORST exercise for aging
>>>>> Avoid this &#34;healthy&#34; exercise to look & feel 5-10 years
>>>>>YOUNGER
>>>>> 
>>>>>http://thirdpartyoffers.netzero.net/TGL3255/54fa40e98a0e640e81196mp07d
>>>>>uc
>>>> 
>>> 
>>> 
>>> ____________________________________________________________
>>> Seabourn Luxury Cruises
>>> Receive special offers from the World&#39;s Finest Small-Ship Cruise
>>>Line!
>>> 
>>>http://thirdpartyoffers.netzero.net/TGL3255/54fbf3b0f058073b02901mp14duc
>> 
>

Re: Broker Exceptions

Reply via email to