Anything going on in the kafka broker logs?

On Fri, Apr 21, 2017 at 12:24 PM, Ali Nazemian <[email protected]>
wrote:

> Although this is a test platform with a way less spec than production, it
> should be enough for indexing 600 docs per second. I have seen benchmark
> result of 150-200k docs per second with this spec! I haven't played with
> tuning the template yet, but I still think the current rate does not make
> sense at all.
>
> I have changed the batch size to 100. Throughput has been dropped, but
> still a very high rate of failure!
>
> Please find the screenshots for the enrichments:
> http://imgur.com/a/ceC8f
> http://imgur.com/a/sBQwM
>
> On Sat, Apr 22, 2017 at 2:08 AM, Casey Stella <[email protected]> wrote:
>
>> Ok, yeah, those latencies are pretty high.  I think what's happening is
>> that the tuples aren't being acked fast enough and are timing out.  How
>> taxed is your ES box?  Can you drop the batch size down to maybe 100 and
>> see what happens?
>>
>> On Fri, Apr 21, 2017 at 12:05 PM, Ali Nazemian <[email protected]>
>> wrote:
>>
>>> Please find the bolt part of Storm-UI related to indexing topology:
>>>
>>> http://imgur.com/a/tFkmO
>>>
>>> As you can see a hdfs error has also appeared which is not important
>>> right now.
>>>
>>> On Sat, Apr 22, 2017 at 1:59 AM, Casey Stella <[email protected]>
>>> wrote:
>>>
>>>> What's curious is the enrichment topology showing the same issues, but
>>>> my mind went to ES as well.
>>>>
>>>> On Fri, Apr 21, 2017 at 11:57 AM, Ryan Merriman <[email protected]>
>>>> wrote:
>>>>
>>>>> Yes which bolt is reporting all those failures?  My theory is that
>>>>> there is some ES tuning that needs to be done.
>>>>>
>>>>> On Fri, Apr 21, 2017 at 10:53 AM, Casey Stella <[email protected]>
>>>>> wrote:
>>>>>
>>>>>> Could I see a little more of that screen?  Specifically what the
>>>>>> bolts look like.
>>>>>>
>>>>>> On Fri, Apr 21, 2017 at 11:51 AM, Ali Nazemian <[email protected]
>>>>>> > wrote:
>>>>>>
>>>>>>> Please find the storm-UI screenshot as follows.
>>>>>>>
>>>>>>> http://imgur.com/FhIrGFd
>>>>>>>
>>>>>>>
>>>>>>> On Sat, Apr 22, 2017 at 1:41 AM, Ali Nazemian <[email protected]
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Hi Casey,
>>>>>>>>
>>>>>>>> - topology.message.timeout: It was 30s at first. I have increased
>>>>>>>> it to 300s, no changes!
>>>>>>>> - It is a very basic geo-enrichment and simple rule for threat
>>>>>>>> triage!
>>>>>>>> - No, not at all.
>>>>>>>> - I have changed that to find the best value. it is 5000 which is
>>>>>>>> about to 5MB.
>>>>>>>> - I have changed the number of executors for the Storm acker
>>>>>>>> thread, and I have also changed the value of 
>>>>>>>> topology.max.spout.pending,
>>>>>>>> still no changes!
>>>>>>>>
>>>>>>>> On Sat, Apr 22, 2017 at 1:24 AM, Casey Stella <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Also,
>>>>>>>>> * what's your setting for topology.message.timeout?
>>>>>>>>> * You said you're seeing this in indexing and enrichment, what
>>>>>>>>> enrichments do you have in place?
>>>>>>>>> * Is ES being taxed heavily?
>>>>>>>>> * What's your ES batch size for the sensor?
>>>>>>>>>
>>>>>>>>> On Fri, Apr 21, 2017 at 10:46 AM, Casey Stella <[email protected]
>>>>>>>>> > wrote:
>>>>>>>>>
>>>>>>>>>> So you're seeing failures in the storm topology but no errors in
>>>>>>>>>> the logs.  Would you mind sending over a screenshot of the indexing
>>>>>>>>>> topology from the storm UI?  You might not be able to paste the 
>>>>>>>>>> image on
>>>>>>>>>> the mailing list, so maybe an imgur link would be in order.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>>
>>>>>>>>>> Casey
>>>>>>>>>>
>>>>>>>>>> On Fri, Apr 21, 2017 at 10:34 AM, Ali Nazemian <
>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Ryan,
>>>>>>>>>>>
>>>>>>>>>>> No, I cannot see any error inside the indexing error topic.
>>>>>>>>>>> Also, the number of tuples is emitted and transferred to the error 
>>>>>>>>>>> indexing
>>>>>>>>>>> bolt is zero!
>>>>>>>>>>>
>>>>>>>>>>> On Sat, Apr 22, 2017 at 12:29 AM, Ryan Merriman <
>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Do you see any errors in the error* index in Elasticsearch?
>>>>>>>>>>>> There are several catch blocks across the different topologies that
>>>>>>>>>>>> transform errors into json objects and forward them on to the 
>>>>>>>>>>>> indexing
>>>>>>>>>>>> topology.  If you're not seeing anything in the worker logs it's 
>>>>>>>>>>>> likely the
>>>>>>>>>>>> errors were captured there instead.
>>>>>>>>>>>>
>>>>>>>>>>>> Ryan
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Apr 21, 2017 at 9:19 AM, Ali Nazemian <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> No everything is fine at the log level. Also, when I checked
>>>>>>>>>>>>> resource consumption at the workers, there had been plenty 
>>>>>>>>>>>>> resources still
>>>>>>>>>>>>> available!
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Apr 21, 2017 at 10:04 PM, Casey Stella <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Seeing anything in the storm logs for the workers?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Apr 21, 2017 at 07:41 Ali Nazemian <
>>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> After I tried to tune the Metron performance I have noticed
>>>>>>>>>>>>>>> the rate of failure for the indexing/enrichment topologies are 
>>>>>>>>>>>>>>> very high
>>>>>>>>>>>>>>> (about 95%). However, I can see the messages in Elasticsearch. 
>>>>>>>>>>>>>>> I have tried
>>>>>>>>>>>>>>> to increase the timeout value for the acknowledgement. It 
>>>>>>>>>>>>>>> didn't fix the
>>>>>>>>>>>>>>> problem. I can set the number of acker executors to 0 to 
>>>>>>>>>>>>>>> temporarily fix
>>>>>>>>>>>>>>> the problem which is not a good idea at all. Do you have any 
>>>>>>>>>>>>>>> idea what have
>>>>>>>>>>>>>>> caused such issue? The percentage of failure decreases by 
>>>>>>>>>>>>>>> reducing the
>>>>>>>>>>>>>>> number of parallelism, but even without any parallelism, it is 
>>>>>>>>>>>>>>> still high!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>> Ali
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> A.Nazemian
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> A.Nazemian
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> A.Nazemian
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> A.Nazemian
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> A.Nazemian
>>>
>>
>>
>
>
> --
> A.Nazemian
>

Reply via email to