Re: Storm 1.2.1 - Excessive workerbeats causing long GC and thus disconneting zookeeper

2020-03-05 Thread Ethan Li
So you are seeing 65MB on UI. UI only shows assigned memory, not memory usage. 

As I mentioned earlier, -Xmx%HEAP-MEM%m in worker.childopts is designed to be 
replaced with the total memory assigned for the worker. ( 
https://github.com/apache/storm/blob/1.x-branch/storm-core/src/jvm/org/apache/storm/daemon/supervisor/BasicContainer.java#L392
 
).
 Your config "worker.childopts: "-Xmx2048m -XX:+PrintGCDetails” will make every 
worker use 2GB. But it will not show up on UI as 2GB because UI doesn’t read 
“-Xmx2048m” from worker.childopts.

The assigned memory on UI is the sum of the assigned memory of all the 
executors in the worker. The amount of  assigned memory for a worker depends on 
how it’s scheduled and what the executors in the worker are. 

For example,  by default, every instance/executor is configured with 128MB 
memory (https://github.com/apache/storm/blob/1.x-branch/conf/defaults.yaml#L276 
). If 
4 executors are scheduled in one worker, then the assigned memory for that 
worker is 512MB. 


Hope that helps. 


> On Feb 17, 2020, at 8:55 AM, Narasimhan Chengalvarayan 
>  wrote:
> 
> Hi Ethan Li,
> 
> 
> Sorry for the late reply. Please find the output where it is showing
> -Xmx2048m for worker heap . But in  storm ui we are seeing the worker
> allocated memory as 65MB for each worker.
> 
> java -server -Dlogging.sensitivity=S3 -Dlogfile.name=worker.log
> -Dstorm.home=/opt/storm/apache-storm-1.2.1
> -Dworkers.artifacts=/var/log/storm/workers-artifacts
> -Dstorm.id=Topology_334348-43-1580365369
> -Dworker.id=f1e3e060-0b32-4ecd-8c34-c486258264a4 -Dworker.port=6707
> -Dstorm.log.dir=/var/log/storm
> -Dlog4j.configurationFile=/opt/storm/apache-storm-1.2.1/log4j2/worker.xml
> -DLog4jContextSelector=org.apache.logging.log4j.core.selector.BasicContextSelector
> -Dstorm.local.dir=/var/log/storm/tmp -Xmx2048m -XX:+PrintGCDetails
> -Xloggc:artifacts/gc.log -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10
> -XX:GCLogFileSize=1M -XX:+HeapDumpOnOutOfMemoryError
> -XX:HeapDumpPath=artifacts/heapdump
> -Djava.library.path=/var/log/storm/tmp/supervisor/stormdist/Topology_334348-43-1580365369/resources/Linux-amd64:/var/log/storm/tmp/supervisor/stormdist/Topology_334348-43-1580365369/resources:/usr/local/lib:/opt/local/lib:/usr/lib
> -Dstorm.conf.file= -Dstorm.options=
> -Djava.io.tmpdir=/var/log/storm/tmp/workers/f1e3e060-0b32-4ecd-8c34-c486258264a4/tmp
> -cp 
> /opt/storm/apache-storm-1.2.1/lib/*:/opt/storm/apache-storm-1.2.1/extlib/*:/opt/storm/apache-storm-1.2.1/conf:/var/log/storm/tmp/supervisor/stormdist/Topology_334348-43-1580365369/stormjar.jar
> org.apache.storm.daemon.worker Topology_334348-43-1580365369
> 7fe05c2b-ebcf-491b-a8cc-2565834b5988 6707
> f1e3e060-0b32-4ecd-8c34-c486258264a4
> 
> On Tue, 4 Feb 2020 at 04:36, Ethan Li  wrote:
>> 
>> This is where the worker launch command is composed:
>> 
>> https://github.com/apache/storm/blob/1.x-branch/storm-core/src/jvm/org/apache/storm/daemon/supervisor/BasicContainer.java#L653-L671
>> 
>> Since your worker.childopts is set, and topology.worker.childopts is empty,
>> 
>> 
>> worker.childopts: "-Xmx2048m -XX:+PrintGCDetails
>> -Xloggc:artifacts/gc.log -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps
>> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10
>> -XX:GCLogFileSize=1M -XX:+HeapDumpOnOutOfMemoryError
>> -XX:HeapDumpPath=artifacts/heapdump”
>> 
>> 
>> The command to launch the worker process should have -Xmx2048m.
>> 
>> I don’t see why it would be 65MB. And what do you mean by coming as 65MB 
>> only? Is it only committed 65MB? Or is the max only 65MB?
>> 
>> Could you submit the topology and show the result of “ps -aux |grep 
>> --ignore-case worker”? This will show you the JVM parameters of the worker 
>> process.
>> 
>> 
>> (BTW, -Xmx%HEAP-MEM%m in worker.childopts is designed to be replaced 
>> https://github.com/apache/storm/blob/1.x-branch/storm-core/src/jvm/org/apache/storm/daemon/supervisor/BasicContainer.java#L392)
>> 
>> 
>> 
>> 
>> On Jan 30, 2020, at 2:12 AM, Narasimhan Chengalvarayan 
>>  wrote:
>> 
>> Hi Ethan,
>> 
>> 
>> Please find the configuration detail
>> 
>> **
>> 
>> #Licensed to the Apache Software Foundation (ASF) under one
>> # or more contributor license agreements.  See the NOTICE file
>> # distributed with this work for additional information
>> # regarding copyright ownership.  The ASF licenses this file
>> # to you under the Apache License, Version 2.0 (the
>> # "License"); you may not use this file except in compliance
>> # with the License.  You may obtain a copy of the License at
>> #
>> # http://www.apache.org/licenses/LICENSE-2.0
>> #
>> # Unless required by applicable law or agreed to in writing, software
>> # distributed 

Re: Storm 1.2.1 - Excessive workerbeats causing long GC and thus disconneting zookeeper

2020-02-17 Thread Narasimhan Chengalvarayan
Hi Ethan Li,


Sorry for the late reply. Please find the output where it is showing
-Xmx2048m for worker heap . But in  storm ui we are seeing the worker
allocated memory as 65MB for each worker.

 java -server -Dlogging.sensitivity=S3 -Dlogfile.name=worker.log
-Dstorm.home=/opt/storm/apache-storm-1.2.1
-Dworkers.artifacts=/var/log/storm/workers-artifacts
-Dstorm.id=Topology_334348-43-1580365369
-Dworker.id=f1e3e060-0b32-4ecd-8c34-c486258264a4 -Dworker.port=6707
-Dstorm.log.dir=/var/log/storm
-Dlog4j.configurationFile=/opt/storm/apache-storm-1.2.1/log4j2/worker.xml
-DLog4jContextSelector=org.apache.logging.log4j.core.selector.BasicContextSelector
-Dstorm.local.dir=/var/log/storm/tmp -Xmx2048m -XX:+PrintGCDetails
-Xloggc:artifacts/gc.log -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10
-XX:GCLogFileSize=1M -XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=artifacts/heapdump
-Djava.library.path=/var/log/storm/tmp/supervisor/stormdist/Topology_334348-43-1580365369/resources/Linux-amd64:/var/log/storm/tmp/supervisor/stormdist/Topology_334348-43-1580365369/resources:/usr/local/lib:/opt/local/lib:/usr/lib
-Dstorm.conf.file= -Dstorm.options=
-Djava.io.tmpdir=/var/log/storm/tmp/workers/f1e3e060-0b32-4ecd-8c34-c486258264a4/tmp
-cp 
/opt/storm/apache-storm-1.2.1/lib/*:/opt/storm/apache-storm-1.2.1/extlib/*:/opt/storm/apache-storm-1.2.1/conf:/var/log/storm/tmp/supervisor/stormdist/Topology_334348-43-1580365369/stormjar.jar
org.apache.storm.daemon.worker Topology_334348-43-1580365369
7fe05c2b-ebcf-491b-a8cc-2565834b5988 6707
f1e3e060-0b32-4ecd-8c34-c486258264a4

On Tue, 4 Feb 2020 at 04:36, Ethan Li  wrote:
>
> This is where the worker launch command is composed:
>
> https://github.com/apache/storm/blob/1.x-branch/storm-core/src/jvm/org/apache/storm/daemon/supervisor/BasicContainer.java#L653-L671
>
> Since your worker.childopts is set, and topology.worker.childopts is empty,
>
>
> worker.childopts: "-Xmx2048m -XX:+PrintGCDetails
> -Xloggc:artifacts/gc.log -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10
> -XX:GCLogFileSize=1M -XX:+HeapDumpOnOutOfMemoryError
> -XX:HeapDumpPath=artifacts/heapdump”
>
>
> The command to launch the worker process should have -Xmx2048m.
>
> I don’t see why it would be 65MB. And what do you mean by coming as 65MB 
> only? Is it only committed 65MB? Or is the max only 65MB?
>
> Could you submit the topology and show the result of “ps -aux |grep 
> --ignore-case worker”? This will show you the JVM parameters of the worker 
> process.
>
>
> (BTW, -Xmx%HEAP-MEM%m in worker.childopts is designed to be replaced 
> https://github.com/apache/storm/blob/1.x-branch/storm-core/src/jvm/org/apache/storm/daemon/supervisor/BasicContainer.java#L392)
>
>
>
>
> On Jan 30, 2020, at 2:12 AM, Narasimhan Chengalvarayan 
>  wrote:
>
> Hi Ethan,
>
>
> Please find the configuration detail
>
> **
>
> #Licensed to the Apache Software Foundation (ASF) under one
> # or more contributor license agreements.  See the NOTICE file
> # distributed with this work for additional information
> # regarding copyright ownership.  The ASF licenses this file
> # to you under the Apache License, Version 2.0 (the
> # "License"); you may not use this file except in compliance
> # with the License.  You may obtain a copy of the License at
> #
> # http://www.apache.org/licenses/LICENSE-2.0
> #
> # Unless required by applicable law or agreed to in writing, software
> # distributed under the License is distributed on an "AS IS" BASIS,
> # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> # See the License for the specific language governing permissions and
> # limitations under the License.
>
> ### These MUST be filled in for a storm configuration
> storm.zookeeper.servers:
> - "ZK1"
> - "ZK2"
> - "ZK3"
>
> nimbus.seeds: ["host1","host2"]
> ui.port : 8081
> storm.log.dir: "/var/log/storm"
> storm.local.dir: "/var/log/storm/tmp"
> supervisor.slots.ports:
> - 6700
> - 6701
> - 6702
> - 6703
> - 6704
> - 6705
> - 6706
> - 6707
> - 6708
> - 6709
> - 6710
> - 6711
> - 6712
> - 6713
> - 6714
> - 6715
> - 6716
> - 6717
> worker.heap.memory.mb: 1639
> topology.worker.max.heap.size.mb: 1639
> worker.childopts: "-Xmx2048m -XX:+PrintGCDetails
> -Xloggc:artifacts/gc.log -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10
> -XX:GCLogFileSize=1M -XX:+HeapDumpOnOutOfMemoryError
> -XX:HeapDumpPath=artifacts/heapdump"
> worker.gc.childopts: ""
>
> topology.min.replication.count: 2
> #
> #
> # # These may optionally be filled in:
> #
> ## List of custom serializations
> # topology.kryo.register:
> # - org.mycompany.MyType
> # - org.mycompany.MyType2: org.mycompany.MyType2Serializer
> #
> ## List of custom kryo decorators
> # topology.kryo.decorators:
> # - org.mycompany.MyDecorator
> #
> ## Locations of the 

Re: Storm 1.2.1 - Excessive workerbeats causing long GC and thus disconneting zookeeper

2020-02-03 Thread Ethan Li
This is where the worker launch command is composed:

https://github.com/apache/storm/blob/1.x-branch/storm-core/src/jvm/org/apache/storm/daemon/supervisor/BasicContainer.java#L653-L671
 


Since your worker.childopts is set, and topology.worker.childopts is empty,


> worker.childopts: "-Xmx2048m -XX:+PrintGCDetails
> -Xloggc:artifacts/gc.log -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10
> -XX:GCLogFileSize=1M -XX:+HeapDumpOnOutOfMemoryError
> -XX:HeapDumpPath=artifacts/heapdump”

The command to launch the worker process should have -Xmx2048m.  

I don’t see why it would be 65MB. And what do you mean by coming as 65MB only? 
Is it only committed 65MB? Or is the max only 65MB?

Could you submit the topology and show the result of “ps -aux |grep 
--ignore-case worker”? This will show you the JVM parameters of the worker 
process.


(BTW, -Xmx%HEAP-MEM%m in worker.childopts is designed to be replaced 
https://github.com/apache/storm/blob/1.x-branch/storm-core/src/jvm/org/apache/storm/daemon/supervisor/BasicContainer.java#L392
 
)




> On Jan 30, 2020, at 2:12 AM, Narasimhan Chengalvarayan 
>  wrote:
> 
> Hi Ethan,
> 
> 
> Please find the configuration detail
> 
> **
> 
> #Licensed to the Apache Software Foundation (ASF) under one
> # or more contributor license agreements.  See the NOTICE file
> # distributed with this work for additional information
> # regarding copyright ownership.  The ASF licenses this file
> # to you under the Apache License, Version 2.0 (the
> # "License"); you may not use this file except in compliance
> # with the License.  You may obtain a copy of the License at
> #
> # http://www.apache.org/licenses/LICENSE-2.0
> #
> # Unless required by applicable law or agreed to in writing, software
> # distributed under the License is distributed on an "AS IS" BASIS,
> # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
> # See the License for the specific language governing permissions and
> # limitations under the License.
> 
> ### These MUST be filled in for a storm configuration
> storm.zookeeper.servers:
> - "ZK1"
> - "ZK2"
> - "ZK3"
> 
> nimbus.seeds: ["host1","host2"]
> ui.port : 8081
> storm.log.dir: "/var/log/storm"
> storm.local.dir: "/var/log/storm/tmp"
> supervisor.slots.ports:
> - 6700
> - 6701
> - 6702
> - 6703
> - 6704
> - 6705
> - 6706
> - 6707
> - 6708
> - 6709
> - 6710
> - 6711
> - 6712
> - 6713
> - 6714
> - 6715
> - 6716
> - 6717
> worker.heap.memory.mb: 1639
> topology.worker.max.heap.size.mb: 1639
> worker.childopts: "-Xmx2048m -XX:+PrintGCDetails
> -Xloggc:artifacts/gc.log -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10
> -XX:GCLogFileSize=1M -XX:+HeapDumpOnOutOfMemoryError
> -XX:HeapDumpPath=artifacts/heapdump"
> worker.gc.childopts: ""
> 
> topology.min.replication.count: 2
> #
> #
> # # These may optionally be filled in:
> #
> ## List of custom serializations
> # topology.kryo.register:
> # - org.mycompany.MyType
> # - org.mycompany.MyType2: org.mycompany.MyType2Serializer
> #
> ## List of custom kryo decorators
> # topology.kryo.decorators:
> # - org.mycompany.MyDecorator
> #
> ## Locations of the drpc servers
> # drpc.servers:
> # - "server1"
> # - "server2"
> 
> ## Metrics Consumers
> # topology.metrics.consumer.register:
> #   - class: "org.apache.storm.metric.LoggingMetricsConsumer"
> # parallelism.hint: 1
> #   - class: "org.mycompany.MyMetricsConsumer"
> # parallelism.hint: 1
> # argument:
> #   - endpoint: "metrics-collector.mycompany.org"
> 
> *
> 
> 
> On Thu, 30 Jan 2020 at 03:07, Ethan Li  wrote:
>> 
>> I am not sure. Can you provide your configs?
>> 
>> 
>> 
>>> On Jan 28, 2020, at 6:33 PM, Narasimhan Chengalvarayan 
>>>  wrote:
>>> 
>>> Hi Team,
>>> 
>>> Do you have any idea, In storm apache 1.1.0 we have set worker size as
>>> 2 GB , Once we upgrade to 1.2.1 .It was coming as 65MB only. please
>>> help us .DO we need to follow different configuration setting for
>>> storm 1.2.1 or it is a bug.
>>> 
>>> On Mon, 27 Jan 2020 at 16:44, Narasimhan Chengalvarayan
>>>  wrote:
 
 Hi Team,
 
 In storm 1.2.1 version, worker memory is showing as 65MB. But we have
 set worker memory has 2GB.
 
 On Fri, 24 Jan 2020 at 01:25, Ethan Li  wrote:
> 
> 
> 1) What is stored in Workerbeats znode?
> 
> 
> Worker periodically sends heartbeat to zookeeper under workerbeats node.
> 
> 2) Which settings control the frequency of workerbeats 

Re: Storm 1.2.1 - Excessive workerbeats causing long GC and thus disconneting zookeeper

2020-01-30 Thread Narasimhan Chengalvarayan
Hi Ethan,


Please find the configuration detail

**

#Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements.  See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership.  The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License.  You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

### These MUST be filled in for a storm configuration
storm.zookeeper.servers:
 - "ZK1"
 - "ZK2"
 - "ZK3"

nimbus.seeds: ["host1","host2"]
ui.port : 8081
storm.log.dir: "/var/log/storm"
storm.local.dir: "/var/log/storm/tmp"
supervisor.slots.ports:
- 6700
- 6701
- 6702
- 6703
- 6704
- 6705
- 6706
- 6707
- 6708
- 6709
- 6710
- 6711
- 6712
- 6713
- 6714
- 6715
- 6716
- 6717
worker.heap.memory.mb: 1639
topology.worker.max.heap.size.mb: 1639
worker.childopts: "-Xmx2048m -XX:+PrintGCDetails
-Xloggc:artifacts/gc.log -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps
-XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10
-XX:GCLogFileSize=1M -XX:+HeapDumpOnOutOfMemoryError
-XX:HeapDumpPath=artifacts/heapdump"
worker.gc.childopts: ""

topology.min.replication.count: 2
#
#
# # These may optionally be filled in:
#
## List of custom serializations
# topology.kryo.register:
# - org.mycompany.MyType
# - org.mycompany.MyType2: org.mycompany.MyType2Serializer
#
## List of custom kryo decorators
# topology.kryo.decorators:
# - org.mycompany.MyDecorator
#
## Locations of the drpc servers
# drpc.servers:
# - "server1"
# - "server2"

## Metrics Consumers
# topology.metrics.consumer.register:
#   - class: "org.apache.storm.metric.LoggingMetricsConsumer"
# parallelism.hint: 1
#   - class: "org.mycompany.MyMetricsConsumer"
# parallelism.hint: 1
# argument:
#   - endpoint: "metrics-collector.mycompany.org"

*


On Thu, 30 Jan 2020 at 03:07, Ethan Li  wrote:
>
> I am not sure. Can you provide your configs?
>
>
>
> > On Jan 28, 2020, at 6:33 PM, Narasimhan Chengalvarayan 
> >  wrote:
> >
> > Hi Team,
> >
> > Do you have any idea, In storm apache 1.1.0 we have set worker size as
> > 2 GB , Once we upgrade to 1.2.1 .It was coming as 65MB only. please
> > help us .DO we need to follow different configuration setting for
> > storm 1.2.1 or it is a bug.
> >
> > On Mon, 27 Jan 2020 at 16:44, Narasimhan Chengalvarayan
> >  wrote:
> >>
> >> Hi Team,
> >>
> >> In storm 1.2.1 version, worker memory is showing as 65MB. But we have
> >> set worker memory has 2GB.
> >>
> >> On Fri, 24 Jan 2020 at 01:25, Ethan Li  wrote:
> >>>
> >>>
> >>> 1) What is stored in Workerbeats znode?
> >>>
> >>>
> >>> Worker periodically sends heartbeat to zookeeper under workerbeats node.
> >>>
> >>> 2) Which settings control the frequency of workerbeats update
> >>>
> >>>
> >>>
> >>> https://github.com/apache/storm/blob/1.x-branch/storm-core/src/jvm/org/apache/storm/Config.java#L1534-L1539
> >>> task.heartbeat.frequency.secs Default to 3
> >>>
> >>> 3)What will be the impact if the frequency is reduced
> >>>
> >>>
> >>> Nimbus get the worker status from workerbeat znode to know if executors 
> >>> on workers are alive or not.
> >>> https://github.com/apache/storm/blob/1.x-branch/storm-core/src/jvm/org/apache/storm/Config.java#L595-L601
> >>> If heartbeat exceeds nimbus.task.timeout.secs (default to 30), nimbus 
> >>> will think the certain executor is dead and try to reschedule.
> >>>
> >>> To reduce the issue on zookeeper, a pacemaker component was introduced. 
> >>> https://github.com/apache/storm/blob/master/docs/Pacemaker.md
> >>> You might want to use it too.
> >>>
> >>> Thanks
> >>>
> >>>
> >>> On Dec 10, 2019, at 4:36 PM, Surajeet Dev  
> >>> wrote:
> >>>
> >>> We upgraded Storm version to 1.2.1 , and since then have been 
> >>> consistently observing Zookeeper session timeouts .
> >>>
> >>> On analysis , we observed that there is high frequency of updates on 
> >>> workerbeats znode with data upto size of 50KB. This causes the Garbage 
> >>> Collector to kickoff lasting more than 15 secs , resulting in Zookeper 
> >>> session timeout
> >>>
> >>> I understand , increasing the session timeout will alleviate the issue , 
> >>> but we have already done that twice
> >>>
> >>> My questions are:
> >>>
> >>> 1) What is stored in Workerbeats znode?
> >>> 2) Which settings control the frequency of workerbeats update
> >>> 3)What will be the impact if the frequency 

Re: Storm 1.2.1 - Excessive workerbeats causing long GC and thus disconneting zookeeper

2020-01-29 Thread Ethan Li
I am not sure. Can you provide your configs? 



> On Jan 28, 2020, at 6:33 PM, Narasimhan Chengalvarayan 
>  wrote:
> 
> Hi Team,
> 
> Do you have any idea, In storm apache 1.1.0 we have set worker size as
> 2 GB , Once we upgrade to 1.2.1 .It was coming as 65MB only. please
> help us .DO we need to follow different configuration setting for
> storm 1.2.1 or it is a bug.
> 
> On Mon, 27 Jan 2020 at 16:44, Narasimhan Chengalvarayan
>  wrote:
>> 
>> Hi Team,
>> 
>> In storm 1.2.1 version, worker memory is showing as 65MB. But we have
>> set worker memory has 2GB.
>> 
>> On Fri, 24 Jan 2020 at 01:25, Ethan Li  wrote:
>>> 
>>> 
>>> 1) What is stored in Workerbeats znode?
>>> 
>>> 
>>> Worker periodically sends heartbeat to zookeeper under workerbeats node.
>>> 
>>> 2) Which settings control the frequency of workerbeats update
>>> 
>>> 
>>> 
>>> https://github.com/apache/storm/blob/1.x-branch/storm-core/src/jvm/org/apache/storm/Config.java#L1534-L1539
>>> task.heartbeat.frequency.secs Default to 3
>>> 
>>> 3)What will be the impact if the frequency is reduced
>>> 
>>> 
>>> Nimbus get the worker status from workerbeat znode to know if executors on 
>>> workers are alive or not.
>>> https://github.com/apache/storm/blob/1.x-branch/storm-core/src/jvm/org/apache/storm/Config.java#L595-L601
>>> If heartbeat exceeds nimbus.task.timeout.secs (default to 30), nimbus will 
>>> think the certain executor is dead and try to reschedule.
>>> 
>>> To reduce the issue on zookeeper, a pacemaker component was introduced. 
>>> https://github.com/apache/storm/blob/master/docs/Pacemaker.md
>>> You might want to use it too.
>>> 
>>> Thanks
>>> 
>>> 
>>> On Dec 10, 2019, at 4:36 PM, Surajeet Dev  wrote:
>>> 
>>> We upgraded Storm version to 1.2.1 , and since then have been consistently 
>>> observing Zookeeper session timeouts .
>>> 
>>> On analysis , we observed that there is high frequency of updates on 
>>> workerbeats znode with data upto size of 50KB. This causes the Garbage 
>>> Collector to kickoff lasting more than 15 secs , resulting in Zookeper 
>>> session timeout
>>> 
>>> I understand , increasing the session timeout will alleviate the issue , 
>>> but we have already done that twice
>>> 
>>> My questions are:
>>> 
>>> 1) What is stored in Workerbeats znode?
>>> 2) Which settings control the frequency of workerbeats update
>>> 3)What will be the impact if the frequency is reduced
>>> 
>>> 
>>> 
>> 
>> 
>> --
>> Thanks
>> C.Narasimhan
>> 09739123245
> 
> 
> 
> -- 
> Thanks
> C.Narasimhan
> 09739123245



Re: Storm 1.2.1 - Excessive workerbeats causing long GC and thus disconneting zookeeper

2020-01-28 Thread Narasimhan Chengalvarayan
Hi Team,

Do you have any idea, In storm apache 1.1.0 we have set worker size as
2 GB , Once we upgrade to 1.2.1 .It was coming as 65MB only. please
help us .DO we need to follow different configuration setting for
storm 1.2.1 or it is a bug.

On Mon, 27 Jan 2020 at 16:44, Narasimhan Chengalvarayan
 wrote:
>
> Hi Team,
>
> In storm 1.2.1 version, worker memory is showing as 65MB. But we have
> set worker memory has 2GB.
>
> On Fri, 24 Jan 2020 at 01:25, Ethan Li  wrote:
> >
> >
> > 1) What is stored in Workerbeats znode?
> >
> >
> > Worker periodically sends heartbeat to zookeeper under workerbeats node.
> >
> > 2) Which settings control the frequency of workerbeats update
> >
> >
> >
> > https://github.com/apache/storm/blob/1.x-branch/storm-core/src/jvm/org/apache/storm/Config.java#L1534-L1539
> > task.heartbeat.frequency.secs Default to 3
> >
> > 3)What will be the impact if the frequency is reduced
> >
> >
> > Nimbus get the worker status from workerbeat znode to know if executors on 
> > workers are alive or not.
> > https://github.com/apache/storm/blob/1.x-branch/storm-core/src/jvm/org/apache/storm/Config.java#L595-L601
> > If heartbeat exceeds nimbus.task.timeout.secs (default to 30), nimbus will 
> > think the certain executor is dead and try to reschedule.
> >
> > To reduce the issue on zookeeper, a pacemaker component was introduced. 
> > https://github.com/apache/storm/blob/master/docs/Pacemaker.md
> > You might want to use it too.
> >
> > Thanks
> >
> >
> > On Dec 10, 2019, at 4:36 PM, Surajeet Dev  wrote:
> >
> > We upgraded Storm version to 1.2.1 , and since then have been consistently 
> > observing Zookeeper session timeouts .
> >
> > On analysis , we observed that there is high frequency of updates on 
> > workerbeats znode with data upto size of 50KB. This causes the Garbage 
> > Collector to kickoff lasting more than 15 secs , resulting in Zookeper 
> > session timeout
> >
> > I understand , increasing the session timeout will alleviate the issue , 
> > but we have already done that twice
> >
> > My questions are:
> >
> > 1) What is stored in Workerbeats znode?
> > 2) Which settings control the frequency of workerbeats update
> > 3)What will be the impact if the frequency is reduced
> >
> >
> >
>
>
> --
> Thanks
> C.Narasimhan
> 09739123245



-- 
Thanks
C.Narasimhan
09739123245


Re: Storm 1.2.1 - Excessive workerbeats causing long GC and thus disconneting zookeeper

2020-01-27 Thread Narasimhan Chengalvarayan
Hi Team,

In storm 1.2.1 version, worker memory is showing as 65MB. But we have
set worker memory has 2GB.

On Fri, 24 Jan 2020 at 01:25, Ethan Li  wrote:
>
>
> 1) What is stored in Workerbeats znode?
>
>
> Worker periodically sends heartbeat to zookeeper under workerbeats node.
>
> 2) Which settings control the frequency of workerbeats update
>
>
>
> https://github.com/apache/storm/blob/1.x-branch/storm-core/src/jvm/org/apache/storm/Config.java#L1534-L1539
> task.heartbeat.frequency.secs Default to 3
>
> 3)What will be the impact if the frequency is reduced
>
>
> Nimbus get the worker status from workerbeat znode to know if executors on 
> workers are alive or not.
> https://github.com/apache/storm/blob/1.x-branch/storm-core/src/jvm/org/apache/storm/Config.java#L595-L601
> If heartbeat exceeds nimbus.task.timeout.secs (default to 30), nimbus will 
> think the certain executor is dead and try to reschedule.
>
> To reduce the issue on zookeeper, a pacemaker component was introduced. 
> https://github.com/apache/storm/blob/master/docs/Pacemaker.md
> You might want to use it too.
>
> Thanks
>
>
> On Dec 10, 2019, at 4:36 PM, Surajeet Dev  wrote:
>
> We upgraded Storm version to 1.2.1 , and since then have been consistently 
> observing Zookeeper session timeouts .
>
> On analysis , we observed that there is high frequency of updates on 
> workerbeats znode with data upto size of 50KB. This causes the Garbage 
> Collector to kickoff lasting more than 15 secs , resulting in Zookeper 
> session timeout
>
> I understand , increasing the session timeout will alleviate the issue , but 
> we have already done that twice
>
> My questions are:
>
> 1) What is stored in Workerbeats znode?
> 2) Which settings control the frequency of workerbeats update
> 3)What will be the impact if the frequency is reduced
>
>
>


-- 
Thanks
C.Narasimhan
09739123245


Re: Storm 1.2.1 - Excessive workerbeats causing long GC and thus disconneting zookeeper

2020-01-23 Thread Ethan Li

> 1) What is stored in Workerbeats znode?

Worker periodically sends heartbeat to zookeeper under workerbeats node.

> 2) Which settings control the frequency of workerbeats update


https://github.com/apache/storm/blob/1.x-branch/storm-core/src/jvm/org/apache/storm/Config.java#L1534-L1539
 

task.heartbeat.frequency.secs  Default to 3

> 3)What will be the impact if the frequency is reduced

Nimbus get the worker status from workerbeat znode to know if executors on 
workers are alive or not. 
https://github.com/apache/storm/blob/1.x-branch/storm-core/src/jvm/org/apache/storm/Config.java#L595-L601
 

If heartbeat exceeds nimbus.task.timeout.secs (default to 30), nimbus will 
think the certain executor is dead and try to reschedule.

To reduce the issue on zookeeper, a pacemaker component was introduced. 
https://github.com/apache/storm/blob/master/docs/Pacemaker.md 
 
You might want to use it too. 

Thanks


> On Dec 10, 2019, at 4:36 PM, Surajeet Dev  wrote:
> 
> We upgraded Storm version to 1.2.1 , and since then have been consistently 
> observing Zookeeper session timeouts . 
> 
> On analysis , we observed that there is high frequency of updates on 
> workerbeats znode with data upto size of 50KB. This causes the Garbage 
> Collector to kickoff lasting more than 15 secs , resulting in Zookeper 
> session timeout
> 
> I understand , increasing the session timeout will alleviate the issue , but 
> we have already done that twice 
> 
> My questions are:
> 
> 1) What is stored in Workerbeats znode?
> 2) Which settings control the frequency of workerbeats update
> 3)What will be the impact if the frequency is reduced
> 
> 



Storm 1.2.1 - Excessive workerbeats causing long GC and thus disconneting zookeeper

2019-12-10 Thread Surajeet Dev
We upgraded Storm version to 1.2.1 , and since then have been consistently
observing Zookeeper session timeouts .

On analysis , we observed that there is high frequency of updates on
workerbeats znode with data upto size of 50KB. This causes the Garbage
Collector to kickoff lasting more than 15 secs , resulting in Zookeper
session timeout

I understand , increasing the session timeout will alleviate the issue ,
but we have already done that twice

My questions are:

1) What is stored in Workerbeats znode?
2) Which settings control the frequency of workerbeats update
3)What will be the impact if the frequency is reduced