Re: Worker thread memory

Nick R. Katsipoulakis Thu, 25 Jun 2015 08:08:34 -0700

Yes, I see the following message which I have not seen before:

2015-06-24T19:05:28.745+0000 b.s.d.supervisor [INFO]
fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
2015-06-24T19:05:29.245+0000 b.s.d.supervisor [INFO]
fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
2015-06-24T19:05:29.746+0000 b.s.d.supervisor [INFO]
fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
2015-06-24T19:05:30.246+0000 b.s.d.supervisor [INFO]
fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
2015-06-24T19:05:30.646+0000 b.s.d.supervisor [INFO] Removing code for
storm id tpch-q5-top-5-1435172243
2015-06-24T19:05:30.747+0000 b.s.d.supervisor [INFO]
fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started
2015-06-24T19:05:31.247+0000 b.s.d.supervisor [INFO]
fa3de772-cc61-4394-97e2-fcbd85190dd4 still hasn't started


2015-06-24T19:06:50.327+0000 b.s.d.supervisor [INFO] Worker
fa3de772-cc61-4394-97e2-fcbd85190dd4 failed to start
2015-06-24T19:06:50.329+0000 b.s.d.supervisor [INFO] Shutting down and
clearing state for id fa3de772-cc61-4394-97e2-fcbd85190dd4. Current
supervisor time: 1435172810. State: :not-started, Heartbeat: nil
2015-06-24T19:06:50.329+0000 b.s.d.supervisor [INFO] Shutting down
58e551ba-f944-4aec-9c8f-5621053021dd:fa3de772-cc61-4394-97e2-fcbd85190dd4
2015-06-24T19:06:50.330+0000 b.s.d.supervisor [INFO] Shut down
58e551ba-f944-4aec-9c8f-5621053021dd:fa3de772-cc61-4394-97e2-fcbd85190dd4
2015-06-24T19:08:39.743+0000 b.s.d.supervisor [INFO] Shutting down
supervisor 58e551ba-f944-4aec-9c8f-5621053021dd
2015-06-24T19:08:39.745+0000 b.s.event [INFO] Event manager interrupted
2015-06-24T19:08:39.745+0000 b.s.event [INFO] Event manager interrupted
2015-06-24T19:08:39.748+0000 o.a.s.z.ZooKeeper [INFO] Session:
0x24e26a304b50025 closed
2015-06-24T19:08:39.748+0000 o.a.s.z.ClientCnxn [INFO] EventThread shut down

But no indication on why the above is happening.

Thanks,
Nick

2015-06-25 10:52 GMT-04:00 Nathan Leung <[email protected]>:

> Any problems in supervisor or nimbus logs?
>
> On Thu, Jun 25, 2015 at 10:49 AM, Nick R. Katsipoulakis <
> [email protected]> wrote:
>
>> I am using m4.xlarge instances, each one with 4 workers per supervisor.
>> Yes, they are listed.
>>
>> Nick
>>
>> 2015-06-25 10:47 GMT-04:00 Nathan Leung <[email protected]>:
>>
>>> How big are your EC2 instances?  Are your supervisors listed in the
>>> storm UI?
>>>
>>> On Thu, Jun 25, 2015 at 10:43 AM, Nick R. Katsipoulakis <
>>> [email protected]> wrote:
>>>
>>>> Nathan,
>>>>
>>>> I attempted to put the following line
>>>>
>>>> worker.childopts: "-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>>>> -XX:+UseConcMarkSweepGC -XX:NewSize=128m -XX:
>>>> CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
>>>> Djava.net.preferIPv4Stack=true"
>>>>
>>>> in the supervisor config files, but for some reason workers were not
>>>> spawned on those machines. To be more precise, I submitted my topology
>>>> (with storm jar...) and I just waited for it to start executing, but
>>>> nothing. Any ideas of what might have been the reason?
>>>>
>>>> Thanks,
>>>> Nick
>>>>
>>>> 2015-06-25 10:39 GMT-04:00 Nathan Leung <[email protected]>:
>>>>
>>>>> In general worker options need to be set in the supervisor config
>>>>> files.
>>>>>
>>>>> On Thu, Jun 25, 2015 at 10:07 AM, Nick R. Katsipoulakis <
>>>>> [email protected]> wrote:
>>>>>
>>>>>> Hello sy.pan
>>>>>>
>>>>>> Thank you for the link. I will try the suggestions.
>>>>>>
>>>>>> Cheers,
>>>>>> Nick
>>>>>>
>>>>>> 2015-06-24 22:35 GMT-04:00 sy.pan <[email protected]>:
>>>>>>
>>>>>>> FYI:
>>>>>>>
>>>>>>>
>>>>>>> https://mail-archives.apache.org/mod_mbox/storm-user/201504.mbox/%3ccafbccrcadux8sl8d99tomrbg9hkmo3gkg-qdv-qkmc-6zxs...@mail.gmail.com%3E
>>>>>>>
>>>>>>>
>>>>>>> 在 2015年6月25日，02:14，Nick R. Katsipoulakis <[email protected]> 写道：
>>>>>>>
>>>>>>> Hello all,
>>>>>>>
>>>>>>> I am working on an EC2 Storm cluster, and I want the workers in the
>>>>>>> supervisor machines to use 4GBs of memory, so I add the following line 
>>>>>>> in
>>>>>>> the machine that hosts the nimbus:
>>>>>>>
>>>>>>> worker.childopts-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC
>>>>>>> -XX:+UseConcMarkSweepGC -XX:NewSize=128m
>>>>>>> -XX:CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled
>>>>>>> Djava.net.preferIPv4Stack=true
>>>>>>> However, when I take a look into the workers' logs (on each other
>>>>>>> machine who is running a supervisor), I do not find the above line on 
>>>>>>> the
>>>>>>> part that launches the worker with the given arguments. In fact, I find 
>>>>>>> the
>>>>>>> following line:
>>>>>>>
>>>>>>> 2015-06-24T17:52:45.349+0000 b.s.d.worker [INFO] Launching worker
>>>>>>> for tpch-q5-top-2-1435168361 on 
>>>>>>> 5568726d-ad65-4a7c-ba52-32eed83276ad:6703
>>>>>>> with id 829f36fc-eeb9-4eef-ae89-9fb6565e9108 and conf 
>>>>>>> {"dev.zookeeper.path"
>>>>>>> "/tmp/dev-storm-zookeeper", "topology.tick.tuple.freq.secs" nil,
>>>>>>> "topology.builtin.metrics.bucket.size.secs" 60,
>>>>>>> "topology.fall.back.on.java.serialization" true,
>>>>>>> "topology.max.error.report.per.interval" 5, "zmq.linger.millis" 5000,
>>>>>>> "topology.skip.missing.kryo.registrations" false,
>>>>>>> "storm.messaging.netty.client_worker_threads" 4, "ui.childopts" 
>>>>>>> "-Xmx768m",
>>>>>>> "storm.zookeeper.session.timeout" 20000, "nimbus.reassign" true,
>>>>>>> "topology.trident.batch.emit.interval.millis" 500, "
>>>>>>> storm.messaging.netty.flush.check.interval.ms" 10,
>>>>>>> "nimbus.monitor.freq.secs" 10, "logviewer.childopts" "-Xmx128m",
>>>>>>> "java.library.path" "/usr/local/lib:/opt/local/lib:/usr/lib", 
>>>>>>> "storm.home"
>>>>>>> "/opt/apache-storm-0.9.4", "topology.executor.send.buffer.size" 1024,
>>>>>>> "storm.local.dir" "/mnt/storm", "storm.messaging.netty.buffer_size"
>>>>>>> 10485760, "supervisor.worker.start.timeout.secs" 120,
>>>>>>> "topology.enable.message.timeouts" true, 
>>>>>>> "nimbus.cleanup.inbox.freq.secs"
>>>>>>> 600, "nimbus.inbox.jar.expiration.secs" 3600, "drpc.worker.threads" 64,
>>>>>>> "storm.meta.serialization.delegate"
>>>>>>> "backtype.storm.serialization.DefaultSerializationDelegate",
>>>>>>> "topology.worker.shared.thread.pool.size" 4, "nimbus.host" 
>>>>>>> "52.25.74.163",
>>>>>>> "storm.messaging.netty.min_wait_ms" 100, "storm.zookeeper.port" 2181,
>>>>>>> "transactional.zookeeper.port" nil, 
>>>>>>> "topology.executor.receive.buffer.size"
>>>>>>> 1024, "transactional.zookeeper.servers" nil, "storm.zookeeper.root"
>>>>>>> "/storm", "storm.zookeeper.retry.intervalceiling.millis" 30000,
>>>>>>> "supervisor.enable" true, "storm.messaging.netty.server_worker_threads" 
>>>>>>> 4,
>>>>>>> "storm.zookeeper.servers" ["172.31.28.73" "172.31.38.251" 
>>>>>>> "172.31.38.252"],
>>>>>>> "transactional.zookeeper.root" "/transactional", 
>>>>>>> "topology.acker.executors"
>>>>>>> nil, "topology.transfer.buffer.size" 1024, "topology.worker.childopts" 
>>>>>>> nil,
>>>>>>> "drpc.queue.size" 128, "worker.childopts" "-Xmx768m",
>>>>>>> "supervisor.heartbeat.frequency.secs" 5,
>>>>>>> "topology.error.throttle.interval.secs" 10, "zmq.hwm" 0, "drpc.port" 
>>>>>>> 3772,
>>>>>>> "supervisor.monitor.frequency.secs" 3, "drpc.childopts" "-Xmx768m",
>>>>>>> "topology.receiver.buffer.size" 8, "task.heartbeat.frequency.secs" 3,
>>>>>>> "topology.tasks" nil, "storm.messaging.netty.max_retries" 100,
>>>>>>> "topology.spout.wait.strategy"
>>>>>>> "backtype.storm.spout.SleepSpoutWaitStrategy",
>>>>>>> "nimbus.thrift.max_buffer_size" 1048576, "topology.max.spout.pending" 
>>>>>>> nil,
>>>>>>> "storm.zookeeper.retry.interval" 1000, "
>>>>>>> topology.sleep.spout.wait.strategy.time.ms" 1,
>>>>>>> "nimbus.topology.validator"
>>>>>>> "backtype.storm.nimbus.DefaultTopologyValidator", 
>>>>>>> "supervisor.slots.ports"
>>>>>>> [6700 6701 6702 6703], "topology.environment" nil, "topology.debug" 
>>>>>>> false,
>>>>>>> "nimbus.task.launch.secs" 120, "nimbus.supervisor.timeout.secs" 60,
>>>>>>> "topology.message.timeout.secs" 30, "task.refresh.poll.secs" 10,
>>>>>>> "topology.workers" 1, "supervisor.childopts" "-Xmx256m",
>>>>>>> "nimbus.thrift.port" 6627, "topology.stats.sample.rate" 0.05,
>>>>>>> "worker.heartbeat.frequency.secs" 1, "topology.tuple.serializer"
>>>>>>> "backtype.storm.serialization.types.ListDelegateSerializer",
>>>>>>> "topology.disruptor.wait.strategy"
>>>>>>> "com.lmax.disruptor.BlockingWaitStrategy", 
>>>>>>> "topology.multilang.serializer"
>>>>>>> "backtype.storm.multilang.JsonSerializer", "nimbus.task.timeout.secs" 
>>>>>>> 30,
>>>>>>> "storm.zookeeper.connection.timeout" 15000, "topology.kryo.factory"
>>>>>>> "backtype.storm.serialization.DefaultKryoFactory", 
>>>>>>> "drpc.invocations.port"
>>>>>>> 3773, "logviewer.port" 8000, "zmq.threads" 1, 
>>>>>>> "storm.zookeeper.retry.times"
>>>>>>> 5, "topology.worker.receiver.thread.count" 1, "storm.thrift.transport"
>>>>>>> "backtype.storm.security.auth.SimpleTransportPlugin",
>>>>>>> "topology.state.synchronization.timeout.secs" 60,
>>>>>>> "supervisor.worker.timeout.secs" 30, "nimbus.file.copy.expiration.secs"
>>>>>>> 600, "storm.messaging.transport" 
>>>>>>> "backtype.storm.messaging.netty.Context", "
>>>>>>> logviewer.appender.name" "A1", "storm.messaging.netty.max_wait_ms"
>>>>>>> 1000, "drpc.request.timeout.secs" 600, "storm.local.mode.zmq" false,
>>>>>>> "ui.port" 8080, "nimbus.childopts" "-Xmx1024m", "storm.cluster.mode"
>>>>>>> "distributed", "topology.max.task.parallelism" nil,
>>>>>>> "storm.messaging.netty.transfer.batch.size" 262144, "topology.classpath"
>>>>>>> nil}
>>>>>>>
>>>>>>> which as you can see uses topology.worker.childopts: nil and
>>>>>>> worker.childops: -Xmx768m. My question is the following: Do I need to 
>>>>>>> add
>>>>>>> the above line in the storm.yaml files of my supervisor nodes in order 
>>>>>>> to
>>>>>>> allow the JVM to use up to 4GBs of memory? Also, am I setting the right
>>>>>>> value for what I am trying to achieve?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Nick
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Nikolaos Romanos Katsipoulakis,
>>>>>> University of Pittsburgh, PhD candidate
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Nikolaos Romanos Katsipoulakis,
>>>> University of Pittsburgh, PhD candidate
>>>>
>>>
>>>
>>
>>
>> --
>> Nikolaos Romanos Katsipoulakis,
>> University of Pittsburgh, PhD candidate
>>
>
>


-- 
Nikolaos Romanos Katsipoulakis,
University of Pittsburgh, PhD candidate

Re: Worker thread memory

Reply via email to