Igor, *maybe* you have the supervisor in a cgroup that is limiting the CPU
for the process?  (Don't know anything about the HDP packaged storm)

- Erik

On Sun, Mar 27, 2016 at 11:30 AM, Andrey Dudin <[email protected]> wrote:

> Hi Igor.
> Try to dump threads and look on it. I think you can find problem in dump.
>
> Look at lan, cpu and ram load. Maybe you CPU overloaded or ram and lan.
>
> 2016-03-27 21:26 GMT+03:00 Igor Kuzmenko <[email protected]>:
>
>> Hello, I'm using Hortonworks Data Platform v2.3.4 with included storm
>> 0.10.
>> After deploying my topology using "*storm jar*" command it takes about 3
>> min to distribute code. The network is 10Gb/s, topology jar is about 150MB,
>> cluster has 2 nodes with supervisors on them, so I assume that this process
>> shouldn't take much time. Actualy, it wasn't with Hortonworks platform
>> 2.3.0, I can't understand what happend since than. Heres change log
>> <http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_HDP_RelNotes/content/patch_storm.html>
>>  of
>> HDP, hope you can help.
>>
>> Here's logs.
>> Nimbus download jar pretty fast:
>> *2016-02-25 17:10:13.040 b.s.d.nimbus [INFO] Uploading file from client
>> to
>> /opt/hadoop/storm/nimbus/inbox/stormjar-7352b097-8829-4268-a81f-29d820c0f311.jar*
>> *2016-02-25 17:10:14.915 b.s.d.nimbus [INFO] Finished uploading file from
>> client:
>> /opt/hadoop/storm/nimbus/inbox/stormjar-7352b097-8829-4268-a81f-29d820c0f311.jar*
>> 2016-02-25 17:10:14.928 b.s.d.nimbus [INFO] [req 22] Access from:
>> principal: op:submitTopology
>> 2016-02-25 17:10:14.933 b.s.d.nimbus [INFO] Received topology submission
>> for CdrPerformanceTestTopology-phoenix-bolt-4 with conf
>> {"topology.max.task.parallelism" nil, "hbase.conf" {"hbase.rootdir" "hdfs://
>> sorm-master02.msk.mts.ru:8020/apps/hbase/data"},
>> "topology.submitter.principal" "", "topology.acker.executors" nil,
>> "topology.workers" 2, "topology.message.timeout.secs" 30, "topology.debug"
>> true, "topology.max.spout.pending" 10000, "storm.zookeeper.superACL" nil,
>> "topology.users" (), "topology.submitter.user" "", "topology.kryo.register"
>> {"org.apache.avro.util.Utf8" nil,
>> "ru.mts.sorm.schemes.avro.cdr.CbossCdrRecord" nil},
>> "topology.kryo.decorators" (), "storm.id"
>> "CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414", "topology.name"
>> "CdrPerformanceTestTopology-phoenix-bolt-4"}
>> 2016-02-25 17:10:14.936 b.s.d.nimbus [INFO] nimbus file
>> location:/opt/hadoop/storm/nimbus/stormdist/CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414
>> 2016-02-25 17:10:15.160 b.s.c.LocalFileSystemCodeDistributor [INFO]
>> Created meta file
>> /opt/hadoop/storm/nimbus/stormdist/CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414/storm-code-distributor.meta
>> upload successful.
>> 2016-02-25 17:10:15.164 b.s.zookeeper [INFO] Node already in queue for
>> leader lock.
>> 2016-02-25 17:10:15.165 b.s.d.nimbus [INFO] desired replication count of
>> 1 not achieved but we have hit the max wait time 60 so moving on with
>> replication count = 1
>> 2016-02-25 17:10:15.169 b.s.d.nimbus [INFO] Activating
>> CdrPerformanceTestTopology-phoenix-bolt-4:
>> CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414
>> 2016-02-25 17:10:15.186 b.s.s.EvenScheduler [INFO] Available slots:
>> (["edef7b4d-2358-41c6-8a03-a7638d0f68c6" 6700]
>> ["edef7b4d-2358-41c6-8a03-a7638d0f68c6" 6701]
>> ["0a6e4de5-dd83-4307-b7e6-1db60c4c0898" 6700]
>> ["0a6e4de5-dd83-4307-b7e6-1db60c4c0898" 6701])
>> 2016-02-25 17:10:15.189 b.s.d.nimbus [INFO] Setting new assignment for
>> topology id CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414:
>> #backtype.storm.daemon.common.Assignment{:master-code-dir
>> "/opt/hadoop/storm/nimbus/stormdist/CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414",
>> :node->host {"edef7b4d-2358-41c6-8a03-a7638d0f68c6" "
>> sorm-data06.msk.mts.ru", "0a6e4de5-dd83-4307-b7e6-1db60c4c0898" "
>> sorm-data07.msk.mts.ru"}, :executor->node+port {[8 8]
>> ["0a6e4de5-dd83-4307-b7e6-1db60c4c0898" 6700], [2 2]
>> ["0a6e4de5-dd83-4307-b7e6-1db60c4c0898" 6700], [7 7]
>> ["edef7b4d-2358-41c6-8a03-a7638d0f68c6" 6700], [3 3]
>> ["edef7b4d-2358-41c6-8a03-a7638d0f68c6" 6700], [1 1]
>> ["edef7b4d-2358-41c6-8a03-a7638d0f68c6" 6700], [6 6]
>> ["0a6e4de5-dd83-4307-b7e6-1db60c4c0898" 6700], [9 9]
>> ["edef7b4d-2358-41c6-8a03-a7638d0f68c6" 6700], [11 11]
>> ["edef7b4d-2358-41c6-8a03-a7638d0f68c6" 6700], [5 5]
>> ["edef7b4d-2358-41c6-8a03-a7638d0f68c6" 6700], [10 10]
>> ["0a6e4de5-dd83-4307-b7e6-1db60c4c0898" 6700], [4 4]
>> ["0a6e4de5-dd83-4307-b7e6-1db60c4c0898" 6700]}, :executor->start-time-secs
>> {[8 8] 1456409415, [2 2] 1456409415, [7 7] 1456409415, [3 3] 1456409415, [1
>> 1] 1456409415, [6 6] 1456409415, [9 9] 1456409415, [11 11] 1456409415, [5
>> 5] 1456409415, [10 10] 1456409415, [4 4] 1456409415}}
>>
>> But supervisor stuck somewhere:
>> *2016-02-25 17:10:15.198 b.s.d.supervisor [INFO] Downloading code for
>> storm id* CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414 from
>> /opt/hadoop/storm/nimbus/stormdist/CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414
>> 2016-02-25 17:10:15.199 b.s.u.StormBoundedExponentialBackoffRetry [INFO]
>> The baseSleepTimeMs [2000] the maxSleepTimeMs [60000] the maxRetries [5]
>> 2016-02-25 17:10:15.205 b.s.u.StormBoundedExponentialBackoffRetry [INFO]
>> The baseSleepTimeMs [2000] the maxSleepTimeMs [60000] the maxRetries [5]
>> 2016-02-25 17:10:17.405 b.s.c.LocalFileSystemCodeDistributor [INFO]
>> Attempting to download meta file
>> /opt/hadoop/storm/nimbus/stormdist/CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414/stormjar.jar
>> from remote sorm-master02.msk.mts.ru:6627
>> 2016-02-25 *17:10:17.406* b.s.u.StormBoundedExponentialBackoffRetry
>> [INFO] The baseSleepTimeMs [2000] the maxSleepTimeMs [60000] the maxRetries
>> [5]
>> 2016-02-25 *17:13:29.002* b.s.c.LocalFileSystemCodeDistributor [INFO]
>> Attempting to download meta file
>> /opt/hadoop/storm/nimbus/stormdist/CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414/stormconf.ser
>> from remote sorm-master02.msk.mts.ru:6627
>> 2016-02-25 17:13:29.003 b.s.u.StormBoundedExponentialBackoffRetry [INFO]
>> The baseSleepTimeMs [2000] the maxSleepTimeMs [60000] the maxRetries [5]
>> 2016-02-25 17:13:31.148 b.s.c.LocalFileSystemCodeDistributor [INFO]
>> Attempting to download meta file
>> /opt/hadoop/storm/nimbus/stormdist/CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414/stormcode.ser
>> from remote sorm-master02.msk.mts.ru:6627
>> 2016-02-25 17:13:31.148 b.s.u.StormBoundedExponentialBackoffRetry [INFO]
>> The baseSleepTimeMs [2000] the maxSleepTimeMs [60000] the maxRetries [5]
>> *2016-02-25 17:13:34.124 b.s.d.supervisor [INFO] Finished downloading
>> code for storm id*
>> CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414 from
>> /opt/hadoop/storm/nimbus/stormdist/CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414
>>
>>
>>
>
>
> --
> С уважением Дудин Андрей
>

Reply via email to