Hello, I'm using Hortonworks Data Platform v2.3.4 with included storm 0.10.
After deploying my topology using "*storm jar*" command it takes about 3
min to distribute code. The network is 10Gb/s, topology jar is about 150MB,
cluster has 2 nodes with supervisors on them, so I assume that this process
shouldn't take much time. Actualy, it wasn't with Hortonworks platform
2.3.0, I can't understand what happend since than. Heres change log
<http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.4/bk_HDP_RelNotes/content/patch_storm.html>
of
HDP, hope you can help.
Here's logs.
Nimbus download jar pretty fast:
*2016-02-25 17:10:13.040 b.s.d.nimbus [INFO] Uploading file from client to
/opt/hadoop/storm/nimbus/inbox/stormjar-7352b097-8829-4268-a81f-29d820c0f311.jar*
*2016-02-25 17:10:14.915 b.s.d.nimbus [INFO] Finished uploading file from
client:
/opt/hadoop/storm/nimbus/inbox/stormjar-7352b097-8829-4268-a81f-29d820c0f311.jar*
2016-02-25 17:10:14.928 b.s.d.nimbus [INFO] [req 22] Access from:
principal: op:submitTopology
2016-02-25 17:10:14.933 b.s.d.nimbus [INFO] Received topology submission
for CdrPerformanceTestTopology-phoenix-bolt-4 with conf
{"topology.max.task.parallelism" nil, "hbase.conf" {"hbase.rootdir" "hdfs://
sorm-master02.msk.mts.ru:8020/apps/hbase/data"},
"topology.submitter.principal" "", "topology.acker.executors" nil,
"topology.workers" 2, "topology.message.timeout.secs" 30, "topology.debug"
true, "topology.max.spout.pending" 10000, "storm.zookeeper.superACL" nil,
"topology.users" (), "topology.submitter.user" "", "topology.kryo.register"
{"org.apache.avro.util.Utf8" nil,
"ru.mts.sorm.schemes.avro.cdr.CbossCdrRecord" nil},
"topology.kryo.decorators" (), "storm.id"
"CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414", "topology.name"
"CdrPerformanceTestTopology-phoenix-bolt-4"}
2016-02-25 17:10:14.936 b.s.d.nimbus [INFO] nimbus file
location:/opt/hadoop/storm/nimbus/stormdist/CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414
2016-02-25 17:10:15.160 b.s.c.LocalFileSystemCodeDistributor [INFO] Created
meta file
/opt/hadoop/storm/nimbus/stormdist/CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414/storm-code-distributor.meta
upload successful.
2016-02-25 17:10:15.164 b.s.zookeeper [INFO] Node already in queue for
leader lock.
2016-02-25 17:10:15.165 b.s.d.nimbus [INFO] desired replication count of 1
not achieved but we have hit the max wait time 60 so moving on with
replication count = 1
2016-02-25 17:10:15.169 b.s.d.nimbus [INFO] Activating
CdrPerformanceTestTopology-phoenix-bolt-4:
CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414
2016-02-25 17:10:15.186 b.s.s.EvenScheduler [INFO] Available slots:
(["edef7b4d-2358-41c6-8a03-a7638d0f68c6" 6700]
["edef7b4d-2358-41c6-8a03-a7638d0f68c6" 6701]
["0a6e4de5-dd83-4307-b7e6-1db60c4c0898" 6700]
["0a6e4de5-dd83-4307-b7e6-1db60c4c0898" 6701])
2016-02-25 17:10:15.189 b.s.d.nimbus [INFO] Setting new assignment for
topology id CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414:
#backtype.storm.daemon.common.Assignment{:master-code-dir
"/opt/hadoop/storm/nimbus/stormdist/CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414",
:node->host {"edef7b4d-2358-41c6-8a03-a7638d0f68c6" "sorm-data06.msk.mts.ru",
"0a6e4de5-dd83-4307-b7e6-1db60c4c0898" "sorm-data07.msk.mts.ru"},
:executor->node+port {[8 8] ["0a6e4de5-dd83-4307-b7e6-1db60c4c0898" 6700],
[2 2] ["0a6e4de5-dd83-4307-b7e6-1db60c4c0898" 6700], [7 7]
["edef7b4d-2358-41c6-8a03-a7638d0f68c6" 6700], [3 3]
["edef7b4d-2358-41c6-8a03-a7638d0f68c6" 6700], [1 1]
["edef7b4d-2358-41c6-8a03-a7638d0f68c6" 6700], [6 6]
["0a6e4de5-dd83-4307-b7e6-1db60c4c0898" 6700], [9 9]
["edef7b4d-2358-41c6-8a03-a7638d0f68c6" 6700], [11 11]
["edef7b4d-2358-41c6-8a03-a7638d0f68c6" 6700], [5 5]
["edef7b4d-2358-41c6-8a03-a7638d0f68c6" 6700], [10 10]
["0a6e4de5-dd83-4307-b7e6-1db60c4c0898" 6700], [4 4]
["0a6e4de5-dd83-4307-b7e6-1db60c4c0898" 6700]}, :executor->start-time-secs
{[8 8] 1456409415, [2 2] 1456409415, [7 7] 1456409415, [3 3] 1456409415, [1
1] 1456409415, [6 6] 1456409415, [9 9] 1456409415, [11 11] 1456409415, [5
5] 1456409415, [10 10] 1456409415, [4 4] 1456409415}}
But supervisor stuck somewhere:
*2016-02-25 17:10:15.198 b.s.d.supervisor [INFO] Downloading code for storm
id* CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414 from
/opt/hadoop/storm/nimbus/stormdist/CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414
2016-02-25 17:10:15.199 b.s.u.StormBoundedExponentialBackoffRetry [INFO]
The baseSleepTimeMs [2000] the maxSleepTimeMs [60000] the maxRetries [5]
2016-02-25 17:10:15.205 b.s.u.StormBoundedExponentialBackoffRetry [INFO]
The baseSleepTimeMs [2000] the maxSleepTimeMs [60000] the maxRetries [5]
2016-02-25 17:10:17.405 b.s.c.LocalFileSystemCodeDistributor [INFO]
Attempting to download meta file
/opt/hadoop/storm/nimbus/stormdist/CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414/stormjar.jar
from remote sorm-master02.msk.mts.ru:6627
2016-02-25 *17:10:17.406* b.s.u.StormBoundedExponentialBackoffRetry [INFO]
The baseSleepTimeMs [2000] the maxSleepTimeMs [60000] the maxRetries [5]
2016-02-25 *17:13:29.002* b.s.c.LocalFileSystemCodeDistributor [INFO]
Attempting to download meta file
/opt/hadoop/storm/nimbus/stormdist/CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414/stormconf.ser
from remote sorm-master02.msk.mts.ru:6627
2016-02-25 17:13:29.003 b.s.u.StormBoundedExponentialBackoffRetry [INFO]
The baseSleepTimeMs [2000] the maxSleepTimeMs [60000] the maxRetries [5]
2016-02-25 17:13:31.148 b.s.c.LocalFileSystemCodeDistributor [INFO]
Attempting to download meta file
/opt/hadoop/storm/nimbus/stormdist/CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414/stormcode.ser
from remote sorm-master02.msk.mts.ru:6627
2016-02-25 17:13:31.148 b.s.u.StormBoundedExponentialBackoffRetry [INFO]
The baseSleepTimeMs [2000] the maxSleepTimeMs [60000] the maxRetries [5]
*2016-02-25 17:13:34.124 b.s.d.supervisor [INFO] Finished downloading code
for storm id* CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414 from
/opt/hadoop/storm/nimbus/stormdist/CdrPerformanceTestTopology-phoenix-bolt-4-2-1456409414