Hey, So, I updated my /etc/hosts files with the hostnames and IP addresses, but I still get the same error. You are right, somebody asked the same thing yesterday and it received no answer.
Cheers, Nick Date: Wed, 24 Jun 2015 09:42:01 -0700 Message-ID: <cal1s2snqtc-v6bh9tdmthz7tf+_u5agiac42_ozfflpgvjf...@mail.gmail.com> Subject: AWS Hostnames not resolving properly Netty-Client-ip-10 From: Dillian Murphey <[email protected]> To: [email protected] Content-Type: multipart/alternative; boundary=001a113d2b16a1f6890519462fda --001a113d2b16a1f6890519462fda Content-Type: text/plain; charset=UTF-8 I'm seeing this reported elsewhere but finding no replies. Does anyone know what the problem is here with host hostnames are resolved by Storm when running on AWS? Clearly that is not a hostname, prefixed with Netty-Client. 2015-06-24T07:13:07.856+0000 b.s.m.n.Client [INFO] connection attempt 24 to Netty-Client-ip-10-9-255-20.us <http://netty-client-ip-10-9-255-20.us/> -west-2.compute.internal/10.9.255.20:6711 scheduled to run in 387 ms 2015-06-24T07:13:08.244+0000 b.s.m.n.Client [ERROR] connection attempt 24 to Netty-Client-ip-10-9-255-20.us <http://netty-client-ip-10-9-255-20.us/> -west-2.compute.internal/10.9.255.20:6711 failed: java.lang.RuntimeException: Returned channel was actually not established I think a work-around is to manually edit the /etc/hosts file. I had a different issue I believe preventing me from fully testing that. 2015-06-25 16:14 GMT-04:00 Fan Jiang <[email protected]>: > I remember someone post the same issue yesterday. The problem is that your > host is somehow resolved as "Netty-Client-*", which is not pingable. You > may modify /etc/hosts to map the hostnames to IP addresses appropriately if > it is allowed. > > — > Sincerely, > Fan Jiang > > > On Thu, Jun 25, 2015 at 3:57 PM, Nick R. Katsipoulakis < > [email protected]> wrote: > >> Hello all, >> >> I apologize for the long message, but I have no idea what is going wrong >> in my setup and I tried to give a lot of info about my cluster. I have the >> following EC2 setup: >> >> 1) 3x m4.xlarge nodes for a 3-node ZooKeeper ensemble and a nimbus >> >> 2) 4x m4.xlarge nodes for my Supervisors. >> >> All of the machines are running Ubuntu Linux v14, OpenJDK v1.7 and Apache >> Storm v0.9.4. The storm.yaml I am currently having in the nimbus node >> (only) has the following values: >> >> storm.home: "/opt/apache-storm-0.9.4" >> storm.local.dir: "/mnt/storm" >> storm.zookeeper.servers: >> - "172.31.28.73" >> - "172.31.38.251" >> - "172.31.38.252" >> storm.zookeeper.port: 2181 >> storm.zookeeper.root: "/storm" >> storm.zookeeper.session.timeout: 20000 >> storm.zookeeper.connection.timeout: 15000 >> storm.zookeeper.retry.times: 5 >> storm.zookeeper.retry.interval: 1000 >> storm.zookeeper.retry.invervalceiling.millis: 30000 >> storm.cluster.mode: "distributed" >> storm.local.mode.zmq: false >> storm.thrift.transport: >> "backtype.storm.security.auth.SimpleTransportPlugin" >> storm.messaging.transport: "backtype.storm.messaging.netty.Context" >> >> nimbus.host: "127.0.0.1" >> nimbus.thrift.port: 6627 >> nimbus.thrift.max_buffer_size: 1048576 >> nimbus.thrift.threads: 256 >> nimbus.childopts: "-Xmx256m" >> nimbus.task.timeout.secs: 30 >> nimbus.supervisor.timeout.secs: 60 >> nimbus.monitor.freq.secs: 10 >> nimbus.cleanup.inbox.freq.secs: 600 >> nimbus.inbox.jar.expiration.secs: 3600 >> nimbus.task.launch.secs: 120 >> nimbus.reassign: true >> nimbus.file.copy.expiration.secs: 600 >> nimbus.topology.validator: >> "backtype.storm.nimbus.DefaultTopologyValidator" >> >> ui.port: 8080 >> ui.childopts: "-Xmx768m" >> logviewer.port: 8000 >> logviewer.childopts: "-Xmx256m" >> logviewer.appender.name: "A1" >> >> drpc.port: 3772 >> drpc.worker.threads: 64 >> drpc.queue.size: 128 >> drpc.invocations.port: 3773 >> drpc.request.timeout.secs: 600 >> drpc.childopts: "-Xmx768m" >> >> transactional.zookeeper.root: "/transactional" >> transactional.zookeeper.servers: null >> transactional.zookeeper.port: null >> >> supervisor.slots.ports: >> - 6700 >> - 6701 >> - 6702 >> - 6703 >> supervisor.childopts: "-Xmx256m" >> supervisor.worker.start.timeout.secs: 120 >> supervisor.worker.timeout.secs: 30 >> supervisor.monitor.frequency.secs: 3 >> supervisor.heartbeat.frequency.secs: 5 >> supervisor.enable: true >> >> worker.childopts: "-Xmx4096m -XX:+UseConcMarkSweepGC -XX:+UseParNewGC >> -XX:+UseConcMarkSweepGC -XX:NewSize=128m >> -XX:CMSInitiatingOccupancyFraction=70 -XX: -CMSConcurrentMTEnabled >> -Djava.net.preferIPv4Stack=true" >> worker.heartbeat.frequency.secs: 1 >> >> task.heartbeat.frequency.secs: 3 >> task.refresh.poll.secs: 10 >> >> zmq.threads: 1 >> zmq.linger.millis: 5000 >> zmq.hwm: 0 >> >> storm.messaging.netty.server_worker_threads: 4 >> storm.messaging.netty.client_worker_threads: 4 >> storm.messaging.netty.buffer_size: 10485760 >> storm.messaging.netty.max_retries: 100 >> storm.messaging.netty.max_wait_ms: 1000 >> storm.messaging.netty.min_wait_ms: 100 >> topology.enable.message.timeouts: true >> topology.debug: false >> topology.optimize: true >> topology.workers: 1 >> topology.acker.executors: null >> topology.tasks: null >> topology.message.timeout.secs: 30 >> topology.skip.missing.kryo.registrations: false >> topology.max.task.parallelism: null >> topology.max.spout.pending: null >> topology.state.synchronization.timeout.secs: 60 >> topology.stats.sample.rate: 0.05 >> topology.builtin.metrics.bucket.size.secs: 60 >> topology.fall.back.on.java.serialization: true >> topology.worker.childopts: null >> topology.executor.receive.buffer.size: 1024 >> topology.executor.send.buffer.size: 1024 >> topology.receiver.buffer.size: 8 >> topology.transfer.buffer.size: 1024 >> topology.tick.tuple.freq.secs: null >> topology.worker.shared.thread.pool.size: 4 >> topology.disruptor.wait.strategy: >> "com.lmax.disruptor.BlockingWaitStrategy" >> topology.spout.wait.strategy: >> "backtype.storm.spout.SleepSpoutWaitStrategy" >> topology.sleep.spout.wait.strategy.time.ms: 1 >> topology.error.throttle.interval.secs: 10 >> topology.max.error.report.per.interval: 5 >> topology.kryo.factory: "backtype.storm.serialization.DefaultKryoFactory" >> topology.tuple.serializer: >> "backtype.storm.serialization.types.ListDelegateSerializer" >> topology.trident.batch.emit.interval.millis: 500 >> >> dev.zookeeper.path: "/tmp/dev-storm-zookeeper" >> >> The problem is that every time I submit a topology, I got a lot of Netty >> messages in my worker logs (found in the supervisor machines) and >> many of them had similar to the following messages: >> >> 2015-06-25T19:42:32.534+0000 b.s.u.StormBoundedExponentialBackoffRetry >> [INFO] The baseSleepTimeMs [1000] the maxSleepTimeMs [30000] the maxRetries >> [5] >> 2015-06-25T19:42:32.625+0000 o.a.s.c.f.i.CuratorFrameworkImpl [INFO] >> Starting >> 2015-06-25T19:42:32.629+0000 o.a.s.z.ZooKeeper [INFO] Initiating client >> connection, connectString=172.31.28.73:2181,172.31.38.251:2181, >> 172.31.38.252:2181 sessionTimeout=20000 >> watcher=org.apache.storm.curator.ConnectionState@5172aa5a >> 2015-06-25T19:42:32.649+0000 o.a.s.z.ClientCnxn [INFO] Opening socket >> connection to server 172.31.28.73/172.31.28.73:2181. Will not attempt to >> authenticate using SASL (unknown error) >> 2015-06-25T19:42:32.655+0000 o.a.s.z.ClientCnxn [INFO] Socket connection >> established to 172.31.28.73/172.31.28.73:2181, initiating session >> 2015-06-25T19:42:32.670+0000 o.a.s.z.ClientCnxn [INFO] Session >> establishment complete on server 172.31.28.73/172.31.28.73:2181, >> sessionid = 0x14e2b0caa01005f, negotiated timeout = 20000 >> 2015-06-25T19:42:32.672+0000 o.a.s.c.f.s.ConnectionStateManager [INFO] >> State change: CONNECTED >> 2015-06-25T19:42:32.674+0000 b.s.zookeeper [INFO] Zookeeper state update: >> :connected:none >> 2015-06-25T19:42:32.703+0000 o.a.s.z.ClientCnxn [INFO] EventThread shut >> down >> 2015-06-25T19:42:32.703+0000 o.a.s.z.ZooKeeper [INFO] Session: >> 0x14e2b0caa01005f closed >> 2015-06-25T19:42:32.705+0000 b.s.u.StormBoundedExponentialBackoffRetry >> [INFO] The baseSleepTimeMs [1000] the maxSleepTimeMs [30000] the maxRetries >> [5] >> 2015-06-25T19:42:32.706+0000 o.a.s.c.f.i.CuratorFrameworkImpl [INFO] >> Starting >> 2015-06-25T19:42:32.716+0000 o.a.s.z.ZooKeeper [INFO] Initiating client >> connection, connectString=172.31.28.73:2181,172.31.38.251:2181, >> 172.31.38.252:2181/storm sessionTimeout=20000 >> watcher=org.apache.storm.curator.ConnectionState@3f308697 >> 2015-06-25T19:42:32.727+0000 o.a.s.z.ClientCnxn [INFO] Opening socket >> connection to server 172.31.28.73/172.31.28.73:2181. Will not attempt to >> authenticate using SASL (unknown error) >> 2015-06-25T19:42:32.727+0000 o.a.s.z.ClientCnxn [INFO] Socket connection >> established to 172.31.28.73/172.31.28.73:2181, initiating session >> 2015-06-25T19:42:32.733+0000 o.a.s.z.ClientCnxn [INFO] Session >> establishment complete on server 172.31.28.73/172.31.28.73:2181, >> sessionid = 0x14e2b0caa010061, negotiated timeout = 20000 >> 2015-06-25T19:42:32.733+0000 o.a.s.c.f.s.ConnectionStateManager [INFO] >> State change: CONNECTED >> 2015-06-25T19:42:32.774+0000 b.s.d.worker [INFO] Reading Assignments. >> 2015-06-25T19:42:32.838+0000 b.s.m.TransportFactory [INFO] Storm peer >> transport plugin:backtype.storm.messaging.netty.Context >> 2015-06-25T19:42:32.971+0000 b.s.d.worker [INFO] Launching receive-thread >> for 58e551ba-f944-4aec-9c8f-5621053021dd:6703 >> 2015-06-25T19:42:32.983+0000 b.s.m.n.Server [INFO] Create Netty Server >> Netty-server-localhost-6703, buffer_size: 10485760, maxWorkers: 4 >> 2015-06-25T19:42:33.011+0000 b.s.m.loader [INFO] Starting receive-thread: >> [stormId: tpch-q5-top-6-1435261345, port: 6703, thread-id: 0 ] >> 2015-06-25T19:42:33.041+0000 b.s.m.n.Client [INFO] creating Netty Client, >> connecting to ip-172-31-19-254.us-west-2.compute.internal:6703, bufferSize: >> 10485760 >> 2015-06-25T19:42:33.041+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN] >> maxRetries too large (100). Pinning to 29 >> 2015-06-25T19:42:33.041+0000 b.s.u.StormBoundedExponentialBackoffRetry >> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries >> [100] >> 2015-06-25T19:42:33.042+0000 b.s.m.n.Client [INFO] connection attempt 1 >> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/ >> 172.31.19.254:6703 scheduled to run in 0 ms >> 2015-06-25T19:42:33.067+0000 b.s.m.n.Client [ERROR] connection attempt 1 >> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/ >> 172.31.19.254:6703 failed: java.lang.RuntimeException: Returned channel >> was actually not established >> 2015-06-25T19:42:33.068+0000 b.s.m.n.Client [INFO] connection attempt 2 >> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/ >> 172.31.19.254:6703 scheduled to run in 103 ms >> 2015-06-25T19:42:33.071+0000 b.s.m.n.Client [INFO] creating Netty Client, >> connecting to ip-172-31-13-184.us-west-2.compute.internal:6703, bufferSize: >> 10485760 >> 2015-06-25T19:42:33.071+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN] >> maxRetries too large (100). Pinning to 29 >> 2015-06-25T19:42:33.071+0000 b.s.u.StormBoundedExponentialBackoffRetry >> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries >> [100] >> 2015-06-25T19:42:33.076+0000 b.s.m.n.Client [INFO] connection attempt 1 >> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/ >> 172.31.13.184:6703 scheduled to run in 0 ms >> 2015-06-25T19:42:33.080+0000 b.s.m.n.Client [INFO] creating Netty Client, >> connecting to ip-172-31-19-254.us-west-2.compute.internal:6702, bufferSize: >> 10485760 >> 2015-06-25T19:42:33.080+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN] >> maxRetries too large (100). Pinning to 29 >> 2015-06-25T19:42:33.080+0000 b.s.u.StormBoundedExponentialBackoffRetry >> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries >> [100] >> 2015-06-25T19:42:33.080+0000 b.s.m.n.Client [INFO] connection attempt 1 >> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/ >> 172.31.19.254:6702 scheduled to run in 0 ms >> 2015-06-25T19:42:33.081+0000 b.s.m.n.Client [INFO] creating Netty Client, >> connecting to ip-172-31-13-184.us-west-2.compute.internal:6702, bufferSize: >> 10485760 >> 2015-06-25T19:42:33.082+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN] >> maxRetries too large (100). Pinning to 29 >> 2015-06-25T19:42:33.082+0000 b.s.u.StormBoundedExponentialBackoffRetry >> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries >> [100] >> 2015-06-25T19:42:33.082+0000 b.s.m.n.Client [INFO] connection attempt 1 >> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/ >> 172.31.13.184:6702 scheduled to run in 0 ms >> 2015-06-25T19:42:33.084+0000 b.s.m.n.Client [INFO] creating Netty Client, >> connecting to ip-172-31-19-254.us-west-2.compute.internal:6701, bufferSize: >> 10485760 >> 2015-06-25T19:42:33.084+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN] >> maxRetries too large (100). Pinning to 29 >> 2015-06-25T19:42:33.084+0000 b.s.u.StormBoundedExponentialBackoffRetry >> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries >> [100] >> 2015-06-25T19:42:33.084+0000 b.s.m.n.Client [INFO] connection attempt 1 >> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/ >> 172.31.19.254:6701 scheduled to run in 0 ms >> 2015-06-25T19:42:33.162+0000 b.s.m.n.Client [ERROR] connection attempt 1 >> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/ >> 172.31.13.184:6703 failed: java.lang.RuntimeException: Returned channel >> was actually not established >> 2015-06-25T19:42:33.162+0000 b.s.m.n.Client [INFO] creating Netty Client, >> connecting to ip-172-31-13-184.us-west-2.compute.internal:6701, bufferSize: >> 10485760 >> 2015-06-25T19:42:33.162+0000 b.s.m.n.Client [INFO] connection attempt 2 >> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/ >> 172.31.13.184:6703 scheduled to run in 103 ms >> 2015-06-25T19:42:33.163+0000 o.a.s.c.r.ExponentialBackoffRetry [WARN] >> maxRetries too large (100). Pinning to 29 >> >> and >> >> 2015-06-25T19:42:33.176+0000 b.s.u.StormBoundedExponentialBackoffRetry >> [INFO] The baseSleepTimeMs [100] the maxSleepTimeMs [1000] the maxRetries >> [100] >> 2015-06-25T19:42:33.176+0000 b.s.m.n.Client [INFO] connection attempt 1 >> to Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/ >> 172.31.19.253:6700 scheduled to run in 0 ms >> 2015-06-25T19:42:33.178+0000 b.s.m.n.Client [ERROR] connection attempt 1 >> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/ >> 172.31.13.184:6700 failed: java.lang.RuntimeException: Returned channel >> was actually not established >> 2015-06-25T19:42:33.189+0000 b.s.m.n.Client [ERROR] connection attempt 2 >> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/ >> 172.31.19.254:6703 failed: java.lang.RuntimeException: Returned channel >> was actually not established >> 2015-06-25T19:42:33.190+0000 b.s.m.n.Client [INFO] connection attempt 2 >> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/ >> 172.31.13.184:6700 scheduled to run in 103 ms >> 2015-06-25T19:42:33.191+0000 b.s.m.n.Client [INFO] connection attempt 3 >> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/ >> 172.31.19.254:6703 scheduled to run in 105 ms >> 2015-06-25T19:42:33.195+0000 b.s.m.n.Client [ERROR] connection attempt 1 >> to Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/ >> 172.31.19.253:6700 failed: java.lang.RuntimeException: Returned channel >> was actually not established >> 2015-06-25T19:42:33.195+0000 b.s.m.n.Client [INFO] connection attempt 2 >> to Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/ >> 172.31.19.253:6700 scheduled to run in 102 ms >> 2015-06-25T19:42:33.196+0000 b.s.m.n.Client [ERROR] connection attempt 1 >> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/ >> 172.31.19.252:6702 failed: java.lang.RuntimeException: Returned channel >> was actually not established >> 2015-06-25T19:42:33.196+0000 b.s.m.n.Client [INFO] connection attempt 2 >> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/ >> 172.31.19.252:6702 scheduled to run in 102 ms >> 2015-06-25T19:42:33.197+0000 b.s.m.n.Client [ERROR] connection attempt 1 >> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/ >> 172.31.19.252:6700 failed: java.lang.RuntimeException: Returned channel >> was actually not established >> 2015-06-25T19:42:33.198+0000 b.s.m.n.Client [INFO] connection attempt 2 >> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/ >> 172.31.19.252:6700 scheduled to run in 103 ms >> 2015-06-25T19:42:33.198+0000 b.s.m.n.Client [ERROR] connection attempt 1 >> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/ >> 172.31.19.252:6703 failed: java.lang.RuntimeException: Returned channel >> was actually not established >> 2015-06-25T19:42:33.198+0000 b.s.m.n.Client [ERROR] connection attempt 1 >> to Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/ >> 172.31.19.253:6702 failed: java.lang.RuntimeException: Returned channel >> was actually not established >> 2015-06-25T19:42:33.205+0000 b.s.m.n.Client [INFO] connection attempt 2 >> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/ >> 172.31.19.252:6703 scheduled to run in 103 ms >> 2015-06-25T19:42:33.198+0000 b.s.m.n.Client [INFO] connection established >> to Netty-Client-ip-172-31-19-252.us-west-2.compute.internal/ >> 172.31.19.252:6701 >> 2015-06-25T19:42:33.206+0000 b.s.m.n.Client [INFO] connection attempt 2 >> to Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/ >> 172.31.19.253:6702 scheduled to run in 102 ms >> 2015-06-25T19:42:33.205+0000 b.s.m.n.Client [INFO] connection established >> to Netty-Client-ip-172-31-19-253.us-west-2.compute.internal/ >> 172.31.19.253:6701 >> 2015-06-25T19:42:33.268+0000 b.s.m.n.Client [ERROR] connection attempt 2 >> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/ >> 172.31.13.184:6703 failed: java.lang.RuntimeException: Returned channel >> was actually not established >> 2015-06-25T19:42:33.272+0000 b.s.m.n.Client [INFO] connection attempt 3 >> to Netty-Client-ip-172-31-13-184.us-west-2.compute.internal/ >> 172.31.13.184:6703 scheduled to run in 105 ms >> 2015-06-25T19:42:33.273+0000 b.s.m.n.Client [ERROR] connection attempt 2 >> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/ >> 172.31.19.254:6701 failed: java.lang.RuntimeException: Returned channel >> was actually not established >> 2015-06-25T19:42:33.273+0000 b.s.m.n.Client [INFO] connection attempt 3 >> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/ >> 172.31.19.254:6701 scheduled to run in 105 ms >> 2015-06-25T19:42:33.274+0000 b.s.m.n.Client [ERROR] connection attempt 2 >> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/ >> 172.31.19.254:6702 failed: java.lang.RuntimeException: Returned channel >> was actually not established >> 2015-06-25T19:42:33.274+0000 b.s.m.n.Client [INFO] connection attempt 3 >> to Netty-Client-ip-172-31-19-254.us-west-2.compute.internal/ >> 172.31.19.254:6702 scheduled to run in 106 ms >> 2015-06-25T19:42:33.275+0000 b.s >> >> Why am I getting the above. Initially, I thought that the input rate of >> tuples in my topology is too high, and Netty's buffers are filled up too >> fast. However, I submitted a debug topology >> that sent one tuple every 1 second and I still got the above messages. >> >> Am I doing something wrong in my configuration? Why do I have the >> previous Netty messages, which obviously show that something is going >> wrong? Please, any hint on my setup will be really helpful. >> >> Regards, >> Nick >> > > -- Nikolaos Romanos Katsipoulakis, University of Pittsburgh, PhD candidate
