Please create a Jira ticket. We will submit a pull request with a fix Andy Feng
Sent from my iPhone > On Mar 4, 2014, at 6:32 PM, "李家宏" <[email protected]> wrote: > > hi , Andy Feng, > there are 150 workers and 450 executors in my topology. > > Thanks for your reply > > > 2014-03-04 23:13 GMT+08:00 Andrew Feng <[email protected]>: > >> How many workers do you have in your topology? >> >> Andy Feng >> >> Sent from my iPhone >> >>> On Mar 4, 2014, at 5:21 AM, "李家宏" <[email protected]> wrote: >>> >>> hi, all >>> >>> When I submit a topology to a storm cluster of 0.9.0.1, the following >> error >>> occurs: >> ---------------------------------------------------------------------------------------------------------------------- >>> [INFO] Starting >>> 2014-03-04 20:24:13 o.a.z.ZooKeeper [INFO] Initiating client >> connection, >>> connectString=10.207.52.82:2181,10.207.52.83:2181,10.207.52.84:2181 >> sessionTimeout=20000 >>> watcher=com.netflix.curator.ConnectionState@796cefa8 >>> 2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Opening socket connection >> to >>> server /10.207.52.83:2181 >>> 2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Socket connection >>> established to >>> storm010207052083.cm3.tbsite.net/10.207.52.83:2181, initiating session >>> 2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Session establishment >>> complete on server >>> storm010207052083.cm3.tbsite.net/10.207.52.83:2181, sessionid = >>> 0x2423f964207c973, negotiated timeout = 20000 >>> 2014-03-04 20:24:13 b.s.zookeeper [INFO] Zookeeper state update: >>> :connected:none >>> 2014-03-04 20:24:13 o.a.z.ZooKeeper [INFO] Session: 0x2423f964207c973 >>> closed >>> 2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] EventThread shut down >>> 2014-03-04 20:24:13 c.n.c.f.i.CuratorFrameworkImpl [INFO] Starting >>> 2014-03-04 20:24:13 o.a.z.ZooKeeper [INFO] Initiating client >> connection, >>> connectString=10.207.52.82:2181,10.207.52.83:2181, >>> 10.207.52.84:2181/tmp/storm-0.9.0.1 sessionTimeout=20000 >>> watcher=com.netflix.curator.ConnectionState@58f41393 >>> 2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Opening socket connection >> to >>> server /10.207.52.82:2181 >>> 2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Socket connection >>> established to >>> storm010207052082.cm3.tbsite.net/10.207.52.82:2181, initiating session >>> 2014-03-04 20:24:13 o.a.z.ClientCnxn [INFO] Session establishment >>> complete on server >>> storm010207052082.cm3.tbsite.net/10.207.52.82:2181, sessionid = >>> 0x1423f964209c65f, negotiated timeout = 20000 >>> 2014-03-04 20:24:14 b.s.m.TransportFactory [INFO] Storm peer transport >>> plugin:backtype.storm.messaging.netty.Context >>> 2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1] >>> 2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1] >>> 2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1] >>> 2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1] >>> 2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1] >>> 2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1] >>> 2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1] >>> 2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [2] >>> 2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1] >>> 2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1] >>> 2014-03-04 20:24:14 b.s.m.n.Client [INFO] Reconnect ... [1] >>> 2014-03-04 20:24:14 b.s.d.worker [ERROR] Error on initialization of >>> server mk-worker >>> org.jboss.netty.channel.ChannelException: Failed to create a selector. >>> at >> org.jboss.netty.channel.socket.nio.AbstractNioSelector.openSelector(AbstractNioSelector.java:337) >>> ~[netty-3.6.3.Final.jar:na] >>> at >> org.jboss.netty.channel.socket.nio.AbstractNioSelector.(AbstractNioSelector.java:95) >>> ~[netty-3.6.3.Final.jar:na] >>> at >> org.jboss.netty.channel.socket.nio.AbstractNioWorker.(AbstractNioWorker.java:51) >>> ~[netty-3.6.3.Final.jar:na] >>> at org.jboss.netty.channel.socket.nio.NioWorker.(NioWorker.java:45) >>> ~[netty-3.6.3.Final.jar:na] >>> at >> org.jboss.netty.channel.socket.nio.NioWorkerPool.createWorker(NioWorkerPool.java:45) >>> ~[netty-3.6.3.Final.jar:na] >>> at >> org.jboss.netty.channel.socket.nio.NioWorkerPool.createWorker(NioWorkerPool.java:28) >>> ~[netty-3.6.3.Final.jar:na] >>> at >> org.jboss.netty.channel.socket.nio.AbstractNioWorkerPool.newWorker(AbstractNioWorkerPool.java:99) >>> ~[netty-3.6.3.Final.jar:na] >>> at >> org.jboss.netty.channel.socket.nio.AbstractNioWorkerPool.init(AbstractNioWorkerPool.java:69) >>> ~[netty-3.6.3.Final.jar:na] >>> at >>> org.jboss.netty.channel.socket.nio.NioWorkerPool.(NioWorkerPool.java:39) >>> ~[netty-3.6.3.Final.jar:na] >>> at >>> org.jboss.netty.channel.socket.nio.NioWorkerPool.(NioWorkerPool.java:33) >>> ~[netty-3.6.3.Final.jar:na] >>> at >> org.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory.(NioClientSocketChannelFactory.java:152) >>> ~[netty-3.6.3.Final.jar:na] >>> at >> org.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory.(NioClientSocketChannelFactory.java:134) >>> ~[netty-3.6.3.Final.jar:na] >>> at backtype.storm.messaging.netty.Client.(Client.java:54) >>> ~[storm-netty-0.9.0.1.jar:na] >>> at backtype.storm.messaging.netty.Context.connect(Context.java:36) >>> ~[storm-netty-0.9.0.1.jar:na] >>> at >> backtype.storm.daemon.worker$mk_refresh_connections$this__5827$iter__5834__5838$fn__5839.invoke(worker.clj:250) >>> ~[storm-core-0.9.0.1.jar:na] >>> at clojure.lang.LazySeq.sval(LazySeq.java:42) ~[clojure-1.4.0.jar:na] >>> at clojure.lang.LazySeq.seq(LazySeq.java:60) ~[clojure-1.4.0.jar:na] >>> at clojure.lang.Cons.next(Cons.java:39) ~[clojure-1.4.0.jar:na] >>> at clojure.lang.RT.next(RT.java:587) ~[clojure-1.4.0.jar:na] >>> at clojure.core$next.invoke(core.clj:64) ~[clojure-1.4.0.jar:na] >>> at clojure.core$dorun.invoke(core.clj:2726) ~[clojure-1.4.0.jar:na] >>> at clojure.core$doall.invoke(core.clj:2741) ~[clojure-1.4.0.jar:na] >>> at >> backtype.storm.daemon.worker$mk_refresh_connections$this__5827.invoke(worker.clj:244) >>> ~[storm-core-0.9.0.1.jar:na] >>> at >> backtype.storm.daemon.worker$fn__5882$exec_fn__1229__auto____5883.invoke(worker.clj:357) >>> ~[storm-core-0.9.0.1.jar:na] >>> at clojure.lang.AFn.applyToHelper(AFn.java:185) [clojure-1.4.0.jar:na] >>> at clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.4.0.jar:na] >>> at clojure.core$apply.invoke(core.clj:601) ~[clojure-1.4.0.jar:na] >>> at >> backtype.storm.daemon.worker$fn__5882$mk_worker__5938.doInvoke(worker.clj:329) >>> [storm-core-0.9.0.1.jar:na] >>> at clojure.lang.RestFn.invoke(RestFn.java:512) [clojure-1.4.0.jar:na] >>> at backtype.storm.daemon.worker$_main.invoke(worker.clj:439) >>> [storm-core-0.9.0.1.jar:na] >>> at clojure.lang.AFn.applyToHelper(AFn.java:172) [clojure-1.4.0.jar:na] >>> at clojure.lang.AFn.applyTo(AFn.java:151) [clojure-1.4.0.jar:na] >>> at backtype.storm.daemon.worker.main(Unknown Source) >>> [storm-core-0.9.0.1.jar:na] >>> Caused by: java.io.IOException: Too many open files >>> at sun.nio.ch.IOUtil.initPipe(Native Method) ~[na:1.6.0_38] >>> at sun.nio.ch.EPollSelectorImpl.(EPollSelectorImpl.java:49) >>> ~[na:1.6.0_38] >>> at >> sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:18) >>> ~[na:1.6.0_38] >>> at java.nio.channels.Selector.open(Selector.java:209) ~[na:1.6.0_38] >>> at >> org.jboss.netty.channel.socket.nio.AbstractNioSelector.openSelector(AbstractNioSelector.java:335) >>> ~[netty-3.6.3.Final.jar:na] >>> ... 32 common frames omitted >>> 2014-03-04 20:24:14 b.s.util [INFO] Halting process: ("Error on >>> initialization") >> -------------------------------------------------------------------------------------------------------------------- >>> >>> This topology works fine with storm cluster of 0.8.0. >>> And: >>> ulimit -n => 131072; >>> sudo losf | grep java | wc -l => 5000 >>> it seems like opened fds do not reaching limits >>> >>> What's the problem ? >>> >>> Regards >>> >>> -- >>> >>> ====================================================== >>> >>> Gvain >>> >>> Email: [email protected] > > > > -- > > ====================================================== > > Gvain > > Email: [email protected]
