Hi Eyko, The memory leak issue had been solved, it was cause by an bug in our code, which kept generating unused buffers. It's not due to the node cluster. >From our production usage, node cluster is solid.
regards, ty Regards, ty 2012/12/5 Eyko Sioux <[email protected]> > We're having a very similar problem, I was wondering if you made any > advancements on this? I hate having to kill a worker then doing another > cluster.fork() > > On Sunday, 12 August 2012 16:50:16 UTC+1, Yi wrote: >> >> Hi >> >> We have recently experienced a strange issue with node cluster. It looks >> like the built-in node cluster might have serious memory leak when process >> socket service. Following is the details. >> >> We have a simple socket service built on node. It does some simple buffer >> handling and distribution job. >> It has been run on a single node process to server about 1k clients with >> stead performance. >> >> Since the service becomes more and more popular, we want to utilize the >> 8-core server by clustering the node app and so server more clients. >> >> We did that by simply add a cluster wrap of the existing node app code. >> code as below. >> >> But once the service run on cluster, it turns out serous memory leak. And >> to server the same amount of clients, the server load average increases 10 >> times then a single node process! >> >> here is a screenshot of the server stats >> http://zk-binary.b0.upaiyun.**com/temp/node-cluster-leak.png<http://zk-binary.b0.upaiyun.com/temp/node-cluster-leak.png> >> ** >> >> Is there any way, we can check the details of what's going on in the node >> cluster? >> >> >> # cluster code: >> >> >> cluster = require "cluster" >> workers = require("os").cpus().length >> >> logger = require "./util/logger" >> >> if cluster.isMaster >> # Fork workers according to CPU number of machine >> cluster.fork() for i in [0...(workers + 1)] >> >> serverWorkers = {} >> httpWorkers = {} >> >> handleOnline = (worker) -> >> # TODO Is it neccessary to check heartbeat of worker? >> logger.info "[production.handleOnline] worker #{worker.id} is >> online." >> >> needToStartServer = Object.keys(serverWorkers).**length < workers >> logger.info "needToStartServer: #{needToStartServer}" >> worker.send >> code: if needToStartServer then "server" else "http" >> (if needToStartServer then serverWorkers else httpWorkers)[worker.id] >> = true >> >> handleExit = (worker, code, signal) -> >> # Remove worker from pools >> delete serverWorkers[worker.id] >> delete httpWorkers[worker.id] >> >> exitCode = worker.process.exitCode >> logger.info "[production.handleExit] worker #{worker.id} died with >> code #{exitCode}." >> # Restart worker if it's dead >> cluster.fork() >> >> cluster.on 'online', handleOnline >> >> cluster.on 'exit', handleExit >> >> else >> >> # Force express to run in production mode >> process.env.NODE_ENV = "production" >> >> handleMasterMessage = (msg) -> >> switch msg.code >> when 'server' >> require "./server" >> break >> when 'http' >> require "./http" >> break >> else >> logger.info "[production.**handleMasterMessage] Unknown message >> code: #{msg.cmd}" >> >> process.on 'message', handleMasterMessage >> >> >> >> >> Regards, >> >> ty >> >> -- > Job Board: http://jobs.nodejs.org/ > Posting guidelines: > https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines > You received this message because you are subscribed to the Google > Groups "nodejs" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/nodejs?hl=en?hl=en > -- Job Board: http://jobs.nodejs.org/ Posting guidelines: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines You received this message because you are subscribed to the Google Groups "nodejs" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/nodejs?hl=en?hl=en
