We're having a very similar problem, I was wondering if you made any advancements on this? I hate having to kill a worker then doing another cluster.fork()
On Sunday, 12 August 2012 16:50:16 UTC+1, Yi wrote: > > Hi > > We have recently experienced a strange issue with node cluster. It looks > like the built-in node cluster might have serious memory leak when process > socket service. Following is the details. > > We have a simple socket service built on node. It does some simple buffer > handling and distribution job. > It has been run on a single node process to server about 1k clients with > stead performance. > > Since the service becomes more and more popular, we want to utilize the > 8-core server by clustering the node app and so server more clients. > > We did that by simply add a cluster wrap of the existing node app code. > code as below. > > But once the service run on cluster, it turns out serous memory leak. And > to server the same amount of clients, the server load average increases 10 > times then a single node process! > > here is a screenshot of the server stats > http://zk-binary.b0.upaiyun.com/temp/node-cluster-leak.png > > Is there any way, we can check the details of what's going on in the node > cluster? > > > # cluster code: > > > cluster = require "cluster" > workers = require("os").cpus().length > > logger = require "./util/logger" > > if cluster.isMaster > # Fork workers according to CPU number of machine > cluster.fork() for i in [0...(workers + 1)] > > serverWorkers = {} > httpWorkers = {} > > handleOnline = (worker) -> > # TODO Is it neccessary to check heartbeat of worker? > logger.info "[production.handleOnline] worker #{worker.id} is online." > > needToStartServer = Object.keys(serverWorkers).length < workers > logger.info "needToStartServer: #{needToStartServer}" > worker.send > code: if needToStartServer then "server" else "http" > (if needToStartServer then serverWorkers else httpWorkers)[worker.id] > = true > > handleExit = (worker, code, signal) -> > # Remove worker from pools > delete serverWorkers[worker.id] > delete httpWorkers[worker.id] > > exitCode = worker.process.exitCode > logger.info "[production.handleExit] worker #{worker.id} died with > code #{exitCode}." > # Restart worker if it's dead > cluster.fork() > > cluster.on 'online', handleOnline > > cluster.on 'exit', handleExit > > else > > # Force express to run in production mode > process.env.NODE_ENV = "production" > > handleMasterMessage = (msg) -> > switch msg.code > when 'server' > require "./server" > break > when 'http' > require "./http" > break > else > logger.info "[production.handleMasterMessage] Unknown message > code: #{msg.cmd}" > > process.on 'message', handleMasterMessage > > > > > Regards, > > ty > > -- Job Board: http://jobs.nodejs.org/ Posting guidelines: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines You received this message because you are subscribed to the Google Groups "nodejs" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/nodejs?hl=en?hl=en
