I don't think posting the whole code here would benefit the question. The process is very simple: 1. start 10 workers with child_process.fork; 2. read 400 random urls from db; 3. pass each url to random worker; 4. calculate number of responses per second.
I've increased settings for kern.maxfiles, kern.maxfilesperproc, kern.ipc.somaxconn, and kern.ipc.maxsockets in sysctl.conf, and rebooted. No effect. On Jun 1, 5:23 pm, Ben Noordhuis <[email protected]> wrote: > On Fri, Jun 1, 2012 at 2:00 PM, Mick <[email protected]> wrote: > > I have to scrape thousands of different websites, as fast as possible. > > On a single node process I was able to fetch 10 urls per second. > > Though if I fork the task to 10 worker processes, I can reach 64 reqs/ > > sec. > > > Why is so? > > Why I am limited to 10 reqs/sec on a single process and have to spawn > > workers to reach 64 reqs/sec? > > > - I am not reaching max sockets/host (agent.maxSockets) limit: all > > urls are from unique hosts. > > - I am not reaching max file descriptors limit (AFAIK): my ulimit -n > > is 2560, and lsof shows that my scraper never uses more than 20 file > > descriptors. > > > Is there any limit I don't know about? I am on Mac OS-X. > > Can you post or link to your code? -- Job Board: http://jobs.nodejs.org/ Posting guidelines: https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines You received this message because you are subscribed to the Google Groups "nodejs" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/nodejs?hl=en?hl=en
