----- Original Message ----- > Hi gentlemen,
Hi Pavel, > I'm trying to test the performance of ATS v.4.0.2. > > Server under test has quad-core CPU with HT disabled. During test (1k Given the range of Platforms we support it's /always/ Good to explicitly state which platform (OS, version, kernel) you're running on. But also, exactly how you compiled it. > user-agents, 1k origin servers, up to 6k requests per second with > average size of 8kb) at mark of 2-2.5k requests per second I see the Given the range of configurations we support it's always good to explictly state if this is a forward, reverse or transparent proxy (You only mention later that caching is fully disabled..) > signs of overloading (growing delay time, missed responses). The problem > is that according to top output, CPU cycles are not under heavy load > (which is strange for overloaded system). All the other parameters (ram, > I/O, network) are far from saturation too. Top shows load at about > 50-60% of one core for [ET_NET 0] process. traffic_server instances seem > to be spreaded between all the cores, even if I'm trying to bind them > mandatory to one or the two of the corec using taskset. (at this point I can now guess with certainty that you're talking about Linux, but I still don't know which distro/version, etc..) > My alterations to default ats configuration (mostly following this > guide:http://www.ogre.com/node/392): > > Cache is fully disabled: > CONFIG proxy.config.http.cache.http INT 0 > Threads: > CONFIG proxy.config.exec_thread.autoconfig INT 0 > CONFIG proxy.config.exec_thread.autoconfig.scale FLOAT 1 > CONFIG proxy.config.exec_thread.limit INT 4 > CONFIG proxy.config.accept_threads INT 2 > CONFIG proxy.config.cache.threads_per_disk INT 1 > CONFIG proxy.config.task_threads INT 4 > > So my questions are the next: > 1) Is there any known strategy to distribute ATS processes/threads by > CPU cores? E.g. All the traffic_server threads bind to cpu0 and cpu1, > all traffic_manager threads to cpu2 and networking interrupts to cpu3? > 2) If so, how can this be done? I see some threads ignore 'taskset -a -p > 1,2 <traffic_server pid>' and are being executed on any CPU core. May be > configuration directives? > 3) What is the better strategy for core configuration? Should sum of > task, accept and network threads be equal to CPU cores number + 1? Or > anything else? May be it's better to use 40 threads in sum for quad-core > device? > 4) Does *thread* config options are taking in account if > proxy.config.http.cache.http is set to '1'? > 5) What other options should have influence on system performance in > case of cache-off test? > > TIA, > Pavel > > > -- Igor Galić Tel: +43 (0) 664 886 22 883 Mail: [email protected] URL: http://brainsware.org/ GPG: 6880 4155 74BD FD7C B515 2EA5 4B1D 9E08 A097 C9AE
