Bob Beck [b...@openbsd.org] wrote:
> this is cool .. but
> 
> I would be interested in someone comparing server workloads, as
> opposed to interactive GUI response, using this.
> 
> I wouldn't be surprised that inspiriation from BFS would produce
> better interactive response, my bigger concern
> would be does this adversely impact how we deal with non-interactive 
> workloads.

I've been testing it on my heavily loaded Zabbix server, which regularly
get dozens of variables from over 5,000 devices. I get mixed results, the
load avg is higher, and the idle cpu time is generally higher, I regularly
see 1 second (top 's 1') CPU idle of 50-70% where I regularly saw 20-50% 
with the old scheduler. This is in top with all cpus collasped into 1 (top
'1'). I suspect if I averaged it over time, the difference could be less   
dramatic.

I'm using Postgres, there is no heavily threaded stuff, so I have
no idea why the idle CPU seems to be higher. The load avg is a bit higher,
it seems to stay up around 48 longer with the new scheduler, and also
shoots up and down more quickly. None of this is particularly well measured,
just a weird observation. With the old scheduler the load avg would fluctuate
from 32-42, and seemed to stay at a particular value for a longer period
of time (whatever that means.)

Zabbix is a horrible pig, and my polling confiuguration is not fine-tuned 
to top it off. The box is a "Xeon E2-1230 v3 @ 3.30GHz" with 16GB RAM and
two Samsung 845DC SSDs in softraid raid1. We use Postgres as a time-series
database, I'm looking at alternatives using collectd and graphite and
whatever, I really want to get away from Zabbix + Postgres.

Since this scheduler has a hack to help spinning threaded apps, that
explains why there is less CPU usage during video playback, but I don't
know how to explain my Zabbix results. It takes at least 2 minutes after
boot before the Zabbix box stabilizes to the levels I'm describing
here. 

If I set top to .1 second (top 's .1') then the new scheduler seems to
drive all CPUs to 0% idle for shorter periods of time, but more
frequently.

These are really rough observations. This box spawns lots of dirty
perl processes and also lots of fping processes for monitoring. I've
not spent the time to optimize this area at all. I was curious if this
scheduler or other OS level optmizations might take away some of the
costs here. With this rather stupid polling architecture, perhaps
copy-on-write is still the biggest win...

Chris

Reply via email to