Hello, Andy. On Mon, Jun 24, 2013 at 11:49:05AM -0700, Andy Lutomirski wrote: > > I have an idea where it should be headed in the long term but am not > > sure about short-term solution. Given that the only sort wide-spread > > use case is virt kthreads, maybe it just needs to be special cased for > > now. Not sure. > > I'll be okay (I think) if I can reliably set affinities of these > threads. I'm currently doing it with cgroups. > > That being said, I don't like the direction that kernel thread magic > affinity is going. It may be great for cache performance and reducing > random bounding, but I have a scheduling-jitter-sensitive workload and > I don't care about overall system throughput. I need the kernel to > stay the f!&k off my important cpus, and arranging for this to happen > is becoming increasingly complicated.
Why is it becoming increasingly complicated? The biggest change probably was the shared workqueue pool implementation but that was years ago and workqueue has grown pool attributes recently adding more properly designed flexibility and, for example, adding default affinity for !per-cpu workqueues should be pretty easy now. But anyways, if it's an issue, it should be examined and properly solved rather than hacking up hacky solution with cgroup. > cgroups are most certainly something that a binary can be aware of. > It's not like a sysctl knob at all -- it's per process. I have lots No, it definitely is not. Sure it is more granular than sysctl but that's it. It exposes control knobs which are directly tied into kernel implementation details. It is not a properly designed programming API by any stretch of imagination. It is an extreme failure on the kernel side that that part hasn't been made crystal clear from the beginning. I don't know how intentional it was but the whole thing is completely botched. cgroup *never* was held to the standard necessary for any widely available API and many of the controls it exposes are exactly at the level of sysctls. As the interface was filesystem, it could evade scrutiny and with the hierarchical organization also gave the impression that it's something which can be used directly by individual applications. It found a loophole in the way we implement and police kernel APIs and then exploited it like there's no tomorrow. We are firmly bound to maintain what already has been exposed from the kernel side and I'm not gonna break any of them but the free-for-all cgroup is broken and deprecated. It's gonna wither and fade away and any attempt to reverse that will be met with extreme prejudice. > of binaries that have worked quite well for a couple years that move > themselves into different cgroups. I have no problem with a unified > hierarchy, but I need control of my little piece of the hierarchy. > > I don't care if the interface to do so changes, but the basic > functionality is important. Whether you care or not is completely irrelevant. Individual binaries widely incorporating cgroup details automatically binds the kernel. It becomes excruciatingly painful to back out after certain point. I don't think we're there yet given the overall immaturity and brokeness of cgroups and it's imperative that we back the hell out as fast as possible before this insanity spreads any wider. Thanks. -- tejun _______________________________________________ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel