Re: [PATCH v3] cpuset: Enable cpuset controller in default hierarchy

2017-11-27 Thread Waiman Long
On 11/27/2017 04:42 PM, Tejun Heo wrote:
> Hello, Waiman.
>
> On Mon, Nov 27, 2017 at 04:19:57PM -0500, Waiman Long wrote:
>>> Let's start just with [e]cpus and [e]mems.  The flags interface looks
>>> fine but the implementations of these features are really bad and
>>> cgroup2 doesn't migrate resources for other controllers either anyway.
>> That is added because the mem_migrate feature is used in libvirt, I
>> think. I am thinking of add a "[EXPERIMENTAL]" tag to the flags to
>> indicate that it is subject to change.
> I see.  Do you happen to know what it's used for and why that's
> necessary just so that we can evaluate it better?  I'm not quite sure
> what adding [EXPERIMENTAL] tag would achieve.  If we expose the
> feature and people use it, we just have to keep it anyway.

The mem_migrate feature will probably enforce better NUMA locality as
the vCPU may move from one physical CPU to another if it is not pinned.

I want to add the experimental tag more in the sense that we are going
to add to the list of the flags in the future than removing an existing
one. Well, I guess we can just say it in the text instead of adding a tag.

Cheers,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] cpuset: Enable cpuset controller in default hierarchy

2017-11-27 Thread Tejun Heo
Hello, Waiman.

On Mon, Nov 27, 2017 at 04:19:57PM -0500, Waiman Long wrote:
> > Let's start just with [e]cpus and [e]mems.  The flags interface looks
> > fine but the implementations of these features are really bad and
> > cgroup2 doesn't migrate resources for other controllers either anyway.
> 
> That is added because the mem_migrate feature is used in libvirt, I
> think. I am thinking of add a "[EXPERIMENTAL]" tag to the flags to
> indicate that it is subject to change.

I see.  Do you happen to know what it's used for and why that's
necessary just so that we can evaluate it better?  I'm not quite sure
what adding [EXPERIMENTAL] tag would achieve.  If we expose the
feature and people use it, we just have to keep it anyway.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] cpuset: Enable cpuset controller in default hierarchy

2017-11-27 Thread Waiman Long
On 11/27/2017 04:04 PM, Tejun Heo wrote:
> Hello, Waiman.
>
> Sorry about the long delay.
>
> On Fri, Oct 06, 2017 at 05:10:30PM -0400, Waiman Long wrote:
>> +Cpuset Interface Files
>> +~~
>> +
>> +  cpuset.cpus
>> +A read-write multiple values file which exists on non-root
>> +cgroups.
>> +
>> +It lists the CPUs allowed to be used by tasks within this
>> +cgroup.  The CPU numbers are comma-separated numbers or
>> +ranges.  For example:
>> +
>> +  # cat cpuset.cpus
>> +  0-4,6,8-10
>> +
>> +An empty value indicates that the cgroup is using the same
>> +setting as the nearest cgroup ancestor with a non-empty
>> +"cpuset.cpus" or all the available CPUs if none is found.
>> +
>> +The value of "cpuset.cpus" stays constant until the next update
>> +and won't be affected by any CPU hotplug events.
>> +
>> +  cpuset.effective_cpus
> Can we do cpuset.ecpus in the fashion of euid, egid..?

Sure.
>> +  cpuset.effective_mems
> Ditto.

Sure.

>> +  cpuset.flags
>> +A read-write multiple values file which exists on non-root
>> +cgroups.
>> +
>> +It lists the flags that are set (with a '+' prefix) and those
>> +that are not set (with a '-' prefix).   The currently supported
>> +flag is:
>> +
>> +  mem_migrate
>> +When it is not set, an allocated memory page will
>> +stay in whatever node it was allocated independent
>> +of changes in "cpuset.mems".
>> +
>> +When it is set, tasks with memory pages not in
>> +"cpuset.mems" will have those pages migrated over to
>> +memory nodes specified in "cpuset.mems".  Any changes
>> +to "cpuset.mems" will cause pages in nodes that are
>> +no longer valid to be migrated over to the newly
>> +valid nodes.
> Let's start just with [e]cpus and [e]mems.  The flags interface looks
> fine but the implementations of these features are really bad and
> cgroup2 doesn't migrate resources for other controllers either anyway.

That is added because the mem_migrate feature is used in libvirt, I
think. I am thinking of add a "[EXPERIMENTAL]" tag to the flags to
indicate that it is subject to change.

Cheers,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] cpuset: Enable cpuset controller in default hierarchy

2017-11-27 Thread Tejun Heo
Hello, Waiman.

Sorry about the long delay.

On Fri, Oct 06, 2017 at 05:10:30PM -0400, Waiman Long wrote:
> +Cpuset Interface Files
> +~~
> +
> +  cpuset.cpus
> + A read-write multiple values file which exists on non-root
> + cgroups.
> +
> + It lists the CPUs allowed to be used by tasks within this
> + cgroup.  The CPU numbers are comma-separated numbers or
> + ranges.  For example:
> +
> +   # cat cpuset.cpus
> +   0-4,6,8-10
> +
> + An empty value indicates that the cgroup is using the same
> + setting as the nearest cgroup ancestor with a non-empty
> + "cpuset.cpus" or all the available CPUs if none is found.
> +
> + The value of "cpuset.cpus" stays constant until the next update
> + and won't be affected by any CPU hotplug events.
> +
> +  cpuset.effective_cpus

Can we do cpuset.ecpus in the fashion of euid, egid..?

> +  cpuset.effective_mems

Ditto.

> +  cpuset.flags
> + A read-write multiple values file which exists on non-root
> + cgroups.
> +
> + It lists the flags that are set (with a '+' prefix) and those
> + that are not set (with a '-' prefix).   The currently supported
> + flag is:
> +
> +   mem_migrate
> + When it is not set, an allocated memory page will
> + stay in whatever node it was allocated independent
> + of changes in "cpuset.mems".
> +
> + When it is set, tasks with memory pages not in
> + "cpuset.mems" will have those pages migrated over to
> + memory nodes specified in "cpuset.mems".  Any changes
> + to "cpuset.mems" will cause pages in nodes that are
> + no longer valid to be migrated over to the newly
> + valid nodes.

Let's start just with [e]cpus and [e]mems.  The flags interface looks
fine but the implementations of these features are really bad and
cgroup2 doesn't migrate resources for other controllers either anyway.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] cpuset: Enable cpuset controller in default hierarchy

2017-11-14 Thread Waiman Long
On 10/26/2017 02:12 PM, Waiman Long wrote:
> On 10/26/2017 10:39 AM, Tejun Heo wrote:
>> Hello, Waiman.
>>
>> On Wed, Oct 25, 2017 at 11:50:34AM -0400, Waiman Long wrote:
>>> Ping! Any comment on this patch?
>> Sorry about the lack of response.  Here are my two thoughts.
>>
>> 1. I'm not really sure about the memory part.  Mostly because of the
>>way it's configured and enforced is completely out of step with how
>>mm behaves in general.  I'd like to get more input from mm folks on
>>this.
> Yes, I also have doubt about which of the additional features are being
> actively used. That is why the current patch exposes only the memory_migrate
> flag in addition to the core *cpus and *mems control files. All the
> other v1 features are not exposed waiting for further investigation and
> feedback. One way to get more feedback is to have something that people
> can play with. Maybe we could somehow tag it as experimental so that we
> can change the interface later on, when necessary, if you have concern
> about setting the APIs in stone.
>
>> 2. I want to think more about how we expose the effective settings.
>>Not that anything is wrong with what cpuset does, but more that I
>>wanna ensure that it's something we can follow in other cases where
>>we have similar hierarchical property propagation.
> Currently, the effective setting is exposed via the effective_cpus and
> effective_mems control files. Unlike other controllers that control
> resources, cpuset is unique in the sense that it is propagating
> hierarchical constraints on CPUs and memory nodes down the tree. I
> understand your desire to have a unified framework that can be applied
> to most controllers, but I doubt cpuset is a good model in this regard.

What do you think we can do for the 4.16 development cycle? I really
like to see some kind of at least experimental support for cpuset v2.
That may be the best way to gather feedback and decide what to do next.

Cheers,
Longman
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] cpuset: Enable cpuset controller in default hierarchy

2017-10-26 Thread Christian Brauner
On Thu, Oct 26, 2017 at 02:12:01PM -0400, Waiman Long wrote:
> On 10/26/2017 10:39 AM, Tejun Heo wrote:
> > Hello, Waiman.
> >
> > On Wed, Oct 25, 2017 at 11:50:34AM -0400, Waiman Long wrote:
> >> Ping! Any comment on this patch?

Fwiw, I just saw this patch today for some weird reason.

> > Sorry about the lack of response.  Here are my two thoughts.
> >
> > 1. I'm not really sure about the memory part.  Mostly because of the
> >way it's configured and enforced is completely out of step with how
> >mm behaves in general.  I'd like to get more input from mm folks on
> >this.
> 
> Yes, I also have doubt about which of the additional features are being
> actively used. That is why the current patch exposes only the memory_migrate
> flag in addition to the core *cpus and *mems control files. All the
> other v1 features are not exposed waiting for further investigation and
> feedback. One way to get more feedback is to have something that people
> can play with. Maybe we could somehow tag it as experimental so that we
> can change the interface later on, when necessary, if you have concern
> about setting the APIs in stone.

This sounds like a reasonable approach to me. The cpuset controller is quite
important from a userspace (especially container) perspective. So making this
an experimental feature for a while to gather feedback seems worth it. I'd be
happy to carry/receive some experimental patches in a liblxc branch for cgroup
v2 to see where the current cpuset controller implementation currently gets us
and send/discuss patches where needed.

> 
> > 2. I want to think more about how we expose the effective settings.
> >Not that anything is wrong with what cpuset does, but more that I
> >wanna ensure that it's something we can follow in other cases where
> >we have similar hierarchical property propagation.
> 
> Currently, the effective setting is exposed via the effective_cpus and
> effective_mems control files. Unlike other controllers that control
> resources, cpuset is unique in the sense that it is propagating
> hierarchical constraints on CPUs and memory nodes down the tree. I
> understand your desire to have a unified framework that can be applied
> to most controllers, but I doubt cpuset is a good model in this regard.
> 
> Cheers,
> Longman
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] cpuset: Enable cpuset controller in default hierarchy

2017-10-26 Thread Waiman Long
On 10/26/2017 10:39 AM, Tejun Heo wrote:
> Hello, Waiman.
>
> On Wed, Oct 25, 2017 at 11:50:34AM -0400, Waiman Long wrote:
>> Ping! Any comment on this patch?
> Sorry about the lack of response.  Here are my two thoughts.
>
> 1. I'm not really sure about the memory part.  Mostly because of the
>way it's configured and enforced is completely out of step with how
>mm behaves in general.  I'd like to get more input from mm folks on
>this.

Yes, I also have doubt about which of the additional features are being
actively used. That is why the current patch exposes only the memory_migrate
flag in addition to the core *cpus and *mems control files. All the
other v1 features are not exposed waiting for further investigation and
feedback. One way to get more feedback is to have something that people
can play with. Maybe we could somehow tag it as experimental so that we
can change the interface later on, when necessary, if you have concern
about setting the APIs in stone.

> 2. I want to think more about how we expose the effective settings.
>Not that anything is wrong with what cpuset does, but more that I
>wanna ensure that it's something we can follow in other cases where
>we have similar hierarchical property propagation.

Currently, the effective setting is exposed via the effective_cpus and
effective_mems control files. Unlike other controllers that control
resources, cpuset is unique in the sense that it is propagating
hierarchical constraints on CPUs and memory nodes down the tree. I
understand your desire to have a unified framework that can be applied
to most controllers, but I doubt cpuset is a good model in this regard.

Cheers,
Longman

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3] cpuset: Enable cpuset controller in default hierarchy

2017-10-25 Thread Waiman Long
On 10/06/2017 05:10 PM, Waiman Long wrote:
> Given the fact that thread mode had been merged into 4.14, it is now
> time to enable cpuset to be used in the default hierarchy (cgroup v2)
> as it is clearly threaded.
>
> The cpuset controller had experienced feature creep since its
> introduction more than a decade ago. Besides the core cpus and mems
> control files to limit cpus and memory nodes, there are a bunch of
> additional features that can be controlled from the userspace. Some of
> the features are of doubtful usefulness and may not be actively used.
>
> After examining the source code of some sample users like systemd,
> libvirt and lxc for their use of those additional features, only
> memory_migrate is used by libvirt.
>
> This patch enables cpuset controller in the default hierarchy with a
> minimal set of features. Currently, only memory_migrate is supported.
> We can certainly add more features to the default hierarchy if there
> is a real user need for them later on.
>
> For features that are actually flags which are set internally, they are
> being combined into a single "cpuset.flags" control file. That includes
> the memory_migrate feature which is the only flag that is currently
> supported. When the "cpuset.flags" file is read, it contains either
> "+mem_migrate" (enabled) or "-mem_migrate" (disabled).
>
> To enable it, use
>
>   # echo +mem_migrate > cpuset.flags
>
> To disable it, use
>
>   # echo -mem_migrate > cpuset.flags
>
> Note that the flag name is changed to "mem_migrate" for better naming
> consistency.
>
> v3:
>  - Further trim the additional features down to just memory_migrate.
>  - Update Documentation/cgroup-v2.txt.
>
> Signed-off-by: Waiman Long 

Ping! Any comment on this patch?

Cheers,
Longman

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html