Re: [Devel] [PATCH 3/8] memcg, slab: never try to merge memcg caches

2014-02-04 Thread Glauber Costa
On Tue, Feb 4, 2014 at 7:27 PM, Vladimir Davydov vdavy...@parallels.com wrote: On 02/04/2014 07:11 PM, Michal Hocko wrote: On Tue 04-02-14 18:59:23, Vladimir Davydov wrote: On 02/04/2014 06:52 PM, Michal Hocko wrote: On Sun 02-02-14 20:33:48, Vladimir Davydov wrote: Suppose we are creating

Re: [Devel] [PATCH 1/8] memcg: export kmemcg cache id via cgroup fs

2014-02-03 Thread Glauber Costa
On Mon, Feb 3, 2014 at 10:57 AM, Vladimir Davydov vdavy...@parallels.com wrote: On 02/03/2014 10:21 AM, David Rientjes wrote: On Sun, 2 Feb 2014, Vladimir Davydov wrote: Per-memcg kmem caches are named as follows: global-cache-name(cgroup-kmem-id:cgroup-name) where cgroup-kmem-id is the

Re: [Devel] [PATCH v14 16/18] vmpressure: in-kernel notifications

2013-12-20 Thread Glauber Costa
will stay away from converting the actual users, you are all welcome to do so. Signed-off-by: Glauber Costa glom...@openvz.org Signed-off-by: Vladimir Davydov vdavy...@parallels.com Acked-by: Anton Vorontsov an...@enomsg.org Acked-by: Pekka Enberg penb...@kernel.org Reviewed-by: Greg Thelen gthe

Re: [Devel] [PATCH v14 16/18] vmpressure: in-kernel notifications

2013-12-20 Thread Glauber Costa
One correction: int vmpressure_register_kernel_event(struct cgroup_subsys_state *css, - void (*fn)(void)) +void (*fn)(void *data, int level), void *data) { - struct vmpressure *vmpr = css_to_vmpressure(css); +

Re: [Devel] [PATCH v14 16/18] vmpressure: in-kernel notifications

2013-12-20 Thread Glauber Costa
On Fri, Dec 20, 2013 at 8:44 PM, Luiz Capitulino lcapitul...@redhat.com wrote: On Fri, 20 Dec 2013 10:03:32 -0500 Luiz Capitulino lcapitul...@redhat.com wrote: The answer for all of your questions above can be summarized by noting that for the lack of other users (at the time), this patch

Re: [Devel] [PATCH v14 16/18] vmpressure: in-kernel notifications

2013-12-20 Thread Glauber Costa
On Fri, Dec 20, 2013 at 8:53 PM, Luiz Capitulino lcapitul...@redhat.com wrote: On Fri, 20 Dec 2013 20:46:05 +0400 Glauber Costa glom...@gmail.com wrote: On Fri, Dec 20, 2013 at 8:44 PM, Luiz Capitulino lcapitul...@redhat.com wrote: On Fri, 20 Dec 2013 10:03:32 -0500 Luiz Capitulino

Re: [Devel] [PATCH 4/6] memcg, slab: check and init memcg_cahes under slab_mutex

2013-12-19 Thread Glauber Costa
On Thu, Dec 19, 2013 at 11:07 AM, Vladimir Davydov vdavy...@parallels.com wrote: On 12/18/2013 09:41 PM, Michal Hocko wrote: On Wed 18-12-13 17:16:55, Vladimir Davydov wrote: The memcg_params::memcg_caches array can be updated concurrently from memcg_update_cache_size() and

Re: [Devel] [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-17 Thread Glauber Costa
On Mon, Dec 2, 2013 at 10:15 PM, Michal Hocko mho...@suse.cz wrote: [CCing Glauber - please do so in other posts for kmem related changes] On Mon 02-12-13 17:08:13, Vladimir Davydov wrote: The KMEM_ACCOUNTED_ACTIVATED was introduced by commit a8964b9b (memcg: use static branches when code not

Re: [Devel] [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-17 Thread Glauber Costa
On Mon, Dec 2, 2013 at 10:51 PM, Michal Hocko mho...@suse.cz wrote: On Mon 02-12-13 22:26:48, Glauber Costa wrote: On Mon, Dec 2, 2013 at 10:15 PM, Michal Hocko mho...@suse.cz wrote: [CCing Glauber - please do so in other posts for kmem related changes] On Mon 02-12-13 17:08:13, Vladimir

Re: [Devel] [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-17 Thread Glauber Costa
On Mon, Dec 2, 2013 at 11:21 PM, Vladimir Davydov vdavy...@parallels.com wrote: On 12/02/2013 10:26 PM, Glauber Costa wrote: On Mon, Dec 2, 2013 at 10:15 PM, Michal Hocko mho...@suse.cz wrote: [CCing Glauber - please do so in other posts for kmem related changes] On Mon 02-12-13 17:08:13

Re: [Devel] [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-17 Thread Glauber Costa
Hi, Glauber Hi. In memcg_update_kmem_limit() we do the whole process of limit initialization under a mutex so the situation we need protection from in tcp_update_limit() is impossible. BTW once set, the 'activated' flag is never cleared and never checked alone, only along with the 'active'

Re: [Devel] [PATCH] memcg: remove KMEM_ACCOUNTED_ACTIVATED

2013-12-17 Thread Glauber Costa
Could you do something clever with just one flag? Probably yes. But I doubt it would be that much cleaner, this is just the way that patching sites work. Thank you for spending your time to listen to me. Don't worry! I thank you for carrying this forward. Let me try to explain what is

Re: [Devel] [PATCH v13 00/16] kmemcg shrinkers

2013-12-17 Thread Glauber Costa
Please note that in contrast to previous versions this patch-set implements slab shrinking only when we hit the user memory limit so that kmem allocations will still fail if we are below the user memory limit, but close to the kmem limit. This is, because the implementation of kmem-only

Re: [Devel] [PATCH v13 04/16] memcg: move memcg_caches_array_size() function

2013-12-17 Thread Glauber Costa
On Mon, Dec 9, 2013 at 12:05 PM, Vladimir Davydov vdavy...@parallels.com wrote: I need to move this up a bit, and I am doing in a separate patch just to reduce churn in the patch that needs it. Signed-off-by: Vladimir Davydov vdavy...@parallels.com Reviewed-by: Glauber Costa glom...@openvz.org

Re: [Devel] [PATCH v13 05/16] vmscan: move call to shrink_slab() to shrink_zones()

2013-12-17 Thread Glauber Costa
On Mon, Dec 9, 2013 at 12:05 PM, Vladimir Davydov vdavy...@parallels.com wrote: This reduces the indentation level of do_try_to_free_pages() and removes extra loop over all eligible zones counting the number of on-LRU pages. Looks correct to me. ___

Re: [Devel] [PATCH v13 14/16] vmpressure: in-kernel notifications

2013-12-17 Thread Glauber Costa
On Mon, Dec 9, 2013 at 12:05 PM, Vladimir Davydov vdavy...@parallels.com wrote: From: Glauber Costa glom...@openvz.org During the past weeks, it became clear to us that the shrinker interface It has been more than a few weeks by now =) ___ Devel

Re: [Devel] [PATCH v13 13/16] vmscan: take at least one pass with shrinkers

2013-12-17 Thread Glauber Costa
On Tue, Dec 10, 2013 at 3:50 PM, Vladimir Davydov vdavy...@parallels.com wrote: On 12/10/2013 08:18 AM, Dave Chinner wrote: On Mon, Dec 09, 2013 at 12:05:54PM +0400, Vladimir Davydov wrote: From: Glauber Costa glom...@openvz.org In very low free kernel memory situations, it may be the case

Re: [Devel] Race in memcg kmem?

2013-12-17 Thread Glauber Costa
On Tue, Dec 10, 2013 at 5:59 PM, Vladimir Davydov vdavy...@parallels.com wrote: Hi, Looking through the per-memcg kmem_cache initialization code, I have a bad feeling that it is prone to a race. Before getting to fixing it, I'd like to ensure this race is not only in my imagination. Here it

Re: [Devel] [PATCH v13 11/16] mm: list_lru: add per-memcg lists

2013-12-17 Thread Glauber Costa
OK, as far as I can tell, this is introducing a per-node, per-memcg LRU lists. Is that correct? If so, then that is not what Glauber and I originally intended for memcg LRUs. per-node LRUs are expensive in terms of memory and cross multiplying them by the number of memcgs in a system was not

Re: [Devel] [PATCH 1/2] memcg: fix memcg_size() calculation

2013-12-17 Thread Glauber Costa
On Mon, Dec 16, 2013 at 8:47 PM, Michal Hocko mho...@suse.cz wrote: On Sat 14-12-13 12:15:33, Vladimir Davydov wrote: The mem_cgroup structure contains nr_node_ids pointers to mem_cgroup_per_node objects, not the objects themselves. Ouch! This is 2k per node which is wasted. What a shame I

[Devel] [PATCH v7 00/11] per-cgroup cpu-stat

2013-05-29 Thread Glauber Costa
got nowhere in the past. Glauber Costa (8): don't call cpuacct_charge in stop_task.c sched: adjust exec_clock to use it as cpu usage metric cpuacct: don't actually do anything. sched: document the cpu cgroup. sched: account guest time per-cgroup as well. sched: record per-cgroup number

[Devel] [PATCH v7 01/11] don't call cpuacct_charge in stop_task.c

2013-05-29 Thread Glauber Costa
this call quite useless. Signed-off-by: Glauber Costa glom...@openvz.org CC: Mike Galbraith mgalbra...@suse.de CC: Peter Zijlstra a.p.zijls...@chello.nl CC: Thomas Gleixner t...@linutronix.de --- kernel/sched/stop_task.c | 1 - 1 file changed, 1 deletion(-) diff --git a/kernel/sched/stop_task.c b

[Devel] [PATCH v7 04/11] sched: adjust exec_clock to use it as cpu usage metric

2013-05-29 Thread Glauber Costa
the independent hierarchy walk executed by cpuacct. Signed-off-by: Glauber Costa glom...@openvz.org CC: Dave Jones da...@redhat.com CC: Ben Hutchings b...@decadent.org.uk CC: Peter Zijlstra a.p.zijls...@chello.nl CC: Paul Turner p...@google.com CC: Lennart Poettering lenn...@poettering.net CC: Kay

[Devel] [PATCH v7 06/11] sched: document the cpu cgroup.

2013-05-29 Thread Glauber Costa
The CPU cgroup is so far, undocumented. Although data exists in the Documentation directory about its functioning, it is usually spread, and/or presented in the context of something else. This file consolidates all cgroup-related information about it. Signed-off-by: Glauber Costa glom

[Devel] [PATCH v7 07/11] sched: account guest time per-cgroup as well.

2013-05-29 Thread Glauber Costa
We already track multiple tick statistics per-cgroup, using the task_group_account_field facility. This patch accounts guest_time in that manner as well. Signed-off-by: Glauber Costa glom...@openvz.org CC: Peter Zijlstra a.p.zijls...@chello.nl CC: Paul Turner p...@google.com --- kernel/sched

[Devel] [PATCH v7 08/11] sched: Push put_prev_task() into pick_next_task()

2013-05-29 Thread Glauber Costa
. [ glom...@openvz.org: incorporated mailing list feedback ] Signed-off-by: Peter Zijlstra a.p.zijls...@chello.nl Signed-off-by: Glauber Costa glom...@openvz.org --- kernel/sched/core.c | 20 +++- kernel/sched/fair.c | 6 +- kernel/sched/idle_task.c | 6 +- kernel

[Devel] [PATCH v7 09/11] sched: record per-cgroup number of context switches

2013-05-29 Thread Glauber Costa
not likely, it seems a fair price to pay. 2. Those figures do not include switches from and to the idle or stop task. Those need to be recorded separately, which will happen in a follow up patch. Signed-off-by: Glauber Costa glom...@openvz.org CC: Peter Zijlstra a.p.zijls...@chello.nl CC: Paul

[Devel] [PATCH v7 10/11] sched: change nr_context_switches calculation.

2013-05-29 Thread Glauber Costa
classes are recorded in the root_task_group. One can easily derive the total figure by adding those quantities together. Signed-off-by: Glauber Costa glom...@openvz.org CC: Peter Zijlstra a.p.zijls...@chello.nl CC: Paul Turner p...@google.com --- kernel/sched/core.c | 17 +++-- kernel

[Devel] [PATCH v7 11/11] sched: introduce cgroup file stat_percpu

2013-05-29 Thread Glauber Costa
1996534 7205 1 cpu1 58800 0 1700 0 0 0 0 2848680 6510 1 cpu2 50500 0 1400 0 0 0 0 2350771 6183 1 cpu3 47200 0 1600 0 0 0 0 19766345 6277 2 Signed-off-by: Glauber Costa glom...@openvz.org CC: Peter Zijlstra a.p.zijls...@chello.nl CC: Paul Turner p...@google.com

[Devel] [PATCH v7 03/11] cgroup, sched: let cpu serve the same files as cpuacct

2013-05-29 Thread Glauber Costa
and creating a base on top of which cpu can implement proper optimization. [ glommer: don't call *_charge in stop_task.c ] Signed-off-by: Tejun Heo t...@kernel.org Signed-off-by: Glauber Costa glom...@openvz.org Cc: Peter Zijlstra pet...@infradead.org Cc: Michal Hocko mho...@suse.cz Cc: Kay Sievers

[Devel] [PATCH v7 05/11] cpuacct: don't actually do anything.

2013-05-29 Thread Glauber Costa
All the information we have that is needed for cpuusage (and cpuusage_percpu) is present in schedstats. It is already recorded in a sane hierarchical way. If we have CONFIG_SCHEDSTATS, we don't really need to do any extra work. All former functions become empty inlines. Signed-off-by: Glauber

[Devel] [PATCH v7 02/11] cgroup: implement CFTYPE_NO_PREFIX

2013-05-29 Thread Glauber Costa
files. Signed-off-by: Tejun Heo t...@kernel.org Cc: Peter Zijlstra pet...@infradead.org Cc: Glauber Costa glom...@openvz.org --- include/linux/cgroup.h | 1 + kernel/cgroup.c| 3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/include/linux/cgroup.h b/include/linux/cgroup.h

[Devel] [PATCH 0/3] Container fixups

2013-05-20 Thread Glauber Costa
(and it is not clear if it will ever be) With this patches, I can successfully run vzctl enter and ssh into containers running totally unmodified kernels for: centos, ubuntu and suse. Please comment Glauber Costa (3): hooks_ct: create devices inside container allow for distro-specific fix ups at creation

[Devel] [PATCH 2/3] allow for distro-specific fix ups at creation time.

2013-05-20 Thread Glauber Costa
From: Glauber Costa glom...@parallels.com We will need that infrastucture when running with Linux upstream, since some support is very unlikely to ever land in the Kernel. We need to do things like account for the fact that udev may kick in and destroy all the setup we have done for /dev. Since

[Devel] [PATCH 1/3] hooks_ct: create devices inside container

2013-05-20 Thread Glauber Costa
side. We can do it from the container side provided we do it before we chroot - and then the host side fs is still visible. The fact that we join a mount namespace will act to keep those mounts totally private, and exempt us from cleaning it up. Signed-off-by: Glauber Costa glom...@openvz.org

[Devel] [PATCH 3/3] hooks_ct: trick PAM to not bail out in loginuid failures

2013-05-20 Thread Glauber Costa
scripts. Signed-off-by: Glauber Costa glom...@openvz.org --- src/lib/hooks_ct.c | 44 1 file changed, 44 insertions(+) diff --git a/src/lib/hooks_ct.c b/src/lib/hooks_ct.c index f769865..a4f9425 100644 --- a/src/lib/hooks_ct.c +++ b/src/lib/hooks_ct.c

Re: [Devel] [PATCH 2/3] allow for distro-specific fix ups at creation time.

2013-05-20 Thread Glauber Costa
On 05/21/2013 12:32 AM, Kir Kolyshkin wrote: On 05/20/2013 05:49 AM, Glauber Costa wrote: From: Glauber Costa glom...@parallels.com We will need that infrastucture when running with Linux upstream, since some support is very unlikely to ever land in the Kernel. We need to do things like

[Devel] [PATCH v2 0/2] distro fixups

2013-05-20 Thread Glauber Costa
Kir, Here is a new attempt at implementing fixups scripts. They look nicer than the last version, and rely on a more generic and configurable script instead, that should make our lives a lot easier in the future. Please let me know what you think Glauber Costa (2): allow for distro-specific

[Devel] [PATCH v2 2/2] prestart: fixup legacy udev effects

2013-05-20 Thread Glauber Costa
/.autofsck. We can check if the file was modified (non-existent - existent, or different modification time) and run our fixups after this. Signed-off-by: Glauber Costa glom...@openvz.org --- etc/dists/scripts/prestart.sh | 36 1 file changed, 36 insertions

Re: [Devel] [PATCH v5 4/6] modify tar extraction to account for user namespace

2013-05-19 Thread Glauber Costa
On 05/19/2013 09:41 PM, Kir Kolyshkin wrote: + */ +#define VZ_DEFAULT_UID10 +#define VZ_DEFAULT_GID10 I assume these are no longer used, right? right ___ Devel mailing list Devel@openvz.org

[Devel] [PATCH] allow for distro-specific fix ups at creation time.

2013-05-19 Thread Glauber Costa
From: Glauber Costa glom...@parallels.com We will need that infrastucture when running with Linux upstream, since some support is very unlikely to ever land in the Kernel. We need to do things like account for the fact that udev may kick in and destroy all the setup we have done for /dev. Since

Re: [Devel] [PATCH v5 6/6] allow for distro-specific fix ups at creation time.

2013-05-18 Thread Glauber Costa
+{ +char buf[STR_SIZE]; + +/* Distributions that don't need the fixup will can stop right here */ +if (!actions || !actions-ct_fixup) +return 0; + +if (snprintf(buf, sizeof(buf), %s/%s, root, /etc/rc3.d/S00vz-fixups.sh) 0) Again and again :( How this snprintf

Re: [Devel] [PATCH v4 2/7] add user mismatch test

2013-05-17 Thread Glauber Costa
On 05/17/2013 04:18 AM, Kir Kolyshkin wrote: +stat(res-fs.private, private_stat); +if ((local_uid (private_stat.st_uid != *local_uid)) || +(local_gid (private_stat.st_gid != *local_gid))) { As I just commented at the very end of a previous patch, indeed it does

Re: [Devel] [PATCH v4 1/7] user namespace support for upstream containers

2013-05-17 Thread Glauber Costa
On 05/17/2013 04:11 AM, Kir Kolyshkin wrote: +if ((arg-userns_p != -1) (read(arg-userns_p, ret, sizeof(ret)) != sizeof(ret))) { +logger(-1, errno, Cannot read from user namespace pipe); We don't close arg-userns_p in case of error here. And it seems we do not close the other

Re: [Devel] [PATCH v4 3/7] Also pass cmd_p pointer to container open

2013-05-17 Thread Glauber Costa
On 05/17/2013 05:35 AM, Kir Kolyshkin wrote: This is kinda becoming over-engineered (and now I realized I should have said it earlier, when reviewing the patch that added the first param). I understand well why you need local_uid and local_gid from config and cmdline in ct_open(), but

[Devel] [PATCH v5 0/6] User namespace support for upstream containers

2013-05-17 Thread Glauber Costa
will go back to it ASAP. Glauber Costa (6): user namespace support for upstream containers add user mismatch test allow local uid and gid to be specified at container creation modify tar extraction to account for user namespace automatically add bridge venet0 when needed allow for distro

[Devel] [PATCH v5 1/6] user namespace support for upstream containers

2013-05-17 Thread Glauber Costa
From: Glauber Costa glom...@parallels.com This patch allows the execution of unprivileged containers running ontop of an upstream Linux Kernel. We will run at whatever UID is found in the configuration file (so far empty, thus disabled). Signed-off-by: Glauber Costa glom...@parallels.com

[Devel] [PATCH v5 2/6] add user mismatch test

2013-05-17 Thread Glauber Costa
From: Glauber Costa glom...@parallels.com In theory, we won't be able to run if our private area is not owned by ourselves. We could, if it have very wide open security permissions, but we should never set up a container like that. Aside from a basic sanity check, this is intended to catch

[Devel] [PATCH v5 3/6] allow local uid and gid to be specified at container creation

2013-05-17 Thread Glauber Costa
From: Glauber Costa glom...@parallels.com It is a valid use case to run a container with host uid and gid different than the default. In particular, already deployed versions of vzctl are expected to have this value unset, effectively meaning they are not expecting user namespaces to be present

[Devel] [PATCH v5 4/6] modify tar extraction to account for user namespace

2013-05-17 Thread Glauber Costa
From: Glauber Costa glom...@parallels.com If we are running upstream with user namespaces, we need to create the container filesystem not with the ownership preserved, but reflecting the mapping we need to apply. Note that according to our documentation, we should ignore this if the user

[Devel] [PATCH v5 5/6] automatically add bridge venet0 when needed

2013-05-17 Thread Glauber Costa
From: Glauber Costa glom...@parallels.com The chosen architecture to deal with --ipadd with upstream containers is to create a veth pair and add the host side information to a bridge called venet0. This way, all the code that expects venet0 to exist can still work without modifications

[Devel] [PATCH v5 6/6] allow for distro-specific fix ups at creation time.

2013-05-17 Thread Glauber Costa
From: Glauber Costa glom...@parallels.com We will need that infrastucture when running with Linux upstream, since some support is very unlikely to ever land in the Kernel. We need to do things like account for the fact that udev may kick in and destroy all the setup we have done for /dev. Since

Re: [Devel] [PATCH 1/6] vzctl: split ct_env_create

2013-05-16 Thread Glauber Costa
On 05/16/2013 04:14 PM, Andrey Vagin wrote: + ret = ct_env_create_real(arg); + if (ret 0) return VZ_RESOURCE_ERROR; - } Isn't it better to just keep the return values intact in create_real, and then return them as is if ret != 0 ?

Re: [Devel] [PATCH 2/6] vzctl: save PID of init in a state file

2013-05-16 Thread Glauber Costa
On 05/16/2013 04:14 PM, Andrey Vagin wrote: CRIU requires a pid of the init. Signed-off-by: Andrey Vagin ava...@openvz.org The way you coded it, it seems to me that we will always overwrite the pid file, which is fine: this way we won't run into the usual pid file already exists kinds of

Re: [Devel] [PATCH v3 4/9] user namespace support for upstream containers

2013-05-14 Thread Glauber Costa
On 05/14/2013 07:02 AM, Kir Kolyshkin wrote: Oh my, four cases of whitespace-at-eol which I had to fixed manually. Sorry about that. I think I got to used to checkpatch and the such that I forget to verify all of those manually. Wanted to apply it nevertheless and fix some things myself, but

Re: [Devel] [PATCH v3 0/9] Upstream Linux support for userns

2013-05-14 Thread Glauber Costa
On 05/14/2013 07:03 AM, Kir Kolyshkin wrote: On 04/29/2013 10:16 PM, Glauber Costa wrote: Kir, Please review the following patchset. The main difference from last version is that we support running with userns disabled even if it is present. This effectively means that containers that were

Re: [Devel] [PATCH v3 7/9] modify tar extraction to account for user namespace

2013-05-14 Thread Glauber Costa
On 05/14/2013 07:17 AM, Kir Kolyshkin wrote: Hmm... If I understand it correctly, in case LOCAL_UID/LOCAL_GID is not set in the global configuration file, and not supplied from command line, here you apply the default values of 1. The problem I see these values are not saved into

Re: [Devel] [PATCH v3 6/9] allow local uid and gid to be specified at container creation

2013-05-14 Thread Glauber Costa
On 05/14/2013 07:09 AM, Kir Kolyshkin wrote: +When running with an upstream Linux Kernel that supports user namespaces (= +3.8), the parameters \fB--local_uid\fR and \fB--local_gid\fR can be used to +select which \fIuid\fR and \fIgid\fR respectively will be used as a base user +in the host

Re: [Devel] [PATCH v3 7/9] modify tar extraction to account for user namespace

2013-05-14 Thread Glauber Costa
On 05/14/2013 07:17 AM, Kir Kolyshkin wrote: Hmm... If I understand it correctly, in case LOCAL_UID/LOCAL_GID is not set in the global configuration file, and not supplied from command line, here you apply the default values of 1. The problem I see these values are not saved into

[Devel] [PATCH v4 3/7] Also pass cmd_p pointer to container open

2013-05-14 Thread Glauber Costa
the command line. If that is the case, this value should take precedence. To achieve this, we should also pass the cmd_p information to the open functions. Signed-off-by: Glauber Costa glom...@openvz.org --- include/env.h | 8 +--- src/lib/env.c | 7 --- src/lib/hooks_ct.c | 2

[Devel] [PATCH v4 4/7] allow local uid and gid to be specified at container creation

2013-05-14 Thread Glauber Costa
From: Glauber Costa glom...@parallels.com It is a valid use case to run a container with host uid and gid different than the default. In particular, already deployed versions of vzctl are expected to have this value unset, effectively meaning they are not expecting user namespaces to be present

[Devel] [PATCH v4 5/7] modify tar extraction to account for user namespace

2013-05-14 Thread Glauber Costa
From: Glauber Costa glom...@parallels.com If we are running upstream with user namespaces, we need to create the container filesystem not with the ownership preserved, but reflecting the mapping we need to apply. Note that according to our documentation, we should ignore this if the user

[Devel] [PATCH v4 6/7] automatically add bridge venet0 when needed

2013-05-14 Thread Glauber Costa
From: Glauber Costa glom...@parallels.com The chosen architecture to deal with --ipadd with upstream containers is to create a veth pair and add the host side information to a bridge called venet0. This way, all the code that expects venet0 to exist can still work without modifications

[Devel] [PATCH v4 7/7] allow for distro-specific fix ups at creation time.

2013-05-14 Thread Glauber Costa
From: Glauber Costa glom...@parallels.com We will need that infrastucture when running with Linux upstream, since some support is very unlikely to ever land in the Kernel. We need to do things like account for the fact that udev may kick in and destroy all the setup we have done for /dev. Since

[Devel] [PATCH v4 1/7] user namespace support for upstream containers

2013-05-14 Thread Glauber Costa
From: Glauber Costa glom...@parallels.com This patch allows the execution of unprivileged containers running ontop of an upstream Linux Kernel. We will run at whatever UID is found in the configuration file (so far empty, thus disabled). Signed-off-by: Glauber Costa glom...@parallels.com

[Devel] [PATCH v4 2/7] add user mismatch test

2013-05-14 Thread Glauber Costa
From: Glauber Costa glom...@parallels.com In theory, we won't be able to run if our private area is not owned by ourselves. We could, if it have very wide open security permissions, but we should never set up a container like that. Aside from a basic sanity check, this is intended to catch

[Devel] [PATCH v4 0/7] User namespace support for upstream containers

2013-05-14 Thread Glauber Costa
for userns presence very early during creation time. I think this very much clarifies our handling of the command line option. The documentation is also changed as you requested, for consistency. Glauber Costa (7): user namespace support for upstream containers add user mismatch test Also pass

Re: [Devel] [PATCH v3 7/9] modify tar extraction to account for user namespace

2013-05-13 Thread Glauber Costa
On 05/11/2013 03:53 AM, Igor M Podlesny wrote: On 30 April 2013 13:16, Glauber Costa glom...@openvz.org wrote: From: Glauber Costa glom...@parallels.com To work around that, we can employ a trick to allow container creation right now, as well as to avoid compatibility problems: we will resort

Re: [Devel] [PATCH 3/3] hooks_ct: fix ub limits setting for upstream containers

2013-05-13 Thread Glauber Costa
On 05/11/2013 05:07 AM, Igor M Podlesny wrote: On 30 April 2013 11:46, Glauber Costa glom...@openvz.org wrote: Currently, vps_setup_res() have an explicit test for state != VPS_STARTING before applying beancounter limits. This means that we can set limits without further problems when

Re: [Devel] [PATCH v3 4/9] user namespace support for upstream containers

2013-05-13 Thread Glauber Costa
On 05/11/2013 04:14 AM, Igor M Podlesny wrote: On 30 April 2013 13:16, Glauber Costa glom...@openvz.org wrote: @@ -576,7 +765,9 @@ int ct_do_open(vps_handler *h, vps_param *param) { int ret; char path[STR_SIZE]; + char upath[STR_SIZE]; struct stat st

Re: [Devel] [PATCH 3/3] hooks_ct: fix ub limits setting for upstream containers

2013-05-13 Thread Glauber Costa
On 05/13/2013 12:11 PM, Igor M Podlesny wrote: On 13 May 2013 16:11, Glauber Costa glom...@parallels.com wrote: On 05/13/2013 12:08 PM, Igor M Podlesny wrote: On 13 May 2013 15:50, Glauber Costa glom...@parallels.com wrote: [...] Aren't macroses supposed to be UPPER CASE named? Yes

[Devel] [PATCH 0/3] Fixes for upstream containers

2013-05-10 Thread Glauber Costa
-vz option, while Patch #3 fixes a bug present in all versions. Thanks Glauber Costa (3): hooks_ct: fix gcc warning ub: compile ub support for non vz kernels as well hooks_ct: fix ub limits setting for upstream containers include/ub.h| 40 -- src/lib/Makefile.am | 3

[Devel] [PATCH 1/3] hooks_ct: fix gcc warning

2013-05-10 Thread Glauber Costa
Signed-off-by: Glauber Costa glom...@openvz.org --- src/lib/hooks_ct.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/lib/hooks_ct.c b/src/lib/hooks_ct.c index 03a18e7..63af536 100644 --- a/src/lib/hooks_ct.c +++ b/src/lib/hooks_ct.c @@ -247,7 +247,7 @@ static int

[Devel] [PATCH 2/3] ub: compile ub support for non vz kernels as well

2013-05-10 Thread Glauber Costa
From: Glauber Costa glom...@parallels.com Commit c9d9170b0 fixed a bug by not including the ub functions if VZ support was not present in the running kernel, and replacing them with empty stubs. This approach, however, proved to be too aggressive. We need to at least be able to read and write

[Devel] [PATCH 3/3] hooks_ct: fix ub limits setting for upstream containers

2013-05-10 Thread Glauber Costa
for it in its startup function. We should do the same, and call our version of setlimits ourselves when the container is coming up. Signed-off-by: Glauber Costa glom...@openvz.org --- src/lib/hooks_ct.c | 147 +++-- 1 file changed, 76 insertions

[Devel] [PATCH] cgroups: fix set command with beancounters upstream

2013-05-10 Thread Glauber Costa
amount minus two pages, which should effectively mean accounting turned on, but unlimited Signed-off-by: Glauber Costa glom...@openvz.org --- src/lib/cgroup.c | 13 + 1 file changed, 13 insertions(+) diff --git a/src/lib/cgroup.c b/src/lib/cgroup.c index 9185d46..ae7fe5c 100644

[Devel] [PATCH v3 0/9] Upstream Linux support for userns

2013-05-10 Thread Glauber Costa
is in place). Glauber Costa (9): host uid and gid parameters adjust fs_create parameter pass parameters to open user namespace support for upstream containers add user mismatch test allow local uid and gid to be specified at container creation modify tar extraction to account for user

[Devel] [PATCH v3 1/9] host uid and gid parameters

2013-05-10 Thread Glauber Costa
From: Glauber Costa glom...@parallels.com When running with an upstream Linux kernel that supports user namespaces, we will run the container using an unprivileged user in the system. That can be any user, and it serves as base to a 1:1 mapping between users in the container and users in the host

[Devel] [PATCH v3 2/9] adjust fs_create parameter

2013-05-10 Thread Glauber Costa
From: Glauber Costa glom...@parallels.com We need to pass more information to fs_create. Instead of adding arguments, it is preferred to pass the whole vps_p structure and unfold it inside the callee. Signed-off-by: Glauber Costa glom...@parallels.com --- src/lib/create.c | 12 +++- 1

[Devel] [PATCH v3 3/9] pass parameters to open

2013-05-10 Thread Glauber Costa
. Signed-off-by: Glauber Costa glom...@openvz.org --- include/env.h | 6 +++--- src/lib/env.c | 7 --- src/lib/hooks_ct.c | 2 +- src/lib/hooks_vz.c | 2 +- src/vzctl-actions.c | 2 +- 5 files changed, 10 insertions(+), 9 deletions(-) diff --git a/include/env.h b/include/env.h

[Devel] [PATCH v3 4/9] user namespace support for upstream containers

2013-05-10 Thread Glauber Costa
From: Glauber Costa glom...@parallels.com This patch allows the execution of unprivileged containers running ontop of an upstream Linux Kernel. We will run at whatever UID is found in the configuration file (so far empty, thus disabled). Signed-off-by: Glauber Costa glom...@parallels.com

[Devel] [PATCH v3 5/9] add user mismatch test

2013-05-10 Thread Glauber Costa
From: Glauber Costa glom...@parallels.com In theory, we won't be able to run if our private area is not owned by ourselves. We could, if it have very wide open security permissions, but we should never set up a container like that. Aside from a basic sanity check, this is intended to catch

[Devel] [PATCH v3 7/9] modify tar extraction to account for user namespace

2013-05-10 Thread Glauber Costa
From: Glauber Costa glom...@parallels.com If we are running upstream with user namespaces, we need to create the container filesystem not with the ownership preserved, but reflecting the mapping we need to apply. Note that according to our documentation, we should ignore this if the user

[Devel] [PATCH v3 8/9] automatically add bridge venet0 when needed

2013-05-10 Thread Glauber Costa
From: Glauber Costa glom...@parallels.com The chosen architecture to deal with --ipadd with upstream containers is to create a veth pair and add the host side information to a bridge called venet0. This way, all the code that expects venet0 to exist can still work without modifications

[Devel] [PATCH v3 9/9] allow for distro-specific fix ups at creation time.

2013-05-10 Thread Glauber Costa
From: Glauber Costa glom...@parallels.com We will need that infrastucture when running with Linux upstream, since some support is very unlikely to ever land in the Kernel. We need to do things like account for the fact that udev may kick in and destroy all the setup we have done for /dev. Since

Re: [Devel] [PATCH] cgroups: fix set command with beancounters upstream

2013-04-30 Thread Glauber Costa
On 04/30/2013 01:48 PM, Kir Kolyshkin wrote: On 04/29/2013 10:12 PM, Glauber Costa wrote: The kernel memory controller cannot flip states from unlimited to limited if there are already tasks in it. Therefore, we always have to run with *some* value of kmem enabled. If we don't do it, we

Re: [Devel] [PATCH v2 2/8] adjust fs_create parameter

2013-04-01 Thread Glauber Costa
On 04/01/2013 07:44 PM, Kir Kolyshkin wrote: On 04/01/2013 08:37 AM, Glauber Costa wrote: On 04/01/2013 07:13 PM, Dmitry Guryanov wrote: On 130401 19:04:37, Glauber Costa wrote: On 04/01/2013 06:58 PM, Dmitry Guryanov wrote: On 130322 14:48:16, Glauber Costa wrote: We need to pass more

[Devel] [PATCH v2 0/8] upstream Linux support for userns

2013-03-22 Thread Glauber Costa
implements --ipadd (now all infrastructure is in place), we can ssh into containers due to issues related to the proc filesystem. Let me know if there are any issues, I'll happily fix them. Glauber Costa (8): host uid and gid parameters adjust fs_create parameter user namespace support

[Devel] [PATCH v2 1/8] host uid and gid parameters

2013-03-22 Thread Glauber Costa
will be used for both uid and gid. Signed-off-by: Glauber Costa glom...@parallels.com --- include/res.h | 8 include/vzctl_param.h | 3 +++ src/lib/config.c | 32 3 files changed, 43 insertions(+) diff --git a/include/res.h b/include/res.h index

[Devel] [PATCH v2 3/8] user namespace support for upstream containers

2013-03-22 Thread Glauber Costa
This patch allows the execution of unprivileged containers running ontop of an upstream Linux Kernel. We will run at whatever UID is found in the configuration file. Signed-off-by: Glauber Costa glom...@parallels.com --- include/env.h | 1 + include/types.h| 1 + src/lib/hooks_ct.c

[Devel] [PATCH v2 4/8] modify tar extraction to account for user namespace

2013-03-22 Thread Glauber Costa
the offset manually. Signed-off-by: Glauber Costa glom...@parallels.com --- scripts/vps-create.in | 19 ++ src/lib/Makefile.am | 3 ++ src/lib/chown_preload.c | 93 + src/lib/create.c| 21 +++ vzctl.spec

[Devel] [PATCH v2 5/8] add user mismatch test

2013-03-22 Thread Glauber Costa
already created containers that will be owned by root:root, and will now try to run it unprivileged. Signed-off-by: Glauber Costa glom...@parallels.com --- src/lib/env.c | 13 + 1 file changed, 13 insertions(+) diff --git a/src/lib/env.c b/src/lib/env.c index 2da848d..ff4dad2 100644

[Devel] [PATCH v2 6/8] allow local uid and gid to be specified at container creation

2013-03-22 Thread Glauber Costa
It is a valid use case to run a container with host uid and gid different than the default. This patch provides and documents a way to do so. Signed-off-by: Glauber Costa glom...@parallels.com --- man/vzctl.8.in | 14 ++ src/vzctl-actions.c | 2 ++ src/vzctl.c | 1 + 3

[Devel] [PATCH v2 7/8] automatically add bridge venet0 when needed

2013-03-22 Thread Glauber Costa
that was actually already stated in the comments, but the code was removed before merging because --ipadd would not work without full unshare support anyway. This patch implements that. Signed-off-by: Glauber Costa glom...@parallels.com --- scripts/vps-functions.in | 7 +++ src/lib/hooks_ct.c

[Devel] [PATCH v2 8/8] allow for distro-specific fix ups at creation time.

2013-03-22 Thread Glauber Costa
operation done by /sbin/init and rc.sysinit, therefore allowing operation to continue freely. Signed-off-by: Glauber Costa glom...@parallels.com --- etc/dists/redhat.conf | 1 + etc/dists/scripts/fixups.sh | 43 +++ include/dist.h | 2

[Devel] [PATCH v2 0/6] Unprivileged containers with user namespaces

2013-03-12 Thread Glauber Costa
to be specified as uid/gid offset. It simplifies the code if conf_parse_ulong is used, and well, if anyone *really* wants to run privileged... We will apply the default value now only if the fields are unset. Glauber Costa (6): host uid and gid parameters adjust fs_create parameter user namespace

[Devel] [PATCH v2 1/6] host uid and gid parameters

2013-03-12 Thread Glauber Costa
will be used for both uid and gid. Signed-off-by: Glauber Costa glom...@parallels.com --- include/res.h | 8 include/vzctl_param.h | 3 +++ src/lib/config.c | 32 3 files changed, 43 insertions(+) diff --git a/include/res.h b/include/res.h index

[Devel] [PATCH v2 2/6] adjust fs_create parameter

2013-03-12 Thread Glauber Costa
We need to pass more information to fs_create. Instead of adding arguments, it is preferred to pass the whole vps_p structure and unfold it inside the callee. Signed-off-by: Glauber Costa glom...@parallels.com --- src/lib/create.c | 13 - 1 file changed, 8 insertions(+), 5 deletions

[Devel] [PATCH v2 3/6] user namespace support for upstream containers

2013-03-12 Thread Glauber Costa
This patch allows the execution of unprivileged containers running ontop of an upstream Linux Kernel. We will run at whatever UID is found in the configuration file. Signed-off-by: Glauber Costa glom...@parallels.com --- include/types.h| 1 + src/lib/env.c | 16 ++ src/lib

[Devel] [PATCH v2 4/6] modify tar extraction to account for user namespace

2013-03-12 Thread Glauber Costa
the offset manually. Signed-off-by: Glauber Costa glom...@parallels.com --- scripts/vps-create.in | 19 ++ src/lib/Makefile.am | 3 ++ src/lib/chown_preload.c | 93 + src/lib/create.c| 21 +++ vzctl.spec

  1   2   3   4   5   6   7   8   9   10   >