Re: [systemd-devel] SD_BUS_VTABLE_CAPABILITY
Hi Andy, On Thu, Apr 16, 2015 at 12:30:28PM -0700, Andy Lutomirski wrote: On Thu, Apr 16, 2015 at 11:23 AM, Lennart Poettering lenn...@poettering.net wrote: [...] AFAICT this piece of kdbus code serves to enable a rather odd way to write privilege-separated services to change the time and kill processes. The cost is complex security code that, at best, fails secure in the presence of different user namespaces, and the cost also involves touching a global refcount for each message sent (this might be the *only* thing that would reference init_user_ns's refcount when sending). Oh yeah, the cost is also ABI crap -- if, say, my The global ref-counts on metadata is just a workaround due to userns and caps. I actually thought we already sorted that out? https://lkml.org/lkml/2015/3/25/702 Hmm there are other paths that refs user_ns, the mqueue notification perhaps ? Please note that we also have _per_ user quota accounting, the trade off of accouting prevents further performance penalties on other bus operations. Referring to performance critical code, this code path can just be ignored by to not opt-in for KDBUS_ATTACH_CAPS which is the default behaviour. Thanks! -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH v2] log: be more verbose if dbus job fails
keep things synced... +static const char* const service_result_table[_SERVICE_RESULT_MAX] = { +[SERVICE_SUCCESS] = success, +[SERVICE_FAILURE_RESOURCES] = resources, +[SERVICE_FAILURE_TIMEOUT] = timeout, +[SERVICE_FAILURE_EXIT_CODE] = exit-code, +[SERVICE_FAILURE_SIGNAL] = signal, +[SERVICE_FAILURE_CORE_DUMP] = core-dump, +[SERVICE_FAILURE_WATCHDOG] = watchdog, +[SERVICE_FAILURE_START_LIMIT] = start-limit +}; + +DEFINE_PRIVATE_STRING_TABLE_LOOKUP_FROM_STRING(service_result, ServiceResult); + +static const char* const service_result_error_message_table[_SERVICE_RESULT_MAX] = { +[SERVICE_FAILURE_RESOURCES] = configured resource limit was exceeded, +[SERVICE_FAILURE_TIMEOUT] = , +[SERVICE_FAILURE_EXIT_CODE] = ExecStart process exited with error code, +[SERVICE_FAILURE_SIGNAL] = fatal signal was delivered to ExecStart process, +[SERVICE_FAILURE_CORE_DUMP] = fatal signal was delivered to ExecStart process. Core dumped, fatal signal was delivered to ExecStart process and it dumped core? +[SERVICE_FAILURE_WATCHDOG] = service failed to sent watchdog ping, ...failed to send... +[SERVICE_FAILURE_START_LIMIT] = start of the service was attempted too often too quickly, +}; -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH 2/2] macro: allow assert_se() assertions to also be optimized out when NDEBUG is set
On Mon, Mar 30, 2015 at 07:32:35PM -0700, Shawn Landden wrote: On Mon, Mar 30, 2015 at 5:04 PM, Djalal Harouni tix...@opendz.org wrote: On Fri, Mar 27, 2015 at 09:51:26AM -0700, Shawn Landden wrote: On Fri, Mar 27, 2015 at 8:16 AM, Tom Gundersen t...@jklm.no wrote: [...] * Current expression may modify/interact with a global state which may cause a fatal error, and if the caller wants to know if that failed, then abort(), your patch just introduced a regression, without the explicit abort(), all callers now have to call abort(). Yeah, this is the point I think. I still think the patch makes sense though, it really don't see why there should be a distinction between assert_se() and assert() when it comes to logging and aborting. If assert_se() fails it indicates we may have messed up the global state, and if assert() fails it indicates that the global state may be messed up (by someone else). Either way the global state is potentially messed up and not logging/aborting seems as (un)safe in both situations. Another way of seeing it, intuitively I don't see why we should distinguish between assert_se(foo() = 0); and r = foo(); assert(r = 0); Yes. The use case is that embedded people don't want all the strings that the logging adds to their binaries when these are ASSERTS, they are only expected to catch programming errors. It is not to make it faster, I think that is negligible. Hmm embedded cases are real, I had to deal with some in the past. But not sure here since I didn't see any numbers before/after stripping, but perhaps you can start by updating the callers and their semantics if you think that the code there is robust and it's worth it... ? How about we leave the abort() and just drop the logging with NDEBUG set? Not sure if this is the best solution, you will just hide the path of code where we failed! it will not be easy to debug later. As said, the callers except to have a log error and abort, if you remove one of them you are altering callers, IMO they do not handle errors and they do not log errors, they rely on assert_se(). As you know in systemd we do log errors... I think the performance impact is negligible, I just want to drop all the strings. Perhaps we could replace the logging with backtrace() in this case? Don't know are you sure that if you add a backrack() routine there it will be better than the current assert_se(), actually without real numbers it is hard to say... IMO we should not touch this macro, the fix should be in the callers, so you have to make sure that callers are robust and can handle errors correctly (log them too perhaps)... -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH 2/2] macro: allow assert_se() assertions to also be optimized out when NDEBUG is set
Hi Shawn, On Tue, Mar 31, 2015 at 04:59:29PM -0700, Shawn Landden wrote: On Tue, Mar 31, 2015 at 2:40 PM, Djalal Harouni tix...@opendz.org wrote: On Tue, Mar 31, 2015 at 11:10:34AM -0700, Shawn Landden wrote: [...] The point is that assert() and assert_se() should only be used for programming errors, this is covered in CODING_STYLE. No, I don't see any mention of assert_se(). With this you are breaking the logic that commes with the following commits: 0c0cdb06c139b52ff1 787784c4c1b24a132 e80afdb3e4a123 288c0991d5a0b240 b680a194bf9ed4c6 bdf7026e9557349c 8d95631ea6c039a60b And the list is long... Please, check the code and the commit logs... You are also hiding the error logs. A simple grep -r assert_se src/ will show you perhaps the ones that can be improved without affecting the semantics of other callers. You do realize that some of these are tests to make sure that systemd works fine ? please read the commit logs, you will clearly see that the authors do *not* want assert_se() to be optimized, and some authors will even use assert_se() to handle errors in the core code. IOW this patch will *weaken* all the test infrastructure and perhaps other areas. How tests are supposed to log errors after this ? I submitted a new version of this patch, with numbers (10kB smaller) to the mailing list. Hmm thanks for the patch, this seems a valid number, but as I have said I still think this is not the best approach. You are affecting all the callers, if you want to do this then update the parts that really need to be updated, the real callers, IMHO before you update a function, first check callers. So here if you think that the caller worth it and it is robust, then yeh remove the assert_se() from there and make it handle the error correctly, hmm but perhaps others may think differently... I will let others comment. Well the patch I last submitted still calls abort() (did you see that patch?). but yeah I don't have strong opinions about this Yes I saw the patch and the numbers, thank you! I just didn't comment to let the thread clean, and see what others may think. Thanks! -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH 2/2] macro: allow assert_se() assertions to also be optimized out when NDEBUG is set
On Fri, Mar 27, 2015 at 09:51:26AM -0700, Shawn Landden wrote: On Fri, Mar 27, 2015 at 8:16 AM, Tom Gundersen t...@jklm.no wrote: [...] * Current expression may modify/interact with a global state which may cause a fatal error, and if the caller wants to know if that failed, then abort(), your patch just introduced a regression, without the explicit abort(), all callers now have to call abort(). Yeah, this is the point I think. I still think the patch makes sense though, it really don't see why there should be a distinction between assert_se() and assert() when it comes to logging and aborting. If assert_se() fails it indicates we may have messed up the global state, and if assert() fails it indicates that the global state may be messed up (by someone else). Either way the global state is potentially messed up and not logging/aborting seems as (un)safe in both situations. Another way of seeing it, intuitively I don't see why we should distinguish between assert_se(foo() = 0); and r = foo(); assert(r = 0); Yes. The use case is that embedded people don't want all the strings that the logging adds to their binaries when these are ASSERTS, they are only expected to catch programming errors. It is not to make it faster, I think that is negligible. Hmm embedded cases are real, I had to deal with some in the past. But not sure here since I didn't see any numbers before/after stripping, but perhaps you can start by updating the callers and their semantics if you think that the code there is robust and it's worth it... ? Thanks! -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH 2/2] macro: allow assert_se() assertions to also be optimized out when NDEBUG is set
Hi Shawn, On Thu, Mar 26, 2015 at 11:21:54PM -0700, Shawn Landden wrote: On Thu, Mar 26, 2015 at 5:47 PM, Djalal Harouni tix...@opendz.org wrote: On Fri, Mar 27, 2015 at 12:30:53AM +0100, Tom Gundersen wrote: On Thu, Mar 26, 2015 at 9:19 AM, Lennart Poettering lenn...@poettering.net wrote: On Tue, 24.03.15 11:11, Shawn Landden (sh...@churchofgit.com) wrote: Will result in slightly smaller binaries, and cuts out the branch, even if the expression is still executed. I am sorry, but the whole point of assert_se() is that it has side effects. That's why it is called that way (the se suffix stands for side effects), and is different from assert(). You are supposed to use it whenever the code enclosed in it shall not be suppressed if NDEBUG is defined. This patch of yours breaks that whole logic! Hm, am I reading the patch wrong? It looks good to me. It is simply dropping the branching and logging in the NDEBUG case, but the expression itself is still evaluated. Yep but it seems that abort() will never be called, log_assert_failed() = abort() And the logging mechanism is also one of those side effects. IMO unless there are real valid reasons for any change to these macors... changing anything here will just bring more bugs that we may never notice. Those are the side effects of assert(), the side effects of the expression are still evaluated. Yes but there are also the following points: * assert() is simple, assert_se() is more complex and it may modify other global sate. I don't think that we can break down side effects of assert_se() into: 1) side effects of assert() 2) side effects of other expressions. And then remove that part 1) So given the fact that asser_se() do not depend on NDEBUG, did you consider the following: * You don't have control over what the expression do, it may just call abort() or exit() in case of fatal errors, so even if you remove the explicit abort() call, expressions may just call it. * Some expressions were perhaps depending on the logging mechanism which is part of the side effect. Callers to assert_se() are perhaps using it for a reason, are you sure that you are not introducing a regression here ?! it will be difficult to debug the error if we don't have logs. * Current expression may modify/interact with a global state which may cause a fatal error, and if the caller wants to know if that failed, then abort(), your patch just introduced a regression, without the explicit abort(), all callers now have to call abort(). * Did you grep for assert_se() uses in the source ? I really do not think this is an improvement, we can't predict the semantics of expression here, perhaps we can with simple assert() but not with assert_se(). -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH 2/2] macro: allow assert_se() assertions to also be optimized out when NDEBUG is set
On Fri, Mar 27, 2015 at 12:30:53AM +0100, Tom Gundersen wrote: On Thu, Mar 26, 2015 at 9:19 AM, Lennart Poettering lenn...@poettering.net wrote: On Tue, 24.03.15 11:11, Shawn Landden (sh...@churchofgit.com) wrote: Will result in slightly smaller binaries, and cuts out the branch, even if the expression is still executed. I am sorry, but the whole point of assert_se() is that it has side effects. That's why it is called that way (the se suffix stands for side effects), and is different from assert(). You are supposed to use it whenever the code enclosed in it shall not be suppressed if NDEBUG is defined. This patch of yours breaks that whole logic! Hm, am I reading the patch wrong? It looks good to me. It is simply dropping the branching and logging in the NDEBUG case, but the expression itself is still evaluated. Yep but it seems that abort() will never be called, log_assert_failed() = abort() And the logging mechanism is also one of those side effects. IMO unless there are real valid reasons for any change to these macors... changing anything here will just bring more bugs that we may never notice. So in the NDEBUG case assert_se(foo()) becomes equivalent to foo(), which I guess makes sense (though I doubt it makes much of a difference). -t --- src/shared/macro.h | 12 ++-- 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/src/shared/macro.h b/src/shared/macro.h index 7f89951..02219ea 100644 --- a/src/shared/macro.h +++ b/src/shared/macro.h @@ -212,17 +212,17 @@ static inline unsigned long ALIGN_POWER2(unsigned long u) { (__x / __y + !!(__x % __y));\ }) -#define assert_se(expr) \ -do {\ -if (_unlikely_(!(expr)))\ -log_assert_failed(#expr, __FILE__, __LINE__, __PRETTY_FUNCTION__); \ -} while (false) \ - /* We override the glibc assert() here. */ #undef assert #ifdef NDEBUG +#define assert_se(expr) do {expr} while(false) #define assert(expr) do {} while(false) #else +#define assert_se(expr) \ +do {\ +if (_unlikely_(!(expr)))\ +log_assert_failed(#expr, __FILE__, __LINE__, __PRETTY_FUNCTION__); \ +} while (false) #define assert(expr) assert_se(expr) #endif -- 2.2.1.209.g41e5f3a ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel Lennart -- Lennart Poettering, Red Hat ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] path-lookup: use secure_getenv()
Hi, On Mon, Mar 16, 2015 at 06:31:29PM +0100, David Herrmann wrote: Hi On Sun, Mar 15, 2015 at 12:36 PM, Ronny Chevalier chevalier.ro...@gmail.com wrote: 2015-03-15 3:27 GMT+01:00 Shawn Landden sh...@churchofgit.com: All these except user_data_home_dir() are certainly vectors for arbitrary code execution. These should use secure_getenv() --- Hi, I don't see why secure_getenv() is appropriate here? These functions are never used in the libraries systemd provides, they are mostly used by systemctl and the dbus manager. Can you provide more details? You're right, but on the other hand secure_getenv() is usually sufficient (we don't use setuid() nor fs-caps). So secure_getenv() wouldn't hurt. But I don't really care.. Yeh, perhaps just push them and forget about them even if they are called later from libraries or copy+past into a library call... ?! Thanks David ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH][RFC] bus-proxy: add support for GetConnectionCredentials method
On Thu, Feb 19, 2015 at 01:05:22PM +, Simon McVittie wrote: On 19/02/15 12:43, Lukasz Skalski wrote: GetConnectionCredentials method was added to dbus-1 specification more than one year ago. This method should return [...] as many credentials as possible for the process connected to the server, but at this moment only UnixUserID and ProcessID are defined by the specification. We should add support for next credentials after extending dbus-1 spec. As of dbus master (soon to be 1.9.12), LinuxSecurityLabel is also defined. It's the bytestring from SO_PEERSEC, whatever that means for the current LSM(s), with a trailing '\0' appended if there wasn't already one there. AppArmor, SELinux and Smack developers have all told me this is valid for their LSMs. Spec patches welcome for others, but I don't think there's a great deal of point in adding GetConnectionCredentials support for additional credentials that can be transferred over kdbus but not (securely) over AF_UNIX: anything with enough kdbus knowledge to know about those might as well be using kdbus directly. +r = get_creds_by_message(a, m, SD_BUS_CREDS_PID|SD_BUS_CREDS_EUID, creds, error); Can this ever return unknown (-1?) for creds-pid or creds-euid? So, I'm missing lot of bits, but pid can be 0, euid can perhaps be (uid_t)-1 which is also a valid value... that maps to the INVALID_UID -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] test: bump KDBUS_CONN_MAX_MSGS_PER_USER value
Hi Lukasz, On Tue, Feb 17, 2015 at 11:37:53AM +0100, Lukasz Skalski wrote: diff --git a/test/test-message.c b/test/test-message.c index 03ac71e..0cae942 100644 --- a/test/test-message.c +++ b/test/test-message.c @@ -28,7 +28,7 @@ * maximum number of queued messages from the same indvidual user after the * the un-accounted value has been hit */ -#define KDBUS_CONN_MAX_MSGS_PER_USER 16 +#define KDBUS_CONN_MAX_MSGS_PER_USER 128 Actually, the limit bump is just a temporary solution, to inspect or fix other bugs related to Gnome. David is working on this, I will let him decide if he wants to apply this patch or let it as it... Thank you Lukasz! #define MAX_USER_TOTAL_MSGS (KDBUS_CONN_MAX_MSGS_UNACCOUNTED + \ KDBUS_CONN_MAX_MSGS_PER_USER) diff --git a/test/test-policy-ns.c b/test/test-policy-ns.c index 3437012..3c8b78e 100644 --- a/test/test-policy-ns.c +++ b/test/test-policy-ns.c @@ -38,7 +38,7 @@ #define MAX_CONN 64 #define POLICY_NAME foo.test.policy-test -#define KDBUS_CONN_MAX_MSGS_PER_USER16 +#define KDBUS_CONN_MAX_MSGS_PER_USER128 /** * Note: this test can be used to inspect policy_db-talk_access_hash -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] systemd-nspawn create container under unprivileged user
On Wed, Feb 11, 2015 at 05:06:56PM +0100, Lennart Poettering wrote: On Wed, 11.02.15 13:53, Djalal Harouni (tix...@opendz.org) wrote: On Tue, Feb 10, 2015 at 12:52:34PM +0100, Lennart Poettering wrote: On Thu, 05.02.15 02:03, Vasiliy Tolstov (v.tols...@selfip.ru) wrote: Hello! Does it possible to create container as regular user? Oh what capabilities i need to add to create container not using root? Invoking containers without privileges is not supported by nspawn, and this is unlikely to change, as I fail to see any strong usecase for this... If somebody can englighten me about the usecase for allowing containers to be run by unprivileged users, I'd be willing to change my mind though... A quick argument against it, IOW just wait and see! As unprivileged we don't have CAP_SYS_MODULE set, but inside unprivileged containers we are root, and a call to cap_get_flag() on CAP_SYS_MODULE will return CAP_SET! but hey in reality this is not true, we don't have CAP_SYS_MODULE... this will confuse programs running inside containers, we'll have to add more code paths for this special case... and not only CAP_SYS_MODULE, perhaps there are other cases... Well, but we could drop CAP_SYS_MODULE both before and after setting up the userns, so that the cap is missing fro the PID both inside and outside of it... Indeed, yes but still there are other obscure cases, like CAP_SYS_ADMIN, even if you have it, you won't be able to mount file systems like btrfs and others, only a subset of virtual filesystems support unprivileged user mounting... yeh we could drop it too, and it seems that systemd was adapted recently to work in this situation, but what about other code ? or if you want todo some sort of system replication inside container... I guess we'll endup trying to know if this is the real capability or the diminished version... or if we are inside a userns... Lennart -- Lennart Poettering, Red Hat -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] systemd-nspawn create container under unprivileged user
On Tue, Feb 10, 2015 at 12:52:34PM +0100, Lennart Poettering wrote: On Thu, 05.02.15 02:03, Vasiliy Tolstov (v.tols...@selfip.ru) wrote: Hello! Does it possible to create container as regular user? Oh what capabilities i need to add to create container not using root? Invoking containers without privileges is not supported by nspawn, and this is unlikely to change, as I fail to see any strong usecase for this... If somebody can englighten me about the usecase for allowing containers to be run by unprivileged users, I'd be willing to change my mind though... A quick argument against it, IOW just wait and see! As unprivileged we don't have CAP_SYS_MODULE set, but inside unprivileged containers we are root, and a call to cap_get_flag() on CAP_SYS_MODULE will return CAP_SET! but hey in reality this is not true, we don't have CAP_SYS_MODULE... this will confuse programs running inside containers, we'll have to add more code paths for this special case... and not only CAP_SYS_MODULE, perhaps there are other cases... -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] Assorted typo fixes
On Mon, Jan 19, 2015 at 02:42:31PM +0200, Mantas Mikulėnas wrote: On Mon, Jan 19, 2015 at 2:26 PM, Djalal Harouni tix...@opendz.org wrote: Hi, On Mon, Jan 19, 2015 at 10:46:23AM +0100, Rémi Audebert wrote: Signed-off-by: Rémi Audebert ha...@lse.epita.fr Your email is in base64 format, the following was set: Content-Transfer-Encoding: base64 Classic tools do not understand it and we have to decode in order to inspect... please check this link [1] mutt and `git am` both understand it, what else does one need? :) yep :-) perhaps classic patch command ? and kernel checkpatch.pl ? on lkml base64 format is rejected... -- Mantas Mikulėnas graw...@gmail.com -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] Assorted typo fixes
-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH 1/2] Add detect_userns to detect uid/gid shifts
On Thu, Jan 08, 2015 at 09:25:07PM +0100, Tom Gundersen wrote: On Thu, Jan 8, 2015 at 8:59 PM, Stéphane Graber stgra...@ubuntu.com wrote: On Thu, Jan 08, 2015 at 08:43:12PM +0100, Tom Gundersen wrote: On Thu, Jan 8, 2015 at 8:27 PM, Stéphane Graber stgra...@ubuntu.com wrote: +/* If both uid_map and gid_map don't exist or if they both match + * the full uid/gid range, then we're not inside a user namespace */ Hm, this is not necessarily true is it? In my naive test, it works just fine to set up a usernamespace with the identity mapping. Moreover, this appears to be functionally different from the initial user namespaces (somewhat counter-intuitively I might add), so treating the identity mapping as 'no user namespace' is probably not the right thing to do. Is there no better way to test for this? I'm not sure I understand what you mean there. When you unshare CLONE_NEWUSER, your uid_map and gid_map are empty. You can then write a map using your own uid and gid or use a privileged helper to write a larger map. The only case where you could be in a separate user namespace and still have the same map as the host is if you had a privileged helper write the whole host uid and gid map to your process' uid_map and gid_map, in which case, your process' uid 0 is mapped to the host uid 0 Yeah, I'm looking at privileged user namespaces, so you are setting them up as root (which I understand is different from your usecase where you want to do this as a regular user). To be clear, I used Michael Kerrisk's test tool [0] like this (as root): # ./userns_child_exec -U -M '0 0 4294967295' -G '0 0 4294967295' bash and while you are technically in a different namespace than the host, your accesses are identical and so it's not unreasonable to have detect_userns report this as the host namespace. It appears to me that this (your accesses are identical) is not true though. At least I found mknod(8) and mount(8) to work on the host, but not in the identity-mapped user namespace [1]. Had it been true, I would agree with you that we should just treat this the same as not being in a user namespace, but as this is not the case (or I'm doing something wrong in my test) we may want to be careful here. Obviously, the identity-mapped user namespace does not sound at all useful to me, so maybe we can ignore it (if there really is no other way), but we should document that at least. Cheers, Hmm Tom a quick glance suggest that at least you need a new mount namespace, IOW you have to own the mount namespace before mounting stuff there... So from that tool if you add -m then you should be able at least to mount tmpfs in test/ For mknode, I remember that there was something special or a security bug that perhaps forces nodev... (not sure) And yes we have to be careful, that's definitely not the same thing, you may have some capabilities set but still some operations are and should be restricted like _arbitrary_ modules loading... another thing, only a limited number of filesystems can be mounted from a user namespace, major ones do not support this... Tom [0]: lwn.net/Articles/539940/ [1]: [root@tomegun-x240 userns]# ./userns_child_exec -U -M '0 0 4294967295' -G '0 0 4294967295' bash [root@tomegun-x240 userns]# mknod null b 1 3 mknod: ‘null’: Operation not permitted [root@tomegun-x240 userns]# mount -t tmpfs none test/ mount: permission denied [root@tomegun-x240 userns]# exit exit [root@tomegun-x240 userns]# mknod null b 1 3 [root@tomegun-x240 userns]# mount -t tmpfs none test/ ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH 1/2] Add detect_userns to detect uid/gid shifts
On Thu, Jan 08, 2015 at 02:59:46PM -0500, Stéphane Graber wrote: On Thu, Jan 08, 2015 at 08:43:12PM +0100, Tom Gundersen wrote: On Thu, Jan 8, 2015 at 8:27 PM, Stéphane Graber stgra...@ubuntu.com wrote: This adds a new detect_userns function in virt.c which will check whether systemd is running in the host user namespace (single map of all available uids and gids) or is using a uid/gid map. The check makes sure that uid_map and gid_map are both exactly equal to the default host map (assuming 32bit uid_t) for a process running in the host namespace. --- src/shared/virt.c | 22 ++ src/shared/virt.h | 1 + 2 files changed, 23 insertions(+) diff --git a/src/shared/virt.c b/src/shared/virt.c index f10baab..3d94e1f 100644 --- a/src/shared/virt.c +++ b/src/shared/virt.c @@ -363,3 +363,25 @@ int detect_virtualization(const char **id) { return VIRTUALIZATION_NONE; } + +/* Detect whether we run in a uid/gid shifted namespace */ +int detect_userns(void) { +int r; +static const char host_id_map[] = 0 0 4294967295; +char *uid_map = NULL; +char *gid_map = NULL; + +/* Check if we are uid-shifted */ +r = read_one_line_file(/proc/self/uid_map, uid_map); +if (r == 0 !streq(uid_map, host_id_map)) +return 1; + +/* Check if we are gid-shifted */ +r = read_one_line_file(/proc/self/gid_map, gid_map); +if (r == 0 !streq(gid_map, host_id_map)) Minor nit: would be nicer to parse these strings into numbers rather than rely on the whitespace never changing, no? Also, would be a bit nicer not to use the magic number 4294967295 but some #defined constant instead. Sure, I'll do that. +return 1; + +/* If both uid_map and gid_map don't exist or if they both match + * the full uid/gid range, then we're not inside a user namespace */ ^^ Hm, this is not necessarily true is it? In my naive test, it works just fine to set up a usernamespace with the identity mapping. Moreover, this appears to be functionally different from the initial user namespaces (somewhat counter-intuitively I might add), so treating the identity mapping as 'no user namespace' is probably not the right thing to do. Is there no better way to test for this? I'm not sure I understand what you mean there. When you unshare CLONE_NEWUSER, your uid_map and gid_map are empty. You can then write a map using your own uid and gid or use a privileged helper to write a larger map. The only case where you could be in a separate user namespace and still have the same map as the host is if you had a privileged helper write the whole host uid and gid map to your process' uid_map and gid_map, in which case, your process' uid 0 is mapped to the host uid 0 and while you are technically in a different namespace than the host, your accesses are identical and so it's not unreasonable to have detect_userns report this as the host namespace. Yes, then that code comment is confusing. Thanks! Cheers, Tom -- Stéphane Graber Ubuntu developer http://www.ubuntu.com ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH v3] Do not clear parent mount flags when setting up namespaces
; +if (mount(NULL, x, NULL, orig_flags|MS_BIND|MS_REMOUNT|(ro ? MS_RDONLY : 0), NULL) 0) { /* Deal with mount points that are * obstructed by a later mount */ diff --git a/src/shared/util.h b/src/shared/util.h index a131a3c..d5aa988 100644 --- a/src/shared/util.h +++ b/src/shared/util.h @@ -1021,6 +1021,8 @@ union file_handle_union { int update_reboot_param_file(const char *param); +int get_mount_flags(const char *path, unsigned long *flags); + int umount_recursive(const char *target, int flags); int bind_remount_recursive(const char *prefix, bool ro); ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] Do not clear parent mount flags when setting up namespaces
On Thu, Jan 01, 2015 at 06:41:17PM +, Topi Miettinen wrote: On 01/01/15 18:08, Dave Reisner wrote: On Thu, Jan 01, 2015 at 04:49:04PM +0200, Topi Miettinen wrote: Copy parent directory mount flags when setting up a namespace and don't accidentally clear mount flags later. Signed-off-by: Topi Miettinen toiwo...@gmail.com --- src/core/namespace.c | 4 ++-- src/shared/util.c| 20 ++-- src/shared/util.h| 2 ++ 3 files changed, 22 insertions(+), 4 deletions(-) diff --git a/src/core/namespace.c b/src/core/namespace.c index 5b408e0..400bc50 100644 --- a/src/core/namespace.c +++ b/src/core/namespace.c @@ -159,7 +159,7 @@ static int mount_dev(BindMount *m) { dev = strappenda(temporary_mount, /dev); (void)mkdir(dev, 0755); -if (mount(tmpfs, dev, tmpfs, MS_NOSUID|MS_STRICTATIME, mode=755) 0) { +if (mount(tmpfs, dev, tmpfs, get_mount_flags(/dev)|MS_NOSUID|MS_STRICTATIME, mode=755) 0) { r = -errno; goto fail; } @@ -282,7 +282,7 @@ static int mount_kdbus(BindMount *m) { root = strappenda(temporary_mount, /kdbus); (void)mkdir(root, 0755); -if (mount(tmpfs, root, tmpfs, MS_NOSUID|MS_STRICTATIME, mode=777) 0) { +if (mount(tmpfs, root, tmpfs, get_mount_flags(/kdbus)|MS_NOSUID|MS_STRICTATIME, mode=777) 0) { Shouldn't this be /sys/fs/bus/kdbus? We certainly don't mount kdbusfs in the root... Probably. I don't have kdbus here (sorry) and I don't quite get what the function is supposed to do. Yes kdbusfs by default should be mounted in /sys/fs/kdbus/ That mount_kdbus() function is dealing with kdbus custom endpoints /sys/fs/kdbus/bus/endpoint [1], the whole thing is mounted on top of tmpfs in order to hide the rest of the kdbufs tree The custom endpoints are created when creating services, this will be part of the namespaced apps setup where apps will only see a subset of names on a bus... the custom endpoint will be mounted on top of the default endpoint bus [2] However, I'm not up to date with this part of the code... in any case this mount_kdbus() function should be renamed to something like mount_kdbus_custom_ep()... [1] http://code.google.com/p/d-bus/source/browse/kdbus.txt#115 [2] http://lists.freedesktop.org/archives/systemd-devel/2014-October/023515.html -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH v2] Do not clear parent mount flags when setting up namespaces
Hi, On Thu, Jan 01, 2015 at 10:36:39PM +0200, Topi Miettinen wrote: Copy parent directory mount flags when setting up a namespace and don't accidentally clear mount flags later. As noted by Colin in the other email, there should be a git log message for why the change. Yes thank you, I see that in one of the replies of v1 of the patch you say why, so just perhaps use it in the commit log and code comment ? --- src/core/namespace.c | 4 ++-- src/shared/util.c| 19 +-- src/shared/util.h| 2 ++ 3 files changed, 21 insertions(+), 4 deletions(-) diff --git a/src/core/namespace.c b/src/core/namespace.c index 4b8dbdd..6859b6a 100644 --- a/src/core/namespace.c +++ b/src/core/namespace.c @@ -159,7 +159,7 @@ static int mount_dev(BindMount *m) { dev = strappenda(temporary_mount, /dev); (void)mkdir(dev, 0755); -if (mount(tmpfs, dev, tmpfs, MS_NOSUID|MS_STRICTATIME, mode=755) 0) { +if (mount(tmpfs, dev, tmpfs, get_mount_flags(/dev)|MS_NOSUID|MS_STRICTATIME, mode=755) 0) { There is no need for this function to be a parameter r = -errno; goto fail; } @@ -282,7 +282,7 @@ static int mount_kdbus(BindMount *m) { root = strappenda(temporary_mount, /kdbus); (void)mkdir(root, 0755); -if (mount(tmpfs, root, tmpfs, MS_NOSUID|MS_STRICTATIME, mode=777) 0) { +if (mount(tmpfs, root, tmpfs, get_mount_flags(/sys/fs/kdbus)|MS_NOSUID|MS_STRICTATIME, mode=777) 0) { r = -errno; goto fail; } diff --git a/src/shared/util.c b/src/shared/util.c index dfaf7f7..8ff5073 100644 --- a/src/shared/util.c +++ b/src/shared/util.c @@ -61,6 +61,7 @@ #include sys/personality.h #include sys/xattr.h #include libgen.h +#include sys/statvfs.h #undef basename #ifdef HAVE_SYS_AUXV_H @@ -6858,6 +6859,15 @@ int umount_recursive(const char *prefix, int flags) { return r ? r : n; } +unsigned long get_mount_flags(const char *path) { +struct statvfs buf; + +if (statvfs(path, buf) 0) +return 0; IMO here it should return an errno since this is a helper. In that case perhaps just open code the statvfs() or improve the helper ? Thanks! -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Compatibility between D-Bus and kdbus
On Wed, Nov 26, 2014 at 01:25:18AM +0100, Lennart Poettering wrote: On Tue, 25.11.14 12:01, Thiago Macieira (thi...@kde.org) wrote: [...] === KDBUS_ATTACH_NAMES === Documentation for metadata says that userspace must cope with some metadata not being delivered. Can we at least require that KDBUS_ATTACH_NAMES be delivered if requested? If the cookie in the match rule isn't provided in the message reception, having the source's names would help solve the problem of the signal delivery. The timestamp should also be mandatory. Yes, they are mandatory. process credentials might be suppressed hwover, for example if they cannot be translated due to namespaces. Thanks. Could you clarify in the docs? Daniel, David? Could you add a note about this? Ok pushed a note about namespace issues, thanks! -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] Unprivileged poweroff
On Wed, Oct 22, 2014 at 12:59:45PM +0100, Simon McVittie wrote: On 22/10/14 12:37, Lennart Poettering wrote: When used with kdbus we actually do check for that client-side capability. THis is not available on dbus1 however, since we cannot determine the capability racefreely and thus safely ... because the kernel doesn't give us that ability on Unix sockets. See https://bugs.freedesktop.org/show_bug.cgi?id=83499 for more on what Unix socket semantics *do* allow socket-based D-Bus to rely on. Yes. From that link the privileged process should not trust the passed fds (including:stdin...) so we can blame it... but in the other hand, even if it closes the fds, there will always be a window for races... but hey it should also be careful what to do with these fds. Anyway running a suid in an untrusted environment is just asking for trouble. A solution requires new kernel features: either something like kdbus, or a way for a Unix socket client to prove to the server that it had a particular capability either at the time the socket opened (a new SCM_CAPABILITIES analogous to SCM_CREDS?) or at the time that a particular message was queued (subtle, probably best avoided). Hmm, when the message is queued you should just trust those capabilities (metadata), there is no another way! so if it is really unprivileged you are safe. For two reasons: 1) Even if privileged leaks the connection fd to unprivileged, you will get the capabilities of the unprivileged one. The real one that is sending messages. 2) Privileged should not trust the passed fd nor queue a message using this passed fd. No sane program should do this, otherwise it will just break every thing. And if we assume from that fdo bug id=83499: * connect to the D-Bus socket * race with dbus-daemon, and win: - queue up the entire handshake, plus a malicious message, in the outgoing socket buffer - make the D-Bus connection not close-on-exec - exec() a setuid process, or fd-pass the D-Bus connection to a more privileged process * dbus-daemon checks the sender's credentials (let's say uid 1000), and thinks it has the credentials of the setuid process (e.g. uid 0) * now dbus-daemon reads the malicious message, determines that uid 0 may send it, and forwards it to the recipient, with no indication that uid 1000 was actually responsible So: If the unprivileged queues up a message then exec() a suid and passes the fd to the privileged one, we sill get the capabilities of the process that did send the message, the unprivileged one. This way if we ever someday get SCM_CAPABILITIES, it should reflect the caps at the time the message was queued, *not* when the connection was opened. Here, the capabilities are metadata attached to the message, that can't be faked... However, when trying to talk to a service/well-known name from a pure D-Bus perspective, you talk to the interface that is offering the poweroff methode, then in this case the TALK check is performed against creds (uid/gid) at the *time* the connection was *created*. So if uid/gid do not match, good, unprivileged do not allow contacting the service at all. However if uid/gid matches or if the connection was created by a privileged, we allow the TALK to that service since it was privileged or the uid/gid did match when the connection was created. The privileged is *responsible* of the connection fd. Doing this allows privileged to create connections (store the cred), drop privs, then pass the connection fd to a normal process. This way some unprivileged code is able to talk to some restricted names... and the receiving service is able to inspect the *correct* capabilities of this unprivileged (capabilities gathered when the message was queued). Hope I'm not missing something, otherwise let me know, thank you Simon! S ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 1/2] test: some tests to enforce routing messages by connections ID
Add kdbus_fork_test_by_id() to test connections by id under different uids. Currently we succeed at this test. Update: 1) kdbus_msg_recv() to get the offset of the slice, so we can release it later. 2) kdbus_msg_recv_poll() to pass the offset argument to kdbus_msg_recv(). We do this to follow best practice and to be able to free the returned kdbus_msg and the slice pointed by that offset. Signed-off-by: Djalal Harouni tix...@opendz.org --- Hi Daniel, before applying please make sure that we want this. It follows what I've discussed in the other mail, otherwise just test it, it will give a better view of the routing policy. test/kdbus-util.c | 15 - test/kdbus-util.h | 7 ++- test/test-activator.c | 2 +- test/test-chat.c| 4 +- test/test-connection.c | 4 +- test/test-daemon.c | 2 +- test/test-fd.c | 2 +- test/test-message.c | 2 +- test/test-metadata-ns.c | 2 +- test/test-monitor.c | 7 ++- test/test-policy-ns.c | 160 test/test-sync.c| 2 +- 12 files changed, 166 insertions(+), 43 deletions(-) diff --git a/test/kdbus-util.c b/test/kdbus-util.c index c72b8fe..fe4565c 100644 --- a/test/kdbus-util.c +++ b/test/kdbus-util.c @@ -630,6 +630,9 @@ void kdbus_msg_free(struct kdbus_msg *msg) const struct kdbus_item *item; int nfds, i; + if (!msg) + return; + KDBUS_ITEM_FOREACH(item, msg, items) { switch (item-type) { /* close all memfds */ @@ -648,7 +651,9 @@ void kdbus_msg_free(struct kdbus_msg *msg) } } -int kdbus_msg_recv(struct kdbus_conn *conn, struct kdbus_msg **msg_out) +int kdbus_msg_recv(struct kdbus_conn *conn, + struct kdbus_msg **msg_out, + uint64_t *offset) { struct kdbus_cmd_recv recv = {}; struct kdbus_msg *msg; @@ -665,6 +670,9 @@ int kdbus_msg_recv(struct kdbus_conn *conn, struct kdbus_msg **msg_out) if (msg_out) { *msg_out = msg; + + if (offset) + *offset = recv.offset; } else { kdbus_msg_free(msg); @@ -677,13 +685,14 @@ int kdbus_msg_recv(struct kdbus_conn *conn, struct kdbus_msg **msg_out) } int kdbus_msg_recv_poll(struct kdbus_conn *conn, + unsigned int timeout_ms, struct kdbus_msg **msg_out, - unsigned int timeout_ms) + uint64_t *offset) { int ret; for (; timeout_ms; timeout_ms--) { - ret = kdbus_msg_recv(conn, NULL); + ret = kdbus_msg_recv(conn, msg_out, offset); if (ret == -EAGAIN) { usleep(1000); continue; diff --git a/test/kdbus-util.h b/test/kdbus-util.h index c2370e7..8e3d0a8 100644 --- a/test/kdbus-util.h +++ b/test/kdbus-util.h @@ -56,9 +56,10 @@ int kdbus_name_release(struct kdbus_conn *conn, const char *name); int kdbus_name_acquire(struct kdbus_conn *conn, const char *name, uint64_t flags); void kdbus_msg_free(struct kdbus_msg *msg); -int kdbus_msg_recv(struct kdbus_conn *conn, struct kdbus_msg **msg); -int kdbus_msg_recv_poll(struct kdbus_conn *conn, struct kdbus_msg **msg_out, - unsigned int timeout_ms); +int kdbus_msg_recv(struct kdbus_conn *conn, + struct kdbus_msg **msg, uint64_t *offset); +int kdbus_msg_recv_poll(struct kdbus_conn *conn, unsigned int timeout_ms, + struct kdbus_msg **msg_out, uint64_t *offset); int kdbus_free(const struct kdbus_conn *conn, uint64_t offset); void kdbus_msg_dump(const struct kdbus_conn *conn, const struct kdbus_msg *msg); diff --git a/test/test-activator.c b/test/test-activator.c index eb57b52..7bdcbf3 100644 --- a/test/test-activator.c +++ b/test/test-activator.c @@ -79,7 +79,7 @@ int kdbus_test_activator(struct kdbus_test_env *env) } if (fds[1].revents POLLIN) { - kdbus_msg_recv(env-conn, NULL); + kdbus_msg_recv(env-conn, NULL, NULL); break; } } diff --git a/test/test-chat.c b/test/test-chat.c index 11c140d..b2bcc31 100644 --- a/test/test-chat.c +++ b/test/test-chat.c @@ -86,7 +86,7 @@ int kdbus_test_chat(struct kdbus_test_env *env) if (count 2) kdbus_name_release(conn_a, foo.bar.baz); - ret = kdbus_msg_recv(conn_a, NULL); + ret = kdbus_msg_recv(conn_a, NULL, NULL); ASSERT_RETURN(ret == 0); ret = kdbus_msg_send(conn_a, NULL, 0xc000 | cookie++, @@ -95,7 +95,7 @@ int kdbus_test_chat(struct kdbus_test_env *env) } if (fds
[systemd-devel] [PATCH 2/2] test: Use 'CapBnd' string for capability bounding set
Signed-off-by: Djalal Harouni tix...@opendz.org --- test/kdbus-util.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/test/kdbus-util.c b/test/kdbus-util.c index fe4565c..b1c5864 100644 --- a/test/kdbus-util.c +++ b/test/kdbus-util.c @@ -569,7 +569,7 @@ void kdbus_msg_dump(const struct kdbus_conn *conn, const struct kdbus_msg *msg) for (i = 0; i n; i++) kdbus_printf(%08x, cap[(2 * n) + (n - i - 1)]); - kdbus_printf( CapInh=); + kdbus_printf( CapBnd=); for (i = 0; i n; i++) kdbus_printf(%08x, cap[(3 * n) + (n - i - 1)]); kdbus_printf(\n); -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH 2/2] policy: make policy checks work across user namespaces
On Tue, Sep 09, 2014 at 10:40:57AM +0200, Daniel Mack wrote: On 09/08/2014 03:50 PM, Djalal Harouni wrote: Yes there are compile time checks, and it is perhaps easier/consistent to read this way! but yes a union is also good. OK I'll update it. Nevermind - I amended the your patch accordingly and pushed it. Oh sorry, and thank you for fixing that :-) -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 1/2] test: update policy tests to handle user namespaces
Upstream kernels allow unprivileged users to create user namespaces and change their uid/gid. These patches update kdbus policy logic to handle this case and improve our current checks across user namespaces. So this patch adds: * kdbus_test_waitpid() to get exit code of childs. * kdbus_clone_userns_test() that performs the test inside a new user namespace. * Converts all the other tests to return CHECK_OK, CHECK_SKIP or CHECK_ERR so we are consistent. Currently we fail at kdbus_clone_userns_test() test #8. The next patch will fix this issue. Signed-off-by: Djalal Harouni tix...@opendz.org --- test/test-kdbus-policy.c | 503 +++ 1 file changed, 375 insertions(+), 128 deletions(-) diff --git a/test/test-kdbus-policy.c b/test/test-kdbus-policy.c index 8d7bcb5..a354290 100644 --- a/test/test-kdbus-policy.c +++ b/test/test-kdbus-policy.c @@ -12,6 +12,7 @@ #include fcntl.h #include pthread.h #include poll.h +#include sched.h #include stdlib.h #include stddef.h #include stdint.h @@ -22,6 +23,8 @@ #include sys/wait.h #include sys/prctl.h #include sys/ioctl.h +#include sys/eventfd.h +#include sys/syscall.h #include kdbus-util.h #include kdbus-enum.h @@ -29,6 +32,30 @@ #define MAX_CONN 64 #define POLICY_NAMEfoo.test.policy-test +static int ok_cnt; +static int skip_cnt; +static int fail_cnt; + +static void print_test_status(int test_status) +{ + switch (test_status) { + case CHECK_OK: + printf(OK); + ok_cnt++; + break; + case CHECK_SKIP: + printf(SKIPPED); + skip_cnt++; + break; + case CHECK_ERR: + default: + printf(ERROR); + fail_cnt++; + break; + } + + printf(\n); +} /** * The purpose of these tests: @@ -53,35 +80,59 @@ void kdbus_free_conn(struct conn *conn) } } +static void *kdbus_recv_echo(void *ptr) +{ + int ret; + struct conn *conn = ptr; + + ret = conn_recv(conn); + + return (void *)(long)ret; +} + /* Trigger kdbus_policy_set() */ static int kdbus_set_policy_talk(struct conn *conn, const char *name, uid_t id, unsigned int type) { + int ret; struct kdbus_policy_access access = { .type = type, .id = id, .access = KDBUS_POLICY_TALK, }; - return conn_update_policy(conn, name, access, 1); + ret = conn_update_policy(conn, name, access, 1); + if (ret 0) + return CHECK_ERR; + + return CHECK_OK; } -/* The policy access will be stored in a policy holder connection */ -static int kdbus_register_activator(char *bus, const char *name, - struct conn **c) +/* return CHECK_OK or CHECK_ERR on failure */ +static int kdbus_register_same_activator(char *bus, const char *name, +struct conn **c) { + int ret; struct conn *activator; activator = kdbus_hello_activator(bus, name, NULL, 0); - if (!activator) - return -errno; + if (activator) { + *c = activator; + fprintf(stderr, --- error was able to register name twice '%s'.\n, + name); + return CHECK_ERR; + } - *c = activator; + ret = -errno; + /* -EEXIST means test succeeded */ + if (ret == -EEXIST) + return CHECK_OK; - return 0; + return CHECK_ERR; } +/* return CHECK_OK or CHECK_ERR on failure */ static int kdbus_register_policy_holder(char *bus, const char *name, struct conn **conn) { @@ -99,54 +150,88 @@ static int kdbus_register_policy_holder(char *bus, const char *name, c = kdbus_hello_registrar(bus, name, access, 2, KDBUS_HELLO_POLICY_HOLDER); if (!c) - return -errno; + return CHECK_ERR; *conn = c; - return 0; + return CHECK_OK; } -static void *kdbus_recv_echo(void *ptr) +/* return CHECK_OK or CHECK_ERR on failure */ +static int kdbus_receiver_acquire_name(char *bus, const char *name, + struct conn **conn) { int ret; - int cnt = 3; - struct pollfd fd; - struct conn *conn = ptr; + struct conn *c; - fd.fd = conn-fd; - fd.events = POLLIN | POLLPRI | POLLHUP; - fd.revents = 0; + c = kdbus_hello(bus, 0, NULL, 0); + if (!c) + return CHECK_ERR; - while (cnt) { - cnt--; - ret = poll(fd, 1, 2000); - if (ret == 0) { - ret = -ETIMEDOUT; - break; - } + add_match_empty(c-fd); - if (ret 0
[systemd-devel] [PATCH 2/2] policy: make policy checks work across user namespaces
Use the global kuid_t and kgid_t to store uid/gid values, this way we our policy checks will work across user namespaces. Note that currently we ignore that the user is privileged in its own namespaces and the policy access kuid_t and kgid_t were mapped into that namespace. If this is requested we can add it later a la: fs/inode.c:inode_owner_or_capable() Add kdbus_policy_make_access() to convert the user provided info to the current user namespace. Userspace struct is not changed, only the kernel one. This patch fixes test #8 of test-kdbus-policy Signed-off-by: Djalal Harouni tix...@opendz.org --- policy.c | 72 +--- 1 file changed, 56 insertions(+), 16 deletions(-) diff --git a/policy.c b/policy.c index c4ac361..699806a 100644 --- a/policy.c +++ b/policy.c @@ -3,6 +3,7 @@ * Copyright (C) 2013 Greg Kroah-Hartman gre...@linuxfoundation.org * Copyright (C) 2013 Daniel Mack dan...@zonque.org * Copyright (C) 2013 Linux Foundation + * Copyright (C) 2014 Djalal Harouni * * kdbus is free software; you can redistribute it and/or modify it under * the terms of the GNU Lesser General Public License as published by the @@ -42,8 +43,8 @@ struct kdbus_policy_db_cache_entry { * struct kdbus_policy_db_entry_access - a database entry access item * @type: One of KDBUS_POLICY_ACCESS_* types * @access:Access to grant. One of KDBUS_POLICY_* - * @id:For KDBUS_POLICY_ACCESS_USER, the uid - * For KDBUS_POLICY_ACCESS_GROUP, the gid + * @uid: For KDBUS_POLICY_ACCESS_USER, the global uid + * @gid: For KDBUS_POLICY_ACCESS_GROUP, the global gid * @list: List entry item for the entry's list * * This is the internal version of struct kdbus_policy_db_access. @@ -51,7 +52,8 @@ struct kdbus_policy_db_cache_entry { struct kdbus_policy_db_entry_access { u8 type;/* USER, GROUP, WORLD */ u8 access; /* OWN, TALK, SEE */ - u64 id; /* uid, gid, 0 */ + kuid_t uid; /* global uid */ + kgid_t gid; /* global gid */ struct list_head list; }; @@ -189,30 +191,30 @@ static int kdbus_policy_check_access(const struct kdbus_policy_db_entry *e, { struct kdbus_policy_db_entry_access *a; struct group_info *group_info; - struct user_namespace *ns; - uid_t uid; int i; if (!e) return -EPERM; - ns = cred-user_ns; group_info = cred-group_info; - uid = from_kuid(ns, cred-uid); list_for_each_entry(a, e-access_list, list) { if (a-access = access) { switch (a-type) { case KDBUS_POLICY_ACCESS_USER: - if (a-id == uid) + if (uid_eq(cred-uid, a-uid)) return 0; break; case KDBUS_POLICY_ACCESS_GROUP: + if (gid_eq(cred-gid, a-gid)) + return 0; + for (i = 0; i group_info-ngroups; i++) { kgid_t gid = GROUP_AT(group_info, i); - if (a-id == from_kgid_munged(ns, gid)) + if (gid_eq(gid, a-gid)) return 0; } + break; case KDBUS_POLICY_ACCESS_WORLD: return 0; @@ -444,6 +446,49 @@ struct kdbus_policy_list_entry { struct list_head entry; }; +/* + * Convert user provided policy access to internal kdbus policy + * access + */ +static int +kdbus_policy_make_access(const struct kdbus_policy_access *uaccess, +struct kdbus_policy_db_entry_access **entry) +{ + int ret; + struct kdbus_policy_db_entry_access *a; + + a = kzalloc(sizeof(*a), GFP_KERNEL); + if (!a) + return -ENOMEM; + + ret = -EINVAL; + switch (uaccess-type) { + case KDBUS_POLICY_ACCESS_USER: + a-uid = make_kuid(current_user_ns(), uaccess-id); + if (!uid_valid(a-uid)) + goto err; + + break; + case KDBUS_POLICY_ACCESS_GROUP: + a-gid = make_kgid(current_user_ns(), uaccess-id); + if (!gid_valid(a-gid)) + goto err; + + break; + } + + a-type = uaccess-type; + a-access = uaccess-access; + + *entry = a; + + return 0; + +err: + kfree(a); + return ret; +} + /** * kdbus_policy_set() - set a connection's policy rules * @db:The policy database @@ -571,15 +616,10 @@ int kdbus_policy_set
Re: [systemd-devel] [PATCH 2/2] policy: make policy checks work across user namespaces
On Mon, Sep 08, 2014 at 03:27:42PM +0200, Daniel Mack wrote: On 09/08/2014 03:18 PM, Djalal Harouni wrote: * This is the internal version of struct kdbus_policy_db_access. @@ -51,7 +52,8 @@ struct kdbus_policy_db_cache_entry { struct kdbus_policy_db_entry_access { u8 type;/* USER, GROUP, WORLD */ u8 access; /* OWN, TALK, SEE */ - u64 id; /* uid, gid, 0 */ + kuid_t uid; /* global uid */ + kgid_t gid; /* global gid */ Such an entry can only either be referring to a user or group rule, determined by the 'type' field. Hence, having two members in the struct is overkill. I understand you did this to have the real kernel types in place, but we can put the two things in a union, right? Yes there are compile time checks, and it is perhaps easier/consistent to read this way! but yes a union is also good. OK I'll update it. BTW there is a *small* optimization that we can also add later to the case KDBUS_POLICY_ACCESS_GROUP: when we walk the additional group and match, instead of linear we can do a binary search since groups are sorted, there is kernel/groups.c:groups_search() which should do the job but the symbol is not exported... Ok will update/test for the union case and send it later, thank you! -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH 1/2] test: update policy tests to handle user namespaces
On Mon, Sep 08, 2014 at 03:32:21PM +0200, Daniel Mack wrote: On 09/08/2014 03:18 PM, Djalal Harouni wrote: Upstream kernels allow unprivileged users to create user namespaces and change their uid/gid. These patches update kdbus policy logic to handle this case and improve our current checks across user namespaces. So this patch adds: * kdbus_test_waitpid() to get exit code of childs. * kdbus_clone_userns_test() that performs the test inside a new user namespace. * Converts all the other tests to return CHECK_OK, CHECK_SKIP or CHECK_ERR so we are consistent. Currently we fail at kdbus_clone_userns_test() test #8. The next patch will fix this issue. Signed-off-by: Djalal Harouni tix...@opendz.org Applied, thanks! However, I will soon rework the entire test code again to have all of it integrated in one comprehensive self-test. But I'll resuse the code to set up the namespaces for that. Ok, with this patch I already converted the high functions that perform these tests to return CHECK_OK, CHECK_ERR and CHECK_SKIP, so you probably just have to add a global struct that will contain the necessary arguments for these functions and pass it. As it is done in test-kdbus one. Thank you! Daniel -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH] sd-bus: kdbus: monitor connections use the KDBUS_HELLO_MONITOR flag
--- Currently this bus_kernel_create_monitor() is not used. Patch compile tested. src/libsystemd/sd-bus/bus-kernel.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/libsystemd/sd-bus/bus-kernel.c b/src/libsystemd/sd-bus/bus-kernel.c index 3ca271c..1440e43 100644 --- a/src/libsystemd/sd-bus/bus-kernel.c +++ b/src/libsystemd/sd-bus/bus-kernel.c @@ -1547,7 +1547,7 @@ int bus_kernel_create_monitor(const char *bus) { hello = alloca0(sizeof(struct kdbus_cmd_hello)); hello-size = sizeof(struct kdbus_cmd_hello); -hello-conn_flags = KDBUS_HELLO_ACTIVATOR; +hello-conn_flags = KDBUS_HELLO_MONITOR; hello-pool_size = KDBUS_POOL_SIZE; if (ioctl(fd, KDBUS_CMD_HELLO, hello) 0) { -- 1.9.0 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH 3/3] kdbus: get some creds during meta append for optimization
On Wed, Aug 20, 2014 at 10:49:22PM +0200, Daniel Mack wrote: On 08/20/2014 06:16 PM, Djalal Harouni wrote: On Tue, Aug 19, 2014 at 09:15:35AM +0200, Daniel Mack wrote: Hmm, I'm not convinced this buys us anything really. After all, that struct has a single user only, and factoring out these fields doesn't necessarily lead to more readability. Hmm with your commit 7bde48f293f5, I guess we have to add a generic struct like that, or split into multiple structs, the aim would be to share the struct and operations that perform the: Translate and install fields into the receiver's slice when requesting the connection info or when the receiver reads the message ? Atm, I think the best way is to fully populate the items at creation time, and override them again when installing the message. That way, we at least make sure not to deliver bogus data in for connection info calls at we get rid of the logical dependency between metadata and the queue items. A long-hanging fruit for optimization would be to check the namespaces at delivery time, and only patch over data if they differ. That can still be done later. Yes, sounds good. Well, yes without numbers the optimization for uid/gid perhaps is not needed, but for the auxgroup perhaps ? we can allocate and get the auxgrps one time, then just memdup the data into the queue-auxgrps[] ? But their not necessarily equal in size. After all, in the final item, all values are u64, so we have to walk the list again. Just trying to figure out the best solution... Me too :) BTW Daniel, sorry for this stupid question: Can't the KDBUS_ATTACH_AUXGROUPS be part of the KDBUS_ATTACH_CREDS item? Will be easy to parse it in the same item with uid/gid... Hmm, I don't think so. We designed the metadata thing in a way that it allows users to request individual items in a fine-grained way, and I'm not convinced that every user who wants the creds item is also interested in auxgrp information. The other way around, it might be different story though. Ok, thank you for confirming. Thanks, Daniel -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 1/2] test: in msg_dump() fix kdbus_audit fields order
We have sessionid then loginuid in kdbus_audit. Signed-off-by: Djalal Harouni tix...@opendz.org --- test/kdbus-util.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/test/kdbus-util.c b/test/kdbus-util.c index f79d7ec..956fa6f 100644 --- a/test/kdbus-util.c +++ b/test/kdbus-util.c @@ -469,7 +469,7 @@ void msg_dump(const struct conn *conn, const struct kdbus_msg *msg) } case KDBUS_ITEM_AUDIT: - printf( +%s (%llu bytes) loginuid=%llu sessionid=%llu\n, + printf( +%s (%llu bytes) sessionid=%llu loginuid=%llu\n, enum_MSG(item-type), item-size, (unsigned long long)item-data64[0], (unsigned long long)item-data64[1]); -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 2/2] kdbus: do the audit loginuid translation as late as possible
Do the namespace translation just before pushing into the receiver's slice, so we map to the receiver's user namespace. Patch tested. Signed-off-by: Djalal Harouni tix...@opendz.org --- connection.c | 37 + metadata.c | 3 +++ metadata.h | 4 3 files changed, 44 insertions(+) diff --git a/connection.c b/connection.c index 9250dab..283a3fc 100644 --- a/connection.c +++ b/connection.c @@ -10,6 +10,7 @@ * your option) any later version. */ +#include linux/audit.h #include linux/device.h #include linux/file.h #include linux/fs.h @@ -67,6 +68,9 @@ struct kdbus_conn_reply; * @auxgrp_item_offset:The offset of the auxgrp item inside the slice, if * the user requested this metainfo in its attach flags. * 0 if unused. + * @audit_item_offset: The offset of the audit item inside the slice, if + * the user requested this metainfo in its attach flags. + * 0 if unused. * @uid: The UID to patch into the final message * @gid: The GID to patch into the final message * @pid: The PID to patch into the final message @@ -75,6 +79,8 @@ struct kdbus_conn_reply; * This information is translated into the user's * namespace when the message is installed. * @auxgroup_count:The number of items in @auxgrps. + * @loginuid: The audit login uid to patch into the final + * message */ struct kdbus_conn_queue { struct list_head entry; @@ -95,6 +101,7 @@ struct kdbus_conn_queue { int user; off_t creds_item_offset; off_t auxgrp_item_offset; + off_t audit_item_offset; /* to honor namespaces, we have to store the following here */ kuid_t uid; @@ -104,6 +111,8 @@ struct kdbus_conn_queue { kgid_t *auxgrps; unsigned int auxgrps_count; + + kuid_t loginuid; }; /** @@ -676,6 +685,12 @@ static int kdbus_conn_queue_alloc(struct kdbus_conn *conn, meta-auxgrps_item_off; } + if (meta-attached KDBUS_ATTACH_AUDIT) { + queue-loginuid = audit_get_loginuid(current); + queue-audit_item_offset = meta_off + + meta-audit_item_off; + } + ret = kdbus_pool_slice_copy(queue-slice, meta_off, kmsg-meta-data, kmsg-meta-size); @@ -983,6 +998,22 @@ static int kdbus_conn_creds_install(struct kdbus_conn_queue *queue) return ret; } +static int kdbus_conn_audit_install(struct kdbus_conn_queue *queue) +{ + int ret; + u64 loginuid; + off_t off = queue-audit_item_offset + + offsetof(struct kdbus_item, audit) + + offsetof(struct kdbus_audit, loginuid); + + loginuid = from_kuid_munged(current_user_ns(), queue-loginuid); + + ret = kdbus_pool_slice_copy_user(queue-slice, off, +loginuid, sizeof(loginuid)); + + return ret; +} + static int kdbus_conn_msg_install(struct kdbus_conn_queue *queue) { int *memfds = NULL; @@ -1036,6 +1067,12 @@ static int kdbus_conn_msg_install(struct kdbus_conn_queue *queue) goto exit_rewind_fds; } + if (queue-audit_item_offset) { + ret = kdbus_conn_audit_install(queue); + if (ret 0) + goto exit_rewind_fds; + } + kfree(fds); kfree(memfds); kdbus_pool_slice_flush(queue-slice); diff --git a/metadata.c b/metadata.c index 934aa62..dabc614 100644 --- a/metadata.c +++ b/metadata.c @@ -119,6 +119,9 @@ kdbus_meta_append_item(struct kdbus_meta *meta, u64 type, size_t payload_size) case KDBUS_ITEM_AUXGROUPS: meta-auxgrps_item_off = meta-size; break; + case KDBUS_ITEM_AUDIT: + meta-audit_item_off = meta-size; + break; } meta-size += extra_size; diff --git a/metadata.h b/metadata.h index 1bdb537..ea77783 100644 --- a/metadata.h +++ b/metadata.h @@ -26,6 +26,9 @@ * @auxgrps_item_off The offset of the auxgroups item in the * @data buffer field, if the user requested * this metainfo. 0 if unused. + * @audit_item_off The offset of the audit item in the @data + * buffer field, if the user requested this + * metainfo. 0 if unused. * * Used to collect and store connection metadata in a pre-compiled * buffer containing struct kdbus_item. @@ -39,6 +42,7 @@ struct kdbus_meta { off_t creds_item_off; off_t auxgrps_item_off; + off_t audit_item_off; }; struct
Re: [systemd-devel] [PATCH 3/3] kdbus: get some creds during meta append for optimization
On Tue, Aug 19, 2014 at 09:15:35AM +0200, Daniel Mack wrote: Hi Djalal, Thanks for applying the others. On 08/19/2014 03:43 AM, Djalal Harouni wrote: Some creds can be gathered during kdbus_meta_append() instead of kdbus_conn_queue_alloc() where they will be gathered for all the receivers and saved into each receivers queue before installing on the slices. By moving to kdbus_meta_append() it permits to do it only one time for all the receivers, and we reduce some latency in kdbus_conn_queue_alloc(). Another point is that all the receivers will get the same uid/gid even if the current sender changes its creds while we are still queueing for all receivers. It seems that we can do the same with auxgroups, but this patch does not implement it. Patch tested with the test-kdbus-metadata-ns Signed-off-by: Djalal Harouni tix...@opendz.org --- connection.c | 56 metadata.c | 6 +- metadata.h | 26 ++ 3 files changed, 67 insertions(+), 21 deletions(-) diff --git a/connection.c b/connection.c index 9250dab..49c5045 100644 --- a/connection.c +++ b/connection.c @@ -42,6 +42,24 @@ struct kdbus_conn_reply; /** + * struct kdbus_queue_creds - internal creds data for receiver's + * queue + * @uid: The UID to patch into the final message + * @gid: The GID to patch into the final message + * @pid: The PID to patch into the final message + * @tid: The TID to patch into the final message + * + * Creds that must be translated into the receiver's namespaces + * when the message is installed into its slice. + */ +struct kdbus_queue_creds { + kuid_t uid; + kgid_t gid; + struct pid *pid; + struct pid *tid; +}; Hmm, I'm not convinced this buys us anything really. After all, that struct has a single user only, and factoring out these fields doesn't necessarily lead to more readability. Hmm with your commit 7bde48f293f5, I guess we have to add a generic struct like that, or split into multiple structs, the aim would be to share the struct and operations that perform the: Translate and install fields into the receiver's slice when requesting the connection info or when the receiver reads the message ? [...] /** + * struct kdbus_meta_creds - internal creds data + * @uid:The UID of the sender + * @gid:The GID of the sender + * + * Creds used for optimizations. We need to gather the info during + * kdbus_meta_append() before pushing into the receiver's queue, + * this way we do it only one time for all the receivers. I'm not strictly against it, but I wonder if this optimization is really worth it. Do you think it's that expensive to access these fields in 'current'? Because this change certainly doesn't make the code easier to understand, so I'm hesitant about this change. Yes, I agree that the code would not be easy to read, and I failed to come up with a better solution, perhaps it needs more thoughts! Well, yes without numbers the optimization for uid/gid perhaps is not needed, but for the auxgroup perhaps ? we can allocate and get the auxgrps one time, then just memdup the data into the queue-auxgrps[] ? Just trying to figure out the best solution... BTW Daniel, sorry for this stupid question: Can't the KDBUS_ATTACH_AUXGROUPS be part of the KDBUS_ATTACH_CREDS item? Will be easy to parse it in the same item with uid/gid... -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH 1/2] test: in msg_dump() fix kdbus_audit fields order
On Wed, Aug 20, 2014 at 06:04:09PM +0200, Daniel Mack wrote: On 08/20/2014 05:58 PM, Djalal Harouni wrote: case KDBUS_ITEM_AUDIT: - printf( +%s (%llu bytes) loginuid=%llu sessionid=%llu\n, + printf( +%s (%llu bytes) sessionid=%llu loginuid=%llu\n, enum_MSG(item-type), item-size, (unsigned long long)item-data64[0], (unsigned long long)item-data64[1]); Why not use item-audit.sessionid and item-audit.loginid to make it more readable? Oh yes :-) sorry will do it. Daniel -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH] test: use audit.loginuid and audit.sessionid when dumping the audit item
Signed-off-by: Djalal Harouni tix...@opendz.org --- test/kdbus-util.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/test/kdbus-util.c b/test/kdbus-util.c index f79d7ec..5b3df7d 100644 --- a/test/kdbus-util.c +++ b/test/kdbus-util.c @@ -471,8 +471,8 @@ void msg_dump(const struct conn *conn, const struct kdbus_msg *msg) case KDBUS_ITEM_AUDIT: printf( +%s (%llu bytes) loginuid=%llu sessionid=%llu\n, enum_MSG(item-type), item-size, - (unsigned long long)item-data64[0], - (unsigned long long)item-data64[1]); + (unsigned long long)item-audit.loginuid, + (unsigned long long)item-audit.sessionid); break; case KDBUS_ITEM_CAPS: { -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] kdbus: merge 3.17 branch into master?
On Tue, Aug 19, 2014 at 08:56:14AM -0500, Greg KH wrote: On Tue, Aug 19, 2014 at 03:24:39PM +0200, Daniel Mack wrote: On 08/19/2014 02:47 PM, Greg KH wrote: In file included from /lib/modules/3.17.0-rc1+/build/include/uapi/linux/posix_types.h:4:0, from /lib/modules/3.17.0-rc1+/build/include/uapi/linux/types.h:13, from ../kdbus.h:23, Ok, got it now. No idea why it worked on my system, but a fix for now is to make cpp look at the system include file search pathes first, and then fall back to $(KERNELDIR)/include/uapi. Care to test again? Close, I now get a bunch of these warnings: cc1: note: obsolete option -I- used, please use -iquote instead and then finally a failure of: kdbus-util.c:173:16: error: ‘__NR_memfd_create’ undeclared (first use in this function) ret = syscall(__NR_memfd_create, name, MFD_ALLOW_SEALING); I also confirm, HEAD gives these errors! However yesterday before any change to Makefile, I got the same error that was first reported by Greg, and to compile and test kdbus, I just added export KERNELDIR to the root Makefile, didn't bother to search... -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] kdbus: merge 3.17 branch into master?
On Tue, Aug 19, 2014 at 04:08:15PM +0200, Daniel Mack wrote: On 08/19/2014 03:56 PM, Greg KH wrote: On Tue, Aug 19, 2014 at 03:24:39PM +0200, Daniel Mack wrote: On 08/19/2014 02:47 PM, Greg KH wrote: In file included from /lib/modules/3.17.0-rc1+/build/include/uapi/linux/posix_types.h:4:0, from /lib/modules/3.17.0-rc1+/build/include/uapi/linux/types.h:13, from ../kdbus.h:23, Ok, got it now. No idea why it worked on my system, but a fix for now is to make cpp look at the system include file search pathes first, and then fall back to $(KERNELDIR)/include/uapi. Care to test again? Close, I now get a bunch of these warnings: cc1: note: obsolete option -I- used, please use -iquote instead and then finally a failure of: kdbus-util.c:173:16: error: ‘__NR_memfd_create’ undeclared (first use in this function) ret = syscall(__NR_memfd_create, name, MFD_ALLOW_SEALING); Ok, cpp is really confused about which headers to look at now. I guess we have to rely on the kernel headers being available under $(KERNELDIR)/usr/include. Could you please try if temporarily reverting my top-most commit and then doing a 'make headers_install' in your kernel repo (before you build kdbus) fixes it? Daniel, I confirm here. I just did that and it builds. Thanks -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 0/3] kdbus: metadata improvements
Hi, Patch 1 and 2 should be applied. For patch 3 it can be discussed :-) Please apply on top of the test series that contain the test-kdbus-metadta-ns tool. Thanks! ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 1/3] metadata: document creds_item_off and auxgrps_item_off fields
Signed-off-by: Djalal Harouni tix...@opendz.org --- metadata.c | 3 ++- metadata.h | 8 +++- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/metadata.c b/metadata.c index eb286aa..3dff9ac 100644 --- a/metadata.c +++ b/metadata.c @@ -130,7 +130,8 @@ kdbus_meta_append_item(struct kdbus_meta *meta, u64 type, size_t payload_size) * kdbus_meta_append_data() - append given raw data to metadata object * @meta: Metadata object * @type: KDBUS_ITEM_* type - * @data: pointer to data to copy from + * @data: pointer to data to copy from. If it is NULL + * then just make space in the metadata buffer. * @len: number of bytes to copy * * Return: 0 on success, negative errno on failure. diff --git a/metadata.h b/metadata.h index a142aea..1bdb537 100644 --- a/metadata.h +++ b/metadata.h @@ -16,10 +16,16 @@ /** * struct kdbus_meta - metadata buffer * @attached: Flags for already attached data - * @domain:Domain the metadata belongs to + * @domain:Domain the metadata belongs to * @data: Allocated buffer * @size: Number of bytes used * @allocated_size:Size of buffer + * @creds_item_off The offset of the creds item in the @data + * buffer field, if the user requested this + * metainfo in its attach flags. 0 if unused. + * @auxgrps_item_off The offset of the auxgroups item in the + * @data buffer field, if the user requested + * this metainfo. 0 if unused. * * Used to collect and store connection metadata in a pre-compiled * buffer containing struct kdbus_item. -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 2/3] connection: move the install creds into the slice to its own function
Move the install creds into the receiver's slice to its own function kdbus_conn_creds_install(). Use from_kuid_munged(), so the uid mapping never fails. Signed-off-by: Djalal Harouni tix...@opendz.org --- connection.c | 32 1 file changed, 20 insertions(+), 12 deletions(-) diff --git a/connection.c b/connection.c index 99ac2a1..9250dab 100644 --- a/connection.c +++ b/connection.c @@ -964,6 +964,25 @@ remove_unused: return ret; } +static int kdbus_conn_creds_install(struct kdbus_conn_queue *queue) +{ + int ret; + struct kdbus_creds creds = {}; + struct user_namespace *current_ns = current_user_ns(); + off_t off = queue-creds_item_offset + + offsetof(struct kdbus_item, creds); + + creds.uid = from_kuid_munged(current_ns, queue-uid); + creds.gid = from_kgid_munged(current_ns, queue-gid); + creds.pid = pid_nr_ns(queue-pid, task_active_pid_ns(current)); + creds.tid = pid_nr_ns(queue-tid, task_active_pid_ns(current)); + + ret = kdbus_pool_slice_copy_user(queue-slice, off, +creds, sizeof(creds)); + + return ret; +} + static int kdbus_conn_msg_install(struct kdbus_conn_queue *queue) { int *memfds = NULL; @@ -989,18 +1008,7 @@ static int kdbus_conn_msg_install(struct kdbus_conn_queue *queue) } if (queue-creds_item_offset) { - struct kdbus_creds creds; - size_t size = sizeof(__u64) * 4; - off_t off = queue-creds_item_offset + - offsetof(struct kdbus_item, creds); - - creds.uid = from_kuid(current_user_ns(), queue-uid); - creds.gid = from_kgid(current_user_ns(), queue-gid); - creds.pid = pid_nr_ns(queue-pid, task_active_pid_ns(current)); - creds.tid = pid_nr_ns(queue-tid, task_active_pid_ns(current)); - - ret = kdbus_pool_slice_copy_user(queue-slice, off, -creds, size); + ret = kdbus_conn_creds_install(queue); if (ret 0) goto exit_rewind_fds; } -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 3/3] kdbus: get some creds during meta append for optimization
Some creds can be gathered during kdbus_meta_append() instead of kdbus_conn_queue_alloc() where they will be gathered for all the receivers and saved into each receivers queue before installing on the slices. By moving to kdbus_meta_append() it permits to do it only one time for all the receivers, and we reduce some latency in kdbus_conn_queue_alloc(). Another point is that all the receivers will get the same uid/gid even if the current sender changes its creds while we are still queueing for all receivers. It seems that we can do the same with auxgroups, but this patch does not implement it. Patch tested with the test-kdbus-metadata-ns Signed-off-by: Djalal Harouni tix...@opendz.org --- connection.c | 56 metadata.c | 6 +- metadata.h | 26 ++ 3 files changed, 67 insertions(+), 21 deletions(-) diff --git a/connection.c b/connection.c index 9250dab..49c5045 100644 --- a/connection.c +++ b/connection.c @@ -42,6 +42,24 @@ struct kdbus_conn_reply; /** + * struct kdbus_queue_creds - internal creds data for receiver's + * queue + * @uid: The UID to patch into the final message + * @gid: The GID to patch into the final message + * @pid: The PID to patch into the final message + * @tid: The TID to patch into the final message + * + * Creds that must be translated into the receiver's namespaces + * when the message is installed into its slice. + */ +struct kdbus_queue_creds { + kuid_t uid; + kgid_t gid; + struct pid *pid; + struct pid *tid; +}; + +/** * struct kdbus_conn_queue - messages waiting to be read * @entry: Entry in the connection's list * @prio_node: Entry in the priority queue tree @@ -67,10 +85,8 @@ struct kdbus_conn_reply; * @auxgrp_item_offset:The offset of the auxgrp item inside the slice, if * the user requested this metainfo in its attach flags. * 0 if unused. - * @uid: The UID to patch into the final message - * @gid: The GID to patch into the final message - * @pid: The PID to patch into the final message - * @tid: The TID to patch into the final message + * @qcreds The creds that must be translated before + * installing. * @auxgrps: An array storing the sender's aux groups, in kgid_t. * This information is translated into the user's * namespace when the message is installed. @@ -97,10 +113,7 @@ struct kdbus_conn_queue { off_t auxgrp_item_offset; /* to honor namespaces, we have to store the following here */ - kuid_t uid; - kgid_t gid; - struct pid *pid; - struct pid *tid; + struct kdbus_queue_creds qcreds; kgid_t *auxgrps; unsigned int auxgrps_count; @@ -466,10 +479,11 @@ static void kdbus_conn_queue_remove(struct kdbus_conn *conn, static void kdbus_conn_queue_cleanup(struct kdbus_conn_queue *queue) { - if (queue-pid) - put_pid(queue-pid); - if (queue-tid) - put_pid(queue-tid); + struct kdbus_queue_creds *qcreds = queue-qcreds; + if (qcreds-pid) + put_pid(qcreds-pid); + if (qcreds-tid) + put_pid(qcreds-tid); if (queue-auxgrps) kfree(queue-auxgrps); @@ -618,6 +632,7 @@ static int kdbus_conn_queue_alloc(struct kdbus_conn *conn, /* append message metadata/credential items */ if (meta_off 0) { struct kdbus_meta *meta = kmsg-meta; + struct kdbus_queue_creds *qcreds = queue-qcreds; /* * If the receiver requested credential information, store the @@ -626,10 +641,10 @@ static int kdbus_conn_queue_alloc(struct kdbus_conn *conn, */ if (meta-attached KDBUS_ATTACH_CREDS) { /* store kernel-view of the credentials */ - queue-uid = current_uid(); - queue-gid = current_gid(); - queue-pid = get_task_pid(current, PIDTYPE_PID); - queue-tid = get_task_pid(current-group_leader, + qcreds-uid = meta-mcreds.uid; + qcreds-gid = meta-mcreds.gid; + qcreds-pid = get_task_pid(current, PIDTYPE_PID); + qcreds-tid = get_task_pid(current-group_leader, PIDTYPE_PID); queue-creds_item_offset = meta_off + @@ -968,14 +983,15 @@ static int kdbus_conn_creds_install(struct kdbus_conn_queue *queue) { int ret; struct kdbus_creds creds = {}; + struct kdbus_queue_creds *qcreds = queue-qcreds; struct user_namespace
Re: [systemd-devel] compile with clang broken
On Fri, Aug 15, 2014 at 10:55:57AM +0200, David Herrmann wrote: Hi On Thu, Aug 14, 2014 at 8:07 PM, Lennart Poettering lenn...@poettering.net wrote: On Fri, 18.07.14 16:02, Thomas H.P. Andersen (pho...@gmail.com) wrote: 1716f6dcf54d4c181c2e2558e3d5414f54c8d9ca (resolved: add LLMNR support for looking up names) broke the build on clang. src/resolve/resolved-manager.c:553:43: error: non-const static data member must be initialized out of line uint8_t buffer[CMSG_SPACE(MAX(sizeof(struct in_pktinfo), sizeof(struct in6_pktinfo))) Moving the MAX(...) to a separate line fixes that problem but another error then happens: src/resolve/resolved-manager.c:554:25: error: fields must have a constant size: 'variable length array in structure' extension will never be supported uint8_t buffer[CMSG_SPACE(size) We have encountered the same problem before and Lennart was able to write the code in a different way. Would this be possible here too? My sugegstion here would be to maybe rewrite the MAX() macro to use __builtin_constant_p() so that it becomes constant if the params are constant, and only uses code block when it isn't. Or so... http://lists.freedesktop.org/archives/systemd-devel/2014-August/021912.html Hm, I don't know whether that works. See the description here: https://gcc.gnu.org/onlinedocs/gcc-4.3.3/gcc/Other-Builtins.html What you propose is something like my attached patch, I guess? Along the lines of: (__builtin_constant_p(A) __builtin_constant_p(B)) ? ((A) (B) ? (A) : (B)) : ({ OLD_MAX }) Thing is, the ELSE case is not considered a compile-time constant by LLVM. Therefore, the whole expression is not considered a compile-time constant. I don't know whether conditions with __builtin_constant_p() are evaluated at the parser-step. The GCC example replaces the ELSE case with -1, effectively making both compile-time constants. Sorry I didn't follow the thread, but: Actually and IIRC it is more complicated, __builtin_constant_p() can be computed during the SSA (optimization) passes on the GIMPLE form of the code... so it depends on the code and parameters passed to __builitin_constant_p(), not only preprocessor constants. https://gcc.gnu.org/onlinedocs/gccint/Tree-SSA.html -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 1/2] test: sync the policy tests with the recent activators and policy holders changes
Recent commit 7015a1e6746e0c2 prevents special-purpose connections from owning names, so update the test-kdbus-policy tests to follow and test these changes. Create a new policy holder connection which will register the policy for an X name, and make the first conn_db[0] connection own that name. Signed-off-by: Djalal Harouni tix...@opendz.org --- test/test-kdbus-policy.c | 25 +++-- 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/test/test-kdbus-policy.c b/test/test-kdbus-policy.c index a2430f2..e0bd619 100644 --- a/test/test-kdbus-policy.c +++ b/test/test-kdbus-policy.c @@ -39,8 +39,9 @@ /** * Check a list of connections against conn_db[0] - * conn_db[0] will be the policy holder and it will set - * different policy accesses. + * conn_db[0] will own the name foo.test.policy-test and the + * policy holder connection for this name will update the policy + * entries, so different use cases can be tested. */ static struct conn **conn_db; @@ -269,6 +270,7 @@ static int kdbus_check_policy(char *bus) int i; int ret; struct conn *activator = NULL; + struct conn *policy_holder = NULL; conn_db = calloc(MAX_CONN, sizeof(struct conn *)); if (!conn_db) @@ -276,8 +278,9 @@ static int kdbus_check_policy(char *bus) memset(conn_db, 0, MAX_CONN * sizeof(struct conn *)); - ret = kdbus_register_policy_holder(bus, POLICY_NAME, conn_db[0]); - printf(-- TEST 1) register '%s' as policy holder , + ret = kdbus_register_policy_holder(bus, POLICY_NAME, + policy_holder); + printf(-- TEST 1) register a policy holder for '%s' , POLICY_NAME); if (ret 0) { printf(FAILED\n); @@ -307,6 +310,15 @@ static int kdbus_check_policy(char *bus) } } + /* The name receiver */ + conn_db[0] = kdbus_hello(bus, 0, NULL, 0); + if (!conn_db[0]) { + ret = -1; + goto out_free_connections; + } + + add_match_empty(conn_db[0]-fd); + ret = name_acquire(conn_db[0], POLICY_NAME, 0); printf(-- TEST 3) acquire '%s' name. , POLICY_NAME); if (ret 0) { @@ -344,9 +356,9 @@ static int kdbus_check_policy(char *bus) * Restrict the policy and purge cache entries where the * conn_db[0] is the destination. */ - ret = kdbus_set_policy_talk(conn_db[0], POLICY_NAME, + ret = kdbus_set_policy_talk(policy_holder, POLICY_NAME, geteuid(), KDBUS_POLICY_ACCESS_USER); - printf(-- TEST 6) restricting policy '%s' TALK access , + printf(-- TEST 6) restricting '%s' policy TALK access , POLICY_NAME); if (ret 0) { printf(FAILED\n); @@ -374,6 +386,7 @@ static int kdbus_check_policy(char *bus) out_free_connections: kdbus_free_conn(activator); + kdbus_free_conn(policy_holder); for (i = 0; i MAX_CONN; i++) kdbus_free_conn(conn_db[i]); -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 2/2] handle: return -EOPNOTSUPP instead of -EPERM if an operation is not supported
If userspace calls in with the wrong connection type, just return -EOPNOTSUPP instead of -EPERM. This will not confuse unprivileged and privileged processes, and permits to identify legitimate -EPERM errors. This just converts errors introduced in commit 7015a1e6746 Signed-off-by: Djalal Harouni tix...@opendz.org --- handle.c | 20 ++-- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/handle.c b/handle.c index 4b8376c..2e6502b 100644 --- a/handle.c +++ b/handle.c @@ -537,7 +537,7 @@ static long kdbus_handle_ioctl_ep_connected(struct file *file, unsigned int cmd, switch (cmd) { case KDBUS_CMD_BYEBYE: if (conn-type != KDBUS_CONN_CONNECTED) { - ret = -EPERM; + ret = -EOPNOTSUPP; break; } @@ -548,7 +548,7 @@ static long kdbus_handle_ioctl_ep_connected(struct file *file, unsigned int cmd, /* acquire a well-known name */ if (conn-type != KDBUS_CONN_CONNECTED) { - ret = -EPERM; + ret = -EOPNOTSUPP; break; } @@ -573,7 +573,7 @@ static long kdbus_handle_ioctl_ep_connected(struct file *file, unsigned int cmd, /* release a well-known name */ if (conn-type != KDBUS_CONN_CONNECTED) { - ret = -EPERM; + ret = -EOPNOTSUPP; break; } @@ -638,7 +638,7 @@ static long kdbus_handle_ioctl_ep_connected(struct file *file, unsigned int cmd, /* update the properties of a connection */ if (conn-type != KDBUS_CONN_CONNECTED) { - ret = -EPERM; + ret = -EOPNOTSUPP; break; } @@ -657,7 +657,7 @@ static long kdbus_handle_ioctl_ep_connected(struct file *file, unsigned int cmd, if (conn-type != KDBUS_CONN_CONNECTED conn-type != KDBUS_CONN_MONITOR) { - ret = -EPERM; + ret = -EOPNOTSUPP; break; } @@ -676,7 +676,7 @@ static long kdbus_handle_ioctl_ep_connected(struct file *file, unsigned int cmd, if (conn-type != KDBUS_CONN_CONNECTED conn-type != KDBUS_CONN_MONITOR) { - ret = -EPERM; + ret = -EOPNOTSUPP; break; } @@ -694,7 +694,7 @@ static long kdbus_handle_ioctl_ep_connected(struct file *file, unsigned int cmd, struct kdbus_kmsg *kmsg = NULL; if (conn-type != KDBUS_CONN_CONNECTED) { - ret = -EPERM; + ret = -EOPNOTSUPP; break; } @@ -727,7 +727,7 @@ static long kdbus_handle_ioctl_ep_connected(struct file *file, unsigned int cmd, if (conn-type != KDBUS_CONN_CONNECTED conn-type != KDBUS_CONN_MONITOR) { - ret = -EPERM; + ret = -EOPNOTSUPP; break; } @@ -754,7 +754,7 @@ static long kdbus_handle_ioctl_ep_connected(struct file *file, unsigned int cmd, u64 cookie; if (conn-type != KDBUS_CONN_CONNECTED) { - ret = -EPERM; + ret = -EOPNOTSUPP; break; } @@ -775,7 +775,7 @@ static long kdbus_handle_ioctl_ep_connected(struct file *file, unsigned int cmd, if (conn-type != KDBUS_CONN_CONNECTED conn-type != KDBUS_CONN_MONITOR) { - ret = -EPERM; + ret = -EOPNOTSUPP; break; } -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 0/3] more improvements on connection types checks
Hi, This goes on top of the previous one: http://lists.freedesktop.org/archives/systemd-devel/2014-August/021747.html Kay, sorry it should be perhaps just be a one series, but I just noticed those bugs, so just send quick fixes. If you want me to resend as a one series, I will do it, no problem. So with this we pass the test-kdbus-policy checks, but It seems that the test-kdbus still fails, will test it later. Thanks! ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 1/3] handle: allow KDBUS_CMD_CONN_UPDATE ioctl for policy holders
Allow KDBUS_CMD_CONN_UPDATE for KDBUS_CONN_POLICY_HOLDER connections. Signed-off-by: Djalal Harouni tix...@opendz.org --- handle.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/handle.c b/handle.c index 2e6502b..ac68681 100644 --- a/handle.c +++ b/handle.c @@ -636,8 +636,8 @@ static long kdbus_handle_ioctl_ep_connected(struct file *file, unsigned int cmd, case KDBUS_CMD_CONN_UPDATE: /* update the properties of a connection */ - - if (conn-type != KDBUS_CONN_CONNECTED) { + if (conn-type != KDBUS_CONN_CONNECTED + conn-type != KDBUS_CONN_POLICY_HOLDER) { ret = -EOPNOTSUPP; break; } -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 2/3] connection: improve kdbus_cmd_conn_update() connection type checks
Do another round of connection type checks inside the KDBUS_ITEM iterator. We need this since we do not want to allow ordinary connections to update policy entries that belong to another policy holder connection. We also do it for the attach flags since only ordinary connections are interessted in it. And update a kdbus_policy_set() call to only pass a one name per policy-holding connection Signed-off-by: Djalal Harouni tix...@opendz.org --- connection.c | 21 - 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/connection.c b/connection.c index c2d71a8..342c870 100644 --- a/connection.c +++ b/connection.c @@ -1792,7 +1792,8 @@ exit: } /** - * kdbus_conn_update() - update flags for a connection + * kdbus_cmd_conn_update() - update the attach-flags of a connection or + * the policy entries of a policy holding one * @conn: Connection * @cmd: The command as passed in by the ioctl * @@ -1815,11 +1816,22 @@ int kdbus_cmd_conn_update(struct kdbus_conn *conn, switch (item-type) { case KDBUS_ITEM_ATTACH_FLAGS: + /* Only ordinary connections may update their +* attach-flags */ + if (conn-type != KDBUS_CONN_CONNECTED) + return -EOPNOTSUPP; + flags_provided = true; attach_flags = item-data64[0]; break; + case KDBUS_ITEM_NAME: case KDBUS_ITEM_POLICY_ACCESS: + /* Only policy holders may update their policy +* entries */ + if (conn-type != KDBUS_CONN_POLICY_HOLDER) + return -EOPNOTSUPP; + policy_provided = true; break; } @@ -1972,13 +1984,12 @@ int kdbus_conn_new(struct kdbus_ep *ep, } /* -* Policy holders may install any number of names, and -* are allowed to use wildcards as well. +* Policy holders may install one name, and are +* allowed to use wildcards. */ ret = kdbus_policy_set(bus-policy_db, hello-items, KDBUS_ITEMS_SIZE(hello, items), - is_policy_holder ? 0 : 1, - is_policy_holder, conn); + 1, is_policy_holder, conn); if (ret 0) goto exit_free_conn; } -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 3/3] test: split conn_update() into update attach-flags and update policy
Since ordinary connections are only interested in the attach-flags and policy holders in policies, split conn_update() into: 1) conn_update_attach_flags() 2) conn_update_policy() This way we use the conn_update_policy() function in test-kdbus-policy with a policy-holding connection and we pass all the tests. This prevents messing up with the attach-flags. Signed-off-by: Djalal Harouni tix...@opendz.org --- test/kdbus-util.c| 62 +++- test/kdbus-util.h| 7 +++--- test/test-kdbus-policy.c | 2 +- 3 files changed, 45 insertions(+), 26 deletions(-) diff --git a/test/kdbus-util.c b/test/kdbus-util.c index e04aab1..c93390e 100644 --- a/test/kdbus-util.c +++ b/test/kdbus-util.c @@ -602,19 +602,15 @@ int name_list(struct conn *conn, uint64_t flags) return 0; } -int conn_update(struct conn *conn, const char *name, - const struct kdbus_policy_access *access, - size_t num_access, uint64_t flags) +int conn_update_attach_flags(struct conn *conn, uint64_t flags) { + int ret; + size_t size; struct kdbus_cmd_update *update; struct kdbus_item *item; - size_t i, size; - int ret; size = sizeof(struct kdbus_cmd_update); size += KDBUS_ITEM_SIZE(sizeof(uint64_t)); - size += KDBUS_ITEM_SIZE(strlen(name) + 1); - size += num_access * KDBUS_ITEM_SIZE(sizeof(struct kdbus_policy_access)); update = malloc(size); if (!update) { @@ -628,25 +624,47 @@ int conn_update(struct conn *conn, const char *name, item = update-items; - /* -* normally having flags == 0 is valid, but just keep -* HELLO flags of kdbus_hello(), don't check them. -*/ item-type = KDBUS_ITEM_ATTACH_FLAGS; item-size = KDBUS_ITEM_HEADER_SIZE + sizeof(uint64_t); - item-data64[0] = flags ? flags : KDBUS_ATTACH_TIMESTAMP | - KDBUS_ATTACH_CREDS | - KDBUS_ATTACH_NAMES | - KDBUS_ATTACH_COMM | - KDBUS_ATTACH_EXE | - KDBUS_ATTACH_CMDLINE | - KDBUS_ATTACH_CAPS | - KDBUS_ATTACH_CGROUP | - KDBUS_ATTACH_SECLABEL | - KDBUS_ATTACH_AUDIT | - KDBUS_ATTACH_CONN_NAME; + item-data64[0] = flags; item = KDBUS_ITEM_NEXT(item); + ret = ioctl(conn-fd, KDBUS_CMD_CONN_UPDATE, update); + if (ret 0) { + ret = -errno; + fprintf(stderr, error conn update: %d (%m)\n, ret); + } + + free(update); + + return ret; +} + +int conn_update_policy(struct conn *conn, const char *name, + const struct kdbus_policy_access *access, + size_t num_access) +{ + struct kdbus_cmd_update *update; + struct kdbus_item *item; + size_t i, size; + int ret; + + size = sizeof(struct kdbus_cmd_update); + size += KDBUS_ITEM_SIZE(strlen(name) + 1); + size += num_access * KDBUS_ITEM_SIZE(sizeof(struct kdbus_policy_access)); + + update = malloc(size); + if (!update) { + ret = -errno; + fprintf(stderr, error malloc: %d (%m)\n, ret); + return ret; + } + + memset(update, 0, size); + update-size = size; + + item = update-items; + item-type = KDBUS_ITEM_NAME; item-size = KDBUS_ITEM_HEADER_SIZE + strlen(name) + 1; strcpy(item-str, name); diff --git a/test/kdbus-util.h b/test/kdbus-util.h index 615f318..ba94d7b 100644 --- a/test/kdbus-util.h +++ b/test/kdbus-util.h @@ -55,9 +55,10 @@ struct conn *kdbus_hello_activator(const char *path, const char *name, size_t num_access); struct kdbus_item *make_policy_name(const char *name); struct kdbus_item *make_policy_access(__u64 type, __u64 bits, __u64 id); -int conn_update(struct conn *conn, const char *name, - const struct kdbus_policy_access *access, - size_t num_access, uint64_t flags); +int conn_update_attach_flags(struct conn *conn, uint64_t flags); +int conn_update_policy(struct conn *conn, const char *name, + const struct kdbus_policy_access *access, + size_t num_access); void add_match_empty(int fd); diff --git a/test/test-kdbus-policy.c b/test/test-kdbus-policy.c index e0bd619..4f8e763 100644 --- a/test/test-kdbus-policy.c +++ b/test/test-kdbus-policy.c @@ -64,7 +64,7 @@ static int kdbus_set_policy_talk(struct conn *conn, .access = KDBUS_POLICY_TALK, }; - return conn_update(conn, name, access, 1, 0); + return conn_update_policy(conn, name
Re: [systemd-devel] [PATCH 0/5] kdbus: allow multiple policies
On Thu, Jul 31, 2014 at 10:38:47PM +0200, Kay Sievers wrote: [...] Still I see three points here from how much pressure and job should the policy holding connection do! 1) Register policy entries (handled internally), no communication 2) Register policy entries + do basic communication based on ID 3) Register policy entries + acquire name or names + do communication based on names... Policy holders and activators can never communicate. Activator connection can get messages queued, but they cannot be received by the activator connection. Please, another point here: Currently the policy holding connection is able to own a well-known name, which makes it able to communicate and receive messages through this name. IIRC this was discussed last time in this list, but it is still unclear at least for me, and in different places of the code we only check for activators when doing send/recv validation. So I guess we should also block policy holders from owning well-known names ? hmm, then add the policy holders to the block X connections from sending or receiving... . Thanks! -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH 0/5] kdbus: allow multiple policies
Hi, On Thu, Jul 31, 2014 at 10:38:47PM +0200, Kay Sievers wrote: On Thu, Jul 31, 2014 at 8:57 PM, Djalal Harouni tix...@opendz.org wrote: (Cc'ed Lennart) On Thu, Jul 31, 2014 at 05:40:53PM +0200, Kay Sievers wrote: On Wed, Jul 23, 2014 at 6:34 PM, Djalal Harouni tix...@opendz.org wrote: This series adds the infrastructure to test and upload multiple policies. The last #5 patch allows to upload multiple policies per connection What is the reason for this? A policy holding connection (which matches a busname unit) shouldn't only be in charge of one single name? Hmm, I thought that what you describe is for activators? Yes, activators can only ever have one name, but policy holders should too. Ok. I was following the current kdbus spec/code here! From kdbus.h source: @KDBUS_HELLO_POLICY_HOLDER: Special-purpose connection which registers policy entries for one or multiple names. The provided names are not activated, and are not registered with the name database I'm going to fix that, it does not sound correct. We have one .busname file per name and it will get really complicated to start stop a busname, when it has multiple names per connection. We should really avoid that and require one connection per name and allow only name. I do agree. And the logic in policy.c is set that multiple policy entries can have the same owner which is the policy holding connection: struct kdbus_policy_db_entry { char *name; struct hlist_node hentry; struct list_head access_list; const void *owner; bool wildcard:1; }; Only custom endpoints will carry list of policies for different names. It looks and acts a little bit like a firewall. I see the point now! This should allow one single connection to handle multiple policy entries, I think the design is good, since policy entries are then handled internally by kdbus, so only one single connection resources to achieve this, which can be reliable on private domains,buses... Resources don't really matter here, they are cheap. We need to keep things simple and robust, and managing lists of names in userspace when the configuration changes would get really complicated. Yes, I was also worried about configuration changes. Well you are right here. I did like the fact that on a private bus, a connection might hold multiple policies for multiple names, but yes there will be problems when updating/deleting multiple policies... The policy holding connection acts like create policy entries, and invalidate them when it is closed! Yes, all that should already work with systemd. Still I see three points here from how much pressure and job should the policy holding connection do! 1) Register policy entries (handled internally), no communication 2) Register policy entries + do basic communication based on ID 3) Register policy entries + acquire name or names + do communication based on names... Policy holders and activators can never communicate. Activator connection can get messages queued, but they cannot be received by the activator connection. Ok, I will give it a review and send patches for kdbus and the test-kdbus-policy tests, since now in test-kdbus-policy a policy holder is able to receive messages. So I'll updated based on these comments, thanks! All of these points are from the perspective: if the policy holding connection dies, all the registered policy entries will go away, thus invalidating all communications to the registered names... Should we care ? It should all work that way already. Yes, but I'll give it a review and write tests for these cases. And for that patch change, it will block activators from having multiple names, it will fail, but it will succeed for policy holders. Which is intentional. Sorry for the misleading text in the docs. Hopefully it got catched now :-) The todo for the policy holders is: * Should we set a maximum value for how many names/policies a policy holder is allowed to upload. This is needed since we do not want to consume all the entries (kdbus_policy_db-entries_hash). Yes, everything that consumes kernel memory needs to be limited. Ok, so this needs a proper discussion, will send about it and Cc Lennart too. We need to limit the size of the uploaded policy, just like we limit the amount of messages. So perhaps do not apply this series until we have a proper limit. * Update/test the kdbus_policy_set() to work with multiple names. This is meant for custom endpoints, not for policy-holder connections? Currently it is aimed to work for both of them! I've tested for policy-holder connections and it works with this series applied, but only when we create new connections. It will not work when we try to update or register new properties, policies or names. So from the source: connection.c:1978 it will fail
Re: [systemd-devel] [PATCH 0/5] kdbus: allow multiple policies
(Cc'ed Lennart) On Thu, Jul 31, 2014 at 05:40:53PM +0200, Kay Sievers wrote: On Wed, Jul 23, 2014 at 6:34 PM, Djalal Harouni tix...@opendz.org wrote: This series adds the infrastructure to test and upload multiple policies. The last #5 patch allows to upload multiple policies per connection What is the reason for this? A policy holding connection (which matches a busname unit) shouldn't only be in charge of one single name? Hmm, I thought that what you describe is for activators? I was following the current kdbus spec/code here! From kdbus.h source: @KDBUS_HELLO_POLICY_HOLDER: Special-purpose connection which registers policy entries for one or multiple names. The provided names are not activated, and are not registered with the name database And the logic in policy.c is set that multiple policy entries can have the same owner which is the policy holding connection: struct kdbus_policy_db_entry { char *name; struct hlist_node hentry; struct list_head access_list; const void *owner; bool wildcard:1; }; This should allow one single connection to handle multiple policy entries, I think the design is good, since policy entries are then handled internally by kdbus, so only one single connection resources to achieve this, which can be reliable on private domains,buses... The policy holding connection acts like create policy entries, and invalidate them when it is closed! Still I see three points here from how much pressure and job should the policy holding connection do! 1) Register policy entries (handled internally), no communication 2) Register policy entries + do basic communication based on ID 3) Register policy entries + acquire name or names + do communication based on names... All of these points are from the perspective: if the policy holding connection dies, all the registered policy entries will go away, thus invalidating all communications to the registered names... Should we care ? And for that patch change, it will block activators from having multiple names, it will fail, but it will succeed for policy holders. Hope I'm not out of scope! The todo for the policy holders is: * Should we set a maximum value for how many names/policies a policy holder is allowed to upload. This is needed since we do not want to consume all the entries (kdbus_policy_db-entries_hash). Yes, everything that consumes kernel memory needs to be limited. Ok, so this needs a proper discussion, will send about it and Cc Lennart too. So perhaps do not apply this series until we have a proper limit. * Update/test the kdbus_policy_set() to work with multiple names. This is meant for custom endpoints, not for policy-holder connections? Currently it is aimed to work for both of them! I've tested for policy-holder connections and it works with this series applied, but only when we create new connections. It will not work when we try to update or register new properties, policies or names. So from the source: connection.c:1978 it will fail, but with this patch series it will succeed. connection.c:1801 it will fail in kdbus_cmd_conn_update() Since it allows only to add/update one policy entry. So this one will be a todo item. And for custom endpoints, I'm sure that kdbus_policy_set() is broken, and it was perhaps never tested since custom endpoints are still broken, we can't create them, at least that was the case the last time I checked. Kay I'm trying to have a kdbus_policy_set() version only for connections, so it will work for policy holding connections that handle multiple names! this will help to fix the sending cache issue I've been talking about. So please, do confirm that a policy holding connection should be able to register multiple policies for multiple names, before I give it more time. Thank you! The kdbus_policy_set() for custom endpoints can be worked out later... Thanks, Kay -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH v2 0/7] kdbus: improve user domain accounting
Hi, This is series v2 of: http://lists.freedesktop.org/archives/systemd-devel/2014-July/021526.html This series improves user domain accounting and fixes some bugs. It should go on top of the kdbus: allow multiple policies series: http://lists.freedesktop.org/archives/systemd-devel/2014-July/021514.html The v2 fixes the race previously discussed but not fixed in v1 and which was inherited from kdbus_domain_user_find_or_new(). If kdbus_domain_user_find_or_new() is called with same parameters and if we do not find a previously linked domain user, then we release the domain lock, and both threads will race to assign a different ID for the same uid, thus, invalidating the users array. We fix this in this series by using a new kdbus_domain_user_account() function which takes the same domain lock to perform both find a previously domain user or create and link a new one. Summary: Patches 1, 2 and 3 are preparation patches to improve the code. Patch 4 fixes kdbus_domain_user_find_or_new(), callers assume that kdbus_domain_user_find_or_new() will fail only with -ENOMEM, but it may fail with -ESHUTDOWN, so adapt the code to return the appropriate errors and fix all callers. This patch also fixes the discussed race in kdbus_domain_user_find_or_new(). Patch 5 improves the bus user quota and reduces the number of domain locks that might be taken in that path. Currently kdbus_bus_new() execution path might take the domain lock 3 or 4 times, reduce this to max 1 or 2 times. The domain is the upper layer, reducing the locks should reduce some overheads. Patch 6 fixes a user quota accounting corruption. Patch 7 kill dead code. Thanks ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH v2 1/7] domain: add kdbus_domain_user_assign_id() to assign IDs to domain users
Signed-off-by: Djalal Harouni tix...@opendz.org --- domain.c | 33 + 1 file changed, 33 insertions(+) diff --git a/domain.c b/domain.c index c4912fa..af9d986 100644 --- a/domain.c +++ b/domain.c @@ -419,6 +419,39 @@ int kdbus_domain_make_user(struct kdbus_cmd_make *cmd, char **name) } /** + * kdbus_domain_user_assign_id() - allocate ID and assign in it to the + *domain user + * @domain:The domain of the user + * @user The kdbus_domain_user object of the user + * + * Returns 0 if ID in [0, INT_MAX] is successfully assigned to the + * domain user. Negative errno on failure. + * + * The user index is used in arrays for accounting user quota in + * receiver queues. + * + * Caller must have the domain lock held and must ensure that the + * domain was not disconnected. + */ +static int kdbus_domain_user_assign_id(struct kdbus_domain *domain, + struct kdbus_domain_user *user) +{ + int ret; + + /* +* Allocate the smallest possible index for this user; used +* in arrays for accounting user quota in receiver queues. +*/ + ret = idr_alloc(domain-user_idr, user, 0, 0, GFP_KERNEL); + if (ret 0) + return ret; + + user-idr = ret; + + return 0; +} + +/** * kdbus_domain_user_find_or_new() - get a kdbus_domain_user object in a domain * @domain:The domain * @uid: The uid of the user; INVALID_UID for an anonymous -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH v2 2/7] domain: add __kdbus_domain_user_account() to account domain users
Add __kdbus_domain_user_account() to account and link users into a domain. Signed-off-by: Djalal Harouni tix...@opendz.org --- domain.c | 70 domain.h | 4 2 files changed, 74 insertions(+) diff --git a/domain.c b/domain.c index af9d986..11d7439 100644 --- a/domain.c +++ b/domain.c @@ -452,6 +452,76 @@ static int kdbus_domain_user_assign_id(struct kdbus_domain *domain, } /** + * __kdbus_domain_user_account() - account a kdbus_domain_user object + *into the specified domain + * @domain:The domain of the user + * @uid: The uid of the user; INVALID_UID for an + * anonymous user like a custom endpoint + * @user Pointer to a reference where the accounted + * domain user will be stored. + * + * Return: 0 on success, negative errno on failure. + * + * On success: if there is a uid matching, then use the already + * accounted kdbus_domain_user, increment its reference counter and + * return it in the 'user' argument. Otherwise, allocate a new one, + * link it into the domain, then return it. + * + * On failure: the 'user' argument is not updated. + * + * Caller must have the domain lock held and must ensure that the + * domain was not disconnected. + */ +int __kdbus_domain_user_account(struct kdbus_domain *domain, + kuid_t uid, + struct kdbus_domain_user **user) +{ + int ret; + struct kdbus_domain_user *tmp_user; + struct kdbus_domain_user *u = NULL; + + /* find uid and reference it */ + if (uid_valid(uid)) { + hash_for_each_possible(domain-user_hash, tmp_user, + hentry, __kuid_val(uid)) { + if (!uid_eq(tmp_user-uid, uid)) + continue; + + u = kdbus_domain_user_ref(tmp_user); + goto out; + } + } + + ret = -ENOMEM; + u = kzalloc(sizeof(*u), GFP_KERNEL); + if (!u) + return ret; + + kref_init(u-kref); + u-domain = kdbus_domain_ref(domain); + u-uid = uid; + atomic_set(u-buses, 0); + atomic_set(u-connections, 0); + + /* Assign user ID and link into domain */ + ret = kdbus_domain_user_assign_id(domain, u); + if (ret 0) + goto exit_free; + + /* UID hash map */ + hash_add(domain-user_hash, u-hentry, __kuid_val(u-uid)); + +out: + *user = u; + return 0; + +exit_free: + kdbus_domain_unref(u-domain); + kfree(u); + return ret; +} + +/** * kdbus_domain_user_find_or_new() - get a kdbus_domain_user object in a domain * @domain:The domain * @uid: The uid of the user; INVALID_UID for an anonymous diff --git a/domain.h b/domain.h index 9c477db..fd2940b 100644 --- a/domain.h +++ b/domain.h @@ -99,6 +99,10 @@ int kdbus_domain_new(struct kdbus_domain *parent, const char *name, int kdbus_domain_make_user(struct kdbus_cmd_make *cmd, char **name); struct kdbus_domain *kdbus_domain_find_by_major(unsigned int major); +int __kdbus_domain_user_account(struct kdbus_domain *domain, + kuid_t uid, + struct kdbus_domain_user **user); + struct kdbus_domain_user *kdbus_domain_user_find_or_new(struct kdbus_domain *domain, kuid_t uid); struct kdbus_domain_user *kdbus_domain_user_ref(struct kdbus_domain_user *u); -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH v2 3/7] domain: add kdbus_domain_user_account()
Add kdbus_domain_user_account() to account and link users into a domain. This function will take the domain lock, and it will be used as a replacement for kdbus_domain_user_find_or_new(). Signed-off-by: Djalal Harouni tix...@opendz.org --- domain.c | 36 domain.h | 4 2 files changed, 40 insertions(+) diff --git a/domain.c b/domain.c index 11d7439..1e2c3c9 100644 --- a/domain.c +++ b/domain.c @@ -522,6 +522,42 @@ exit_free: } /** + * kdbus_domain_user_account() - account a kdbus_domain_user object + * into the specified domain + * @domain:The domain of the user + * @uid: The uid of the user; INVALID_UID for an + * anonymous user like a custom endpoint + * @user Pointer to a reference where the accounted + * domain user will be stored. + * + * Return: 0 on success, negative errno on failure. + * + * On success: if there is a uid matching, then use the already + * accounted kdbus_domain_user, increment its reference counter and + * return it in the 'user' argument. Otherwise, allocate a new one, + * link it into the domain, then return it. + * + * On failure: the 'user' argument is not updated. + * + * This function will first check if the domain was not disconnected. + */ +int kdbus_domain_user_account(struct kdbus_domain *domain, + kuid_t uid, + struct kdbus_domain_user **user) +{ + int ret = -ESHUTDOWN; + + mutex_lock(domain-lock); + + if (!domain-disconnected) + ret = __kdbus_domain_user_account(domain, uid, user); + + mutex_unlock(domain-lock); + + return ret; +} + +/** * kdbus_domain_user_find_or_new() - get a kdbus_domain_user object in a domain * @domain:The domain * @uid: The uid of the user; INVALID_UID for an anonymous diff --git a/domain.h b/domain.h index fd2940b..c81589e 100644 --- a/domain.h +++ b/domain.h @@ -103,6 +103,10 @@ int __kdbus_domain_user_account(struct kdbus_domain *domain, kuid_t uid, struct kdbus_domain_user **user); +int kdbus_domain_user_account(struct kdbus_domain *domain, + kuid_t uid, + struct kdbus_domain_user **user); + struct kdbus_domain_user *kdbus_domain_user_find_or_new(struct kdbus_domain *domain, kuid_t uid); struct kdbus_domain_user *kdbus_domain_user_ref(struct kdbus_domain_user *u); -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH v2 4/7] kdbus: improve user quota accounting by using kdbus_domain_user_account()
Currently kdbus_domain_user_find_or_new() is used to find a user domain or create a new one and link it into the domain. kdbus_domain_user_find_or_new() may fail due to memory allocation errors or if the domain was shutdown, but since callers will receive only a NULL pointer on failure, they assume -ENOMEM and ignore -ESHUTDOWN. Fix this in kdbus_domain_user_account() by returning the appropriate error code. There are also some races with kdbus_domain_user_find_or_new(), if it is called with the same parameters and if we do not find a previously linked domain user, then both threads will race to assign a different ID for the same uid, thus, invalidating the users array. We fix this in the new kdbus_domain_user_account() and __kdbus_domain_user_account() by taking the domain lock only one time. Replace some kdbus_domain_user_find_or_new() calls with kdbus_domain_user_account(). The last one in bus.c is updated in the next patch. Signed-off-by: Djalal Harouni tix...@opendz.org --- connection.c | 12 ++-- handle.c | 8 2 files changed, 10 insertions(+), 10 deletions(-) diff --git a/connection.c b/connection.c index 1658a92..8838029 100644 --- a/connection.c +++ b/connection.c @@ -2092,12 +2092,12 @@ int kdbus_conn_new(struct kdbus_ep *ep, */ if (ep-user) conn-user = kdbus_domain_user_ref(ep-user); - else - conn-user = kdbus_domain_user_find_or_new(ep-bus-domain, - current_fsuid()); - if (!conn-user) { - ret = -ENOMEM; - goto exit_free_meta; + else { + ret = kdbus_domain_user_account(ep-bus-domain, + current_fsuid(), + conn-user); + if (ret 0) + goto exit_free_meta; } /* lock order: domain - bus - ep - names - conn */ diff --git a/handle.c b/handle.c index bf32c6e..d8e1dc6 100644 --- a/handle.c +++ b/handle.c @@ -464,11 +464,11 @@ static long kdbus_handle_ioctl_ep(struct file *file, unsigned int cmd, * endpoint users do not share the budget with the ordinary * users created for a UID. */ - ep-user = kdbus_domain_user_find_or_new( - handle-ep-bus-domain, INVALID_UID); - if (!ep-user) { + ret = kdbus_domain_user_account( + handle-ep-bus-domain, + INVALID_UID, ep-user); + if (ret 0) { kdbus_ep_unref(ep); - ret = -ENOMEM; break; } -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH v2 5/7] bus: improve user quota accounting and domain locking
Currently kdbus_bus_new() execution path might take the domain lock three times on success, four times on failure. kdbus_bus_new(): = kdbus_domain_user_find_or_new(): takes it 2 times (and it is racy) + kdbus_bus_new(): take it an extra time to account the user and link the bus into the domain bus_list. And as discussed in the previous patch kdbus_domain_user_find_or_new() is racy, so convert to __kdbus_domain_user_account() This allows also to improve the execution path to take the domain lock only one time in case of success. On failure the domain lock will be taken two times. kdbus_bus_new(): = take domain lock = check if domain is still active/connected = __kdbus_domain_user_account() ... Signed-off-by: Djalal Harouni tix...@opendz.org --- bus.c | 13 + 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/bus.c b/bus.c index d09e3c6..9e2f41b 100644 --- a/bus.c +++ b/bus.c @@ -269,13 +269,6 @@ int kdbus_bus_new(struct kdbus_domain *domain, if (ret 0) goto exit_free_reg; - /* account the bus against the user */ - b-user = kdbus_domain_user_find_or_new(domain, uid); - if (!b-user) { - ret = -ENOMEM; - goto exit_ep_unref; - } - /* link into domain */ mutex_lock(domain-lock); if (domain-disconnected) { @@ -283,6 +276,11 @@ int kdbus_bus_new(struct kdbus_domain *domain, goto exit_unref_user_unlock; } + /* account the bus against the user */ + ret = __kdbus_domain_user_account(domain, uid, b-user); + if (ret 0) + goto exit_unref_user_unlock; + if (!capable(CAP_IPC_OWNER) atomic_inc_return(b-user-buses) KDBUS_USER_MAX_BUSES) { atomic_dec(b-user-buses); @@ -300,7 +298,6 @@ int kdbus_bus_new(struct kdbus_domain *domain, exit_unref_user_unlock: mutex_unlock(domain-lock); kdbus_domain_user_unref(b-user); -exit_ep_unref: kdbus_ep_unref(b-ep); exit_free_reg: kdbus_name_registry_free(b-name_registry); -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH v2 6/7] connection: fix user quota accounting corruption
First use kzalloc to allocate the users array, so we do not reference unintialized values. And free the old conn-msg_users array not the newly allocated 'users' one. Patch tested, and users will hit the KDBUS_CONN_MAX_MSGS_PER_USER limit and fail with -ENOBUFS Signed-off-by: Djalal Harouni tix...@opendz.org --- connection.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/connection.c b/connection.c index 8838029..3cd84ce 100644 --- a/connection.c +++ b/connection.c @@ -636,13 +636,13 @@ static int kdbus_conn_queue_user_quota(struct kdbus_conn *conn, unsigned int i; i = 8 + KDBUS_ALIGN8(user); - users = kmalloc(sizeof(unsigned int) * i, GFP_KERNEL); + users = kzalloc(sizeof(unsigned int) * i, GFP_KERNEL); if (!users) return -ENOMEM; memcpy(users, conn-msg_users, sizeof(unsigned int) * conn-msg_users_max); - kfree(users); + kfree(conn-msg_users); conn-msg_users = users; conn-msg_users_max = i; } -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH v2 7/7] domain: remove dead kdbus_domain_user_find_or_new()
Signed-off-by: Djalal Harouni tix...@opendz.org --- domain.c | 67 domain.h | 2 -- 2 files changed, 69 deletions(-) diff --git a/domain.c b/domain.c index 1e2c3c9..eeb73ca 100644 --- a/domain.c +++ b/domain.c @@ -557,73 +557,6 @@ int kdbus_domain_user_account(struct kdbus_domain *domain, return ret; } -/** - * kdbus_domain_user_find_or_new() - get a kdbus_domain_user object in a domain - * @domain:The domain - * @uid: The uid of the user; INVALID_UID for an anonymous - * user like a custom endpoint - * - * Return: a kdbus_domain_user, either freshly allocated or with the reference - * counter increased. In case of memory allocation failure, NULL is returned. - */ -struct kdbus_domain_user -*kdbus_domain_user_find_or_new(struct kdbus_domain *domain, kuid_t uid) -{ - struct kdbus_domain_user *u; - int ret; - - /* find uid and reference it */ - if (uid_valid(uid)) { - mutex_lock(domain-lock); - hash_for_each_possible(domain-user_hash, u, - hentry, __kuid_val(uid)) { - if (!uid_eq(u-uid, uid)) - continue; - - kref_get(u-kref); - mutex_unlock(domain-lock); - return u; - } - mutex_unlock(domain-lock); - } - - /* allocate a new user */ - u = kzalloc(sizeof(*u), GFP_KERNEL); - if (!u) - return NULL; - - kref_init(u-kref); - u-domain = kdbus_domain_ref(domain); - u-uid = uid; - atomic_set(u-buses, 0); - atomic_set(u-connections, 0); - - /* link into domain */ - mutex_lock(domain-lock); - if (domain-disconnected) { - mutex_unlock(domain-lock); - kfree(u); - return NULL; - } - - /* -* Allocate the smallest possible index for this user; used -* in arrays for accounting user quota in receiver queues. -*/ - ret = idr_alloc(domain-user_idr, u, 0, 0, GFP_KERNEL); - if (ret 0) { - mutex_unlock(domain-lock); - return NULL; - } - u-idr = ret; - - /* UID hash map */ - hash_add(domain-user_hash, u-hentry, __kuid_val(u-uid)); - mutex_unlock(domain-lock); - - return u; -} - static void __kdbus_domain_user_free(struct kref *kref) { struct kdbus_domain_user *user = diff --git a/domain.h b/domain.h index c81589e..0577e5d 100644 --- a/domain.h +++ b/domain.h @@ -107,8 +107,6 @@ int kdbus_domain_user_account(struct kdbus_domain *domain, kuid_t uid, struct kdbus_domain_user **user); -struct kdbus_domain_user -*kdbus_domain_user_find_or_new(struct kdbus_domain *domain, kuid_t uid); struct kdbus_domain_user *kdbus_domain_user_ref(struct kdbus_domain_user *u); struct kdbus_domain_user *kdbus_domain_user_unref(struct kdbus_domain_user *u); #endif -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 0/5] kdbus: allow multiple policies
This series adds the infrastructure to test and upload multiple policies. The last #5 patch allows to upload multiple policies per connection The todo for the policy holders is: * Should we set a maximum value for how many names/policies a policy holder is allowed to upload. This is needed since we do not want to consume all the entries (kdbus_policy_db-entries_hash). * Update/test the kdbus_policy_set() to work with multiple names. ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 5/5] connection: allow policy holders to install multiple names
Signed-off-by: Djalal Harouni tix...@opendz.org --- connection.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/connection.c b/connection.c index 85ffa5a..1658a92 100644 --- a/connection.c +++ b/connection.c @@ -1905,7 +1905,11 @@ int kdbus_conn_new(struct kdbus_ep *ep, if (!is_activator !is_policy_holder) return -EINVAL; - if (name) + /* +* activators are allowed to install only +* one name. +*/ + if (name is_activator) return -EINVAL; if (!kdbus_item_validate_nul(item)) -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 0/7] kdbus: improve user domain accounting
Hi, This series improves user domain accounting and fixes some bugs. On top of the kdbus: allow multiple policies series: http://lists.freedesktop.org/archives/systemd-devel/2014-July/021514.html Patches 1, 2, 3 and 4 are preparation patches to improve the code. Patch 5 fixes kdbus_domain_user_find_or_new(), callers assume that kdbus_domain_user_find_or_new() will fail only with -ENOMEM, but it may fail with -ESHUTDOWN, so adapt the code to return the appropriate errors and fix all callers. Patch 6 makes use of the new helpers in order to reduce the number of domain locks taken in kdbus_bus_new(). The domain is the upper layer, reducing the locks should reduce some overheads. In order to reduce the number of locks in this patch 6, we make the domain-user_idr start from 1, so if a user-idr == 0 we assume that it was not accounted. This logic was used in patch 5 which contains the detailed explanation. Patch 7 fixes a user quota accounting corruption. Thanks! ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 1/7] domain: add __kdbus_domain_user_account() to account and link users
Add __kdbus_domain_user_account() to account and link users into a domain. Signed-off-by: Djalal Harouni tix...@opendz.org --- domain.c | 32 1 file changed, 32 insertions(+) diff --git a/domain.c b/domain.c index c4912fa..a321f31 100644 --- a/domain.c +++ b/domain.c @@ -419,6 +419,38 @@ int kdbus_domain_make_user(struct kdbus_cmd_make *cmd, char **name) } /** + * __kdbus_domain_user_account() - account a kdbus_domain_user object + *into the specified domain + * @domain:The domain of the user + * @user The kdbus_domain_user object of the user + * + * Returns 0 on success, -ENOMEM on failure. + * + * Caller must have the domain lock held and must ensure that the + * domain was not disconnected. + */ +int __kdbus_domain_user_account(struct kdbus_domain *domain, + struct kdbus_domain_user *user) +{ + int ret; + + /* +* Allocate the smallest possible index for this user; used +* in arrays for accounting user quota in receiver queues. +*/ + ret = idr_alloc(domain-user_idr, user, 1, 0, GFP_KERNEL); + if (ret 0) + return ret; + + user-idr = ret; + + /* UID hash map */ + hash_add(domain-user_hash, user-hentry, __kuid_val(user-uid)); + + return 0; +} + +/** * kdbus_domain_user_find_or_new() - get a kdbus_domain_user object in a domain * @domain:The domain * @uid: The uid of the user; INVALID_UID for an anonymous -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 2/7] domain: add the lock protected version of user accounting
Add the lock protected version of __kdbus_domain_user_account(). It will check if the domain is still active before linking users. Signed-off-by: Djalal Harouni tix...@opendz.org --- domain.c | 24 1 file changed, 24 insertions(+) diff --git a/domain.c b/domain.c index a321f31..86fde55 100644 --- a/domain.c +++ b/domain.c @@ -451,6 +451,30 @@ int __kdbus_domain_user_account(struct kdbus_domain *domain, } /** + * kdbus_domain_user_account() - account a kdbus_domain_user object + * into the specified domain + * @domain:The domain of the user + * @user The kdbus_domain_user object of the user + * + * Returns 0 on success. In case of memory allocation failure return + * -ENOMEM, or -ESHUTDOWN if the domain was disconnected. + */ +int kdbus_domain_user_account(struct kdbus_domain *domain, + struct kdbus_domain_user *user) +{ + int ret = -ESHUTDOWN; + + mutex_lock(domain-lock); + + if (!domain-disconnected) + ret = __kdbus_domain_user_account(domain, user); + + mutex_unlock(domain-lock); + + return ret; +} + +/** * kdbus_domain_user_find_or_new() - get a kdbus_domain_user object in a domain * @domain:The domain * @uid: The uid of the user; INVALID_UID for an anonymous -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 3/7] domain: add kdbus_domain_user_new()
Add kdbus_domain_user_new() to allocate kdbus_domain_user objects. Signed-off-by: Djalal Harouni tix...@opendz.org --- domain.c | 27 +++ 1 file changed, 27 insertions(+) diff --git a/domain.c b/domain.c index 86fde55..18dc2a7 100644 --- a/domain.c +++ b/domain.c @@ -419,6 +419,33 @@ int kdbus_domain_make_user(struct kdbus_cmd_make *cmd, char **name) } /** + * kdbus_domain_user_new() - create a new kdbus_domain_user object + * @domain The domain of the user + * @uidThe uid of the user; INVALID_UID for an + * anonymous user like custom endpoint + * + * Return: a kdbus_domain_user freshly allocated. In case of memory + * allocation failure, NULL is returned. + */ +struct kdbus_domain_user * +kdbus_domain_user_new(struct kdbus_domain *domain, kuid_t uid) +{ + struct kdbus_domain_user *u; + + u = kzalloc(sizeof(*u), GFP_KERNEL); + if (!u) + return NULL; + + kref_init(u-kref); + u-domain = kdbus_domain_ref(domain); + u-uid = uid; + atomic_set(u-buses, 0); + atomic_set(u-connections, 0); + + return u; +} + +/** * __kdbus_domain_user_account() - account a kdbus_domain_user object *into the specified domain * @domain:The domain of the user -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 4/7] domain: add kdbus_domain_user_find()
Add kdbus_domain_user_find() to look up domain users Signed-off-by: Djalal Harouni tix...@opendz.org --- domain.c | 32 1 file changed, 32 insertions(+) diff --git a/domain.c b/domain.c index 18dc2a7..a5abb2d 100644 --- a/domain.c +++ b/domain.c @@ -446,6 +446,38 @@ kdbus_domain_user_new(struct kdbus_domain *domain, kuid_t uid) } /** + * kdbus_domain_user_find() - find a domain user with a given uid + * @domain The domain of the user + * @uidThe uid of the user; INVALID_UID for an + * anonymous user like custom endpoint + * + * Return: a kdbus_domain_user with the reference counter increased. + * If the domain user can't be found, NULL is returned. + */ +struct kdbus_domain_user * +kdbus_domain_user_find(struct kdbus_domain *domain, kuid_t uid) +{ + struct kdbus_domain_user *u; + struct kdbus_domain_user *user = NULL; + + if (!uid_valid(uid)) + return user; + + mutex_lock(domain-lock); + hash_for_each_possible(domain-user_hash, u, + hentry, __kuid_val(uid)) { + if (!uid_eq(u-uid, uid)) + continue; + + user = kdbus_domain_user_ref(u); + break; + } + mutex_unlock(domain-lock); + + return user; +} + +/** * __kdbus_domain_user_account() - account a kdbus_domain_user object *into the specified domain * @domain:The domain of the user -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 5/7] kdbus: improve user domain accounting
Currently kdbus_domain_user_find_or_new() is used to find a user domain or create a new one and link it into the domain. kdbus_domain_user_find_or_new() may fail due to memory allocation errors or if the domain was shutdown, but since callers will receive only a NULL pointer on failure, they assume -ENOMEM and ignore -ESHUTDOWN. To fix this we make kdbus_domain_user_find_or_new() return an ERR_PTR() on failure, and we update all callers. The patch also simplifies kdbus_domain_user_find_or_new() by using the previously added helpers. Another point is that by using kdbus_domain_user_account() inside kdbus_domain_user_find_or_new() we make user index start from 1 instead of 0, this allows: Other parts of the code to directly call kdbus_domain_user_account() and to be sure that user was really accounted and its index is 1 instead of 0. 0 is the default value given to user-idr after user = kzalloc(), if we want other parts of the code to direclty account users, we must differenciate the two values: 1) 0 as a valid user index user-idr = idr_alloc(domain-user_idr, user, 0, 0, GFP_KERNEL) And 2) 0 after kzalloc() struct kdbus_domain_user *user = kzalloc() So convert the former to 1, and adapt other parts of the code to treat 1 as the starting id of user indexes. Signed-off-by: Djalal Harouni tix...@opendz.org --- bus.c| 4 ++-- connection.c | 19 +++ domain.c | 76 +--- handle.c | 4 ++-- 4 files changed, 40 insertions(+), 63 deletions(-) diff --git a/bus.c b/bus.c index d09e3c6..c329bef 100644 --- a/bus.c +++ b/bus.c @@ -271,8 +271,8 @@ int kdbus_bus_new(struct kdbus_domain *domain, /* account the bus against the user */ b-user = kdbus_domain_user_find_or_new(domain, uid); - if (!b-user) { - ret = -ENOMEM; + if (IS_ERR(b-user)) { + ret = PTR_ERR(b-user); goto exit_ep_unref; } diff --git a/connection.c b/connection.c index 1658a92..c432286 100644 --- a/connection.c +++ b/connection.c @@ -60,7 +60,7 @@ struct kdbus_conn_reply; * @dst_name_id: The sequence number of the name this message is * addressed to, 0 for messages sent to an ID * @reply: The reply block if a reply to this message is expected. - * @user: Index in per-user message counter, -1 for unused + * @user: Index in per-user message counter, 0 for unused */ struct kdbus_conn_queue { struct list_head entry; @@ -78,7 +78,7 @@ struct kdbus_conn_queue { u64 cookie; u64 dst_name_id; struct kdbus_conn_reply *reply; - int user; + unsigned int user; }; /** @@ -397,10 +397,10 @@ static void kdbus_conn_queue_remove(struct kdbus_conn *conn, conn-msg_count--; /* user quota */ - if (queue-user = 0) { + if (queue-user 0) { BUG_ON(conn-msg_users[queue-user] == 0); conn-msg_users[queue-user]--; - queue-user = -1; + queue-user = 0; } /* the queue is empty, remove the user quota accounting */ @@ -471,8 +471,6 @@ static int kdbus_conn_queue_alloc(struct kdbus_conn *conn, if (!queue) return -ENOMEM; - queue-user = -1; - /* copy message properties we need for the queue management */ queue-src_id = kmsg-msg.src_id; queue-cookie = kmsg-msg.cookie; @@ -2092,12 +2090,13 @@ int kdbus_conn_new(struct kdbus_ep *ep, */ if (ep-user) conn-user = kdbus_domain_user_ref(ep-user); - else + else { conn-user = kdbus_domain_user_find_or_new(ep-bus-domain, current_fsuid()); - if (!conn-user) { - ret = -ENOMEM; - goto exit_free_meta; + if (IS_ERR(conn-user)) { + ret = PTR_ERR(conn-user); + goto exit_free_meta; + } } /* lock order: domain - bus - ep - names - conn */ diff --git a/domain.c b/domain.c index a5abb2d..81a9667 100644 --- a/domain.c +++ b/domain.c @@ -3,6 +3,7 @@ * Copyright (C) 2013 Greg Kroah-Hartman gre...@linuxfoundation.org * Copyright (C) 2013 Daniel Mack dan...@zonque.org * Copyright (C) 2013 Linux Foundation + * Copyright (C) 2014 Djalal Harouni * * kdbus is free software; you can redistribute it and/or modify it under * the terms of the GNU Lesser General Public License as published by the @@ -540,64 +541,38 @@ int kdbus_domain_user_account(struct kdbus_domain *domain, * user like a custom endpoint * * Return: a kdbus_domain_user, either freshly allocated or with the reference - * counter increased. In case of memory allocation failure, NULL is returned. + * counter increased. On failure, an ERR_PTR() that contains either -ENOMEM
[systemd-devel] [PATCH 6/7] bus: call __kdbus_domain_user_account() and avoid an extra domain lock
kdbus_bus_new() worst case will take the domain lock 3 times: kdbus_bus_new() = kdbus_domain_user_find_or_new(): will take it 2 times + kdbus_bus_new(): will take it an extra time to account the user and link the bus into the domain bus_list. We can reduce the worst case to take the domain lock 2 times by replacing the kdbus_domain_user_find_or_new() with kdbus_domain_user_find(): take the lock 1 time kdbus_domain_user_new() + kdbus_bus_new(): take the lock 1 time and use the unlocked version __kdbus_domain_user_account() to account the user. Signed-off-by: Djalal Harouni tix...@opendz.org --- bus.c| 21 + domain.h | 12 2 files changed, 29 insertions(+), 4 deletions(-) diff --git a/bus.c b/bus.c index c329bef..77adf25 100644 --- a/bus.c +++ b/bus.c @@ -219,6 +219,7 @@ int kdbus_bus_new(struct kdbus_domain *domain, struct kdbus_bus *b; char prefix[16]; int ret; + bool new_user = false; BUG_ON(*bus); @@ -269,10 +270,15 @@ int kdbus_bus_new(struct kdbus_domain *domain, if (ret 0) goto exit_free_reg; - /* account the bus against the user */ - b-user = kdbus_domain_user_find_or_new(domain, uid); - if (IS_ERR(b-user)) { - ret = PTR_ERR(b-user); + /* account the bus against the user or create a new one */ + b-user = kdbus_domain_user_find(domain, uid); + if (!b-user) { + b-user = kdbus_domain_user_new(domain, uid); + new_user = true; + } + + if (!b-user) { + ret = -ENOMEM; goto exit_ep_unref; } @@ -283,6 +289,13 @@ int kdbus_bus_new(struct kdbus_domain *domain, goto exit_unref_user_unlock; } + /* New user: account the bus against the user */ + if (new_user) { + ret = __kdbus_domain_user_account(domain, b-user); + if (ret 0) + goto exit_unref_user_unlock; + } + if (!capable(CAP_IPC_OWNER) atomic_inc_return(b-user-buses) KDBUS_USER_MAX_BUSES) { atomic_dec(b-user-buses); diff --git a/domain.h b/domain.h index 9c477db..b17e023 100644 --- a/domain.h +++ b/domain.h @@ -99,6 +99,18 @@ int kdbus_domain_new(struct kdbus_domain *parent, const char *name, int kdbus_domain_make_user(struct kdbus_cmd_make *cmd, char **name); struct kdbus_domain *kdbus_domain_find_by_major(unsigned int major); +struct kdbus_domain_user * +kdbus_domain_user_new(struct kdbus_domain *domain, kuid_t uid); + +struct kdbus_domain_user * +kdbus_domain_user_find(struct kdbus_domain *domain, kuid_t uid); + +int __kdbus_domain_user_account(struct kdbus_domain *domain, + struct kdbus_domain_user *user); + +int kdbus_domain_user_account(struct kdbus_domain *domain, + struct kdbus_domain_user *user); + struct kdbus_domain_user *kdbus_domain_user_find_or_new(struct kdbus_domain *domain, kuid_t uid); struct kdbus_domain_user *kdbus_domain_user_ref(struct kdbus_domain_user *u); -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH 2/3] nspawn: use Barrier API instead of eventfd-util
On Thu, Jul 17, 2014 at 11:30:26AM +0200, David Herrmann wrote: Hi On Mon, Jul 14, 2014 at 3:28 AM, Djalal Harouni tix...@opendz.org wrote: ppoll is atomic and it is handled by the kernel, so perhaps setting/restoring sigmask can be done easily! and for nspawn: IMO we need to receive SIGCHLD which implies EINTR. I say EINTR since not only for blocking read or infinite poll, but perhaps for all the other functions that the parent may do to setup the environment of the container, currently nspawn will set network interfaces before moving them into the container, it will also register the machine, and perhaps other operations... So having EINTR errors is useful here not only for direct reads, but for all the other calls that might block! IOW I think that nspawn should have an empty sig handler for SIGCHLD. Barrier reads already use poll and pipe to handle remote abortion since it can *not* be done by eventfd, yes this is perfect but for nspawn we can also achieve the same by combining eventfd and SICCHLD! What do you think if we make Barrier use: eventfd+pipe and/or eventfd+SIGCHLD ? Most complex fork/clone code should receive SIGCHLD, and think about nspawn! we do want it to be as lightweigh as possible, having 4 fds by default (2 eventfd + heavy pipe) may hit some resource limits quickly! compared to: 2 eventfd + empty sig handler! My first attempt was to use a signalfd on SIGCHLD + edge-triggered. If I don't read from the signalfd and only use it to wake up and wall waitid(WNOWAIT), I won't interfere with other signalfds. However, this wasn't really more lightweight than the pipe-method so i ditched it. Ok. Regarding dropping the pipe: pipe2() is _really_ fast. I mean, we're fork()ing and running like thousands of syscalls just during container setup. I cannot see how dropping one light pipe2 call is beneficial here? We also destroy the pipe before running the real container. So it's really just during setup. Yes, compared to fork() and all the other stuff, pipe2() is fast. My concern was about the other resources that pipe needs and the fd limit. Of course, it depends on nspawn future and plans, 2 or 4 fds sure it will affect systems that will run multiple nspawn instances... but perhaps this is not an issue for nspawn! Otherwise I'm ok with having a pipe as a mechanism to detect container failure, and a good point for general cases: it does not interfere with signal handlers Thanks! -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH 2/3] nspawn: use Barrier API instead of eventfd-util
with priviliges. - * After we got the notification we can make the process - * join its cgroup which might limit what it can do */ -r = eventfd_child_succeeded(eventfds[1]); -eventfds[1] = safe_close(eventfds[1]); - -if (r = 0) { +/* wait for child-setup to be done */ +if (barrier_place_and_sync(barrier)) { int ifi = 0; r = move_network_interfaces(pid); @@ -3458,10 +3445,7 @@ int main(int argc, char *argv[]) { /* Notify the child that the parent is ready with all * its setup, and that the child can now hand over * control to the code to run inside the container. */ -r = eventfd_send_state(eventfds[0], EVENTFD_PARENT_SUCCEEDED); -eventfds[0] = safe_close(eventfds[0]); -if (r 0) -goto finish; +barrier_place(barrier); k = process_pty(master, mask, arg_boot ? pid : 0, SIGRTMIN+3); if (k 0) { diff --git a/src/shared/eventfd-util.c b/src/shared/eventfd-util.c deleted file mode 100644 index 27b7cf7..000 --- a/src/shared/eventfd-util.c +++ /dev/null @@ -1,169 +0,0 @@ -/*-*- Mode: C; c-basic-offset: 8; indent-tabs-mode: nil -*-*/ - -/*** - This file is part of systemd. - - Copyright 2014 Djalal Harouni - - systemd is free software; you can redistribute it and/or modify it - under the terms of the GNU Lesser General Public License as published by - the Free Software Foundation; either version 2.1 of the License, or - (at your option) any later version. - - systemd is distributed in the hope that it will be useful, but - WITHOUT ANY WARRANTY; without even the implied warranty of - MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU - Lesser General Public License for more details. - - You should have received a copy of the GNU Lesser General Public License - along with systemd; If not, see http://www.gnu.org/licenses/. -***/ - -#include assert.h -#include errno.h -#include unistd.h -#include sys/eventfd.h -#include sys/syscall.h - -#include eventfd-util.h -#include util.h - - -/* - * Use this to create processes that need to setup a full context - * and sync it with their parents using cheap mechanisms. - * - * This will create two blocking eventfd(s). A pair for the parent and - * the other for the child so they can be used as a notify mechanism. - * Each process will gets its copy of the parent and child eventfds. - * - * This is useful in case: - * 1) If the parent fails or dies, the child must die. - * 2) Child will install PR_SET_PDEATHSIG as soon as possible. - * 3) Parent and child need to sync using less resources. - * 4) If parent is not able to install a SIGCHLD handler: - *parent will wait using a blocking eventfd_read() or - *eventfd_child_succeeded() call on the child eventfd. - * - ** If the child setup succeeded, child should notify with an - * EVENTFD_CHILD_SUCCEEDED, parent will continue. - ** If the child setup failed, child should notify with an - * EVENTFD_CHILD_FAILED before any _exit(). This avoids blocking - * the parent. - * - * 5) If parent is able to install a SIGCHLD handler: - *An empty signal handler without SA_RESTART will do it, since the - *blocking eventfd_read() or eventfd_parent_succeeded() of the - *parent will be interrupted by SIGCHLD and the call will fail with - *EINTR. This is useful in case the child dies abnormaly and did - *not have a chance to notify its parent using EVENTFD_CHILD_FAILED. - * - * 6) Call wait*() in the main instead of the signal handler in order - *to: 1) reduce side effects and 2) have a better handling for - *child termination in order to reduce various race conditions. - * - * - * The return value of clone_with_eventfd() is the same of clone(). - * On success the eventfds[] will contain the two eventfd(s). These - * file descriptors can be closed later with safe_close(). On failure, - * a negative value is returned in the caller's context, and errno will - * be set appropriately. - * - * Extra preliminary work: - * 1) Child can wait before starting its setup by using the - *eventfd_recv_start() call on the parent eventfd, in that case the - *parent must notify with EVENTFD_START, after doing any preliminary - *work. - * - * Note: this function depends on systemd internal functions - * safe_close() and it should be used only by direct binaries, no - * libraries. - */ -pid_t clone_with_eventfd(int flags, int eventfds[2]) { -pid_t pid; - -assert(eventfds); - -eventfds[0] = eventfd(EVENTFD_INIT, EFD_CLOEXEC
Re: [systemd-devel] How to Listen for SessionRemoved Signal
On Thu, Jul 10, 2014 at 11:18:20PM -0500, Kurt von Laven wrote: Hello folks, Apologies if anybody is getting this message twice; I originally posted on d...@lists.freedesktop.org and was directed here. I am attempting to listen to the SessionRemoved signal from the login manager http://www.freedesktop.org/wiki/Software/systemd/logind/. I have no trouble in the case where the user logs out and waits for ~15 seconds before shutting down, but when the user shuts down the system without first logging out I don't hear the SessionRemoved signal in time. I would've thought that inhibiting shutdown (by calling the Inhibit method of the login manager http://www.freedesktop.org/wiki/Software/systemd/inhibit/) would be the appropriate way to keep my process alive long enough to record the SessionRemoved signal before my process gets killed. Apparently that is not the case though. I am fairly certain that I am actually inhibiting shutdown, because my process shows up when I call ListInhibitors in D-Feet https://wiki.gnome.org/action/show/Apps/DFeet. As Mantas said, this is wrong, your target here is shutdown... and your process should only do small operations to record something or clean things up... So there is the 'PrepareForShutdown' signal, assume SessionRemoved, then release the inhibitor lock ? http://www.freedesktop.org/wiki/Software/systemd/logind/ Thanks -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] kdbus performance regression by ~70% on 3.15 kernels ?
On Tue, Jul 01, 2014 at 09:26:01AM +0200, Daniel Mack wrote: On 06/28/2014 01:55 AM, Steven Noonan wrote: Was going to try to repro this perf regression as well, but instead got kdbus to oops (via test-kdbus-benchmark): $ test/test-kdbus-benchmark -- opening /dev/kdbus/control -- creating bus '1000-testbus' -- opening bus connection /dev/kdbus/1000-testbus/bus -- Our peer ID for /dev/kdbus/1000-testbus/bus: 1 -- bus uuid: 'b65bfdd23d3e4696aae2992a0857aa33' -- opening bus connection /dev/kdbus/1000-testbus/bus -- Our peer ID for /dev/kdbus/1000-testbus/bus: 2 -- bus uuid: 'b65bfdd23d3e4696aae2992a0857aa33' name_acquire(): flags after call: 0x0 Killed $ ... [ 33.558642] Stack: [ 33.558650] a11f45d6 8800c82cff50 0010 [ 33.558681] 7fff80d23800 8800c82cff50 8800c82cfef8 [ 33.558712] 811d386a 7fff80d23800 0010 8803f9d5d400 [ 33.558743] Call Trace: [ 33.558757] [a11f45d6] ? kdbus_memfd_writev+0x66/0xa0 [kdbus] This bug is fixed upstream with 7da2745e (memfd: switch to f_op-{read,write}_iter). This fix was neccessary due to vfs internal refactoring that came in for 3.16-rc1. Yes, Bisecting between v3.15 and 3.16-rc1 can hence be a little tricky. You might have to revert this particular patch for some bisect iterations in case you get build breakage. Indeed, revert and switch to the old aio_{read,write} to compare, since my suspicions points to this API change, and it's not easy to get it right, the old generic_file_aio_{read,write} was removed... so I'm really stuck... but I'll try to continue and try to report. I don't know if memfd needs to do something special, or just convert! I really don't know... It might also be a good idea to investigate whether one specific meta data attachments causes the regression. Ok will do, thanks! Thanks, Daniel -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] nspawn: When exiting with an error, make the error code meaningful.
Hi, You are right, commit 113cea8 introduced this regression, we guarantee that nspawn returns the exit code of the program executed in the container in case nspwan wont fail. My bad, I missed this point... sorry! Ok will comment on this patch, the other one is wrong, since we mix -errno with exit status, 'status.si_status' contains the low 8 bits of the value returned by child, in other words it is equivalent to WIFEXITED() of waitpid(), so that patch converts any exit status to an -errno code... IMHO this is the right patch. On Sat, Jun 28, 2014 at 12:09:56PM -0400, Luke Shumaker wrote: This is accomplished by having wait_for_container() return a positive error code when we would like that error code to make its way to the user. This is at odds with the CODING_STYLE rule that error codes should be returned as negative values. This is not odd, we already do that! When I did convert to wait_for_container(), I remember I checked all calls to wait_for_terminate() to see what others do, and there was the: src/shared/util.c:wait_for_terminate_and_warn() Which also returns the positive 'status.si_status' code to caller, and it is used in all places... I checked that function, but was played by the fact that if the child fails to setup the container we just fail with _exit(EXIT_FAILURE), no specific code, so I converted to -1 and introduced the regression... :-/ So if you are able to send a v2 of this patch, here are some comments: Please improve the commit log, note the commit that caused the regression, and remove the mention to CODING_STYLE, since we already do this in systemd, and for nspawn: the positive code returned by wait_for_container() is the exit code of the program executed in the container, it can be positive or EXIT_SUCCESS, and this is already documented. It will be easy to buy it this way :-) --- src/nspawn/nspawn.c | 21 + 1 file changed, 13 insertions(+), 8 deletions(-) diff --git a/src/nspawn/nspawn.c b/src/nspawn/nspawn.c index 0a8dc0c..42f939b 100644 --- a/src/nspawn/nspawn.c +++ b/src/nspawn/nspawn.c @@ -2645,12 +2645,17 @@ static int change_uid_gid(char **_home) { } /* - * Return 0 in case the container is being rebooted, has been shut - * down or exited successfully. On failures a negative value is - * returned. + * Return values: + * 0 : The container was terminated by a signal, or failed for an + * unknown reason. No change is made to the container argument. wait_for_container() failed to get the state of the container, the container was terminated by a signal or failed for an unknown reason. + * 0 : The container terminated with an error code, which is the + * return value. No change is made to the container argument. The exit code of the program executed in the container is returned. Quoted from systemd-nspawn man page. + * 0 : The container is being rebooted, has been shut down or exited + * successfully. The container argument has been set to either + * CONTAINER_TERMINATED or CONTAINER_REBOOTED. Nice! * - * The status of the container CONTAINER_TERMINATED or - * CONTAINER_REBOOTED will be saved in the container argument + * That is, success is indicated by a return value of zero, and an + * error is indicated by a non-zero value. */ static int wait_for_container(pid_t pid, ContainerStatus *container) { int r; @@ -2672,7 +2677,6 @@ static int wait_for_container(pid_t pid, ContainerStatus *container) { A minor thing: in wait_for_container() could you add please a log_warning() if wait_for_terminate() fails, as it is done in wait_for_terminate_and_warn(). } else { log_error(Container %s failed with error code %i., arg_machine, status.si_status); -r = -1; } break; @@ -3299,8 +3303,9 @@ check_container_status: r = wait_for_container(pid, container_status); pid = 0; -if (r 0) { -r = EXIT_FAILURE; +if (r != 0) { Add a code comment, if (r 0) explicitly set to EXIT_FAILURE, otherwise return the exit code of the containered process. +if (r 0) +r = EXIT_FAILURE; break; } else if (container_status == CONTAINER_TERMINATED) break; Thank you for the report and the patch! -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH 3/4] connection: use the already cached metadata if KDBUS_HELLO_CACHE_META is set
On Fri, Jun 27, 2014 at 01:02:19PM +0200, Daniel Mack wrote: On 06/27/2014 12:46 PM, Kay Sievers wrote: On Fri, Jun 27, 2014 at 12:32 PM, Djalal Harouni tix...@opendz.org wrote: For connections with the KDBUS_HELLO_CACHE_META flag dup the metadata/credentials from handle or from the HELLO cmd, and use it to construct kdbus kmsg object, this improves benchmark by ~50% The KDBUS_HELLO_CACHE_META flag is only for privileged bus users, others will fail with -EPERM. Privileged bus users can do what ever they want. Metadata contains timestamps, global message sequence numbers, PIDs, none of that should be cached or faked, I think. By no means, even for 'trusted' connections. The entire concept of metadata breaks if we cache things here. Yes, I do agree, that was a quick hack to see how much we gain... The thing is that for privileged processes or connections we already support faking creds and seclabal, and in the kernel there is already support for the no_new_privs bit: https://www.kernel.org/doc/Documentation/prctl/no_new_privs.txt So I was exploring things, don't know if it would worth it to make kdbus smarter and check the no_new_privs bit if set, and cache some fields... Anyway, yes I do realize, providing real time metadata is part of the design and really a nice *race-free* feature. Thanks for the comments! Daniel -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH] nspawn: When exiting with an error, make the error code meaningful.
On Mon, Jun 30, 2014 at 01:54:57AM +0100, Djalal Harouni wrote: On Sun, Jun 29, 2014 at 07:59:38PM -0400, Luke Shumaker wrote: At Sun, 29 Jun 2014 12:31:13 +0100, Djalal Harouni wrote: On Sat, Jun 28, 2014 at 12:09:56PM -0400, Luke Shumaker wrote: This is accomplished by having wait_for_container() return a positive error code when we would like that error code to make its way to the user. This is at odds with the CODING_STYLE rule that error codes should be returned as negative values. This is not odd, we already do that! When I did convert to wait_for_container(), I remember I checked all calls to wait_for_terminate() to see what others do, and there was the: src/shared/util.c:wait_for_terminate_and_warn() Which also returns the positive 'status.si_status' code to caller, and it is used in all places... I missed that wait_for_terminate_and_warn() also did that! With that in consideration, shouldn't the check of the return value from wait_for_terminate_and_warn() at src/quotacheck/quotacheck.c:110 be '== 0' instead of '= 0'? Yes, I'm not familiar with quotacheck, a quick man-page check didn't show anything about return values, so you are right to ask, I would say this is a bug! especially if execv() fails, then we should return EXIT_FAILURE You could send a patch for it too, thanks in advance! So if you are able to send a v2 of this patch, here are some comments: I'm putting one together now; thanks for the feedback! + * Return values: + * 0 : The container was terminated by a signal, or failed for an + * unknown reason. No change is made to the container argument. wait_for_container() failed to get the state of the container, the container was terminated by a signal or failed for an unknown reason. You mean wait_for_terminate()? No, I mean wait_for_container(), wait_for_terminate() is abstracted here, the doc should reflect the definition of the function, or you can just remove it and say we failed to get the state of container... Ok, I saw you just sent the patches, never mind! Thanks! -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [RFC 4/8] HACK0: allow meta information customizable
On Fri, Jun 27, 2014 at 11:24:48AM +0200, Daniel Mack wrote: On 06/27/2014 11:06 AM, AKASHI Takahiro wrote: On 06/27/2014 01:32 AM, Daniel Mack wrote: On 06/26/2014 12:33 PM, AKASHI Takahiro wrote: Assuming that attaching meta is necessary but its also expensive, it might be a good idea to have and check meta info not per message, but per connection. (In this case, we may have to invent higher-level protocol/concept). That's impossible because the information can change during a connection's lifetime, and we make the promise of sending accurate, up-to-date information along with each message, if the user requested it. Caching meta info might be a help, I have no specific idea though. No, that's what I'm saying. We cannot cache anything as any information may change at any time. The rule here is simple: if you want metadata, you have to be aware that they come at a price. If you don't need them, don't request them :) Well, Daniel I think we can do something here! We can use the cached metadata from handle or from HELLO and only for privileged bus users! Privileged bus users can do what ever they want, and per kdbus definition this include normal users which is nice! so we can perhap do that! If you don't trust the bus owner nor the bus! you should not connect to that bus. I've already patches that do this! will send them in minutes, we have metadata for free and bechmark improvment hmm say ~50% Thanks, Daniel ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [RFC 4/8] HACK0: allow meta information customizable
On Fri, Jun 27, 2014 at 11:44:06AM +0200, Daniel Mack wrote: On 06/27/2014 11:33 AM, Djalal Harouni wrote: On Fri, Jun 27, 2014 at 11:24:48AM +0200, Daniel Mack wrote: No, that's what I'm saying. We cannot cache anything as any information may change at any time. The rule here is simple: if you want metadata, you have to be aware that they come at a price. If you don't need them, don't request them :) Well, Daniel I think we can do something here! We can use the cached metadata from handle or from HELLO and only for privileged bus users! Privileged bus users can do what ever they want, and per kdbus definition this include normal users which is nice! so we can perhap do that! Did you consider the fact that a kdbus handle (the fd) can be passed from one userspace task to another? Tasks may also modify their name, drop capabilities, change their effective uid and all the like. Metadata must match the environment in which the _message_ has been generated, regardless what was the case when the bus user originally connected to the bus. Yes, I do agree that metadata *must* match the environment at the moment of generating and sending the message. In the other hand say we have a busy privileged process (that we can trust) that will setup its enviroment then open or make a HELLO cmd, and will not change its capabilities, others unprivileged processes or connections that do connect to this bus will for sure trust the provided metadata and have to! For the kdbus handle and the fd that can be passed from a task to another, yes you have a point. Need to investigate this more, applications should not just connect to any bus, in the other hand I'm making this only available to bus privileged users, so this follows the same scheme of kdbus activators, policy holders, monitors... And also is this really useful ? sure we need to explore it. I've already patches that do this! will send them in minutes, we have metadata for free and bechmark improvment hmm say ~50% Ok. Thanks, Daniel -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 0/4] kdbus: improve benchmark by using cached metadata
Hi, First this is just to investigate things! and if it follows kdbus use cases. This is in the spirit of the late kdbus benchmark threads. Do not apply, just trying to investigate things and if it is really useful. I did benchmarks and the improvement is say ~50%, but did not post numbers, sorry currently I only have kvm guests under hand! After applying the patches, give test-kdbus-benchmark a try (really sorry). Collecting metadata for kdbus is a heavy operation, in the other hand it is very useful and part of kdbus design. To reduce calls to kdbus_meta_append() and the metadata collection overhead, introduce the KDBUS_HELLO_CACHE_META flag that will only be available to privileged bus users. This flag will permit source connections to use the already cached metadata from the handle or from HELLO instead of collecting the metadata on each send operation. This improves benchmark tests by ~50%. This will be available only to privileged bus users, in other words bus owners. Applications that do not trust the bus owner should not connect to that bus, bus owners are able to monitor to set policies and even impersonate somebody else they can provide custom credential data. This will have dramatic performance improvements on cases where you have a privileged connection that performs broadcast, other unprivileged connections are interested on metadata but will not cause the source connection to collect its data on every send/broadcast round. Use cases where you a privileged process that will setup its environment then opens the bus in order to serve multiple connections and perform broadcasting will benefit from this. It allows to provide metadata and reduce latency. Things to investigate is: Is this really useful Security Add tests Thanks! ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 1/4] kdbus: add KDBUS_HELLO_CACHE_META to use the cached metadata
Collecting metadata for kdbus is a heavy operation, in the other hand it is very useful and part of kdbus design. To reduce calls to kdbus_meta_append() and the metadata collection overhead, introduce the KDBUS_HELLO_CACHE_META flag that will only be available to privileged bus users. This flag will permit source connections to use the already cached metadata from the handle or from HELLO instead of collecting the metadata on each send operation. This improves benchmark tests by ~50%. This will be available only to privileged bus users, in other words bus owners. Applications that do not trust the bus owner should not connect to that bus, bus owners are able to monitor to set policies... This will also have dramatic performance improvements on cases where you have a privileged connection that performs broadcast, other unprivileged connections are interested on metadata but will not cause the source connection to collect its data on every broadcast round. This allows to provide metadata and reduce latency. Signed-off-by: Djalal Harouni tix...@opendz.org --- kdbus.h | 6 ++ 1 file changed, 6 insertions(+) diff --git a/kdbus.h b/kdbus.h index 0b189cb..f637203 100644 --- a/kdbus.h +++ b/kdbus.h @@ -486,12 +486,18 @@ enum kdbus_policy_type { * registered with the name database * @KDBUS_HELLO_MONITOR: Special-purpose connection to monitor * bus traffic + * @KDBUS_HELLO_CACHE_META:Instead of collecting and sending the + * metadata of current process, the connection + * will use the cached metadata and credentials + * from the handle or from HELLO. Available + * only to privileged bus users. */ enum kdbus_hello_flags { KDBUS_HELLO_ACCEPT_FD = 1 0, KDBUS_HELLO_ACTIVATOR = 1 1, KDBUS_HELLO_POLICY_HOLDER = 1 2, KDBUS_HELLO_MONITOR = 1 3, + KDBUS_HELLO_CACHE_META = 1 4, }; /** -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 2/4] metadata: add kdbus_meta_memdup() to dup a metadata object
This is a preparation patch thats add kdbus_meta_memdup() to memdup a metadata object. This is useful to connections with a KDBUS_HELLO_CACHE_META flag. Signed-off-by: Djalal Harouni tix...@opendz.org --- metadata.c | 27 +++ metadata.h | 3 ++- 2 files changed, 29 insertions(+), 1 deletion(-) diff --git a/metadata.c b/metadata.c index 0db5392..6516402 100644 --- a/metadata.c +++ b/metadata.c @@ -70,6 +70,33 @@ void kdbus_meta_free(struct kdbus_meta *meta) kfree(meta); } +/** + * kdbus_meta_memdup() - duplicate metadata object + * + * @src: metadata to duplicate + * @meta: New metadata object + * + * Return: 0 on success, negative errno on failure. + */ +int kdbus_meta_memdup(struct kdbus_meta *src, struct kdbus_meta **meta) +{ + struct kdbus_meta *m; + + m = kmemdup(src, sizeof(*m), GFP_KERNEL); + if (!m) + return -ENOMEM; + + m-data = kmemdup(src-data, src-allocated_size, GFP_KERNEL); + if (!m-data) { + kfree(m); + return -ENOMEM; + } + + *meta = m; + + return 0; +} + static struct kdbus_item * kdbus_meta_append_item(struct kdbus_meta *meta, size_t extra_size) { diff --git a/metadata.h b/metadata.h index 77b7303..766dade 100644 --- a/metadata.h +++ b/metadata.h @@ -16,7 +16,7 @@ /** * struct kdbus_meta - metadata buffer * @attached: Flags for already attached data - * @domain:Domain the metadata belongs to + * @domain:Domain the metadata belongs to * @data: Allocated buffer * @size: Number of bytes used * @allocated_size:Size of buffer @@ -35,6 +35,7 @@ struct kdbus_meta { struct kdbus_conn; int kdbus_meta_new(struct kdbus_meta **meta); +int kdbus_meta_memdup(struct kdbus_meta *src, struct kdbus_meta **meta); int kdbus_meta_append_data(struct kdbus_meta *meta, u64 type, const void *buf, size_t len); int kdbus_meta_append(struct kdbus_meta *meta, -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 3/4] connection: use the already cached metadata if KDBUS_HELLO_CACHE_META is set
For connections with the KDBUS_HELLO_CACHE_META flag dup the metadata/credentials from handle or from the HELLO cmd, and use it to construct kdbus kmsg object, this improves benchmark by ~50% The KDBUS_HELLO_CACHE_META flag is only for privileged bus users, others will fail with -EPERM. Privileged bus users can do what ever they want. This will also have a valid use case where you have a privileged process that serves multiple connections and perhaps broadcasting, normal unprivileged connections should just trust what they receive and should not be able to cause extra work for the privileged process. Signed-off-by: Djalal Harouni tix...@opendz.org --- connection.c | 22 +++--- connection.h | 2 +- 2 files changed, 20 insertions(+), 4 deletions(-) diff --git a/connection.c b/connection.c index 85ffa5a..f460ad1 100644 --- a/connection.c +++ b/connection.c @@ -1230,7 +1230,16 @@ int kdbus_conn_kmsg_send(struct kdbus_ep *ep, /* non-kernel senders append credentials/metadata */ if (conn_src) { - ret = kdbus_meta_new(kmsg-meta); + /* +* If KDBUS_HELLO_CACHE_META then dup and use the +* cached metadata/credentials from handle or from +* HELLO cmd, otherwise allocate a new one. +*/ + if (conn_src-flags KDBUS_HELLO_CACHE_META) + ret = kdbus_meta_memdup(conn_src-meta, kmsg-meta); + else + ret = kdbus_meta_new(kmsg-meta); + if (ret 0) return ret; } @@ -1888,9 +1897,16 @@ int kdbus_conn_new(struct kdbus_ep *ep, if (is_activator is_policy_holder) return -EINVAL; - /* only privileged connections can activate and monitor */ + /* +* Only privileged connections can be/do : +* 1) Activators +* 2) Policy holders +* 3) Monitors +* 4) Use the cached metadata +*/ if (!kdbus_bus_uid_is_privileged(bus) - (is_activator || is_policy_holder || is_monitor)) + (is_activator || is_policy_holder || is_monitor || +hello-conn_flags KDBUS_HELLO_CACHE_META)) return -EPERM; KDBUS_ITEMS_FOREACH(item, hello-items, diff --git a/connection.h b/connection.h index 46f7b6e..c8b1e25 100644 --- a/connection.h +++ b/connection.h @@ -47,7 +47,7 @@ * @work: Delayed work to handle timeouts * @match_db: Subscription filter to broadcast messages * @meta: Active connection creator's metadata/credentials, - * either from the handle of from HELLO + * either from the handle or from HELLO * @owner_meta:The connection's metadata/credentials supplied by * HELLO * @pool: The user's buffer to receive messages -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 4/4] kdbus-benchmark: use KDBUS_HELLO_CACHE_META to improve benchmark
Use the new KDBUS_HELLO_CACHE_META flag to improve benchmark by ~50% This reduces latency and allows sending metadata at the same time. Signed-off-by: Djalal Harouni tix...@opendz.org --- test/test-kdbus-benchmark.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/test/test-kdbus-benchmark.c b/test/test-kdbus-benchmark.c index f3b49e1..7d3c6d5 100644 --- a/test/test-kdbus-benchmark.c +++ b/test/test-kdbus-benchmark.c @@ -268,7 +268,8 @@ int main(int argc, char *argv[]) if (!conn_a) return EXIT_FAILURE; - conn_b = kdbus_hello(bus, 0); + /* conn_b = kdbus_hello(bus, 0); */ + conn_b = kdbus_hello(bus, KDBUS_HELLO_CACHE_META); if (!conn_b) return EXIT_FAILURE; -- 1.9.3 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] kdbus performance regression by ~70% on 3.15 kernels ?
Hi, Just to let you know that I did notice a regression by ~70% when running test-kdbus-benchmark on a kvm guest (that's what I've under hands now) I know sorry, but still a kdbus on kvm is a valid case, I don't know if this affects real machine or only kvm guests will be able to confirm it next week unless someone do! If you are able to test it in a real machine and confirm that it affects them too, thank you! I've managed to bisect this to: 3.15.0-rc1 good 3.15.0-rc5 bad I Will continue later this day! Perhaps currently any produced benchmark results for kdbus are not accurate...! Busy kvm guest with a fedora filesystem, kernel 3.15.0-rc1 $ ./test/test-kdbus-benchmark -- opening /dev/kdbus/control -- creating bus '1000-testbus' -- opening bus connection /dev/kdbus/1000-testbus/bus -- Our peer ID for /dev/kdbus/1000-testbus/bus: 1 -- bus uuid: 'beb14de480b240ccaa1574ffac606ca8' -- opening bus connection /dev/kdbus/1000-testbus/bus -- Our peer ID for /dev/kdbus/1000-testbus/bus: 2 -- bus uuid: 'beb14de480b240ccaa1574ffac606ca8' name_acquire(): flags after call: 0x0 -- entering poll loop ... stats: 17927 packets processed, latency (usecs) min/max/avg 42/1277/47 stats: 18152 packets processed, latency (usecs) min/max/avg 42/251/46 stats: 18057 packets processed, latency (usecs) min/max/avg 42/441/46 stats: 18152 packets processed, latency (usecs) min/max/avg 42/300/46 stats: 18100 packets processed, latency (usecs) min/max/avg 42/297/46 stats: 18064 packets processed, latency (usecs) min/max/avg 42/239/46 Busy kvm guest with same fedora filesystem (same processes) but kernel 3.15.0-rc5: $ ./test/test-kdbus-benchmark -- opening /dev/kdbus/control -- creating bus '1000-testbus' -- opening bus connection /dev/kdbus/1000-testbus/bus -- Our peer ID for /dev/kdbus/1000-testbus/bus: 1 -- bus uuid: 'c1433addbdbd4f2e8561c9921d57d112' -- opening bus connection /dev/kdbus/1000-testbus/bus -- Our peer ID for /dev/kdbus/1000-testbus/bus: 2 -- bus uuid: 'c1433addbdbd4f2e8561c9921d57d112' name_acquire(): flags after call: 0x0 -- entering poll loop ... stats: 5590 packets processed, latency (usecs) min/max/avg 113/10274/131 stats: 5507 packets processed, latency (usecs) min/max/avg 113/10314/132 stats: 5550 packets processed, latency (usecs) min/max/avg 113/12435/132 Thanks -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] kdbus performance regression by ~70% on 3.15 kernels ?
On Fri, Jun 27, 2014 at 01:04:00PM +0200, Daniel Mack wrote: On 06/27/2014 12:51 PM, Djalal Harouni wrote: Just to let you know that I did notice a regression by ~70% when running test-kdbus-benchmark on a kvm guest (that's what I've under hands now) I know sorry, but still a kdbus on kvm is a valid case, I don't know if this affects real machine or only kvm guests will be able to confirm it next week unless someone do! If you are able to test it in a real machine and confirm that it affects them too, thank you! I've managed to bisect this to: 3.15.0-rc1 good 3.15.0-rc5 bad I Will continue later this day! Please do. I'm not currently aware of such a regression. What about 3.16-rc2? Will test this afternoon, and report back! Thanks, Daniel -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] kdbus performance regression by ~70% on 3.15 kernels ?
On Fri, Jun 27, 2014 at 12:23:05PM +0100, Djalal Harouni wrote: On Fri, Jun 27, 2014 at 01:04:00PM +0200, Daniel Mack wrote: On 06/27/2014 12:51 PM, Djalal Harouni wrote: Just to let you know that I did notice a regression by ~70% when running test-kdbus-benchmark on a kvm guest (that's what I've under hands now) I know sorry, but still a kdbus on kvm is a valid case, I don't know if this affects real machine or only kvm guests will be able to confirm it next week unless someone do! If you are able to test it in a real machine and confirm that it affects them too, thank you! I've managed to bisect this to: 3.15.0-rc1 good 3.15.0-rc5 bad I Will continue later this day! Please do. I'm not currently aware of such a regression. What about 3.16-rc2? A bit late, sorry! I was wrong on the 3.15.0-rc5 sorry that was a fedora rawhide kernel got confused by the naming and 'rc5'... but yes fedora rawhide affected! so something backported perhaps... Anyway for upstream tests: 3.15.0-rc5 and 3.15.0-rc7 are good 3.16-rc1 and 3.16-rc2 are bad So I confirm there is a regression somewhere. -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] kdbus performance regression by ~70% on 3.15 kernels ?
On Fri, Jun 27, 2014 at 02:28:56PM -0700, Greg KH wrote: On Fri, Jun 27, 2014 at 10:19:03PM +0100, Djalal Harouni wrote: On Fri, Jun 27, 2014 at 12:23:05PM +0100, Djalal Harouni wrote: On Fri, Jun 27, 2014 at 01:04:00PM +0200, Daniel Mack wrote: On 06/27/2014 12:51 PM, Djalal Harouni wrote: Just to let you know that I did notice a regression by ~70% when running test-kdbus-benchmark on a kvm guest (that's what I've under hands now) I know sorry, but still a kdbus on kvm is a valid case, I don't know if this affects real machine or only kvm guests will be able to confirm it next week unless someone do! If you are able to test it in a real machine and confirm that it affects them too, thank you! I've managed to bisect this to: 3.15.0-rc1 good 3.15.0-rc5 bad I Will continue later this day! Please do. I'm not currently aware of such a regression. What about 3.16-rc2? A bit late, sorry! I was wrong on the 3.15.0-rc5 sorry that was a fedora rawhide kernel got confused by the naming and 'rc5'... but yes fedora rawhide affected! so something backported perhaps... Anyway for upstream tests: 3.15.0-rc5 and 3.15.0-rc7 are good 3.16-rc1 and 3.16-rc2 are bad So I confirm there is a regression somewhere. Can you run 'git bisect' on the kernel tree to try to track down the problem commit? Yes of course! I'm planning to do so Thanks! thanks, greg k-h -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] kdbus performance regression by ~70% on 3.15 kernels ?
On Fri, Jun 27, 2014 at 04:55:30PM -0700, Steven Noonan wrote: On Fri, Jun 27, 2014 at 3:14 PM, Djalal Harouni tix...@opendz.org wrote: On Fri, Jun 27, 2014 at 02:28:56PM -0700, Greg KH wrote: On Fri, Jun 27, 2014 at 10:19:03PM +0100, Djalal Harouni wrote: On Fri, Jun 27, 2014 at 12:23:05PM +0100, Djalal Harouni wrote: On Fri, Jun 27, 2014 at 01:04:00PM +0200, Daniel Mack wrote: On 06/27/2014 12:51 PM, Djalal Harouni wrote: Just to let you know that I did notice a regression by ~70% when running test-kdbus-benchmark on a kvm guest (that's what I've under hands now) I know sorry, but still a kdbus on kvm is a valid case, I don't know if this affects real machine or only kvm guests will be able to confirm it next week unless someone do! If you are able to test it in a real machine and confirm that it affects them too, thank you! I've managed to bisect this to: 3.15.0-rc1 good 3.15.0-rc5 bad I Will continue later this day! Please do. I'm not currently aware of such a regression. What about 3.16-rc2? A bit late, sorry! I was wrong on the 3.15.0-rc5 sorry that was a fedora rawhide kernel got confused by the naming and 'rc5'... but yes fedora rawhide affected! so something backported perhaps... Anyway for upstream tests: 3.15.0-rc5 and 3.15.0-rc7 are good 3.16-rc1 and 3.16-rc2 are bad So I confirm there is a regression somewhere. Can you run 'git bisect' on the kernel tree to try to track down the problem commit? Yes of course! I'm planning to do so Thanks! Was going to try to repro this perf regression as well, but instead got kdbus to oops (via test-kdbus-benchmark): $ test/test-kdbus-benchmark -- opening /dev/kdbus/control -- creating bus '1000-testbus' -- opening bus connection /dev/kdbus/1000-testbus/bus -- Our peer ID for /dev/kdbus/1000-testbus/bus: 1 -- bus uuid: 'b65bfdd23d3e4696aae2992a0857aa33' -- opening bus connection /dev/kdbus/1000-testbus/bus -- Our peer ID for /dev/kdbus/1000-testbus/bus: 2 -- bus uuid: 'b65bfdd23d3e4696aae2992a0857aa33' name_acquire(): flags after call: 0x0 Killed $ [ 32.853967] kdbus: initialized [ 33.557785] BUG: unable to handle kernel NULL pointer dereference at (null) [ 33.557819] IP: [ (null)] (null) [ 33.557837] PGD c58a5067 PUD c81cd067 PMD 0 [ 33.557856] Oops: 0010 [#1] SMP [ 33.557870] Modules linked in: kdbus(O) snd_hda_codec_hdmi tun hid_generic snd_hda_codec_realtek snd_hda_codec_generic usbhid hid kvm_amd kvm crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd snd_hda_intel snd_hda_controller radeon microcode snd_hda_codec snd_hwdep broadcom snd_pcm snd_timer serio_raw tg3 fam15h_power snd ttm soundcore libphy edac_core i2c_piix4 edac_mce_amd k10temp tpm_tis tpm acpi_cpufreq wmi evdev processor usbip_host(C) usbip_core(C) ext4 crc16 jbd2 mbcache sd_mod ata_generic pata_acpi crc_t10dif crct10dif_common ahci pata_jmicron libahci pata_atiixp crc32c_intel ehci_pci ehci_hcd xhci_hcd libata firewire_ohci usbcore scsi_mod usb_common firewire_core crc_itu_t i915 video intel_gtt i2c_algo_bit drm_kms_helper [ 33.558231] drm i2c_core e1000e ptp pps_core ipmi_poweroff ipmi_msghandler button [ 33.558267] CPU: 1 PID: 1393 Comm: test-kdbus-benc Tainted: G C O 3.16.0-rc2-ec2-00222-g3493860 #1 [ 33.558335] task: 8803e7811d80 ti: 8800c82cc000 task.ti: 8800c82cc000 [ 33.558364] RIP: 0010:[] [ (null)] (null) [ 33.558398] RSP: 0018:8800c82cfe40 EFLAGS: 00010246 [ 33.558419] RAX: 81636400 RBX: 880406dcfe40 RCX: [ 33.558447] RDX: 0001 RSI: 8800c82cfe88 RDI: 8800c82cfe98 [ 33.558475] RBP: 8800c82cfe78 R08: 7fff80d23810 R09: 8803f9d5cc00 [ 33.558503] R10: 8803e7811d80 R11: 0246 R12: 8800c82cfe98 [ 33.558532] R13: 880406dcfe48 R14: 8800c82cfe88 R15: 0001 [ 33.558566] FS: 7f97e69bf700() GS:88042dc4() knlGS: [ 33.558595] CS: 0010 DS: ES: CR0: 8005003b [ 33.558616] CR2: CR3: 36e2 CR4: 000407e0 [ 33.558642] Stack: [ 33.558650] a11f45d6 8800c82cff50 0010 [ 33.558681] 7fff80d23800 8800c82cff50 8800c82cfef8 [ 33.558712] 811d386a 7fff80d23800 0010 8803f9d5d400 [ 33.558743] Call Trace: [ 33.558757] [a11f45d6] ? kdbus_memfd_writev+0x66/0xa0 [kdbus] [ 33.558785] [811d386a] do_sync_write+0x5a
Re: [systemd-devel] [PATCH 07/12] policy: use the db-entries_hash to access the policy db entries
Hi, On Fri, Jun 20, 2014 at 07:12:13PM +0100, Djalal Harouni wrote: On Fri, Jun 20, 2014 at 08:01:04PM +0200, Daniel Mack wrote: On 06/20/2014 07:28 PM, Daniel Mack wrote: On 06/20/2014 06:50 PM, Djalal Harouni wrote: Use the db-entries_hash to access the policy db entries instead of the db-send_access_hash which is just a cache for send entries. Ah, you're purging the other entries in patch #12. Alright then, now it makes sense. Indeed, I've tested it and the cache is cleared. Applied both #7 and #9 now. Thank you Daniel, There is a still another series related to the cache I just need to test it. When we update the TALK POLICY of a connection that is already referenced as a *destination* in the cache we must purge all its entries from the cache, since permission have been changed! we need to redo the permission checks. This should be done in kdbus_policy_set() when we update the TALK policy but since kdbus_policy_set() can be called by an endpoint as an owner, not only a connection, I did split the code in a new function... Anyway I'll test it send it tomorrow. Just an update: I'm still working on the TALK cache issue, I've added the infrastracture to clear the TALK cache (db-send_access_hash) when we update the TALK policy of the destination connection, yes we need to clear the previously cached entries where it is listed as destination since its TALK permission have been changed (issue #1). However we don't want to clear *all* the cache, only entries that apply to the newly updated policies! Since policy holders connections might have several policies and multiple names, the best solution would be naturaly to clear only cached entries related to the updated policy and name! and keep other cached entries related to other policies and names... while working on it, I hit another two bugs: Issue #2: currently it seems we can't have a policy holder connection with multiple policies and names! This check: https://code.google.com/p/d-bus/source/browse/connection.c#1908 Will prevent policy holders to register multiple names and policies, it should prevent activators, since we have a one name per activator! but not policy holders. So I've removed this check, I've added the necessary test infrastructure to be able to register multiple policies and names, but I'm stuck in which I think a kdbus item alignment problem! I'll try to fix and start with this series! Issue #3 (not confirmed) related to issue #2 and issue #1 of course is that in function: __kdbus_policy_check_talk_access() https://code.google.com/p/d-bus/source/browse/policy.c#254 That function do not receive the name of the policy to check TALK access against! the other OWN and SEE check access functions do, but not the 'TALK' one. The current behaviour will walk the 'conn_dst-names_list' and perform the TALK access against all names, if one name allows the TALK then we have access! but in that case perhaps the source connection was aiming to send to a different name which restrcit access, not this one! So IMHO that function and all its callers should be updated to receive the name of the destination as a third parameter, and the 'name' should also be saved in the TALK cache, this way we are sure that the cached entries belong to the name being checked! So we have two issues here. So what do you think Daniel, Kay ? (Hope I'm not getting it wrong) I'll try to have tests for issue #2 ASAP, and will probably handle issue #1 since I've already added the infrastructure, and if you confirm issue #3 I'll perhaps also work on it later too (if you don't beat me :-) Thank you! -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
Re: [systemd-devel] [PATCH 0/12] kdbus: policy tests and fixes
Hi Daniel, (Really sorry for my late response, my university/research job...) On Fri, Jun 20, 2014 at 07:21:25PM +0200, Daniel Mack wrote: Hi Djalal, On 06/20/2014 06:49 PM, Djalal Harouni wrote: This series adds the test-kdbus-policy test. The first patches are prepration then you have the test. Later there are several fixes and improvments, I've performed all the tests with success. Very nice, thanks a lot for doing this! You are welcome! just reporting and fixing kdbus+systemd things that I find... I'll comment on the individual patches. Thank you for applying the series. I still have another series which deals with the send access cache, will send it soon, or perhaps tomorrow it should go on top of this. Ok, great. Please Kay, Daniel allow me this question: The policy holders are just connections that register policy entries! They dont register names, so the registered policy entry wont take any effect unless you acquire (register into database) its name ! That's correct. The idea here is to close the gap between name acquisition and the policy being applied, and the owner of a name should not be the same instance that decides who's allowed to own it, who may talk to it or see it. Ok, I see that! that permits to reserve policies without races! X registers (reserve name) the policy, but later Y is the real owner... In the same time it will allow the *same* instance to own a name and to decides who's allowed to own it, access it... IIUC since currently we can do that. Likewise, a connection can only own a name on the system bus if there's a policy rule that allows just that, and the rule has to be added beforehand by the bus owner. Ok, that's nice! It also allow other general purpose use cases where you need kdbus features and at the same time bus separation. We need here two operations: 1) register as a policy holder 2) acquire the name to be able to send to that name and to activate the policy rules. Is this the intended behaviour ? Yes, exactly, and installing a policy is a privileged operation. We thought a lot about the design here, and I think this is a good and clean solution. Did you understand that right away? Is there anything Yes, installing a policy holder or an activator For the activator that was easy, for the policy holder I think we need to document this in the kdbus.h perhaps in the 'KDBUS_HELLO_POLICY_HOLDER' section add: and to activate that name... and to be able to send to it you must acquire it. ? illogical about the idea you're concerned about? We're open to suggestions. After all, the code is not yet in production :) Thank you for the explanation. Yes I thought that policy holders are like activators, only activators are for systemd and policy holders are for all other cases! So I was wondering why I can't reach the policy holder and its name? I misunderstood the concept at first, then I found we need to acquire that name to activate it! and the acquire name operation will be subject to the OWN access policy. So this is clear now, and it will for sure prevent own names races! that's good design. So perhaps just add small note into the 'KDBUS_HELLO_POLICY_HOLDER' section ? I was also worried about spoofing activators and policies but hopefully it seems there is no chance for that. Unrelated notes: There is also some concern about LSM! I hope that kdbus will stay clean. Ah and kdbus as it is now, can fit other general purpose use cases, not only D-Bus... one example: Send a *big* chunk of data using kdbus *features* + *metadata*, but later receiver can just *drop* it if it does not fit for an X reason! with kdbus this is a cheap operation... I should also mention this: There is also another hidden positive point for kdbus, by using -ioctl() instead of -read() or -write() file operations you close some windows against creds and privileges changes! Currently there are checks against *current* creds at the time of the ioctl() call! However we all know that there are some setuid programs that read()/write() from stdin/stdout... which can be dupped by unprivileged parent, thus if the unprivileged parent is *able* to -open() and *not* able to -read()/write() he will dup fd and pass it to the exec suid program to pass privileges checks and do -read()/write(). So some kernel interfaces should worry about these cases! and the only protection should be to do checks against *creds* that *opened* the file descriptor not against *current* creds which may point to the privileged setuid process. In the other hand you dont find setuid programs that do ioclt() on stdin/stdout... that would be stupid! So being a char device with its own ioctl() just makes things simpler. Thanks Thanks, Daniel -- Djalal Harouni http://opendz.org ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 0/12] kdbus: policy tests and fixes
Hi, This series adds the test-kdbus-policy test. The first patches are prepration then you have the test. Later there are several fixes and improvments, I've performed all the tests with success. I still have another series which deals with the send access cache, will send it soon, or perhaps tomorrow it should go on top of this. Ah and there is also another series which discuss some policy design... probably next week when time permits. Please Kay, Daniel allow me this question: The policy holders are just connections that register policy entries! They dont register names, so the registered policy entry wont take any effect unless you acquire (register into database) its name ! We need here two operations: 1) register as a policy holder 2) acquire the name to be able to send to that name and to activate the policy rules. Is this the intended behaviour ? Thank you! ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 02/12] test: make msg_send() return -errno
Make msg_send() return negative errno not EXIT_FAILURE. We need this since we will add tests that fork() and drop privileges and we want all the error codes, especially -EPERM. After a quick grep it seems the return value of msg_send() is never used, so it is safe to convert it, this wont break any test. Signed-off-by: Djalal Harouni tix...@opendz.org --- test/kdbus-util.c | 15 ++- 1 file changed, 10 insertions(+), 5 deletions(-) diff --git a/test/kdbus-util.c b/test/kdbus-util.c index 988aa8c..b7dc057 100644 --- a/test/kdbus-util.c +++ b/test/kdbus-util.c @@ -189,20 +189,23 @@ int msg_send(const struct conn *conn, strcpy(m.name, my-name-is-nice); ret = ioctl(conn-fd, KDBUS_CMD_MEMFD_NEW, m); if (ret 0) { + ret = -errno; fprintf(stderr, KDBUS_CMD_MEMFD_NEW failed: %m\n); - return EXIT_FAILURE; + return ret; } memfd = m.cmd.fd; if (write(memfd, kdbus memfd 1234567, 19) != 19) { + ret = -errno; fprintf(stderr, writing to memfd failed: %m\n); - return EXIT_FAILURE; + return ret; } ret = ioctl(memfd, KDBUS_CMD_MEMFD_SEAL_SET, true); if (ret 0) { + ret = -errno; fprintf(stderr, memfd sealing failed: %m\n); - return EXIT_FAILURE; + return ret; } size += KDBUS_ITEM_SIZE(sizeof(struct kdbus_memfd)); @@ -213,8 +216,9 @@ int msg_send(const struct conn *conn, msg = malloc(size); if (!msg) { + ret = -errno; fprintf(stderr, unable to malloc()!?\n); - return EXIT_FAILURE; + return ret; } memset(msg, 0, size); @@ -269,8 +273,9 @@ int msg_send(const struct conn *conn, ret = ioctl(conn-fd, KDBUS_CMD_MSG_SEND, msg); if (ret 0) { + ret = -errno; fprintf(stderr, error sending message: %d err %d (%m)\n, ret, errno); - return EXIT_FAILURE; + return ret; } if (memfd = 0) -- 1.9.0 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 03/12] test: add simple helper to drop privileges
This is needed since we will add tests to fork() + drop privileges Signed-off-by: Djalal Harouni tix...@opendz.org --- test/kdbus-util.c | 29 + test/kdbus-util.h | 1 + 2 files changed, 30 insertions(+) diff --git a/test/kdbus-util.c b/test/kdbus-util.c index b7dc057..965c95d 100644 --- a/test/kdbus-util.c +++ b/test/kdbus-util.c @@ -20,6 +20,7 @@ #include errno.h #include assert.h #include poll.h +#include grp.h #include sys/ioctl.h #include sys/mman.h @@ -625,3 +626,31 @@ void add_match_empty(int fd) if (ret 0) fprintf(stderr, --- error adding conn match: %d (%m)\n, ret); } + +int drop_privileges(uid_t uid, gid_t gid) +{ + int ret; + + ret = setgroups(0, NULL); + if (ret 0) { + ret = -errno; + fprintf(stderr, error setgroups: %d (%m)\n, ret); + return ret; + } + + ret = setresgid(gid, gid, gid); + if (ret 0) { + ret = -errno; + fprintf(stderr, error setresgid: %d (%m)\n, ret); + return ret; + } + + ret = setresuid(uid, uid, uid); + if (ret 0) { + ret = -errno; + fprintf(stderr, error setresuid: %d (%m)\n, ret); + return ret; + } + + return ret; +} diff --git a/test/kdbus-util.h b/test/kdbus-util.h index 9771622..dd7d7b6 100644 --- a/test/kdbus-util.h +++ b/test/kdbus-util.h @@ -55,3 +55,4 @@ struct kdbus_item *make_policy_name(const char *name); struct kdbus_item *make_policy_access(__u64 type, __u64 bits, __u64 id); void add_match_empty(int fd); +int drop_privileges(uid_t uid, gid_t gid); -- 1.9.0 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 04/12] test: add conn_update() to test KDBUS_CMD_CONN_UPDATE
Add conn_update() to perform KDBUS_CMD_CONN_UPDATE ioctl() calls. Signed-off-by: Djalal Harouni tix...@opendz.org --- test/kdbus-util.c | 74 +++ test/kdbus-util.h | 4 +++ 2 files changed, 78 insertions(+) diff --git a/test/kdbus-util.c b/test/kdbus-util.c index 965c95d..6bb3bbf 100644 --- a/test/kdbus-util.c +++ b/test/kdbus-util.c @@ -1,6 +1,7 @@ /* * Copyright (C) 2013 Daniel Mack * Copyright (C) 2013 Kay Sievers + * Copyright (C) 2014 Djalal Harouni * * kdbus is free software; you can redistribute it and/or modify it under * the terms of the GNU Lesser General Public License as published by the @@ -606,6 +607,79 @@ int name_list(struct conn *conn, uint64_t flags) return 0; } +int conn_update(struct conn *conn, const char *name, + const struct kdbus_policy_access *access, + size_t num_access, uint64_t flags) +{ + struct kdbus_cmd_update *update; + struct kdbus_item *item; + size_t i, size; + int ret; + + size = sizeof(struct kdbus_cmd_update); + size += KDBUS_ITEM_SIZE(sizeof(uint64_t)); + size += KDBUS_ITEM_SIZE(strlen(name) + 1); + size += num_access * KDBUS_ITEM_SIZE(sizeof(struct kdbus_policy_access)); + + update = malloc(size); + if (!update) { + ret = -errno; + fprintf(stderr, error malloc: %d (%m)\n, ret); + return ret; + } + + memset(update, 0, size); + update-size = size; + + item = update-items; + + /* +* normally having flags == 0 is valid, but just keep +* HELLO flags of __kdbus_hello(), don't check them. +*/ + item-type = KDBUS_ITEM_ATTACH_FLAGS; + item-size = KDBUS_ITEM_HEADER_SIZE + sizeof(uint64_t); + item-data64[0] = flags ? flags : KDBUS_ATTACH_TIMESTAMP | + KDBUS_ATTACH_CREDS | + KDBUS_ATTACH_NAMES | + KDBUS_ATTACH_COMM | + KDBUS_ATTACH_EXE | + KDBUS_ATTACH_CMDLINE | + KDBUS_ATTACH_CAPS | + KDBUS_ATTACH_CGROUP | + KDBUS_ATTACH_SECLABEL | + KDBUS_ATTACH_AUDIT | + KDBUS_ATTACH_CONN_NAME; + item = KDBUS_ITEM_NEXT(item); + + item-type = KDBUS_ITEM_NAME; + item-size = KDBUS_ITEM_HEADER_SIZE + strlen(name) + 1; + strcpy(item-str, name); + item = KDBUS_ITEM_NEXT(item); + + for (i = 0; i num_access; i++) { + item-size = KDBUS_ITEM_HEADER_SIZE + +sizeof(struct kdbus_policy_access); + item-type = KDBUS_ITEM_POLICY_ACCESS; + + item-policy_access.type = access[i].type; + item-policy_access.access = access[i].access; + item-policy_access.id = access[i].id; + + item = KDBUS_ITEM_NEXT(item); + } + + ret = ioctl(conn-fd, KDBUS_CMD_CONN_UPDATE, update); + if (ret 0) { + ret = -errno; + fprintf(stderr, error conn update: %d (%m)\n, ret); + } + + free(update); + + return ret; +} + void add_match_empty(int fd) { struct { diff --git a/test/kdbus-util.h b/test/kdbus-util.h index dd7d7b6..0fcfb72 100644 --- a/test/kdbus-util.h +++ b/test/kdbus-util.h @@ -53,6 +53,10 @@ struct conn *kdbus_hello_activator(const char *path, const char *name, size_t num_access); struct kdbus_item *make_policy_name(const char *name); struct kdbus_item *make_policy_access(__u64 type, __u64 bits, __u64 id); +int conn_update(struct conn *conn, const char *name, + const struct kdbus_policy_access *access, + size_t num_access, uint64_t flags); + void add_match_empty(int fd); int drop_privileges(uid_t uid, gid_t gid); -- 1.9.0 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 05/12] test: add the test-kdbus-policy test
Add the test-kdbus-policy that performs: 1) Register a policy holder 2) Try to register the same name as an activator: kdbus will break here due to a corrupted db-entries_hash in kdbus_policy_set(). Will be fixed in the next patches. 3) Acquire the name for the policy holder 4) Create and test the connections 5) Fork and drop privileges, then create and test connections which should be cached in the send cache after the tests. Here we inspect the send cache and we have located several bugs, which will be fixed in the next patches. 6) Call kdbus_set_policy_talk() to test KDBUS_CMD_CONN_UPDATE ioctl and restrict TALK access to KDBUS_POLICY_ACCESS_USER 7) Redo test 5), now connections should all fail with -EPERM since TALK access was restricted. To test this we need the right capabilities to perform the setuid() and drop privileges, so this test just check if it was exec by root. Signed-off-by: Djalal Harouni tix...@opendz.org --- .gitignore | 1 + test/Makefile| 3 +- test/test-kdbus-policy.c | 456 +++ 3 files changed, 459 insertions(+), 1 deletion(-) create mode 100644 test/test-kdbus-policy.c diff --git a/.gitignore b/.gitignore index f441632..e2bdc63 100644 --- a/.gitignore +++ b/.gitignore @@ -18,3 +18,4 @@ test/test-kdbus-chat test/test-kdbus-timeout test/test-kdbus-prio test/test-kdbus-sync +test/test-kdbus-policy diff --git a/test/Makefile b/test/Makefile index 83cb736..f8117c8 100644 --- a/test/Makefile +++ b/test/Makefile @@ -20,7 +20,8 @@ TESTS= \ test-kdbus-chat \ test-kdbus-timeout \ test-kdbus-sync \ - test-kdbus-prio + test-kdbus-prio \ + test-kdbus-policy all: $(TESTS) diff --git a/test/test-kdbus-policy.c b/test/test-kdbus-policy.c new file mode 100644 index 000..6099087 --- /dev/null +++ b/test/test-kdbus-policy.c @@ -0,0 +1,456 @@ +/* + * Copyright (C) 2014 Djalal Harouni + * + * kdbus is free software; you can redistribute it and/or modify it under + * the terms of the GNU Lesser General Public License as published by the + * Free Software Foundation; either version 2.1 of the License, or (at + * your option) any later version. + */ + +#include stdio.h +#include string.h +#include fcntl.h +#include pthread.h +#include poll.h +#include stdlib.h +#include stddef.h +#include stdint.h +#include stdbool.h +#include unistd.h +#include errno.h +#include signal.h +#include sys/wait.h +#include sys/prctl.h +#include sys/ioctl.h + +#include kdbus-util.h +#include kdbus-enum.h + +#define MAX_CONN 64 +#define POLICY_NAMEfoo.test.policy-test + + +/** + * The purpose of these tests: + * 1) Check KDBUS_POLICY_TALK + * 2) Check the cache state: kdbus_policy_db-send_access_hash + * Should be extended + */ + +/** + * Check a list of connections against conn_db[0] + * conn_db[0] will be the policy holder and it will set + * different policy accesses. + */ +static struct conn **conn_db; + +void kdbus_free_conn(struct conn *conn) +{ + if (conn) { + close(conn-fd); + free(conn); + } +} + +/* Trigger kdbus_policy_set() */ +static int kdbus_set_policy_talk(struct conn *conn, +const char *name, +uid_t id, unsigned int type) +{ + struct kdbus_policy_access access = { + .type = type, + .id = id, + .access = KDBUS_POLICY_TALK, + }; + + return conn_update(conn, name, access, 1, 0); +} + +/* The policy access will be stored in a policy holder connection */ +static int kdbus_register_activator(char *bus, const char *name, + struct conn **c) +{ + struct conn *activator; + + activator = kdbus_hello_activator(bus, name, NULL, 0); + if (!activator) + return -errno; + + *c = activator; + + return 0; +} + +static int kdbus_register_policy_holder(char *bus, const char *name, + struct conn **conn) +{ + struct conn *c; + struct kdbus_policy_access access[2]; + + access[0].type = KDBUS_POLICY_ACCESS_USER; + access[0].access = KDBUS_POLICY_OWN; + access[0].id = geteuid(); + + access[1].type = KDBUS_POLICY_ACCESS_WORLD; + access[1].access = KDBUS_POLICY_TALK; + access[1].id = geteuid(); + + c = kdbus_hello_registrar(bus, name, access, 2, + KDBUS_HELLO_POLICY_HOLDER); + if (!c) + return -errno; + + *conn = c; + + return 0; +} + +static void *kdbus_recv_echo(void *ptr) +{ + int ret; + int cnt = 3; + struct pollfd fd; + struct conn *conn = ptr; + + fd.fd = conn-fd; + fd.events = POLLIN | POLLPRI | POLLHUP; + fd.revents = 0; + + while (cnt) { + cnt--; + ret = poll(fd, 1, 2000); + if (ret
[systemd-devel] [PATCH 06/12] connection: update attach_flags only if KDBUS_ITEM_ATTACH_FLAGS is provided
Fix a bug introcuded in commit d92d68414fab which fixed another bug. conn-attach_flags should only be update if KDBUS_ITEM_ATTACH_FLAGS was provided. Signed-off-by: Djalal Harouni tix...@opendz.org --- connection.c | 5 - 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/connection.c b/connection.c index 3e8c5de..542f677 100644 --- a/connection.c +++ b/connection.c @@ -1784,6 +1784,7 @@ int kdbus_cmd_conn_update(struct kdbus_conn *conn, { const struct kdbus_item *item; bool policy_provided = false; + bool flags_provided = false; u64 attach_flags = 0; int ret; @@ -1795,6 +1796,7 @@ int kdbus_cmd_conn_update(struct kdbus_conn *conn, switch (item-type) { case KDBUS_ITEM_ATTACH_FLAGS: + flags_provided = true; attach_flags = item-data64[0]; break; case KDBUS_ITEM_NAME: @@ -1807,7 +1809,8 @@ int kdbus_cmd_conn_update(struct kdbus_conn *conn, if (!KDBUS_ITEMS_END(item, cmd-items, KDBUS_ITEMS_SIZE(cmd, items))) return -EINVAL; - conn-attach_flags = attach_flags; + if (flags_provided) + conn-attach_flags = attach_flags; if (!policy_provided) return 0; -- 1.9.0 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 07/12] policy: use the db-entries_hash to access the policy db entries
Use the db-entries_hash to access the policy db entries instead of the db-send_access_hash which is just a cache for send entries. Signed-off-by: Djalal Harouni tix...@opendz.org --- policy.c | 7 --- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/policy.c b/policy.c index bf49f68..79d6fa4 100644 --- a/policy.c +++ b/policy.c @@ -373,7 +373,7 @@ static void __kdbus_policy_remove_owner(struct kdbus_policy_db *db, struct hlist_node *tmp; int i; - hash_for_each_safe(db-send_access_hash, i, tmp, e, hentry) + hash_for_each_safe(db-entries_hash, i, tmp, e, hentry) if (e-owner == owner) { hash_del(e-hentry); kdbus_policy_entry_free(e); @@ -394,7 +394,8 @@ void kdbus_policy_remove_owner(struct kdbus_policy_db *db, } /** - * kdbus_policy_remove_conn() - remove all entries related to a connection + * kdbus_policy_remove_conn() - remove all cached entries related to + * a connection * @db:The policy database * @conn: The connection which items to remove */ @@ -482,7 +483,7 @@ int kdbus_policy_set(struct kdbus_policy_db *db, * At the same time, the lookup mechanism won't find any collisions * when looking for already exising names. */ - hash_for_each_safe(db-send_access_hash, i, tmp, e, hentry) + hash_for_each_safe(db-entries_hash, i, tmp, e, hentry) if (e-owner == owner) { struct kdbus_policy_list_entry *l; -- 1.9.0 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 08/12] policy: kdbus_policy_set() make sure we restore the right entries
If kdbus_policy_set() fails we try to restore the entries that were previously saved in a list, however due to a typo that logic was trying to access a previously freed entry which will just break things... So fix the typo 'l-e' instead of 'e'. This fixes a bug triggered by test-kdbus-policy and makes the code able to restore previously saved entries in case of errors. Signed-off-by: Djalal Harouni tix...@opendz.org --- policy.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/policy.c b/policy.c index 79d6fa4..9cf7f67 100644 --- a/policy.c +++ b/policy.c @@ -601,7 +601,7 @@ exit: /* restore original entries from list */ list_for_each_entry_safe(l, l_tmp, list, entry) { - kdbus_policy_add_one(db, e); + kdbus_policy_add_one(db, l-e); kfree(l); } } -- 1.9.0 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 09/12] policy: kdbus_policy_set() use another variable to save entries
In kdbus_policy_set() function, we use the 'e' variable to reference each entry of the 'db-entries_hash', so at the end the variable 'e' will for sure point to a valid one. Next in the KDBUS_ITEMS_FOREACH() iterator and if we fail at the first KDBUS_ITEM_VALID() test, we jmp to exit: Which contains the following: if (e) kdbus_policy_entry_free(e); Here 'e' points to a valid entry and it will be freed, so even we restore all the other entries from that list, there will be always one missing, the last one pointed by that 'e' variable. To fix this, just use another 'tmp_entry' variable to reference hash entries. Signed-off-by: Djalal Harouni tix...@opendz.org --- policy.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/policy.c b/policy.c index 9cf7f67..601d2a8 100644 --- a/policy.c +++ b/policy.c @@ -467,6 +467,7 @@ int kdbus_policy_set(struct kdbus_policy_db *db, const void *owner) { struct kdbus_policy_db_entry *e = NULL; + struct kdbus_policy_db_entry *tmp_entry = NULL; struct kdbus_policy_db_entry_access *a; const struct kdbus_item *item; struct hlist_node *tmp; @@ -483,8 +484,8 @@ int kdbus_policy_set(struct kdbus_policy_db *db, * At the same time, the lookup mechanism won't find any collisions * when looking for already exising names. */ - hash_for_each_safe(db-entries_hash, i, tmp, e, hentry) - if (e-owner == owner) { + hash_for_each_safe(db-entries_hash, i, tmp, tmp_entry, hentry) + if (tmp_entry-owner == owner) { struct kdbus_policy_list_entry *l; l = kzalloc(sizeof(*l), GFP_KERNEL); @@ -493,9 +494,9 @@ int kdbus_policy_set(struct kdbus_policy_db *db, goto exit; } - l-e = e; + l-e = tmp_entry; list_add_tail(l-entry, list); - hash_del(e-hentry); + hash_del(tmp_entry-hentry); } /* Walk the list of items and look for new policies */ -- 1.9.0 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 10/12] policy: kdbus_policy_set() fix a use after free bug
If we try to register the same name twice then kdbus_policy_add_one() will fail with -EEXIST In kdbus_policy_set() we have two calls to kdbus_policy_add_one() if they fail we clean things up with kdbus_policy_entry_free(), but since we failed ret == -EEXIST ,we hit the error path where we have another: if (e) kdbus_policy_entry_free(e); We have a use after free bug here, Since 'e' is freed but never set to NULL. To fix this we can set that 'e' to NULL after each kdbus_policy_entry_free() call, but it is better to just clean things up in a one place, in the error path and remove the other kdbus_policy_entry_free() calls. Thix fixes the bug triggered by test-kdbus-policy when we try to register the same name twice. Signed-off-by: Djalal Harouni tix...@opendz.org --- policy.c | 9 ++--- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/policy.c b/policy.c index 601d2a8..d75c2ef 100644 --- a/policy.c +++ b/policy.c @@ -512,10 +512,8 @@ int kdbus_policy_set(struct kdbus_policy_db *db, if (e) { ret = kdbus_policy_add_one(db, e); - if (ret 0) { - kdbus_policy_entry_free(e); + if (ret 0) goto exit; - } } if (max_policies ++count max_policies) { @@ -584,11 +582,8 @@ int kdbus_policy_set(struct kdbus_policy_db *db, goto exit; } - if (e) { + if (e) ret = kdbus_policy_add_one(db, e); - if (ret 0) - kdbus_policy_entry_free(e); - } exit: if (ret 0) { -- 1.9.0 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel
[systemd-devel] [PATCH 11/12] policy: kdbus_policy_check_own_access() returns 0 on success not true
kdbus_policy_check_own_access() returns 0 if access is granted, otherwise a negative errno. So fix this by returning 0. We did not hit this since callers were checking negative values for errors. Signed-off-by: Djalal Harouni tix...@opendz.org --- policy.c | 13 +++-- 1 file changed, 11 insertions(+), 2 deletions(-) diff --git a/policy.c b/policy.c index d75c2ef..58ab6a5 100644 --- a/policy.c +++ b/policy.c @@ -231,7 +231,7 @@ static int kdbus_policy_check_access(const struct kdbus_policy_db_entry *e, * @conn: The connection to check * @name: The name to check * - * Return: t0 if the connection is allowed to own the name, -EPERM otherwise + * Return: 0 if the connection is allowed to own the name, -EPERM otherwise */ int kdbus_policy_check_own_access(struct kdbus_policy_db *db, const struct kdbus_conn *conn, @@ -307,8 +307,17 @@ int kdbus_policy_check_talk_access(struct kdbus_policy_db *db, unsigned int hash = 0; int ret; + /* +* user-uid maps to a fsuid at the time of a KDBUS_CMD_HELLO +* cmd, if they equal allow the TALK access, otherwise we +* proceed and perform checks against current's cred. +* +* By using the user-uid check first we reduce the exposure to +* creds changes. Privileged processes should be careful about +* what to do with a file descriptor. +*/ if (uid_eq(conn_src-user-uid, conn_dst-user-uid)) - return true; + return 0; /* * If there was a positive match for these two connections before, -- 1.9.0 ___ systemd-devel mailing list systemd-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/systemd-devel