On 3/16/23 13:00, Aaron Conole wrote:
> Open vSwitch generally tries to let the underlying operating system
> managed the low level details of hardware, for example DMA mapping,
> bus arbitration, etc. However, when using DPDK, the underlying
> operating system yields control of many of these details to userspace
> for management.
>
> In the case of some DPDK port drivers, configuring rte_flow or even
> allocating resources may require access to iopl/ioperm calls, which
> are guarded by the CAP_SYS_RAWIO privilege on linux systems. These
> calls are dangerous, and can allow a process to completely compromise
> a system. However, they are needed in the case of some userspace
> driver code which manages the hardware (for example, the mlx
> implementation of backend support for rte_flow).
>
> Here, we create an opt-in flag passed to the command line to allow
> this access. We need to do this before ever accessing the database,
> because we want to drop all privileges asap, and cannot wait for
> a connection to the database to be established and functional before
> dropping. There may be distribution specific ways to do capability
> management as well (using for example, systemd), but they are not
> as universal to the vswitchd as a flag.
>
> Reviewed-by: Simon Horman <[email protected]>
> Signed-off-by: Aaron Conole <[email protected]>
> ---
> NEWS | 4 ++++
> lib/daemon-unix.c | 31 +++++++++++++++++++++++--------
> lib/daemon-windows.c | 6 ++++--
> lib/daemon.c | 2 +-
> lib/daemon.h | 4 ++--
> ovsdb/ovsdb-client.c | 6 +++---
> ovsdb/ovsdb-server.c | 4 ++--
> tests/test-netflow.c | 2 +-
> tests/test-sflow.c | 2 +-
> tests/test-unixctl.c | 2 +-
> utilities/ovs-ofctl.c | 4 ++--
> utilities/ovs-testcontroller.c | 4 ++--
> vswitchd/ovs-vswitchd.8.in | 9 +++++++++
> vswitchd/ovs-vswitchd.c | 11 ++++++++++-
> 14 files changed, 65 insertions(+), 26 deletions(-)
>
> diff --git a/NEWS b/NEWS
> index 72b9024e6d..8771ee618a 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -17,6 +17,10 @@ Post-v3.1.0
> in order to create OVSDB sockets with access mode of 0770.
> - QoS:
> * Added new configuration option 'jitter' for a linux-netem QoS type.
> + - DPDK:
> + * ovs-vswitchd will keep the CAP_SYS_RAWIO capability when started
> + with the --hw-rawio-access command line option. This allows the
> + process extra privileges when mapping physical interconnect memory.
>
>
> v3.1.0 - 16 Feb 2023
> diff --git a/lib/daemon-unix.c b/lib/daemon-unix.c
> index 1a7ba427d7..5ec75106bc 100644
> --- a/lib/daemon-unix.c
> +++ b/lib/daemon-unix.c
> @@ -88,7 +88,8 @@ static bool switch_user = false;
> static uid_t uid;
> static gid_t gid;
> static char *user = NULL;
> -static void daemon_become_new_user__(bool access_datapath);
> +static void daemon_become_new_user__(bool access_datapath,
> + bool access_hardware_ports);
>
> static void check_already_running(void);
> static int lock_pidfile(FILE *, int command);
> @@ -443,13 +444,13 @@ monitor_daemon(pid_t daemon_pid)
> * daemonize_complete()) or that it failed to start up (by exiting with a
> * nonzero exit code). */
> void
> -daemonize_start(bool access_datapath)
> +daemonize_start(bool access_datapath, bool access_hardware_ports)
> {
> assert_single_threaded();
> daemonize_fd = -1;
>
> if (switch_user) {
> - daemon_become_new_user__(access_datapath);
> + daemon_become_new_user__(access_datapath, access_hardware_ports);
> switch_user = false;
> }
>
> @@ -807,7 +808,8 @@ daemon_become_new_user_unix(void)
> /* Linux specific implementation of daemon_become_new_user()
> * using libcap-ng. */
> static void
> -daemon_become_new_user_linux(bool access_datapath OVS_UNUSED)
> +daemon_become_new_user_linux(bool access_datapath OVS_UNUSED,
> + bool access_hardware_ports OVS_UNUSED)
> {
> #if defined __linux__ && HAVE_LIBCAPNG
> int ret;
> @@ -827,6 +829,18 @@ daemon_become_new_user_linux(bool access_datapath
> OVS_UNUSED)
> ret = capng_update(CAPNG_ADD, cap_sets, CAP_NET_ADMIN)
> || capng_update(CAPNG_ADD, cap_sets, CAP_NET_RAW)
> || capng_update(CAPNG_ADD, cap_sets,
> CAP_NET_BROADCAST);
> +#ifdef DPDK_NETDEV
> + if (access_hardware_ports && !ret) {
> + ret = capng_update(CAPNG_ADD, cap_sets, CAP_SYS_RAWIO);
> + VLOG_INFO("The Linux capability CAP_SYS_RAWIO "
> + "is enabled.");
I think, we shouldn't really log if the capng_update() failed.
I could wrap the log in if (ret) {} while applying the change,
if that makes sense. What do you think?
Best regards, Ilya Maximets.
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev