Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering
Hi, On Wednesday, 13 March 2019 14:29:28 CET Marc-André Lureau wrote: > > On Wednesday, 13 March 2019 00:03:26 CET Marek Olšák wrote: > > > The env var workaround is fine. > > > > > > Thread affinity is used for cache topology related optimizations. I think > > > it's a mistake to treat it only as a resource allocation tool. > > > > For a shorter term solution to the problem. > > One Idea that comes into my mind: > > > > Can we check the currently set thread affinity mask if it still contains the > > cpu we are aiming for and narrow the mask down to our cpu if we can do > > that by narrowing. If we would need to assign our thread to a cpu that > > we are not bound anymore just do nothing. > > > > getaffinity() is also blocked by current qemu policy. > > It should be possible to allow a narrower setaffinity(), with some complex > rule. > > > That would obviously require that we can still call into > > pthread_setaffinity_np > > without being just killed straight away because we touch something that > > somebody else wants to control. And that we even succeed in just narrowing > > down the allowed set of cpus. > > Marc-Andre, would that still work with qemu then? > > For now, Daniel proposed "seccomp: don't kill process for resource > control syscalls": the resource control syscalls will return -1/EPERM. Thanks a lot!! I saw the patch! Nevertheless, I think you should consider that point of view from a calling application. If that is not wise to allow something security wise, please find a different solution than just killing processes. The only workaround that I had in my back would have been to start a child process to test if we can call into a well known and always available system call to see if it gets killed or if it succeeds. We could have solved the special case in mesa with environment variables, but the basic problem still remains. best Mathias > > > > > Of course this still leaves a small race condition open if somebody changes > > the > > affinitiy mask of the current thread in between our call to > > pthread_getaffinity_np > > and pthread_setaffinity_np from the outside of our linux task. Then we may > > experience a non narrowing set affinity operation anymore because of an > > other set > > operation that came in between and we may get killed then. > > ... which is an other argument against just killing. But ok ... > > IMO this condition happens sufficiently seldom to accept that. > > > > Could that solve our problem?? > > > > best > > Mathias > > > > > > > > > > Marek > > > > > > On Tue, Mar 12, 2019, 1:59 AM Marc-André Lureau > > > > > > wrote: > > > > > > > Hi > > > > > > > > On Fri, Mar 1, 2019 at 12:13 PM Mathias Fröhlich > > > > wrote: > > > > > > > > > > On Friday, 1 March 2019 12:15:08 CET Eero Tamminen wrote: > > > > > > Hi, > > > > > > > > > > > > On 1.3.2019 11.12, Michel Dänzer wrote: > > > > > > > On 2019-02-28 8:41 p.m., Marek Olšák wrote: > > > > > > >>> On Thu, Feb 28, 2019 at 1:37 PM Eero Tamminen < > > > > eero.t.tammi...@intel.com> > > > > > > Why distro versions of Qemu filter sched_setaffinity() syscall? > > > > > > >>> > > > > > > >>> (https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889) > > > > > > >>> > > > > > > >>> Daniel Berrange (berrange) wrote on 2019-02-27: #19 > > > > > > >>> > > > > > > >>> "IMHO that mesa change is not valid. It is settings its > > > > > > >>> affinity to > > > > > > >>> run on all threads which is definitely *NOT* something we want > > > > > > >>> to > > > > be > > > > > > >>> allowed. Management applications want to control which CPUs QEMU > > > > runs > > > > > > >>> on, and as such Mesa should honour the CPU placement that the > > > > > > >>> QEMU > > > > > > >>> process has. > > > > > > >>> > > > > > > >>> This is a great example of why QEMU wants to use seccomp to > > > > > > >>> block > > > > > > >>> affinity changes to prevent something silently trying to use > > > > > > >>> more > > > > CPUs > > > > > > >>> than are assigned to this QEMU." > > > > > > >>> > > > > > > >> > > > > > > >> Mesa uses thread affinity to optimize memory access performance > > > > > > >> on > > > > some > > > > > > >> CPUs (see util_pin_thread_to_L3). Other places in Mesa need to > > > > restore the > > > > > > >> original thread affinity for some child threads. Additionally, if > > > > games > > > > > > >> limit the thread affinity, Mesa needs to restore the full thread > > > > affinity > > > > > > >> for some of its child threads. > > > > > > > > > > > > > > The last part sounds like Mesa clearly overstepping its authority. > > > > > > > > > > > > > > > > > > > > >> In essence, the thread affinity should only be considered a hint > > > > for the > > > > > > >> kernel for optimal performance. There is no reason to kill the > > > > process if > > > > > > >> it's disallowed. Just ignore the call or modify the thread mask > > > > > > >> to > > > > make it > > > > > > >> legal. > > > > > > > > > > > > > > The
Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering
On Wed, Mar 13, 2019 at 02:29:28PM +0100, Marc-André Lureau wrote: > Hi > > On Wed, Mar 13, 2019 at 8:53 AM Mathias Fröhlich > wrote: > > > > Marek, Marc-Andre, > > > > On Wednesday, 13 March 2019 00:03:26 CET Marek Olšák wrote: > > > The env var workaround is fine. > > > > > > Thread affinity is used for cache topology related optimizations. I think > > > it's a mistake to treat it only as a resource allocation tool. > > > > For a shorter term solution to the problem. > > One Idea that comes into my mind: > > > > Can we check the currently set thread affinity mask if it still contains the > > cpu we are aiming for and narrow the mask down to our cpu if we can do > > that by narrowing. If we would need to assign our thread to a cpu that > > we are not bound anymore just do nothing. > > > > getaffinity() is also blocked by current qemu policy. I think we could consider that a bug. Blocking this syscall while still allowing read of /proc/self/status achieves little from a security pov as the affinity is still visible. It is just protecting against a bug in the impl of getaffinity in the kernel which is unlikely to be worth caring about & a bug in /proc impl is probably more likely! Regards, Daniel -- |: https://berrange.com -o-https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o-https://fstop138.berrange.com :| |: https://entangle-photo.org-o-https://www.instagram.com/dberrange :| ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering
Hi On Wed, Mar 13, 2019 at 8:53 AM Mathias Fröhlich wrote: > > Marek, Marc-Andre, > > On Wednesday, 13 March 2019 00:03:26 CET Marek Olšák wrote: > > The env var workaround is fine. > > > > Thread affinity is used for cache topology related optimizations. I think > > it's a mistake to treat it only as a resource allocation tool. > > For a shorter term solution to the problem. > One Idea that comes into my mind: > > Can we check the currently set thread affinity mask if it still contains the > cpu we are aiming for and narrow the mask down to our cpu if we can do > that by narrowing. If we would need to assign our thread to a cpu that > we are not bound anymore just do nothing. > getaffinity() is also blocked by current qemu policy. It should be possible to allow a narrower setaffinity(), with some complex rule. > That would obviously require that we can still call into > pthread_setaffinity_np > without being just killed straight away because we touch something that > somebody else wants to control. And that we even succeed in just narrowing > down the allowed set of cpus. > Marc-Andre, would that still work with qemu then? For now, Daniel proposed "seccomp: don't kill process for resource control syscalls": the resource control syscalls will return -1/EPERM. > > Of course this still leaves a small race condition open if somebody changes > the > affinitiy mask of the current thread in between our call to > pthread_getaffinity_np > and pthread_setaffinity_np from the outside of our linux task. Then we may > experience a non narrowing set affinity operation anymore because of an other > set > operation that came in between and we may get killed then. > ... which is an other argument against just killing. But ok ... > IMO this condition happens sufficiently seldom to accept that. > > Could that solve our problem?? > > best > Mathias > > > > > > Marek > > > > On Tue, Mar 12, 2019, 1:59 AM Marc-André Lureau > > wrote: > > > > > Hi > > > > > > On Fri, Mar 1, 2019 at 12:13 PM Mathias Fröhlich > > > wrote: > > > > > > > > On Friday, 1 March 2019 12:15:08 CET Eero Tamminen wrote: > > > > > Hi, > > > > > > > > > > On 1.3.2019 11.12, Michel Dänzer wrote: > > > > > > On 2019-02-28 8:41 p.m., Marek Olšák wrote: > > > > > >>> On Thu, Feb 28, 2019 at 1:37 PM Eero Tamminen < > > > eero.t.tammi...@intel.com> > > > > > Why distro versions of Qemu filter sched_setaffinity() syscall? > > > > > >>> > > > > > >>> (https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889) > > > > > >>> > > > > > >>> Daniel Berrange (berrange) wrote on 2019-02-27: #19 > > > > > >>> > > > > > >>> "IMHO that mesa change is not valid. It is settings its affinity > > > > > >>> to > > > > > >>> run on all threads which is definitely *NOT* something we want to > > > be > > > > > >>> allowed. Management applications want to control which CPUs QEMU > > > runs > > > > > >>> on, and as such Mesa should honour the CPU placement that the QEMU > > > > > >>> process has. > > > > > >>> > > > > > >>> This is a great example of why QEMU wants to use seccomp to block > > > > > >>> affinity changes to prevent something silently trying to use more > > > CPUs > > > > > >>> than are assigned to this QEMU." > > > > > >>> > > > > > >> > > > > > >> Mesa uses thread affinity to optimize memory access performance on > > > some > > > > > >> CPUs (see util_pin_thread_to_L3). Other places in Mesa need to > > > restore the > > > > > >> original thread affinity for some child threads. Additionally, if > > > games > > > > > >> limit the thread affinity, Mesa needs to restore the full thread > > > affinity > > > > > >> for some of its child threads. > > > > > > > > > > > > The last part sounds like Mesa clearly overstepping its authority. > > > > > > > > > > > > > > > > > >> In essence, the thread affinity should only be considered a hint > > > for the > > > > > >> kernel for optimal performance. There is no reason to kill the > > > process if > > > > > >> it's disallowed. Just ignore the call or modify the thread mask to > > > make it > > > > > >> legal. > > > > > > > > > > > > The fundamental issue here is that Mesa is using the thread affinity > > > API > > > > > > for something else than it's intended for. If there was an API for > > > what > > > > > > Mesa wants (encouraging certain sets of threads to run on > > > topologically > > > > > > close cores), there should be no need to block that. > > > > > > > > > > Why such process needs to be killed instead the request being masked > > > > > suitably, is there some program that breaks subtly if affinity request > > > > > is masked (and that being worse than the program being killed)? > > > > > > > > But that is still a situation that could be nicely handled with a > > > > EPERM error return. Way better than just kill a process. > > > > That 'badly affected' program still can call abort then. > > > > But nicely working programs don't get just killed then!! > > > > > > > > > Returning an error seems less
Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering
Hi, On 12.3.2019 10.59, Marc-André Lureau wrote: On Fri, Mar 1, 2019 at 12:13 PM Mathias Fröhlich wrote: On Friday, 1 March 2019 12:15:08 CET Eero Tamminen wrote: On 1.3.2019 11.12, Michel Dänzer wrote: On 2019-02-28 8:41 p.m., Marek Olšák wrote: On Thu, Feb 28, 2019 at 1:37 PM Eero Tamminen Why distro versions of Qemu filter sched_setaffinity() syscall? (https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889) Daniel Berrange (berrange) wrote on 2019-02-27: #19 "IMHO that mesa change is not valid. It is settings its affinity to run on all threads which is definitely *NOT* something we want to be allowed. Management applications want to control which CPUs QEMU runs on, and as such Mesa should honour the CPU placement that the QEMU process has. This is a great example of why QEMU wants to use seccomp to block affinity changes to prevent something silently trying to use more CPUs than are assigned to this QEMU." Mesa uses thread affinity to optimize memory access performance on some CPUs (see util_pin_thread_to_L3). Other places in Mesa need to restore the original thread affinity for some child threads. Additionally, if games limit the thread affinity, Mesa needs to restore the full thread affinity for some of its child threads. The last part sounds like Mesa clearly overstepping its authority. In essence, the thread affinity should only be considered a hint for the kernel for optimal performance. There is no reason to kill the process if it's disallowed. Just ignore the call or modify the thread mask to make it legal. The fundamental issue here is that Mesa is using the thread affinity API for something else than it's intended for. If there was an API for what Mesa wants (encouraging certain sets of threads to run on topologically close cores), there should be no need to block that. Why such process needs to be killed instead the request being masked suitably, is there some program that breaks subtly if affinity request is masked (and that being worse than the program being killed)? But that is still a situation that could be nicely handled with a EPERM error return. Way better than just kill a process. That 'badly affected' program still can call abort then. But nicely working programs don't get just killed then!! Returning an error seems less secure that prohibiting it completely. And it may lead to subtle bugs in rarely tested code paths. It's legitimate that QEMU and management layers want to prevent arbitrary code from changing resource allocation etc. They can do that by no-oping the system call, or masking the parts they don't want to be modified. As that affects only (potentially) performance, not functionality, it seems to me better than outright killing a process. (As with killing, there should probably be some way to log things that were ignored/masked.) There are no easy way I can think of for mesa (and other libraries) to probe the seccomp filters and associated action. So we need a way to tell mesa not to call setaffinity() (and other syscalls). MESA_NO_THREAD_AFFINITY or MESA_NO_SYSCALLS=setaffinity,... seem like a relatively easy way to go. - Eero ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering
Marek, Marc-Andre, On Wednesday, 13 March 2019 00:03:26 CET Marek Olšák wrote: > The env var workaround is fine. > > Thread affinity is used for cache topology related optimizations. I think > it's a mistake to treat it only as a resource allocation tool. For a shorter term solution to the problem. One Idea that comes into my mind: Can we check the currently set thread affinity mask if it still contains the cpu we are aiming for and narrow the mask down to our cpu if we can do that by narrowing. If we would need to assign our thread to a cpu that we are not bound anymore just do nothing. That would obviously require that we can still call into pthread_setaffinity_np without being just killed straight away because we touch something that somebody else wants to control. And that we even succeed in just narrowing down the allowed set of cpus. Marc-Andre, would that still work with qemu then? Of course this still leaves a small race condition open if somebody changes the affinitiy mask of the current thread in between our call to pthread_getaffinity_np and pthread_setaffinity_np from the outside of our linux task. Then we may experience a non narrowing set affinity operation anymore because of an other set operation that came in between and we may get killed then. ... which is an other argument against just killing. But ok ... IMO this condition happens sufficiently seldom to accept that. Could that solve our problem?? best Mathias > > Marek > > On Tue, Mar 12, 2019, 1:59 AM Marc-André Lureau > wrote: > > > Hi > > > > On Fri, Mar 1, 2019 at 12:13 PM Mathias Fröhlich > > wrote: > > > > > > On Friday, 1 March 2019 12:15:08 CET Eero Tamminen wrote: > > > > Hi, > > > > > > > > On 1.3.2019 11.12, Michel Dänzer wrote: > > > > > On 2019-02-28 8:41 p.m., Marek Olšák wrote: > > > > >>> On Thu, Feb 28, 2019 at 1:37 PM Eero Tamminen < > > eero.t.tammi...@intel.com> > > > > Why distro versions of Qemu filter sched_setaffinity() syscall? > > > > >>> > > > > >>> (https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889) > > > > >>> > > > > >>> Daniel Berrange (berrange) wrote on 2019-02-27: #19 > > > > >>> > > > > >>> "IMHO that mesa change is not valid. It is settings its affinity to > > > > >>> run on all threads which is definitely *NOT* something we want to > > be > > > > >>> allowed. Management applications want to control which CPUs QEMU > > runs > > > > >>> on, and as such Mesa should honour the CPU placement that the QEMU > > > > >>> process has. > > > > >>> > > > > >>> This is a great example of why QEMU wants to use seccomp to block > > > > >>> affinity changes to prevent something silently trying to use more > > CPUs > > > > >>> than are assigned to this QEMU." > > > > >>> > > > > >> > > > > >> Mesa uses thread affinity to optimize memory access performance on > > some > > > > >> CPUs (see util_pin_thread_to_L3). Other places in Mesa need to > > restore the > > > > >> original thread affinity for some child threads. Additionally, if > > games > > > > >> limit the thread affinity, Mesa needs to restore the full thread > > affinity > > > > >> for some of its child threads. > > > > > > > > > > The last part sounds like Mesa clearly overstepping its authority. > > > > > > > > > > > > > > >> In essence, the thread affinity should only be considered a hint > > for the > > > > >> kernel for optimal performance. There is no reason to kill the > > process if > > > > >> it's disallowed. Just ignore the call or modify the thread mask to > > make it > > > > >> legal. > > > > > > > > > > The fundamental issue here is that Mesa is using the thread affinity > > API > > > > > for something else than it's intended for. If there was an API for > > what > > > > > Mesa wants (encouraging certain sets of threads to run on > > topologically > > > > > close cores), there should be no need to block that. > > > > > > > > Why such process needs to be killed instead the request being masked > > > > suitably, is there some program that breaks subtly if affinity request > > > > is masked (and that being worse than the program being killed)? > > > > > > But that is still a situation that could be nicely handled with a > > > EPERM error return. Way better than just kill a process. > > > That 'badly affected' program still can call abort then. > > > But nicely working programs don't get just killed then!! > > > > > > Returning an error seems less secure that prohibiting it completely. > > And it may lead to subtle bugs in rarely tested code paths. > > > > It's legitimate that QEMU and management layers want to prevent > > arbitrary code from changing resource allocation etc. > > > > There are no easy way I can think of for mesa (and other libraries) to > > probe the seccomp filters and associated action. > > > > So we need a way to tell mesa not to call setaffinity() (and other > > syscalls). MESA_NO_THREAD_AFFINITY or MESA_NO_SYSCALLS=setaffinity,... > > seem like a relatively easy way to go. > > > > thanks > > > > >
Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering
Hi, On Tuesday, 12 March 2019 09:59:17 CET Marc-André Lureau wrote: > Hi > > On Fri, Mar 1, 2019 at 12:13 PM Mathias Fröhlich > wrote: > > > > On Friday, 1 March 2019 12:15:08 CET Eero Tamminen wrote: > > > Hi, > > > > > > On 1.3.2019 11.12, Michel Dänzer wrote: > > > > On 2019-02-28 8:41 p.m., Marek Olšák wrote: > > > >>> On Thu, Feb 28, 2019 at 1:37 PM Eero Tamminen > > > >>> > > > Why distro versions of Qemu filter sched_setaffinity() syscall? > > > >>> > > > >>> (https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889) > > > >>> > > > >>> Daniel Berrange (berrange) wrote on 2019-02-27: #19 > > > >>> > > > >>> "IMHO that mesa change is not valid. It is settings its affinity to > > > >>> run on all threads which is definitely *NOT* something we want to be > > > >>> allowed. Management applications want to control which CPUs QEMU runs > > > >>> on, and as such Mesa should honour the CPU placement that the QEMU > > > >>> process has. > > > >>> > > > >>> This is a great example of why QEMU wants to use seccomp to block > > > >>> affinity changes to prevent something silently trying to use more CPUs > > > >>> than are assigned to this QEMU." > > > >>> > > > >> > > > >> Mesa uses thread affinity to optimize memory access performance on some > > > >> CPUs (see util_pin_thread_to_L3). Other places in Mesa need to restore > > > >> the > > > >> original thread affinity for some child threads. Additionally, if games > > > >> limit the thread affinity, Mesa needs to restore the full thread > > > >> affinity > > > >> for some of its child threads. > > > > > > > > The last part sounds like Mesa clearly overstepping its authority. > > > > > > > > > > > >> In essence, the thread affinity should only be considered a hint for > > > >> the > > > >> kernel for optimal performance. There is no reason to kill the process > > > >> if > > > >> it's disallowed. Just ignore the call or modify the thread mask to > > > >> make it > > > >> legal. > > > > > > > > The fundamental issue here is that Mesa is using the thread affinity API > > > > for something else than it's intended for. If there was an API for what > > > > Mesa wants (encouraging certain sets of threads to run on topologically > > > > close cores), there should be no need to block that. > > > > > > Why such process needs to be killed instead the request being masked > > > suitably, is there some program that breaks subtly if affinity request > > > is masked (and that being worse than the program being killed)? > > > > But that is still a situation that could be nicely handled with a > > EPERM error return. Way better than just kill a process. > > That 'badly affected' program still can call abort then. > > But nicely working programs don't get just killed then!! > > > Returning an error seems less secure that prohibiting it completely. > And it may lead to subtle bugs in rarely tested code paths. > > It's legitimate that QEMU and management layers want to prevent > arbitrary code from changing resource allocation etc. I *never* saw this api as resource allocation. Such a call finally dates back into the IRIX threads (the api *before* pthreads on those very old OpenGL dinosaurs from Silicon Graphics) where this was pretty much used in contexts like exactly what Marek wanted to. To make use hardware topology that you can access specific hardware faster from specific cpu's. Or aiming for cache locality between threads that exchange lots of data between each other. Think of high performance computing applications for cache locality or VR application middle end libraries for hardware OpenGL access. That api was replaced on that ancient operating system by pthreads that contained the equivalent api call. And later on the linux pthread implementation gained the equivalent pthread_setaffinity_np call when SMP linux systems got used more often. Means if you just kill an application that tries to optimize for valid uses using an API that used to work for that purpose for years you will just break existing API's. Beside breaking exiting behavior, just think of what you are doing from an applications view. I think as an application writer now I believe that I want to change this property. Now I know that if I touch that specific property I may just be dead. So, then I want to know if I can touch this property without being dead. But there is no such way to find that out. Well, then this means basically the api is finally unusable because I can't tolerate that I am just dead when trying to change something for good!!! What you want as an application writer is to change that value and if that does not work handle that accordingly. Where 'handle that' can be ranging from, but is not limited to: Either silently internally handle that in the application, may be use an other algorithm that fits that case. Present the user some user interface message that you need to bind threads to cpus and the operating system does not allow that and that re
Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering
On Tue, Mar 12, 2019 at 9:59 AM Marc-André Lureau wrote: > > Hi > > On Fri, Mar 1, 2019 at 12:13 PM Mathias Fröhlich > wrote: > > > > On Friday, 1 March 2019 12:15:08 CET Eero Tamminen wrote: > > > Hi, > > > > > > On 1.3.2019 11.12, Michel Dänzer wrote: > > > > On 2019-02-28 8:41 p.m., Marek Olšák wrote: > > > >>> On Thu, Feb 28, 2019 at 1:37 PM Eero Tamminen > > > >>> > > > Why distro versions of Qemu filter sched_setaffinity() syscall? > > > >>> > > > >>> (https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889) > > > >>> > > > >>> Daniel Berrange (berrange) wrote on 2019-02-27: #19 > > > >>> > > > >>> "IMHO that mesa change is not valid. It is settings its affinity to > > > >>> run on all threads which is definitely *NOT* something we want to be > > > >>> allowed. Management applications want to control which CPUs QEMU runs > > > >>> on, and as such Mesa should honour the CPU placement that the QEMU > > > >>> process has. > > > >>> > > > >>> This is a great example of why QEMU wants to use seccomp to block > > > >>> affinity changes to prevent something silently trying to use more CPUs > > > >>> than are assigned to this QEMU." > > > >>> > > > >> > > > >> Mesa uses thread affinity to optimize memory access performance on some > > > >> CPUs (see util_pin_thread_to_L3). Other places in Mesa need to restore > > > >> the > > > >> original thread affinity for some child threads. Additionally, if games > > > >> limit the thread affinity, Mesa needs to restore the full thread > > > >> affinity > > > >> for some of its child threads. > > > > > > > > The last part sounds like Mesa clearly overstepping its authority. > > > > > > > > > > > >> In essence, the thread affinity should only be considered a hint for > > > >> the > > > >> kernel for optimal performance. There is no reason to kill the process > > > >> if > > > >> it's disallowed. Just ignore the call or modify the thread mask to > > > >> make it > > > >> legal. > > > > > > > > The fundamental issue here is that Mesa is using the thread affinity API > > > > for something else than it's intended for. If there was an API for what > > > > Mesa wants (encouraging certain sets of threads to run on topologically > > > > close cores), there should be no need to block that. > > > > > > Why such process needs to be killed instead the request being masked > > > suitably, is there some program that breaks subtly if affinity request > > > is masked (and that being worse than the program being killed)? > > > > But that is still a situation that could be nicely handled with a > > EPERM error return. Way better than just kill a process. > > That 'badly affected' program still can call abort then. > > But nicely working programs don't get just killed then!! > > > Returning an error seems less secure that prohibiting it completely. > And it may lead to subtle bugs in rarely tested code paths. > > It's legitimate that QEMU and management layers want to prevent > arbitrary code from changing resource allocation etc. > > There are no easy way I can think of for mesa (and other libraries) to > probe the seccomp filters and associated action. > > So we need a way to tell mesa not to call setaffinity() (and other > syscalls). MESA_NO_THREAD_AFFINITY or MESA_NO_SYSCALLS=setaffinity,... > seem like a relatively easy way to go. I strongly believe we should not be going the route of adding another environment variable. Primarily because this is adding a big pitfall for users that have their software not work and have to spend significant efforts to figure out the environment variable and get things work. (as well as associated costs of some users not getting that far and filing bugs or proclaiming on other sites that this thing is buggy for them). As such I'd strongly appreciate it if people look further than the immediate crash and figure out a way to make graceful degradation happen without user intervention. Is there really no way to figure out that calling setaffinity is going to kill the process, and what would be needed to add a method? > > thanks > > > -- > Marc-André Lureau > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering
The env var workaround is fine. Thread affinity is used for cache topology related optimizations. I think it's a mistake to treat it only as a resource allocation tool. Marek On Tue, Mar 12, 2019, 1:59 AM Marc-André Lureau wrote: > Hi > > On Fri, Mar 1, 2019 at 12:13 PM Mathias Fröhlich > wrote: > > > > On Friday, 1 March 2019 12:15:08 CET Eero Tamminen wrote: > > > Hi, > > > > > > On 1.3.2019 11.12, Michel Dänzer wrote: > > > > On 2019-02-28 8:41 p.m., Marek Olšák wrote: > > > >>> On Thu, Feb 28, 2019 at 1:37 PM Eero Tamminen < > eero.t.tammi...@intel.com> > > > Why distro versions of Qemu filter sched_setaffinity() syscall? > > > >>> > > > >>> (https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889) > > > >>> > > > >>> Daniel Berrange (berrange) wrote on 2019-02-27: #19 > > > >>> > > > >>> "IMHO that mesa change is not valid. It is settings its affinity to > > > >>> run on all threads which is definitely *NOT* something we want to > be > > > >>> allowed. Management applications want to control which CPUs QEMU > runs > > > >>> on, and as such Mesa should honour the CPU placement that the QEMU > > > >>> process has. > > > >>> > > > >>> This is a great example of why QEMU wants to use seccomp to block > > > >>> affinity changes to prevent something silently trying to use more > CPUs > > > >>> than are assigned to this QEMU." > > > >>> > > > >> > > > >> Mesa uses thread affinity to optimize memory access performance on > some > > > >> CPUs (see util_pin_thread_to_L3). Other places in Mesa need to > restore the > > > >> original thread affinity for some child threads. Additionally, if > games > > > >> limit the thread affinity, Mesa needs to restore the full thread > affinity > > > >> for some of its child threads. > > > > > > > > The last part sounds like Mesa clearly overstepping its authority. > > > > > > > > > > > >> In essence, the thread affinity should only be considered a hint > for the > > > >> kernel for optimal performance. There is no reason to kill the > process if > > > >> it's disallowed. Just ignore the call or modify the thread mask to > make it > > > >> legal. > > > > > > > > The fundamental issue here is that Mesa is using the thread affinity > API > > > > for something else than it's intended for. If there was an API for > what > > > > Mesa wants (encouraging certain sets of threads to run on > topologically > > > > close cores), there should be no need to block that. > > > > > > Why such process needs to be killed instead the request being masked > > > suitably, is there some program that breaks subtly if affinity request > > > is masked (and that being worse than the program being killed)? > > > > But that is still a situation that could be nicely handled with a > > EPERM error return. Way better than just kill a process. > > That 'badly affected' program still can call abort then. > > But nicely working programs don't get just killed then!! > > > Returning an error seems less secure that prohibiting it completely. > And it may lead to subtle bugs in rarely tested code paths. > > It's legitimate that QEMU and management layers want to prevent > arbitrary code from changing resource allocation etc. > > There are no easy way I can think of for mesa (and other libraries) to > probe the seccomp filters and associated action. > > So we need a way to tell mesa not to call setaffinity() (and other > syscalls). MESA_NO_THREAD_AFFINITY or MESA_NO_SYSCALLS=setaffinity,... > seem like a relatively easy way to go. > > thanks > > > -- > Marc-André Lureau > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering
Hi On Fri, Mar 1, 2019 at 12:13 PM Mathias Fröhlich wrote: > > On Friday, 1 March 2019 12:15:08 CET Eero Tamminen wrote: > > Hi, > > > > On 1.3.2019 11.12, Michel Dänzer wrote: > > > On 2019-02-28 8:41 p.m., Marek Olšák wrote: > > >>> On Thu, Feb 28, 2019 at 1:37 PM Eero Tamminen > > >>> > > Why distro versions of Qemu filter sched_setaffinity() syscall? > > >>> > > >>> (https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889) > > >>> > > >>> Daniel Berrange (berrange) wrote on 2019-02-27: #19 > > >>> > > >>> "IMHO that mesa change is not valid. It is settings its affinity to > > >>> run on all threads which is definitely *NOT* something we want to be > > >>> allowed. Management applications want to control which CPUs QEMU runs > > >>> on, and as such Mesa should honour the CPU placement that the QEMU > > >>> process has. > > >>> > > >>> This is a great example of why QEMU wants to use seccomp to block > > >>> affinity changes to prevent something silently trying to use more CPUs > > >>> than are assigned to this QEMU." > > >>> > > >> > > >> Mesa uses thread affinity to optimize memory access performance on some > > >> CPUs (see util_pin_thread_to_L3). Other places in Mesa need to restore > > >> the > > >> original thread affinity for some child threads. Additionally, if games > > >> limit the thread affinity, Mesa needs to restore the full thread affinity > > >> for some of its child threads. > > > > > > The last part sounds like Mesa clearly overstepping its authority. > > > > > > > > >> In essence, the thread affinity should only be considered a hint for the > > >> kernel for optimal performance. There is no reason to kill the process if > > >> it's disallowed. Just ignore the call or modify the thread mask to make > > >> it > > >> legal. > > > > > > The fundamental issue here is that Mesa is using the thread affinity API > > > for something else than it's intended for. If there was an API for what > > > Mesa wants (encouraging certain sets of threads to run on topologically > > > close cores), there should be no need to block that. > > > > Why such process needs to be killed instead the request being masked > > suitably, is there some program that breaks subtly if affinity request > > is masked (and that being worse than the program being killed)? > > But that is still a situation that could be nicely handled with a > EPERM error return. Way better than just kill a process. > That 'badly affected' program still can call abort then. > But nicely working programs don't get just killed then!! Returning an error seems less secure that prohibiting it completely. And it may lead to subtle bugs in rarely tested code paths. It's legitimate that QEMU and management layers want to prevent arbitrary code from changing resource allocation etc. There are no easy way I can think of for mesa (and other libraries) to probe the seccomp filters and associated action. So we need a way to tell mesa not to call setaffinity() (and other syscalls). MESA_NO_THREAD_AFFINITY or MESA_NO_SYSCALLS=setaffinity,... seem like a relatively easy way to go. thanks -- Marc-André Lureau ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering
Hi Drew On Mon, Mar 4, 2019 at 8:01 PM Drew Davenport wrote: > > We ran into a similar issue with this syscall in Chrome OS. This patch > addressed the issue: > https://lists.freedesktop.org/archives/mesa-dev/2019-February/215721.html > > Does it help in this case as well? That doesn't work, since the default SIGSYS handler will core dump. Furthermore, qemu will start using SCMP_ACT_KILL_PROCESS, which will kill the entire process immediately. > > On Wed, Feb 27, 2019 at 4:13 PM wrote: > > > > From: Marc-André Lureau > > > > Since commit d877451b48a59ab0f9a4210fc736f51da5851c9a ("util/u_queue: > > add UTIL_QUEUE_INIT_SET_FULL_THREAD_AFFINITY"), mesa calls > > sched_setaffinity syscall. Unfortunately, qemu crashes with SIGSYS > > when sandboxing is enabled (by default with libvirt), as this syscall > > is filtered. > > > > There doesn't seem to be a way to check for the seccomp rule other > > than doing a call, which may result in various behaviour depending on > > seccomp actions. There is a PTRACE_SECCOMP_GET_FILTER, but it is > > low-level and a priviledged operation (but there might be a way to use > > it?). A safe way would be to try the call in a subprocess, > > unfortunately, qemu also prohibits fork(). Also this could be subject > > to TOCTOU. > > > > There seems to be few solutions, but the issue can be considered a > > regression for various libvirt/Boxes users. > > > > Introduce MESA_NO_THREAD_AFFINITY environment variable to prevent the > > offending call. Wrap pthread_setaffinity_np() in a utility function > > u_pthread_setaffinity_np(), returning a EACCESS error if the variable > > is set. > > > > Note: one call is left with a FIXME, as I didn't investigate how to > > build and test it, help welcome! > > > > See also: > > https://bugs.freedesktop.org/show_bug.cgi?id=109695 > > > > Signed-off-by: Marc-André Lureau > > --- > > .../drivers/swr/rasterizer/core/threads.cpp | 1 + > > src/util/u_queue.c| 2 +- > > src/util/u_thread.h | 15 ++- > > 3 files changed, 16 insertions(+), 2 deletions(-) > > > > diff --git a/src/gallium/drivers/swr/rasterizer/core/threads.cpp > > b/src/gallium/drivers/swr/rasterizer/core/threads.cpp > > index e30c1170568..d10c79512a1 100644 > > --- a/src/gallium/drivers/swr/rasterizer/core/threads.cpp > > +++ b/src/gallium/drivers/swr/rasterizer/core/threads.cpp > > @@ -364,6 +364,7 @@ void bindThread(SWR_CONTEXT* pContext, > > CPU_ZERO(); > > CPU_SET(threadId, ); > > > > +/* FIXME: use u_pthread_setaffinity_np() if possible */ > > int err = pthread_setaffinity_np(thread, sizeof(cpu_set_t), ); > > if (err != 0) > > { > > diff --git a/src/util/u_queue.c b/src/util/u_queue.c > > index 3812c824b6d..dea8d2bb4ae 100644 > > --- a/src/util/u_queue.c > > +++ b/src/util/u_queue.c > > @@ -249,7 +249,7 @@ util_queue_thread_func(void *input) > >for (unsigned i = 0; i < CPU_SETSIZE; i++) > > CPU_SET(i, ); > > > > - pthread_setaffinity_np(pthread_self(), sizeof(cpuset), ); > > + u_pthread_setaffinity_np(pthread_self(), sizeof(cpuset), ); > > } > > #endif > > > > diff --git a/src/util/u_thread.h b/src/util/u_thread.h > > index a46c18d3db2..a4e6dbae5d7 100644 > > --- a/src/util/u_thread.h > > +++ b/src/util/u_thread.h > > @@ -70,6 +70,19 @@ static inline void u_thread_setname( const char *name ) > > (void)name; > > } > > > > +#if defined(HAVE_PTHREAD_SETAFFINITY) > > +static inline int u_pthread_setaffinity_np(pthread_t thread, size_t > > cpusetsize, > > + const cpu_set_t *cpuset) > > +{ > > + if (getenv("MESA_NO_THREAD_AFFINITY")) { > > + errno = EACCES; > > + return -1; > > + } > > + > > + return pthread_setaffinity_np(thread, cpusetsize, cpuset); > > +} > > +#endif > > + > > /** > > * An AMD Zen CPU consists of multiple modules where each module has its > > own L3 > > * cache. Inter-thread communication such as locks and atomics between > > modules > > @@ -89,7 +102,7 @@ util_pin_thread_to_L3(thrd_t thread, unsigned L3_index, > > unsigned cores_per_L3) > > CPU_ZERO(); > > for (unsigned i = 0; i < cores_per_L3; i++) > >CPU_SET(L3_index * cores_per_L3 + i, ); > > - pthread_setaffinity_np(thread, sizeof(cpuset), ); > > + u_pthread_setaffinity_np(thread, sizeof(cpuset), ); > > #endif > > } > > > > -- > > 2.21.0 > > > > ___ > > mesa-dev mailing list > > mesa-dev@lists.freedesktop.org > > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering
We ran into a similar issue with this syscall in Chrome OS. This patch addressed the issue: https://lists.freedesktop.org/archives/mesa-dev/2019-February/215721.html Does it help in this case as well? On Wed, Feb 27, 2019 at 4:13 PM wrote: > > From: Marc-André Lureau > > Since commit d877451b48a59ab0f9a4210fc736f51da5851c9a ("util/u_queue: > add UTIL_QUEUE_INIT_SET_FULL_THREAD_AFFINITY"), mesa calls > sched_setaffinity syscall. Unfortunately, qemu crashes with SIGSYS > when sandboxing is enabled (by default with libvirt), as this syscall > is filtered. > > There doesn't seem to be a way to check for the seccomp rule other > than doing a call, which may result in various behaviour depending on > seccomp actions. There is a PTRACE_SECCOMP_GET_FILTER, but it is > low-level and a priviledged operation (but there might be a way to use > it?). A safe way would be to try the call in a subprocess, > unfortunately, qemu also prohibits fork(). Also this could be subject > to TOCTOU. > > There seems to be few solutions, but the issue can be considered a > regression for various libvirt/Boxes users. > > Introduce MESA_NO_THREAD_AFFINITY environment variable to prevent the > offending call. Wrap pthread_setaffinity_np() in a utility function > u_pthread_setaffinity_np(), returning a EACCESS error if the variable > is set. > > Note: one call is left with a FIXME, as I didn't investigate how to > build and test it, help welcome! > > See also: > https://bugs.freedesktop.org/show_bug.cgi?id=109695 > > Signed-off-by: Marc-André Lureau > --- > .../drivers/swr/rasterizer/core/threads.cpp | 1 + > src/util/u_queue.c| 2 +- > src/util/u_thread.h | 15 ++- > 3 files changed, 16 insertions(+), 2 deletions(-) > > diff --git a/src/gallium/drivers/swr/rasterizer/core/threads.cpp > b/src/gallium/drivers/swr/rasterizer/core/threads.cpp > index e30c1170568..d10c79512a1 100644 > --- a/src/gallium/drivers/swr/rasterizer/core/threads.cpp > +++ b/src/gallium/drivers/swr/rasterizer/core/threads.cpp > @@ -364,6 +364,7 @@ void bindThread(SWR_CONTEXT* pContext, > CPU_ZERO(); > CPU_SET(threadId, ); > > +/* FIXME: use u_pthread_setaffinity_np() if possible */ > int err = pthread_setaffinity_np(thread, sizeof(cpu_set_t), ); > if (err != 0) > { > diff --git a/src/util/u_queue.c b/src/util/u_queue.c > index 3812c824b6d..dea8d2bb4ae 100644 > --- a/src/util/u_queue.c > +++ b/src/util/u_queue.c > @@ -249,7 +249,7 @@ util_queue_thread_func(void *input) >for (unsigned i = 0; i < CPU_SETSIZE; i++) > CPU_SET(i, ); > > - pthread_setaffinity_np(pthread_self(), sizeof(cpuset), ); > + u_pthread_setaffinity_np(pthread_self(), sizeof(cpuset), ); > } > #endif > > diff --git a/src/util/u_thread.h b/src/util/u_thread.h > index a46c18d3db2..a4e6dbae5d7 100644 > --- a/src/util/u_thread.h > +++ b/src/util/u_thread.h > @@ -70,6 +70,19 @@ static inline void u_thread_setname( const char *name ) > (void)name; > } > > +#if defined(HAVE_PTHREAD_SETAFFINITY) > +static inline int u_pthread_setaffinity_np(pthread_t thread, size_t > cpusetsize, > + const cpu_set_t *cpuset) > +{ > + if (getenv("MESA_NO_THREAD_AFFINITY")) { > + errno = EACCES; > + return -1; > + } > + > + return pthread_setaffinity_np(thread, cpusetsize, cpuset); > +} > +#endif > + > /** > * An AMD Zen CPU consists of multiple modules where each module has its own > L3 > * cache. Inter-thread communication such as locks and atomics between > modules > @@ -89,7 +102,7 @@ util_pin_thread_to_L3(thrd_t thread, unsigned L3_index, > unsigned cores_per_L3) > CPU_ZERO(); > for (unsigned i = 0; i < cores_per_L3; i++) >CPU_SET(L3_index * cores_per_L3 + i, ); > - pthread_setaffinity_np(thread, sizeof(cpuset), ); > + u_pthread_setaffinity_np(thread, sizeof(cpuset), ); > #endif > } > > -- > 2.21.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering
FYI, starting with AMD Ryzen, multithreaded apps and libs pretty much have to change thread affinity to get good performance out of multithreading. Marek On Thu, Feb 28, 2019, 11:41 AM Marek Olšák wrote: > On Thu, Feb 28, 2019 at 11:13 AM Marc-André Lureau < > marcandre.lur...@gmail.com> wrote: > >> Hi Eero! >> >> (ex-colleagues, long time ago!) >> >> On Thu, Feb 28, 2019 at 1:37 PM Eero Tamminen >> wrote: >> > >> > Hi, >> > >> > On 28.2.2019 11.57, Marc-André Lureau wrote: >> > > On Thu, Feb 28, 2019 at 1:17 AM Marek Olšák wrote: >> > >> I'd rather have something more robust than an env var, like catching >> SIGSYS. >> > >> > SIGSYS is info for the invoking parent, not the (Mesa) process doing the >> > syscall. >> > >> > From "man 2 seccomp": >> > >> > The process terminates as though killed by a SIGSYS signal. Even if a >> > signal handler has been registered for SIGSYS, the handler will be >> > ignored in this case and the process always terminates. To a parent >> > process that is waiting on this process (using waitpid(2) or similar), >> > the returned wstatus will indicate that its child was terminated as >> > though by a SIGSYS signal. >> > >> > >> > > With current qemu in most distros, it defaults to SIGSYS (we switched >> > > away from SCMP_ACT_KILL, which had other problems). With more recent >> > > qemu/libseccomp, it will default to SCMP_ACT_KILL_PROCESS. In those >> > > KILL action cases, mesa will not be able to catch the failing >> > > syscalls. >> > >> > Qemu / libvirt isn't the only thing using seccomp. >> > >> > For example Docker enables seccomp filters (along with capability >> > restrictions) for the invoked containers unless that is explicitly >> > disabled: >> > https://docs.docker.com/engine/security/seccomp/ >> > >> > What actually gets filtered, is trivially changeable on Docker command >> > line by giving a JSON file specifying the syscall filtering. >> > >> > Default policy seems to be white-listing affinity syscall: >> > >> https://github.com/moby/moby/blob/master/profiles/seccomp/default.json >> > >> > >> > Why distro versions of Qemu filter sched_setaffinity() syscall? >> > >> > >> >> (https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889) >> >> Daniel Berrange (berrange) wrote on 2019-02-27: #19 >> >> "IMHO that mesa change is not valid. It is settings its affinity to >> run on all threads which is definitely *NOT* something we want to be >> allowed. Management applications want to control which CPUs QEMU runs >> on, and as such Mesa should honour the CPU placement that the QEMU >> process has. >> >> This is a great example of why QEMU wants to use seccomp to block >> affinity changes to prevent something silently trying to use more CPUs >> than are assigned to this QEMU." >> > > Mesa uses thread affinity to optimize memory access performance on some > CPUs (see util_pin_thread_to_L3). Other places in Mesa need to restore the > original thread affinity for some child threads. Additionally, if games > limit the thread affinity, Mesa needs to restore the full thread affinity > for some of its child threads. > > In essence, the thread affinity should only be considered a hint for the > kernel for optimal performance. There is no reason to kill the process if > it's disallowed. Just ignore the call or modify the thread mask to make it > legal. > > Marek > ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering
On Friday, 1 March 2019 12:15:08 CET Eero Tamminen wrote: > Hi, > > On 1.3.2019 11.12, Michel Dänzer wrote: > > On 2019-02-28 8:41 p.m., Marek Olšák wrote: > >>> On Thu, Feb 28, 2019 at 1:37 PM Eero Tamminen > Why distro versions of Qemu filter sched_setaffinity() syscall? > >>> > >>> (https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889) > >>> > >>> Daniel Berrange (berrange) wrote on 2019-02-27: #19 > >>> > >>> "IMHO that mesa change is not valid. It is settings its affinity to > >>> run on all threads which is definitely *NOT* something we want to be > >>> allowed. Management applications want to control which CPUs QEMU runs > >>> on, and as such Mesa should honour the CPU placement that the QEMU > >>> process has. > >>> > >>> This is a great example of why QEMU wants to use seccomp to block > >>> affinity changes to prevent something silently trying to use more CPUs > >>> than are assigned to this QEMU." > >>> > >> > >> Mesa uses thread affinity to optimize memory access performance on some > >> CPUs (see util_pin_thread_to_L3). Other places in Mesa need to restore the > >> original thread affinity for some child threads. Additionally, if games > >> limit the thread affinity, Mesa needs to restore the full thread affinity > >> for some of its child threads. > > > > The last part sounds like Mesa clearly overstepping its authority. > > > > > >> In essence, the thread affinity should only be considered a hint for the > >> kernel for optimal performance. There is no reason to kill the process if > >> it's disallowed. Just ignore the call or modify the thread mask to make it > >> legal. > > > > The fundamental issue here is that Mesa is using the thread affinity API > > for something else than it's intended for. If there was an API for what > > Mesa wants (encouraging certain sets of threads to run on topologically > > close cores), there should be no need to block that. > > Why such process needs to be killed instead the request being masked > suitably, is there some program that breaks subtly if affinity request > is masked (and that being worse than the program being killed)? But that is still a situation that could be nicely handled with a EPERM error return. Way better than just kill a process. That 'badly affected' program still can call abort then. But nicely working programs don't get just killed then!! best Mathias > > > - eero > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering
Hi, On 1.3.2019 11.12, Michel Dänzer wrote: On 2019-02-28 8:41 p.m., Marek Olšák wrote: On Thu, Feb 28, 2019 at 1:37 PM Eero Tamminen Why distro versions of Qemu filter sched_setaffinity() syscall? (https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889) Daniel Berrange (berrange) wrote on 2019-02-27: #19 "IMHO that mesa change is not valid. It is settings its affinity to run on all threads which is definitely *NOT* something we want to be allowed. Management applications want to control which CPUs QEMU runs on, and as such Mesa should honour the CPU placement that the QEMU process has. This is a great example of why QEMU wants to use seccomp to block affinity changes to prevent something silently trying to use more CPUs than are assigned to this QEMU." Mesa uses thread affinity to optimize memory access performance on some CPUs (see util_pin_thread_to_L3). Other places in Mesa need to restore the original thread affinity for some child threads. Additionally, if games limit the thread affinity, Mesa needs to restore the full thread affinity for some of its child threads. The last part sounds like Mesa clearly overstepping its authority. In essence, the thread affinity should only be considered a hint for the kernel for optimal performance. There is no reason to kill the process if it's disallowed. Just ignore the call or modify the thread mask to make it legal. The fundamental issue here is that Mesa is using the thread affinity API for something else than it's intended for. If there was an API for what Mesa wants (encouraging certain sets of threads to run on topologically close cores), there should be no need to block that. Why such process needs to be killed instead the request being masked suitably, is there some program that breaks subtly if affinity request is masked (and that being worse than the program being killed)? - eero ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering
On 2019-02-28 8:41 p.m., Marek Olšák wrote: > On Thu, Feb 28, 2019 at 11:13 AM Marc-André Lureau < > marcandre.lur...@gmail.com> wrote: >> On Thu, Feb 28, 2019 at 1:37 PM Eero Tamminen >> wrote: >>> >>> Why distro versions of Qemu filter sched_setaffinity() syscall? >> >> (https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889) >> >> Daniel Berrange (berrange) wrote on 2019-02-27: #19 >> >> "IMHO that mesa change is not valid. It is settings its affinity to >> run on all threads which is definitely *NOT* something we want to be >> allowed. Management applications want to control which CPUs QEMU runs >> on, and as such Mesa should honour the CPU placement that the QEMU >> process has. >> >> This is a great example of why QEMU wants to use seccomp to block >> affinity changes to prevent something silently trying to use more CPUs >> than are assigned to this QEMU." >> > > Mesa uses thread affinity to optimize memory access performance on some > CPUs (see util_pin_thread_to_L3). Other places in Mesa need to restore the > original thread affinity for some child threads. Additionally, if games > limit the thread affinity, Mesa needs to restore the full thread affinity > for some of its child threads. The last part sounds like Mesa clearly overstepping its authority. > In essence, the thread affinity should only be considered a hint for the > kernel for optimal performance. There is no reason to kill the process if > it's disallowed. Just ignore the call or modify the thread mask to make it > legal. The fundamental issue here is that Mesa is using the thread affinity API for something else than it's intended for. If there was an API for what Mesa wants (encouraging certain sets of threads to run on topologically close cores), there should be no need to block that. -- Earthling Michel Dänzer | https://www.amd.com Libre software enthusiast | Mesa and X developer ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering
On Thu, Feb 28, 2019 at 11:13 AM Marc-André Lureau < marcandre.lur...@gmail.com> wrote: > Hi Eero! > > (ex-colleagues, long time ago!) > > On Thu, Feb 28, 2019 at 1:37 PM Eero Tamminen > wrote: > > > > Hi, > > > > On 28.2.2019 11.57, Marc-André Lureau wrote: > > > On Thu, Feb 28, 2019 at 1:17 AM Marek Olšák wrote: > > >> I'd rather have something more robust than an env var, like catching > SIGSYS. > > > > SIGSYS is info for the invoking parent, not the (Mesa) process doing the > > syscall. > > > > From "man 2 seccomp": > > > > The process terminates as though killed by a SIGSYS signal. Even if a > > signal handler has been registered for SIGSYS, the handler will be > > ignored in this case and the process always terminates. To a parent > > process that is waiting on this process (using waitpid(2) or similar), > > the returned wstatus will indicate that its child was terminated as > > though by a SIGSYS signal. > > > > > > > With current qemu in most distros, it defaults to SIGSYS (we switched > > > away from SCMP_ACT_KILL, which had other problems). With more recent > > > qemu/libseccomp, it will default to SCMP_ACT_KILL_PROCESS. In those > > > KILL action cases, mesa will not be able to catch the failing > > > syscalls. > > > > Qemu / libvirt isn't the only thing using seccomp. > > > > For example Docker enables seccomp filters (along with capability > > restrictions) for the invoked containers unless that is explicitly > > disabled: > > https://docs.docker.com/engine/security/seccomp/ > > > > What actually gets filtered, is trivially changeable on Docker command > > line by giving a JSON file specifying the syscall filtering. > > > > Default policy seems to be white-listing affinity syscall: > > > https://github.com/moby/moby/blob/master/profiles/seccomp/default.json > > > > > > Why distro versions of Qemu filter sched_setaffinity() syscall? > > > > > > (https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889) > > Daniel Berrange (berrange) wrote on 2019-02-27: #19 > > "IMHO that mesa change is not valid. It is settings its affinity to > run on all threads which is definitely *NOT* something we want to be > allowed. Management applications want to control which CPUs QEMU runs > on, and as such Mesa should honour the CPU placement that the QEMU > process has. > > This is a great example of why QEMU wants to use seccomp to block > affinity changes to prevent something silently trying to use more CPUs > than are assigned to this QEMU." > Mesa uses thread affinity to optimize memory access performance on some CPUs (see util_pin_thread_to_L3). Other places in Mesa need to restore the original thread affinity for some child threads. Additionally, if games limit the thread affinity, Mesa needs to restore the full thread affinity for some of its child threads. In essence, the thread affinity should only be considered a hint for the kernel for optimal performance. There is no reason to kill the process if it's disallowed. Just ignore the call or modify the thread mask to make it legal. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering
Hi, Just catching up on this thread now. My main question is where is issue occurring? Is it some sort of CI system or something along those lines? We don't really consider SWR in an emulated environment to be an intended use case. Generally it is used as the rendering backend for data visualization, which is typically running on the host OS. -Alok > -Original Message- > From: mesa-dev [mailto:mesa-dev-boun...@lists.freedesktop.org] On > Behalf Of Marc-André Lureau > Sent: Thursday, February 28, 2019 10:14 AM > To: Tamminen, Eero T > Cc: ML mesa-dev > Subject: Re: [Mesa-dev] [PATCH] RFC: Workaround for > pthread_setaffinity_np() seccomp filtering > > Hi Eero! > > (ex-colleagues, long time ago!) > > On Thu, Feb 28, 2019 at 1:37 PM Eero Tamminen > wrote: > > > > Hi, > > > > On 28.2.2019 11.57, Marc-André Lureau wrote: > > > On Thu, Feb 28, 2019 at 1:17 AM Marek Olšák > wrote: > > >> I'd rather have something more robust than an env var, like catching > SIGSYS. > > > > SIGSYS is info for the invoking parent, not the (Mesa) process doing > > the syscall. > > > > From "man 2 seccomp": > > > > The process terminates as though killed by a SIGSYS signal. Even if a > > signal handler has been registered for SIGSYS, the handler will be > > ignored in this case and the process always terminates. To a parent > > process that is waiting on this process (using waitpid(2) or similar), > > the returned wstatus will indicate that its child was terminated as > > though by a SIGSYS signal. > > > > > > > With current qemu in most distros, it defaults to SIGSYS (we > > > switched away from SCMP_ACT_KILL, which had other problems). With > > > more recent qemu/libseccomp, it will default to > > > SCMP_ACT_KILL_PROCESS. In those KILL action cases, mesa will not be > > > able to catch the failing syscalls. > > > > Qemu / libvirt isn't the only thing using seccomp. > > > > For example Docker enables seccomp filters (along with capability > > restrictions) for the invoked containers unless that is explicitly > > disabled: > > https://docs.docker.com/engine/security/seccomp/ > > > > What actually gets filtered, is trivially changeable on Docker command > > line by giving a JSON file specifying the syscall filtering. > > > > Default policy seems to be white-listing affinity syscall: > > > > > https://github.com/moby/moby/blob/master/profiles/seccomp/default.jso > n > > > > > > Why distro versions of Qemu filter sched_setaffinity() syscall? > > > > > > (https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889) > > Daniel Berrange (berrange) wrote on 2019-02-27: #19 > > "IMHO that mesa change is not valid. It is settings its affinity to run on all > threads which is definitely *NOT* something we want to be allowed. > Management applications want to control which CPUs QEMU runs on, and as > such Mesa should honour the CPU placement that the QEMU process has. > > This is a great example of why QEMU wants to use seccomp to block affinity > changes to prevent something silently trying to use more CPUs than are > assigned to this QEMU." > > > > -- > Marc-André Lureau > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering
Hi Eero! (ex-colleagues, long time ago!) On Thu, Feb 28, 2019 at 1:37 PM Eero Tamminen wrote: > > Hi, > > On 28.2.2019 11.57, Marc-André Lureau wrote: > > On Thu, Feb 28, 2019 at 1:17 AM Marek Olšák wrote: > >> I'd rather have something more robust than an env var, like catching > >> SIGSYS. > > SIGSYS is info for the invoking parent, not the (Mesa) process doing the > syscall. > > From "man 2 seccomp": > > The process terminates as though killed by a SIGSYS signal. Even if a > signal handler has been registered for SIGSYS, the handler will be > ignored in this case and the process always terminates. To a parent > process that is waiting on this process (using waitpid(2) or similar), > the returned wstatus will indicate that its child was terminated as > though by a SIGSYS signal. > > > > With current qemu in most distros, it defaults to SIGSYS (we switched > > away from SCMP_ACT_KILL, which had other problems). With more recent > > qemu/libseccomp, it will default to SCMP_ACT_KILL_PROCESS. In those > > KILL action cases, mesa will not be able to catch the failing > > syscalls. > > Qemu / libvirt isn't the only thing using seccomp. > > For example Docker enables seccomp filters (along with capability > restrictions) for the invoked containers unless that is explicitly > disabled: > https://docs.docker.com/engine/security/seccomp/ > > What actually gets filtered, is trivially changeable on Docker command > line by giving a JSON file specifying the syscall filtering. > > Default policy seems to be white-listing affinity syscall: > https://github.com/moby/moby/blob/master/profiles/seccomp/default.json > > > Why distro versions of Qemu filter sched_setaffinity() syscall? > > (https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1815889) Daniel Berrange (berrange) wrote on 2019-02-27: #19 "IMHO that mesa change is not valid. It is settings its affinity to run on all threads which is definitely *NOT* something we want to be allowed. Management applications want to control which CPUs QEMU runs on, and as such Mesa should honour the CPU placement that the QEMU process has. This is a great example of why QEMU wants to use seccomp to block affinity changes to prevent something silently trying to use more CPUs than are assigned to this QEMU." -- Marc-André Lureau ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering
Hi, On 28.2.2019 11.57, Marc-André Lureau wrote: On Thu, Feb 28, 2019 at 1:17 AM Marek Olšák wrote: I'd rather have something more robust than an env var, like catching SIGSYS. SIGSYS is info for the invoking parent, not the (Mesa) process doing the syscall. From "man 2 seccomp": The process terminates as though killed by a SIGSYS signal. Even if a signal handler has been registered for SIGSYS, the handler will be ignored in this case and the process always terminates. To a parent process that is waiting on this process (using waitpid(2) or similar), the returned wstatus will indicate that its child was terminated as though by a SIGSYS signal. With current qemu in most distros, it defaults to SIGSYS (we switched away from SCMP_ACT_KILL, which had other problems). With more recent qemu/libseccomp, it will default to SCMP_ACT_KILL_PROCESS. In those KILL action cases, mesa will not be able to catch the failing syscalls. Qemu / libvirt isn't the only thing using seccomp. For example Docker enables seccomp filters (along with capability restrictions) for the invoked containers unless that is explicitly disabled: https://docs.docker.com/engine/security/seccomp/ What actually gets filtered, is trivially changeable on Docker command line by giving a JSON file specifying the syscall filtering. Default policy seems to be white-listing affinity syscall: https://github.com/moby/moby/blob/master/profiles/seccomp/default.json Why distro versions of Qemu filter sched_setaffinity() syscall? - Eero Marek On Wed, Feb 27, 2019 at 6:13 PM wrote: From: Marc-André Lureau Since commit d877451b48a59ab0f9a4210fc736f51da5851c9a ("util/u_queue: add UTIL_QUEUE_INIT_SET_FULL_THREAD_AFFINITY"), mesa calls sched_setaffinity syscall. Unfortunately, qemu crashes with SIGSYS when sandboxing is enabled (by default with libvirt), as this syscall is filtered. There doesn't seem to be a way to check for the seccomp rule other than doing a call, which may result in various behaviour depending on seccomp actions. There is a PTRACE_SECCOMP_GET_FILTER, but it is low-level and a priviledged operation (but there might be a way to use it?). A safe way would be to try the call in a subprocess, unfortunately, qemu also prohibits fork(). Also this could be subject to TOCTOU. There seems to be few solutions, but the issue can be considered a regression for various libvirt/Boxes users. Introduce MESA_NO_THREAD_AFFINITY environment variable to prevent the offending call. Wrap pthread_setaffinity_np() in a utility function u_pthread_setaffinity_np(), returning a EACCESS error if the variable is set. Note: one call is left with a FIXME, as I didn't investigate how to build and test it, help welcome! See also: https://bugs.freedesktop.org/show_bug.cgi?id=109695 Signed-off-by: Marc-André Lureau --- .../drivers/swr/rasterizer/core/threads.cpp | 1 + src/util/u_queue.c| 2 +- src/util/u_thread.h | 15 ++- 3 files changed, 16 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/threads.cpp b/src/gallium/drivers/swr/rasterizer/core/threads.cpp index e30c1170568..d10c79512a1 100644 --- a/src/gallium/drivers/swr/rasterizer/core/threads.cpp +++ b/src/gallium/drivers/swr/rasterizer/core/threads.cpp @@ -364,6 +364,7 @@ void bindThread(SWR_CONTEXT* pContext, CPU_ZERO(); CPU_SET(threadId, ); +/* FIXME: use u_pthread_setaffinity_np() if possible */ int err = pthread_setaffinity_np(thread, sizeof(cpu_set_t), ); if (err != 0) { diff --git a/src/util/u_queue.c b/src/util/u_queue.c index 3812c824b6d..dea8d2bb4ae 100644 --- a/src/util/u_queue.c +++ b/src/util/u_queue.c @@ -249,7 +249,7 @@ util_queue_thread_func(void *input) for (unsigned i = 0; i < CPU_SETSIZE; i++) CPU_SET(i, ); - pthread_setaffinity_np(pthread_self(), sizeof(cpuset), ); + u_pthread_setaffinity_np(pthread_self(), sizeof(cpuset), ); } #endif diff --git a/src/util/u_thread.h b/src/util/u_thread.h index a46c18d3db2..a4e6dbae5d7 100644 --- a/src/util/u_thread.h +++ b/src/util/u_thread.h @@ -70,6 +70,19 @@ static inline void u_thread_setname( const char *name ) (void)name; } +#if defined(HAVE_PTHREAD_SETAFFINITY) +static inline int u_pthread_setaffinity_np(pthread_t thread, size_t cpusetsize, + const cpu_set_t *cpuset) +{ + if (getenv("MESA_NO_THREAD_AFFINITY")) { + errno = EACCES; + return -1; + } + + return pthread_setaffinity_np(thread, cpusetsize, cpuset); +} +#endif + /** * An AMD Zen CPU consists of multiple modules where each module has its own L3 * cache. Inter-thread communication such as locks and atomics between modules @@ -89,7 +102,7 @@ util_pin_thread_to_L3(thrd_t thread, unsigned L3_index, unsigned cores_per_L3) CPU_ZERO(); for (unsigned i = 0; i <
Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering
Hi On Thu, Feb 28, 2019 at 1:17 AM Marek Olšák wrote: > > I'd rather have something more robust than an env var, like catching SIGSYS. With current qemu in most distros, it defaults to SIGSYS (we switched away from SCMP_ACT_KILL, which had other problems). With more recent qemu/libseccomp, it will default to SCMP_ACT_KILL_PROCESS. In those KILL action cases, mesa will not be able to catch the failing syscalls. > > Marek > > On Wed, Feb 27, 2019 at 6:13 PM wrote: >> >> From: Marc-André Lureau >> >> Since commit d877451b48a59ab0f9a4210fc736f51da5851c9a ("util/u_queue: >> add UTIL_QUEUE_INIT_SET_FULL_THREAD_AFFINITY"), mesa calls >> sched_setaffinity syscall. Unfortunately, qemu crashes with SIGSYS >> when sandboxing is enabled (by default with libvirt), as this syscall >> is filtered. >> >> There doesn't seem to be a way to check for the seccomp rule other >> than doing a call, which may result in various behaviour depending on >> seccomp actions. There is a PTRACE_SECCOMP_GET_FILTER, but it is >> low-level and a priviledged operation (but there might be a way to use >> it?). A safe way would be to try the call in a subprocess, >> unfortunately, qemu also prohibits fork(). Also this could be subject >> to TOCTOU. >> >> There seems to be few solutions, but the issue can be considered a >> regression for various libvirt/Boxes users. >> >> Introduce MESA_NO_THREAD_AFFINITY environment variable to prevent the >> offending call. Wrap pthread_setaffinity_np() in a utility function >> u_pthread_setaffinity_np(), returning a EACCESS error if the variable >> is set. >> >> Note: one call is left with a FIXME, as I didn't investigate how to >> build and test it, help welcome! >> >> See also: >> https://bugs.freedesktop.org/show_bug.cgi?id=109695 >> >> Signed-off-by: Marc-André Lureau >> --- >> .../drivers/swr/rasterizer/core/threads.cpp | 1 + >> src/util/u_queue.c| 2 +- >> src/util/u_thread.h | 15 ++- >> 3 files changed, 16 insertions(+), 2 deletions(-) >> >> diff --git a/src/gallium/drivers/swr/rasterizer/core/threads.cpp >> b/src/gallium/drivers/swr/rasterizer/core/threads.cpp >> index e30c1170568..d10c79512a1 100644 >> --- a/src/gallium/drivers/swr/rasterizer/core/threads.cpp >> +++ b/src/gallium/drivers/swr/rasterizer/core/threads.cpp >> @@ -364,6 +364,7 @@ void bindThread(SWR_CONTEXT* pContext, >> CPU_ZERO(); >> CPU_SET(threadId, ); >> >> +/* FIXME: use u_pthread_setaffinity_np() if possible */ >> int err = pthread_setaffinity_np(thread, sizeof(cpu_set_t), ); >> if (err != 0) >> { >> diff --git a/src/util/u_queue.c b/src/util/u_queue.c >> index 3812c824b6d..dea8d2bb4ae 100644 >> --- a/src/util/u_queue.c >> +++ b/src/util/u_queue.c >> @@ -249,7 +249,7 @@ util_queue_thread_func(void *input) >>for (unsigned i = 0; i < CPU_SETSIZE; i++) >> CPU_SET(i, ); >> >> - pthread_setaffinity_np(pthread_self(), sizeof(cpuset), ); >> + u_pthread_setaffinity_np(pthread_self(), sizeof(cpuset), ); >> } >> #endif >> >> diff --git a/src/util/u_thread.h b/src/util/u_thread.h >> index a46c18d3db2..a4e6dbae5d7 100644 >> --- a/src/util/u_thread.h >> +++ b/src/util/u_thread.h >> @@ -70,6 +70,19 @@ static inline void u_thread_setname( const char *name ) >> (void)name; >> } >> >> +#if defined(HAVE_PTHREAD_SETAFFINITY) >> +static inline int u_pthread_setaffinity_np(pthread_t thread, size_t >> cpusetsize, >> + const cpu_set_t *cpuset) >> +{ >> + if (getenv("MESA_NO_THREAD_AFFINITY")) { >> + errno = EACCES; >> + return -1; >> + } >> + >> + return pthread_setaffinity_np(thread, cpusetsize, cpuset); >> +} >> +#endif >> + >> /** >> * An AMD Zen CPU consists of multiple modules where each module has its >> own L3 >> * cache. Inter-thread communication such as locks and atomics between >> modules >> @@ -89,7 +102,7 @@ util_pin_thread_to_L3(thrd_t thread, unsigned L3_index, >> unsigned cores_per_L3) >> CPU_ZERO(); >> for (unsigned i = 0; i < cores_per_L3; i++) >>CPU_SET(L3_index * cores_per_L3 + i, ); >> - pthread_setaffinity_np(thread, sizeof(cpuset), ); >> + u_pthread_setaffinity_np(thread, sizeof(cpuset), ); >> #endif >> } >> >> -- >> 2.21.0 >> >> ___ >> mesa-dev mailing list >> mesa-dev@lists.freedesktop.org >> https://lists.freedesktop.org/mailman/listinfo/mesa-dev > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev -- Marc-André Lureau ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering
I'd rather have something more robust than an env var, like catching SIGSYS. Marek On Wed, Feb 27, 2019 at 6:13 PM wrote: > From: Marc-André Lureau > > Since commit d877451b48a59ab0f9a4210fc736f51da5851c9a ("util/u_queue: > add UTIL_QUEUE_INIT_SET_FULL_THREAD_AFFINITY"), mesa calls > sched_setaffinity syscall. Unfortunately, qemu crashes with SIGSYS > when sandboxing is enabled (by default with libvirt), as this syscall > is filtered. > > There doesn't seem to be a way to check for the seccomp rule other > than doing a call, which may result in various behaviour depending on > seccomp actions. There is a PTRACE_SECCOMP_GET_FILTER, but it is > low-level and a priviledged operation (but there might be a way to use > it?). A safe way would be to try the call in a subprocess, > unfortunately, qemu also prohibits fork(). Also this could be subject > to TOCTOU. > > There seems to be few solutions, but the issue can be considered a > regression for various libvirt/Boxes users. > > Introduce MESA_NO_THREAD_AFFINITY environment variable to prevent the > offending call. Wrap pthread_setaffinity_np() in a utility function > u_pthread_setaffinity_np(), returning a EACCESS error if the variable > is set. > > Note: one call is left with a FIXME, as I didn't investigate how to > build and test it, help welcome! > > See also: > https://bugs.freedesktop.org/show_bug.cgi?id=109695 > > Signed-off-by: Marc-André Lureau > --- > .../drivers/swr/rasterizer/core/threads.cpp | 1 + > src/util/u_queue.c| 2 +- > src/util/u_thread.h | 15 ++- > 3 files changed, 16 insertions(+), 2 deletions(-) > > diff --git a/src/gallium/drivers/swr/rasterizer/core/threads.cpp > b/src/gallium/drivers/swr/rasterizer/core/threads.cpp > index e30c1170568..d10c79512a1 100644 > --- a/src/gallium/drivers/swr/rasterizer/core/threads.cpp > +++ b/src/gallium/drivers/swr/rasterizer/core/threads.cpp > @@ -364,6 +364,7 @@ void bindThread(SWR_CONTEXT* pContext, > CPU_ZERO(); > CPU_SET(threadId, ); > > +/* FIXME: use u_pthread_setaffinity_np() if possible */ > int err = pthread_setaffinity_np(thread, sizeof(cpu_set_t), ); > if (err != 0) > { > diff --git a/src/util/u_queue.c b/src/util/u_queue.c > index 3812c824b6d..dea8d2bb4ae 100644 > --- a/src/util/u_queue.c > +++ b/src/util/u_queue.c > @@ -249,7 +249,7 @@ util_queue_thread_func(void *input) >for (unsigned i = 0; i < CPU_SETSIZE; i++) > CPU_SET(i, ); > > - pthread_setaffinity_np(pthread_self(), sizeof(cpuset), ); > + u_pthread_setaffinity_np(pthread_self(), sizeof(cpuset), ); > } > #endif > > diff --git a/src/util/u_thread.h b/src/util/u_thread.h > index a46c18d3db2..a4e6dbae5d7 100644 > --- a/src/util/u_thread.h > +++ b/src/util/u_thread.h > @@ -70,6 +70,19 @@ static inline void u_thread_setname( const char *name ) > (void)name; > } > > +#if defined(HAVE_PTHREAD_SETAFFINITY) > +static inline int u_pthread_setaffinity_np(pthread_t thread, size_t > cpusetsize, > + const cpu_set_t *cpuset) > +{ > + if (getenv("MESA_NO_THREAD_AFFINITY")) { > + errno = EACCES; > + return -1; > + } > + > + return pthread_setaffinity_np(thread, cpusetsize, cpuset); > +} > +#endif > + > /** > * An AMD Zen CPU consists of multiple modules where each module has its > own L3 > * cache. Inter-thread communication such as locks and atomics between > modules > @@ -89,7 +102,7 @@ util_pin_thread_to_L3(thrd_t thread, unsigned L3_index, > unsigned cores_per_L3) > CPU_ZERO(); > for (unsigned i = 0; i < cores_per_L3; i++) >CPU_SET(L3_index * cores_per_L3 + i, ); > - pthread_setaffinity_np(thread, sizeof(cpuset), ); > + u_pthread_setaffinity_np(thread, sizeof(cpuset), ); > #endif > } > > -- > 2.21.0 > > ___ > mesa-dev mailing list > mesa-dev@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] [PATCH] RFC: Workaround for pthread_setaffinity_np() seccomp filtering
From: Marc-André Lureau Since commit d877451b48a59ab0f9a4210fc736f51da5851c9a ("util/u_queue: add UTIL_QUEUE_INIT_SET_FULL_THREAD_AFFINITY"), mesa calls sched_setaffinity syscall. Unfortunately, qemu crashes with SIGSYS when sandboxing is enabled (by default with libvirt), as this syscall is filtered. There doesn't seem to be a way to check for the seccomp rule other than doing a call, which may result in various behaviour depending on seccomp actions. There is a PTRACE_SECCOMP_GET_FILTER, but it is low-level and a priviledged operation (but there might be a way to use it?). A safe way would be to try the call in a subprocess, unfortunately, qemu also prohibits fork(). Also this could be subject to TOCTOU. There seems to be few solutions, but the issue can be considered a regression for various libvirt/Boxes users. Introduce MESA_NO_THREAD_AFFINITY environment variable to prevent the offending call. Wrap pthread_setaffinity_np() in a utility function u_pthread_setaffinity_np(), returning a EACCESS error if the variable is set. Note: one call is left with a FIXME, as I didn't investigate how to build and test it, help welcome! See also: https://bugs.freedesktop.org/show_bug.cgi?id=109695 Signed-off-by: Marc-André Lureau --- .../drivers/swr/rasterizer/core/threads.cpp | 1 + src/util/u_queue.c| 2 +- src/util/u_thread.h | 15 ++- 3 files changed, 16 insertions(+), 2 deletions(-) diff --git a/src/gallium/drivers/swr/rasterizer/core/threads.cpp b/src/gallium/drivers/swr/rasterizer/core/threads.cpp index e30c1170568..d10c79512a1 100644 --- a/src/gallium/drivers/swr/rasterizer/core/threads.cpp +++ b/src/gallium/drivers/swr/rasterizer/core/threads.cpp @@ -364,6 +364,7 @@ void bindThread(SWR_CONTEXT* pContext, CPU_ZERO(); CPU_SET(threadId, ); +/* FIXME: use u_pthread_setaffinity_np() if possible */ int err = pthread_setaffinity_np(thread, sizeof(cpu_set_t), ); if (err != 0) { diff --git a/src/util/u_queue.c b/src/util/u_queue.c index 3812c824b6d..dea8d2bb4ae 100644 --- a/src/util/u_queue.c +++ b/src/util/u_queue.c @@ -249,7 +249,7 @@ util_queue_thread_func(void *input) for (unsigned i = 0; i < CPU_SETSIZE; i++) CPU_SET(i, ); - pthread_setaffinity_np(pthread_self(), sizeof(cpuset), ); + u_pthread_setaffinity_np(pthread_self(), sizeof(cpuset), ); } #endif diff --git a/src/util/u_thread.h b/src/util/u_thread.h index a46c18d3db2..a4e6dbae5d7 100644 --- a/src/util/u_thread.h +++ b/src/util/u_thread.h @@ -70,6 +70,19 @@ static inline void u_thread_setname( const char *name ) (void)name; } +#if defined(HAVE_PTHREAD_SETAFFINITY) +static inline int u_pthread_setaffinity_np(pthread_t thread, size_t cpusetsize, + const cpu_set_t *cpuset) +{ + if (getenv("MESA_NO_THREAD_AFFINITY")) { + errno = EACCES; + return -1; + } + + return pthread_setaffinity_np(thread, cpusetsize, cpuset); +} +#endif + /** * An AMD Zen CPU consists of multiple modules where each module has its own L3 * cache. Inter-thread communication such as locks and atomics between modules @@ -89,7 +102,7 @@ util_pin_thread_to_L3(thrd_t thread, unsigned L3_index, unsigned cores_per_L3) CPU_ZERO(); for (unsigned i = 0; i < cores_per_L3; i++) CPU_SET(L3_index * cores_per_L3 + i, ); - pthread_setaffinity_np(thread, sizeof(cpuset), ); + u_pthread_setaffinity_np(thread, sizeof(cpuset), ); #endif } -- 2.21.0 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev