Re: [Qemu-block] Limiting coroutine stack usage
On Wed, Mar 07, 2018 at 09:36:38PM +0100, Peter Lieven wrote: > Am 06.03.2018 um 12:51 schrieb Stefan Hajnoczi: > > On Tue, Feb 20, 2018 at 06:04:02PM +0100, Peter Lieven wrote: > >> I remember we discussed a long time ago to limit the stack usage of all > >> functions that are executed in a coroutine > >> context to a very low value to be able to safely limit the coroutine stack > >> size as well. > >> > >> I checked through all functions in block/, migration/ and nbd/ and there > >> are only very few larger or unbound stack > >> allocations that can easily be fixed. > >> > >> Now my question: Is there an easy way to add a cflag like > >> -Wstack-usage=2048 to all objects in a given directory only? > >> I tried to add a llimit to the whole project, but fixing this will be a > >> larger task. > > 2KB is fine for QEMU code but actual coroutine stack sizes will have to > > be at least 8KB, I guess, in order for third-party libraries to work > > (e.g. curl, rbd). PATH_MAX is 4KB on Linux. > > > > Nested event loops in QEMU code can also result in deep call stacks. > > This happens when aio_poll() invokes an fd handler or BH that also > > invokes aio_poll(). > > The plan was to limit the stack usage only as a compiler option. I would > leave the coroutine stack size at 1MB > for now until we have a way to identify the worst case usage. I'm not sure we'll be able to confidently set a small stack size, but a compile-time check doesn't hurt. Maybe someday we'll switch to stackless coroutines. Clang now has coroutines support and maybe gcc will get it too (for C++20). If the compiler transforms the code then coroutine state can be kept in a smaller data area that is not a call stack. A single per-thread call stack is used to run any coroutine and only during coroutine execution. Stefan signature.asc Description: PGP signature
Re: [Qemu-block] Limiting coroutine stack usage
Am 06.03.2018 um 12:51 schrieb Stefan Hajnoczi: > On Tue, Feb 20, 2018 at 06:04:02PM +0100, Peter Lieven wrote: >> I remember we discussed a long time ago to limit the stack usage of all >> functions that are executed in a coroutine >> context to a very low value to be able to safely limit the coroutine stack >> size as well. >> >> I checked through all functions in block/, migration/ and nbd/ and there are >> only very few larger or unbound stack >> allocations that can easily be fixed. >> >> Now my question: Is there an easy way to add a cflag like -Wstack-usage=2048 >> to all objects in a given directory only? >> I tried to add a llimit to the whole project, but fixing this will be a >> larger task. > 2KB is fine for QEMU code but actual coroutine stack sizes will have to > be at least 8KB, I guess, in order for third-party libraries to work > (e.g. curl, rbd). PATH_MAX is 4KB on Linux. > > Nested event loops in QEMU code can also result in deep call stacks. > This happens when aio_poll() invokes an fd handler or BH that also > invokes aio_poll(). The plan was to limit the stack usage only as a compiler option. I would leave the coroutine stack size at 1MB for now until we have a way to identify the worst case usage. Peter
Re: [Qemu-block] Limiting coroutine stack usage
On Tue, Feb 20, 2018 at 06:04:02PM +0100, Peter Lieven wrote: > I remember we discussed a long time ago to limit the stack usage of all > functions that are executed in a coroutine > context to a very low value to be able to safely limit the coroutine stack > size as well. > > I checked through all functions in block/, migration/ and nbd/ and there are > only very few larger or unbound stack > allocations that can easily be fixed. > > Now my question: Is there an easy way to add a cflag like -Wstack-usage=2048 > to all objects in a given directory only? > I tried to add a llimit to the whole project, but fixing this will be a > larger task. 2KB is fine for QEMU code but actual coroutine stack sizes will have to be at least 8KB, I guess, in order for third-party libraries to work (e.g. curl, rbd). PATH_MAX is 4KB on Linux. Nested event loops in QEMU code can also result in deep call stacks. This happens when aio_poll() invokes an fd handler or BH that also invokes aio_poll(). Stefan signature.asc Description: PGP signature
Re: [Qemu-block] Limiting coroutine stack usage
On 22/02/2018 18:06, John Snow wrote: > > > On 02/22/2018 05:57 AM, Kevin Wolf wrote: >> Am 20.02.2018 um 22:54 hat Paolo Bonzini geschrieben: >>> On 20/02/2018 18:04, Peter Lieven wrote: Hi, I remember we discussed a long time ago to limit the stack usage of all functions that are executed in a coroutine context to a very low value to be able to safely limit the coroutine stack size as well. >>> >>> IIRC the only issue was that hw/ide/atapi.c has mutual recursion between >>> ide_atapi_cmd_reply_end -> ide_transfer_start -> ahci_start_transfer -> >>> ide_atapi_cmd_reply_end. >>> >>> But perhaps it's not an issue, somebody needs to audit the code. >> >> I think John intended to get rid of the recursion sometime, but I doubt >> he has had the time so far. >> > > It hasn't been a priority for me. > > Paolo tried to fix ATAPI by adding a BH callback, but that added the > possibility of a migration halfway through a data transfer IIRC. > > If anyone wants to tackle it, I'll dig up Paolo's patches. A better possibility is to make it into tail recursion first and then a while loop. Maybe introducing some kind of ide_transfer_start_norecurse that returns "true" if you have a start_transfer callback (so you need to do another iteration immediately) and "false" if you don't. I'll take a look... Paolo
Re: [Qemu-block] Limiting coroutine stack usage
On 02/22/2018 05:57 AM, Kevin Wolf wrote: > Am 20.02.2018 um 22:54 hat Paolo Bonzini geschrieben: >> On 20/02/2018 18:04, Peter Lieven wrote: >>> Hi, >>> >>> I remember we discussed a long time ago to limit the stack usage of all >>> functions that are executed in a coroutine >>> context to a very low value to be able to safely limit the coroutine >>> stack size as well. >> >> IIRC the only issue was that hw/ide/atapi.c has mutual recursion between >> ide_atapi_cmd_reply_end -> ide_transfer_start -> ahci_start_transfer -> >> ide_atapi_cmd_reply_end. >> >> But perhaps it's not an issue, somebody needs to audit the code. > > I think John intended to get rid of the recursion sometime, but I doubt > he has had the time so far. > It hasn't been a priority for me. Paolo tried to fix ATAPI by adding a BH callback, but that added the possibility of a migration halfway through a data transfer IIRC. If anyone wants to tackle it, I'll dig up Paolo's patches. --js
Re: [Qemu-block] Limiting coroutine stack usage
Am 22.02.2018 um 12:32 schrieb Kevin Wolf: > Am 22.02.2018 um 12:01 hat Peter Lieven geschrieben: >> Am 22.02.2018 um 11:57 schrieb Kevin Wolf: >>> Am 20.02.2018 um 22:54 hat Paolo Bonzini geschrieben: On 20/02/2018 18:04, Peter Lieven wrote: > Hi, > > I remember we discussed a long time ago to limit the stack usage of all > functions that are executed in a coroutine > context to a very low value to be able to safely limit the coroutine > stack size as well. IIRC the only issue was that hw/ide/atapi.c has mutual recursion between ide_atapi_cmd_reply_end -> ide_transfer_start -> ahci_start_transfer -> ide_atapi_cmd_reply_end. But perhaps it's not an issue, somebody needs to audit the code. >>> I think John intended to get rid of the recursion sometime, but I doubt >>> he has had the time so far. >> Apart from this is is possible to define special cflags in the >> Makefile.objs just for a subdirectory? I have patches ready to make >> the block layer files and other coroutine users compile with >> -Wstack-size=2048. But I do not want to specify each file separately. > Our Makefiles have lines like this: > > iscsi.o-cflags := $(LIBISCSI_CFLAGS) > > I don't think there is a direct mechanism to apply cflags to a whole > directory or just to block-obj-y/block-obj-m, but just looping over them > could work. I'm not a Makefile expert at all, but after some toying with > a simple example, something like this might work: > > $(foreach x,$(block-obj-y),$(eval $x-cflags += -Wstack-size=2048)) Thanks for the hint. If noone comes up with a better idea I will try this. Peter
Re: [Qemu-block] Limiting coroutine stack usage
Am 22.02.2018 um 12:01 hat Peter Lieven geschrieben: > Am 22.02.2018 um 11:57 schrieb Kevin Wolf: > > Am 20.02.2018 um 22:54 hat Paolo Bonzini geschrieben: > >> On 20/02/2018 18:04, Peter Lieven wrote: > >>> Hi, > >>> > >>> I remember we discussed a long time ago to limit the stack usage of all > >>> functions that are executed in a coroutine > >>> context to a very low value to be able to safely limit the coroutine > >>> stack size as well. > >> IIRC the only issue was that hw/ide/atapi.c has mutual recursion between > >> ide_atapi_cmd_reply_end -> ide_transfer_start -> ahci_start_transfer -> > >> ide_atapi_cmd_reply_end. > >> > >> But perhaps it's not an issue, somebody needs to audit the code. > > I think John intended to get rid of the recursion sometime, but I doubt > > he has had the time so far. > > Apart from this is is possible to define special cflags in the > Makefile.objs just for a subdirectory? I have patches ready to make > the block layer files and other coroutine users compile with > -Wstack-size=2048. But I do not want to specify each file separately. Our Makefiles have lines like this: iscsi.o-cflags := $(LIBISCSI_CFLAGS) I don't think there is a direct mechanism to apply cflags to a whole directory or just to block-obj-y/block-obj-m, but just looping over them could work. I'm not a Makefile expert at all, but after some toying with a simple example, something like this might work: $(foreach x,$(block-obj-y),$(eval $x-cflags += -Wstack-size=2048)) > Limiting the coroutine size to much could also lead to trouble in some > third party libraries that are called from a coroutine context, or > not? Yes, this is true. Kevin
Re: [Qemu-block] Limiting coroutine stack usage
Am 22.02.2018 um 11:57 schrieb Kevin Wolf: > Am 20.02.2018 um 22:54 hat Paolo Bonzini geschrieben: >> On 20/02/2018 18:04, Peter Lieven wrote: >>> Hi, >>> >>> I remember we discussed a long time ago to limit the stack usage of all >>> functions that are executed in a coroutine >>> context to a very low value to be able to safely limit the coroutine >>> stack size as well. >> IIRC the only issue was that hw/ide/atapi.c has mutual recursion between >> ide_atapi_cmd_reply_end -> ide_transfer_start -> ahci_start_transfer -> >> ide_atapi_cmd_reply_end. >> >> But perhaps it's not an issue, somebody needs to audit the code. > I think John intended to get rid of the recursion sometime, but I doubt > he has had the time so far. Apart from this is is possible to define special cflags in the Makefile.objs just for a subdirectory? I have patches ready to make the block layer files and other coroutine users compile with -Wstack-size=2048. But I do not want to specify each file separately. Limiting the coroutine size to much could also lead to trouble in some third party libraries that are called from a coroutine context, or not? Thanks, Peter
Re: [Qemu-block] Limiting coroutine stack usage
Am 20.02.2018 um 22:54 hat Paolo Bonzini geschrieben: > On 20/02/2018 18:04, Peter Lieven wrote: > > Hi, > > > > I remember we discussed a long time ago to limit the stack usage of all > > functions that are executed in a coroutine > > context to a very low value to be able to safely limit the coroutine > > stack size as well. > > IIRC the only issue was that hw/ide/atapi.c has mutual recursion between > ide_atapi_cmd_reply_end -> ide_transfer_start -> ahci_start_transfer -> > ide_atapi_cmd_reply_end. > > But perhaps it's not an issue, somebody needs to audit the code. I think John intended to get rid of the recursion sometime, but I doubt he has had the time so far. Kevin
Re: [Qemu-block] Limiting coroutine stack usage
On 20/02/2018 18:04, Peter Lieven wrote: > Hi, > > I remember we discussed a long time ago to limit the stack usage of all > functions that are executed in a coroutine > context to a very low value to be able to safely limit the coroutine > stack size as well. IIRC the only issue was that hw/ide/atapi.c has mutual recursion between ide_atapi_cmd_reply_end -> ide_transfer_start -> ahci_start_transfer -> ide_atapi_cmd_reply_end. But perhaps it's not an issue, somebody needs to audit the code. > I checked through all functions in block/, migration/ and nbd/ and there > are only very few larger or unbound stack > allocations that can easily be fixed. Yeah, the really large allocations are very few and mostly have to do with networking (https://wiki.qemu.org/BiteSizedTasks#Large_frames). Unfortunately the link to the original list has died together with the gmane archives. Paolo
[Qemu-block] Limiting coroutine stack usage
Hi, I remember we discussed a long time ago to limit the stack usage of all functions that are executed in a coroutine context to a very low value to be able to safely limit the coroutine stack size as well. I checked through all functions in block/, migration/ and nbd/ and there are only very few larger or unbound stack allocations that can easily be fixed. Now my question: Is there an easy way to add a cflag like -Wstack-usage=2048 to all objects in a given directory only? I tried to add a llimit to the whole project, but fixing this will be a larger task. Thanks, Peter