Re: [PATCH 3/3] [media] v4l2: disable filesystem-dax mapping support
On Tue, Nov 7, 2017 at 12:39 PM, Mauro Carvalho Chehabwrote: > Em Tue, 7 Nov 2017 09:43:41 -0800 > Dan Williams escreveu: > >> On Tue, Nov 7, 2017 at 12:33 AM, Mauro Carvalho Chehab >> wrote: >> > Em Mon, 06 Nov 2017 16:57:28 -0800 >> > Dan Williams escreveu: >> > >> >> V4L2 memory registrations are incompatible with filesystem-dax that >> >> needs the ability to revoke dma access to a mapping at will, or >> >> otherwise allow the kernel to wait for completion of DMA. The >> >> filesystem-dax implementation breaks the traditional solution of >> >> truncate of active file backed mappings since there is no page-cache >> >> page we can orphan to sustain ongoing DMA. >> >> >> >> If v4l2 wants to support long lived DMA mappings it needs to arrange to >> >> hold a file lease or use some other mechanism so that the kernel can >> >> coordinate revoking DMA access when the filesystem needs to truncate >> >> mappings. >> > >> > >> > Not sure if I understand this your comment here... what happens >> > if FS_DAX is enabled? The new err = get_user_pages_longterm() >> > would cause DMA allocation to fail? >> >> Correct, any attempt to specify a filesystem-dax mapping range to >> get_user_pages_longterm will fail with EOPNOTSUPP. In the future we >> want to add something like a 'struct file_lock *' argument to >> get_user_pages_longterm so that the kernel has a handle to revoke >> access to the returned pages. Once we have a safe way for the kernel >> to undo elevated page counts we can stop failing the longterm vs >> filesystem-dax case. > > Argh! Perhaps we should make it depend on BROKEN while not fixed :-/ Small consolation, but we do warn that filesystem-dax is still considered experimental when mounting a filesystem with "-o dax" >> Here is more background on why _longterm gup is a problem for filesystem-dax: >> >> https://lwn.net/Articles/737273/ >> >> > If so, that doesn't sound >> > right. Instead, mm should somehow mark this mapping to be out >> > of FS_DAX control range. >> >> DAX is currently global setting for the entire backing device of the >> filesystem, so any mapping of any file when the "-o dax" mount option >> is set is in the "FS_DAX control range". In other words there's >> currently no way to prevent FS_DAX mappings from being exposed to V4L2 >> outside of CONFIG_FS_DAX=n. > > Grrr... > >> > Also, it is not only videobuf-dma-sg.c that does long lived >> > DMA mappings. VB2 also does that (and videobuf-vmalloc). >> >> Without finding the code videobuf-vmalloc sounds like it should be ok >> if the kernel is allocating memory separate from a file-backed DAX >> mapping. > > videobuf-vmalloc do DMA mapping for pages allocated via vmalloc(), > via vmalloc_user()/remap_vmalloc_range(). Ok, that's completely safe since filesystem-dax mappings are not involved in a vmalloc backed virtual address range. > There aren't much drivers using VB1 anymore, but a change at VB2 > will likely break support for almost all webcams if fs DAX is > in usage. Yes, unless / until we can switch userspace to using a new memory registration api that includes a way for the kernel to revoke access to a dax mapping. Another mitigation is following through on support for moving dax support from a global mount flag to a per-inode flag to at least prevent dax from leaking to use cases that need explicit coordination. >> Where is the VB2 get_user_pages call? > > Before changeset 3336c24f25ec, the logic for get_user_pages() were > at drivers/media/v4l2-core/videobuf2-dma-sg.c. Now, the logic > it uses is inside mm/frame_vector.c. Ok, I'll take a look.
Re: [PATCH 3/3] [media] v4l2: disable filesystem-dax mapping support
Em Tue, 7 Nov 2017 09:43:41 -0800 Dan Williamsescreveu: > On Tue, Nov 7, 2017 at 12:33 AM, Mauro Carvalho Chehab > wrote: > > Em Mon, 06 Nov 2017 16:57:28 -0800 > > Dan Williams escreveu: > > > >> V4L2 memory registrations are incompatible with filesystem-dax that > >> needs the ability to revoke dma access to a mapping at will, or > >> otherwise allow the kernel to wait for completion of DMA. The > >> filesystem-dax implementation breaks the traditional solution of > >> truncate of active file backed mappings since there is no page-cache > >> page we can orphan to sustain ongoing DMA. > >> > >> If v4l2 wants to support long lived DMA mappings it needs to arrange to > >> hold a file lease or use some other mechanism so that the kernel can > >> coordinate revoking DMA access when the filesystem needs to truncate > >> mappings. > > > > > > Not sure if I understand this your comment here... what happens > > if FS_DAX is enabled? The new err = get_user_pages_longterm() > > would cause DMA allocation to fail? > > Correct, any attempt to specify a filesystem-dax mapping range to > get_user_pages_longterm will fail with EOPNOTSUPP. In the future we > want to add something like a 'struct file_lock *' argument to > get_user_pages_longterm so that the kernel has a handle to revoke > access to the returned pages. Once we have a safe way for the kernel > to undo elevated page counts we can stop failing the longterm vs > filesystem-dax case. Argh! Perhaps we should make it depend on BROKEN while not fixed :-/ > Here is more background on why _longterm gup is a problem for filesystem-dax: > > https://lwn.net/Articles/737273/ > > > If so, that doesn't sound > > right. Instead, mm should somehow mark this mapping to be out > > of FS_DAX control range. > > DAX is currently global setting for the entire backing device of the > filesystem, so any mapping of any file when the "-o dax" mount option > is set is in the "FS_DAX control range". In other words there's > currently no way to prevent FS_DAX mappings from being exposed to V4L2 > outside of CONFIG_FS_DAX=n. Grrr... > > Also, it is not only videobuf-dma-sg.c that does long lived > > DMA mappings. VB2 also does that (and videobuf-vmalloc). > > Without finding the code videobuf-vmalloc sounds like it should be ok > if the kernel is allocating memory separate from a file-backed DAX > mapping. videobuf-vmalloc do DMA mapping for pages allocated via vmalloc(), via vmalloc_user()/remap_vmalloc_range(). There aren't much drivers using VB1 anymore, but a change at VB2 will likely break support for almost all webcams if fs DAX is in usage. > Where is the VB2 get_user_pages call? Before changeset 3336c24f25ec, the logic for get_user_pages() were at drivers/media/v4l2-core/videobuf2-dma-sg.c. Now, the logic it uses is inside mm/frame_vector.c. Thanks, Mauro
Re: [PATCH 3/3] [media] v4l2: disable filesystem-dax mapping support
On Tue, Nov 7, 2017 at 12:33 AM, Mauro Carvalho Chehabwrote: > Em Mon, 06 Nov 2017 16:57:28 -0800 > Dan Williams escreveu: > >> V4L2 memory registrations are incompatible with filesystem-dax that >> needs the ability to revoke dma access to a mapping at will, or >> otherwise allow the kernel to wait for completion of DMA. The >> filesystem-dax implementation breaks the traditional solution of >> truncate of active file backed mappings since there is no page-cache >> page we can orphan to sustain ongoing DMA. >> >> If v4l2 wants to support long lived DMA mappings it needs to arrange to >> hold a file lease or use some other mechanism so that the kernel can >> coordinate revoking DMA access when the filesystem needs to truncate >> mappings. > > > Not sure if I understand this your comment here... what happens > if FS_DAX is enabled? The new err = get_user_pages_longterm() > would cause DMA allocation to fail? Correct, any attempt to specify a filesystem-dax mapping range to get_user_pages_longterm will fail with EOPNOTSUPP. In the future we want to add something like a 'struct file_lock *' argument to get_user_pages_longterm so that the kernel has a handle to revoke access to the returned pages. Once we have a safe way for the kernel to undo elevated page counts we can stop failing the longterm vs filesystem-dax case. Here is more background on why _longterm gup is a problem for filesystem-dax: https://lwn.net/Articles/737273/ > If so, that doesn't sound > right. Instead, mm should somehow mark this mapping to be out > of FS_DAX control range. DAX is currently global setting for the entire backing device of the filesystem, so any mapping of any file when the "-o dax" mount option is set is in the "FS_DAX control range". In other words there's currently no way to prevent FS_DAX mappings from being exposed to V4L2 outside of CONFIG_FS_DAX=n. > Also, it is not only videobuf-dma-sg.c that does long lived > DMA mappings. VB2 also does that (and videobuf-vmalloc). Without finding the code videobuf-vmalloc sounds like it should be ok if the kernel is allocating memory separate from a file-backed DAX mapping. Where is the VB2 get_user_pages call?
Re: [PATCH 3/3] [media] v4l2: disable filesystem-dax mapping support
Em Mon, 06 Nov 2017 16:57:28 -0800 Dan Williamsescreveu: > V4L2 memory registrations are incompatible with filesystem-dax that > needs the ability to revoke dma access to a mapping at will, or > otherwise allow the kernel to wait for completion of DMA. The > filesystem-dax implementation breaks the traditional solution of > truncate of active file backed mappings since there is no page-cache > page we can orphan to sustain ongoing DMA. > > If v4l2 wants to support long lived DMA mappings it needs to arrange to > hold a file lease or use some other mechanism so that the kernel can > coordinate revoking DMA access when the filesystem needs to truncate > mappings. Not sure if I understand this your comment here... what happens if FS_DAX is enabled? The new err = get_user_pages_longterm() would cause DMA allocation to fail? If so, that doesn't sound right. Instead, mm should somehow mark this mapping to be out of FS_DAX control range. Also, it is not only videobuf-dma-sg.c that does long lived DMA mappings. VB2 also does that (and videobuf-vmalloc). Regards, Mauro > > Reported-by: Jan Kara > Cc: Mauro Carvalho Chehab > Cc: linux-media@vger.kernel.org > Cc: > Signed-off-by: Dan Williams > --- > drivers/media/v4l2-core/videobuf-dma-sg.c |5 +++-- > 1 file changed, 3 insertions(+), 2 deletions(-) > > diff --git a/drivers/media/v4l2-core/videobuf-dma-sg.c > b/drivers/media/v4l2-core/videobuf-dma-sg.c > index 0b5c43f7e020..f412429cf5ba 100644 > --- a/drivers/media/v4l2-core/videobuf-dma-sg.c > +++ b/drivers/media/v4l2-core/videobuf-dma-sg.c > @@ -185,12 +185,13 @@ static int videobuf_dma_init_user_locked(struct > videobuf_dmabuf *dma, > dprintk(1, "init user [0x%lx+0x%lx => %d pages]\n", > data, size, dma->nr_pages); > > - err = get_user_pages(data & PAGE_MASK, dma->nr_pages, > + err = get_user_pages_longterm(data & PAGE_MASK, dma->nr_pages, >flags, dma->pages, NULL); > > if (err != dma->nr_pages) { > dma->nr_pages = (err >= 0) ? err : 0; > - dprintk(1, "get_user_pages: err=%d [%d]\n", err, dma->nr_pages); > + dprintk(1, "get_user_pages_longterm: err=%d [%d]\n", err, > + dma->nr_pages); > return err < 0 ? err : -EINVAL; > } > return 0; > Thanks, Mauro