Re: [PATCH 34/33] netfs: Pass flag rather than use in_softirq()
On 2021-02-18 14:02:36 [+], David Howells wrote: > How about the attached instead? Thank you for that flag. > David Sebastian
Re: [PATCH 34/33] netfs: Pass flag rather than use in_softirq()
On Thu, Feb 18, 2021 at 10:03 AM David Howells wrote: > > Christoph Hellwig wrote: > > > On Tue, Feb 16, 2021 at 09:29:31AM +, David Howells wrote: > > > Is there a better way to do it? The intent is to process the assessment > > > phase in the calling thread's context if possible rather than bumping over > > > to a worker thread. For synchronous I/O, for example, that's done in the > > > caller's thread. Maybe that's the answer - if it's known to be > > > asynchronous, I have to punt, but otherwise don't have to. > > > > Yes, i think you want an explicit flag instead. > > How about the attached instead? > > David > --- > commit 29b3e9eed616db01f15c7998c062b4e501ea6582 > Author: David Howells > Date: Mon Feb 15 21:56:43 2021 + > > netfs: Pass flag rather than use in_softirq() > > The in_softirq() in netfs_rreq_terminated() works fine for the cache being > on a normal disk, as the completion handlers may get called in softirq > context, but for an NVMe drive, the completion handler may get called in > IRQ context. > > Fix to pass a flag to netfs_subreq_terminated() to indicate whether we > think the function isn't being called from a context in which we can do > allocations, waits and I/O submissions (such as softirq or interrupt > context). If this flag is set, netfs lib has to punt to a worker thread > to > handle anything like that. > > The symptom involves warnings like the following appearing and the kernel > hanging: > > WARNING: CPU: 0 PID: 0 at kernel/softirq.c:175 > __local_bh_enable_ip+0x35/0x50 > ... > RIP: 0010:__local_bh_enable_ip+0x35/0x50 > ... > Call Trace: > > rxrpc_kernel_begin_call+0x7d/0x1b0 [rxrpc] > ? afs_rx_new_call+0x40/0x40 [kafs] > ? afs_alloc_call+0x28/0x120 [kafs] > afs_make_call+0x120/0x510 [kafs] > ? afs_rx_new_call+0x40/0x40 [kafs] > ? afs_alloc_flat_call+0xba/0x100 [kafs] > ? __kmalloc+0x167/0x2f0 > ? afs_alloc_flat_call+0x9b/0x100 [kafs] > afs_wait_for_operation+0x2d/0x200 [kafs] > afs_do_sync_operation+0x16/0x20 [kafs] > afs_req_issue_op+0x8c/0xb0 [kafs] > netfs_rreq_assess+0x125/0x7d0 [netfs] > ? cachefiles_end_operation+0x40/0x40 [cachefiles] > netfs_subreq_terminated+0x117/0x220 [netfs] > cachefiles_read_complete+0x21/0x60 [cachefiles] > iomap_dio_bio_end_io+0xdd/0x110 > blk_update_request+0x20a/0x380 > blk_mq_end_request+0x1c/0x120 > nvme_process_cq+0x159/0x1f0 [nvme] > nvme_irq+0x10/0x20 [nvme] > __handle_irq_event_percpu+0x37/0x150 > handle_irq_event+0x49/0xb0 > handle_edge_irq+0x7c/0x200 > asm_call_irq_on_stack+0xf/0x20 > > common_interrupt+0xad/0x120 > asm_common_interrupt+0x1e/0x40 > ... > > Reported-by: Marc Dionne > Signed-off-by: David Howells > cc: Matthew Wilcox > cc: linux...@kvack.org > cc: linux-cach...@redhat.com > cc: linux-...@lists.infradead.org > cc: linux-...@vger.kernel.org > cc: linux-c...@vger.kernel.org > cc: ceph-de...@vger.kernel.org > cc: v9fs-develo...@lists.sourceforge.net > cc: linux-fsde...@vger.kernel.org > > diff --git a/fs/afs/file.c b/fs/afs/file.c > index 8f28d4f4cfd7..6dcdbbfb48e2 100644 > --- a/fs/afs/file.c > +++ b/fs/afs/file.c > @@ -223,7 +223,7 @@ static void afs_fetch_data_notify(struct afs_operation > *op) > > if (subreq) { > __set_bit(NETFS_SREQ_CLEAR_TAIL, >flags); > - netfs_subreq_terminated(subreq, error ?: req->actual_len); > + netfs_subreq_terminated(subreq, error ?: req->actual_len, > false); > req->subreq = NULL; > } else if (req->done) { > req->done(req); > @@ -289,7 +289,7 @@ static void afs_req_issue_op(struct netfs_read_subrequest > *subreq) > > fsreq = afs_alloc_read(GFP_NOFS); > if (!fsreq) > - return netfs_subreq_terminated(subreq, -ENOMEM); > + return netfs_subreq_terminated(subreq, -ENOMEM, false); > > fsreq->subreq = subreq; > fsreq->pos = subreq->start + subreq->transferred; > @@ -304,7 +304,7 @@ static void afs_req_issue_op(struct netfs_read_subrequest > *subreq) > > ret = afs_fetch_data(fsreq->vnode, fsreq); > if (ret < 0) > - return netfs_subreq_terminated(subreq, ret); > + return netfs_subreq_terminated(subreq, ret, false); > } > > static int afs_symlink_readpage(struct page *page) > diff --git a/fs/cachefiles/rdwr2.c b/fs/cachefiles/rdwr2.c > index 4cea5a2a2d6e..40668bfe6688 100644 > --- a/fs/cachefiles/rdwr2.c > +++ b/fs/cachefiles/rdwr2.c > @@ -23,6 +23,7 @@ struct cachefiles_kiocb { > }; > netfs_io_terminated_t term_func; > void*term_func_priv; > + boolwas_async; > }; > > static inline void
Re: [PATCH 34/33] netfs: Pass flag rather than use in_softirq()
On Thu, Feb 18, 2021 at 10:03 AM David Howells wrote: > > Christoph Hellwig wrote: > > > On Tue, Feb 16, 2021 at 09:29:31AM +, David Howells wrote: > > > Is there a better way to do it? The intent is to process the assessment > > > phase in the calling thread's context if possible rather than bumping over > > > to a worker thread. For synchronous I/O, for example, that's done in the > > > caller's thread. Maybe that's the answer - if it's known to be > > > asynchronous, I have to punt, but otherwise don't have to. > > > > Yes, i think you want an explicit flag instead. > > How about the attached instead? > > David > --- > commit 29b3e9eed616db01f15c7998c062b4e501ea6582 > Author: David Howells > Date: Mon Feb 15 21:56:43 2021 + > > netfs: Pass flag rather than use in_softirq() > > The in_softirq() in netfs_rreq_terminated() works fine for the cache being > on a normal disk, as the completion handlers may get called in softirq > context, but for an NVMe drive, the completion handler may get called in > IRQ context. > > Fix to pass a flag to netfs_subreq_terminated() to indicate whether we > think the function isn't being called from a context in which we can do > allocations, waits and I/O submissions (such as softirq or interrupt > context). If this flag is set, netfs lib has to punt to a worker thread > to > handle anything like that. > > The symptom involves warnings like the following appearing and the kernel > hanging: > > WARNING: CPU: 0 PID: 0 at kernel/softirq.c:175 > __local_bh_enable_ip+0x35/0x50 > ... > RIP: 0010:__local_bh_enable_ip+0x35/0x50 > ... > Call Trace: > > rxrpc_kernel_begin_call+0x7d/0x1b0 [rxrpc] > ? afs_rx_new_call+0x40/0x40 [kafs] > ? afs_alloc_call+0x28/0x120 [kafs] > afs_make_call+0x120/0x510 [kafs] > ? afs_rx_new_call+0x40/0x40 [kafs] > ? afs_alloc_flat_call+0xba/0x100 [kafs] > ? __kmalloc+0x167/0x2f0 > ? afs_alloc_flat_call+0x9b/0x100 [kafs] > afs_wait_for_operation+0x2d/0x200 [kafs] > afs_do_sync_operation+0x16/0x20 [kafs] > afs_req_issue_op+0x8c/0xb0 [kafs] > netfs_rreq_assess+0x125/0x7d0 [netfs] > ? cachefiles_end_operation+0x40/0x40 [cachefiles] > netfs_subreq_terminated+0x117/0x220 [netfs] > cachefiles_read_complete+0x21/0x60 [cachefiles] > iomap_dio_bio_end_io+0xdd/0x110 > blk_update_request+0x20a/0x380 > blk_mq_end_request+0x1c/0x120 > nvme_process_cq+0x159/0x1f0 [nvme] > nvme_irq+0x10/0x20 [nvme] > __handle_irq_event_percpu+0x37/0x150 > handle_irq_event+0x49/0xb0 > handle_edge_irq+0x7c/0x200 > asm_call_irq_on_stack+0xf/0x20 > > common_interrupt+0xad/0x120 > asm_common_interrupt+0x1e/0x40 > ... > > Reported-by: Marc Dionne > Signed-off-by: David Howells > cc: Matthew Wilcox > cc: linux...@kvack.org > cc: linux-cach...@redhat.com > cc: linux-...@lists.infradead.org > cc: linux-...@vger.kernel.org > cc: linux-c...@vger.kernel.org > cc: ceph-de...@vger.kernel.org > cc: v9fs-develo...@lists.sourceforge.net > cc: linux-fsde...@vger.kernel.org > > diff --git a/fs/afs/file.c b/fs/afs/file.c > index 8f28d4f4cfd7..6dcdbbfb48e2 100644 > --- a/fs/afs/file.c > +++ b/fs/afs/file.c > @@ -223,7 +223,7 @@ static void afs_fetch_data_notify(struct afs_operation > *op) > > if (subreq) { > __set_bit(NETFS_SREQ_CLEAR_TAIL, >flags); > - netfs_subreq_terminated(subreq, error ?: req->actual_len); > + netfs_subreq_terminated(subreq, error ?: req->actual_len, > false); > req->subreq = NULL; > } else if (req->done) { > req->done(req); > @@ -289,7 +289,7 @@ static void afs_req_issue_op(struct netfs_read_subrequest > *subreq) > > fsreq = afs_alloc_read(GFP_NOFS); > if (!fsreq) > - return netfs_subreq_terminated(subreq, -ENOMEM); > + return netfs_subreq_terminated(subreq, -ENOMEM, false); > > fsreq->subreq = subreq; > fsreq->pos = subreq->start + subreq->transferred; > @@ -304,7 +304,7 @@ static void afs_req_issue_op(struct netfs_read_subrequest > *subreq) > > ret = afs_fetch_data(fsreq->vnode, fsreq); > if (ret < 0) > - return netfs_subreq_terminated(subreq, ret); > + return netfs_subreq_terminated(subreq, ret, false); > } > > static int afs_symlink_readpage(struct page *page) > diff --git a/fs/cachefiles/rdwr2.c b/fs/cachefiles/rdwr2.c > index 4cea5a2a2d6e..40668bfe6688 100644 > --- a/fs/cachefiles/rdwr2.c > +++ b/fs/cachefiles/rdwr2.c > @@ -23,6 +23,7 @@ struct cachefiles_kiocb { > }; > netfs_io_terminated_t term_func; > void*term_func_priv; > + boolwas_async; > }; > > static inline void
[PATCH 34/33] netfs: Pass flag rather than use in_softirq()
Christoph Hellwig wrote: > On Tue, Feb 16, 2021 at 09:29:31AM +, David Howells wrote: > > Is there a better way to do it? The intent is to process the assessment > > phase in the calling thread's context if possible rather than bumping over > > to a worker thread. For synchronous I/O, for example, that's done in the > > caller's thread. Maybe that's the answer - if it's known to be > > asynchronous, I have to punt, but otherwise don't have to. > > Yes, i think you want an explicit flag instead. How about the attached instead? David --- commit 29b3e9eed616db01f15c7998c062b4e501ea6582 Author: David Howells Date: Mon Feb 15 21:56:43 2021 + netfs: Pass flag rather than use in_softirq() The in_softirq() in netfs_rreq_terminated() works fine for the cache being on a normal disk, as the completion handlers may get called in softirq context, but for an NVMe drive, the completion handler may get called in IRQ context. Fix to pass a flag to netfs_subreq_terminated() to indicate whether we think the function isn't being called from a context in which we can do allocations, waits and I/O submissions (such as softirq or interrupt context). If this flag is set, netfs lib has to punt to a worker thread to handle anything like that. The symptom involves warnings like the following appearing and the kernel hanging: WARNING: CPU: 0 PID: 0 at kernel/softirq.c:175 __local_bh_enable_ip+0x35/0x50 ... RIP: 0010:__local_bh_enable_ip+0x35/0x50 ... Call Trace: rxrpc_kernel_begin_call+0x7d/0x1b0 [rxrpc] ? afs_rx_new_call+0x40/0x40 [kafs] ? afs_alloc_call+0x28/0x120 [kafs] afs_make_call+0x120/0x510 [kafs] ? afs_rx_new_call+0x40/0x40 [kafs] ? afs_alloc_flat_call+0xba/0x100 [kafs] ? __kmalloc+0x167/0x2f0 ? afs_alloc_flat_call+0x9b/0x100 [kafs] afs_wait_for_operation+0x2d/0x200 [kafs] afs_do_sync_operation+0x16/0x20 [kafs] afs_req_issue_op+0x8c/0xb0 [kafs] netfs_rreq_assess+0x125/0x7d0 [netfs] ? cachefiles_end_operation+0x40/0x40 [cachefiles] netfs_subreq_terminated+0x117/0x220 [netfs] cachefiles_read_complete+0x21/0x60 [cachefiles] iomap_dio_bio_end_io+0xdd/0x110 blk_update_request+0x20a/0x380 blk_mq_end_request+0x1c/0x120 nvme_process_cq+0x159/0x1f0 [nvme] nvme_irq+0x10/0x20 [nvme] __handle_irq_event_percpu+0x37/0x150 handle_irq_event+0x49/0xb0 handle_edge_irq+0x7c/0x200 asm_call_irq_on_stack+0xf/0x20 common_interrupt+0xad/0x120 asm_common_interrupt+0x1e/0x40 ... Reported-by: Marc Dionne Signed-off-by: David Howells cc: Matthew Wilcox cc: linux...@kvack.org cc: linux-cach...@redhat.com cc: linux-...@lists.infradead.org cc: linux-...@vger.kernel.org cc: linux-c...@vger.kernel.org cc: ceph-de...@vger.kernel.org cc: v9fs-develo...@lists.sourceforge.net cc: linux-fsde...@vger.kernel.org diff --git a/fs/afs/file.c b/fs/afs/file.c index 8f28d4f4cfd7..6dcdbbfb48e2 100644 --- a/fs/afs/file.c +++ b/fs/afs/file.c @@ -223,7 +223,7 @@ static void afs_fetch_data_notify(struct afs_operation *op) if (subreq) { __set_bit(NETFS_SREQ_CLEAR_TAIL, >flags); - netfs_subreq_terminated(subreq, error ?: req->actual_len); + netfs_subreq_terminated(subreq, error ?: req->actual_len, false); req->subreq = NULL; } else if (req->done) { req->done(req); @@ -289,7 +289,7 @@ static void afs_req_issue_op(struct netfs_read_subrequest *subreq) fsreq = afs_alloc_read(GFP_NOFS); if (!fsreq) - return netfs_subreq_terminated(subreq, -ENOMEM); + return netfs_subreq_terminated(subreq, -ENOMEM, false); fsreq->subreq = subreq; fsreq->pos = subreq->start + subreq->transferred; @@ -304,7 +304,7 @@ static void afs_req_issue_op(struct netfs_read_subrequest *subreq) ret = afs_fetch_data(fsreq->vnode, fsreq); if (ret < 0) - return netfs_subreq_terminated(subreq, ret); + return netfs_subreq_terminated(subreq, ret, false); } static int afs_symlink_readpage(struct page *page) diff --git a/fs/cachefiles/rdwr2.c b/fs/cachefiles/rdwr2.c index 4cea5a2a2d6e..40668bfe6688 100644 --- a/fs/cachefiles/rdwr2.c +++ b/fs/cachefiles/rdwr2.c @@ -23,6 +23,7 @@ struct cachefiles_kiocb { }; netfs_io_terminated_t term_func; void*term_func_priv; + boolwas_async; }; static inline void cachefiles_put_kiocb(struct cachefiles_kiocb *ki) @@ -43,10 +44,9 @@ static void cachefiles_read_complete(struct kiocb *iocb, long ret, long ret2) _enter("%ld,%ld", ret, ret2); if (ki->term_func) { - if (ret < 0) -