On Mon, 27 Nov 2017 11:17:54 +0200, Felipe Balbi <[email protected]>
wrote:
> as I said, the *only* thing that schedules from inside
> dwc3_gadget_ep_dequeue() is wait_event_lock_irq(). Which works fine
> unless usb_ep_dequeue() is called with locks held and IRQs disabled.
This I understand, yes.
What puzzles me is how to get it to be called with IRQs enabled.
At least, if ep_lock can be left unaquired in ffs_aio_cancel without
risking data inconsistencies when called without larger locks acquired
(this I am still not certain of yet), there are still 3 more locks to
go before dwc3_gadget_ep_dequeue could get called with IRQs enabled in
such call stack:
>>> [ 382.515903] #0: (rcu_callback){....}, at: [<c10b4ff0>]
>>> rcu_process_callbacks+0x260/0x440
Acquired with lock_acquire (via rcu_lock_acquire), which disables IRQs.
>>> [ 382.524572] #1: (rcu_read_lock_sched){....}, at: [<c1358ba0>]
>>> percpu_ref_switch_to_atomic_rcu+0xb0/0x130
Ditto.
>>> [ 382.534650] #2: (&(&ctx->ctx_lock)->rlock){....}, at: [<c11f0c73>]
>>> free_ioctx_users+0x23/0xd0
free_ioctx_users acquires ctx->ctx_lock using spin_lock_irq, so again
disabling IRQs.
> If my original suggestion didn't help, then there may be other bugs in
> f_fs.c.
So I believe something has to delay cancelling the AIO so it happens
in another call stack, with IRQs enabled.
I did the following, which is likely very naive.
I believe it opens a race-condition on io_data->work (complete vs.
cancel), and as I still do not understand what eps_lock is supposed to
protect I may be opening something else there.
I do not like that I cannot handle nor propagate any error returned by
usb_ep_dequeue.
But on the plus side, it did allow my program to get killed without
causing an immediate panic:
[ 73.396732] read descriptors
[ 73.403524] read strings
[ 92.407515] configfs-gadget gadget: high-speed config #1: c
[ 104.090661] ffs_aio_cancel_worker: 0
io_submit(0xb79ed000, 2, [{aio_lio_opcode=IOCB_CMD_PREADV, aio_fildes=7,
aio_buf=[{iov_base=0xb78ec008, iov_len=1048576}], aio_offset=0, aio_resfd=3},
{aio_lio_opcode=IOCB_CMD_PREADV, aio_fildes=7, aio_buf=[{iov_base=0xb77eb008,
iov_len=1048576}], aio_offset=0, aio_resfd=3}]) = 2
epoll_wait(8, 0x12c4558, 1023, -1) = -1 EINTR (Interrupted system call)
--- SIGTERM {si_signo=SIGTERM, si_code=SI_USER, si_pid=1887, si_uid=1000} ---
+++ killed by SIGTERM +++
Here is the naive patch (NOT for inclusion, but as an RFC):
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -1069,23 +1069,33 @@ ffs_epfile_open(struct inode *inode, struct file *file)
return 0;
}
+static void ffs_aio_cancel_worker(struct work_struct *work)
+{
+ struct ffs_io_data *io_data = container_of(work, struct ffs_io_data,
+ work);
+ int value;
+
+ ENTER();
+
+ value = usb_ep_dequeue(io_data->ep, io_data->req);
+ printk("ffs_aio_cancel_worker: %i", value);
+}
+
static int ffs_aio_cancel(struct kiocb *kiocb)
{
struct ffs_io_data *io_data = kiocb->private;
- struct ffs_epfile *epfile = kiocb->ki_filp->private_data;
+ struct ffs_data *ffs = io_data->ffs;
int value;
ENTER();
- spin_lock_irq(&epfile->ffs->eps_lock);
-
- if (likely(io_data && io_data->ep && io_data->req))
- value = usb_ep_dequeue(io_data->ep, io_data->req);
- else
+ if (likely(io_data && io_data->ep && io_data->req)) {
+ INIT_WORK(&io_data->work, ffs_aio_cancel_worker);
+ queue_work(ffs->io_completion_wq, &io_data->work);
+ value = -EINPROGRESS;
+ } else
value = -EINVAL;
- spin_unlock_irq(&epfile->ffs->eps_lock);
-
return value;
}
Regards,
--
Vincent Pelletier
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html