Re: [PATCH] fsnotify: don't call mutex_lock from TASK_INTERRUPTIBLE context

2014-11-04 Thread Andrew Morton
On Sat,  1 Nov 2014 23:51:38 -0400 Sasha Levin  wrote:

> Sleeping functions should only be called from TASK_RUNNING. The following
> code in fanotify_read():
> 
>   prepare_to_wait(>notification_waitq, , TASK_INTERRUPTIBLE);
> 
> mutex_lock(>notification_mutex);
> 
> would call it under TASK_INTERRUPTIBLE, and trigger a warning:
> 
> [12326.092094] WARNING: CPU: 27 PID: 30207 at kernel/sched/core.c:7305 
> __might_sleep+0xd2/0x110()
> [12326.092878] do not call blocking ops when !TASK_RUNNING; state=1 set at 
> prepare_to_wait (./arch/x86/include/asm/current.h:14 kernel/sched/wait.c:179)
> [12326.093938] Modules linked in:
>
> ...
>

It's a fairly minor problem - if mutex_lock() hits contention we get
flipped into TASK_RUNNING and the schedule() immediately returns and we
take another trip around the loop.

fanotify_read() also calls copy_event_to_user()->copy_to_user() in
TASK_INTERRUPTIBLE state.  That's a bug and this is why the first thing
handle_mm_fault() does is to set TASK_RUNNING.

> Instead of trying to fix fanotify_read() I've converted
> notification_mutex into a spinlock.  I didn't see a reason why it
> should be a mutex nor anything complained when I ran the same tests
> again.

This could be a latency problem - those lists can get very long.

I wonder if we can zap the prepare_to_wait()/finish_wait() and use
something like

wait_event_interruptible(notification_waitq, foo(group, count));

int foo(struct fsnotify_group *group, size_t count)
{
int ret;

mutex_lock(>notification_mutex);
ret = get_one_event(group, count);
mutex_unlock(>notification_mutex);
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] fsnotify: don't call mutex_lock from TASK_INTERRUPTIBLE context

2014-11-04 Thread Andrew Morton
On Sat,  1 Nov 2014 23:51:38 -0400 Sasha Levin sasha.le...@oracle.com wrote:

 Sleeping functions should only be called from TASK_RUNNING. The following
 code in fanotify_read():
 
   prepare_to_wait(group-notification_waitq, wait, TASK_INTERRUPTIBLE);
 
 mutex_lock(group-notification_mutex);
 
 would call it under TASK_INTERRUPTIBLE, and trigger a warning:
 
 [12326.092094] WARNING: CPU: 27 PID: 30207 at kernel/sched/core.c:7305 
 __might_sleep+0xd2/0x110()
 [12326.092878] do not call blocking ops when !TASK_RUNNING; state=1 set at 
 prepare_to_wait (./arch/x86/include/asm/current.h:14 kernel/sched/wait.c:179)
 [12326.093938] Modules linked in:

 ...


It's a fairly minor problem - if mutex_lock() hits contention we get
flipped into TASK_RUNNING and the schedule() immediately returns and we
take another trip around the loop.

fanotify_read() also calls copy_event_to_user()-copy_to_user() in
TASK_INTERRUPTIBLE state.  That's a bug and this is why the first thing
handle_mm_fault() does is to set TASK_RUNNING.

 Instead of trying to fix fanotify_read() I've converted
 notification_mutex into a spinlock.  I didn't see a reason why it
 should be a mutex nor anything complained when I ran the same tests
 again.

This could be a latency problem - those lists can get very long.

I wonder if we can zap the prepare_to_wait()/finish_wait() and use
something like

wait_event_interruptible(notification_waitq, foo(group, count));

int foo(struct fsnotify_group *group, size_t count)
{
int ret;

mutex_lock(group-notification_mutex);
ret = get_one_event(group, count);
mutex_unlock(group-notification_mutex);
}

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] fsnotify: don't call mutex_lock from TASK_INTERRUPTIBLE context

2014-11-01 Thread Sasha Levin
Sleeping functions should only be called from TASK_RUNNING. The following
code in fanotify_read():

prepare_to_wait(>notification_waitq, , TASK_INTERRUPTIBLE);

mutex_lock(>notification_mutex);

would call it under TASK_INTERRUPTIBLE, and trigger a warning:

[12326.092094] WARNING: CPU: 27 PID: 30207 at kernel/sched/core.c:7305 
__might_sleep+0xd2/0x110()
[12326.092878] do not call blocking ops when !TASK_RUNNING; state=1 set at 
prepare_to_wait (./arch/x86/include/asm/current.h:14 kernel/sched/wait.c:179)
[12326.093938] Modules linked in:
[12326.094261] CPU: 27 PID: 30207 Comm: fanotify01 Not tainted 
3.18.0-rc2-next-20141031-sasha-00057-g9a0b11b-dirty #1435
[12326.095255]  0009  88003b563000 
88005bfbbc38
[12326.096019]  90dabf13  88005bfbbc98 
88005bfbbc88
[12326.096791]  8c1b12fa 88005bfbbc88 8c1f6112 
001d76c0
[12326.097610] Call Trace:
[12326.097881] dump_stack (lib/dump_stack.c:52)
[12326.098383] warn_slowpath_common (kernel/panic.c:432)
[12326.098973] ? __might_sleep (kernel/sched/core.c:7311)
[12326.099512] ? prepare_to_wait (./arch/x86/include/asm/current.h:14 
kernel/sched/wait.c:179)
[12326.100100] warn_slowpath_fmt (kernel/panic.c:446)
[12326.100704] ? check_chain_key (kernel/locking/lockdep.c:2190)
[12326.101319] ? prepare_to_wait (./arch/x86/include/asm/current.h:14 
kernel/sched/wait.c:179)
[12326.101870] ? prepare_to_wait (./arch/x86/include/asm/current.h:14 
kernel/sched/wait.c:179)
[12326.102421] __might_sleep (kernel/sched/core.c:7311)
[12326.102949] ? prepare_to_wait (./arch/x86/include/asm/current.h:14 
kernel/sched/wait.c:179)
[12326.103502] ? prepare_to_wait (kernel/sched/wait.c:181)
[12326.104060] mutex_lock_nested (kernel/locking/mutex.c:623)
[12326.104620] ? preempt_count_sub (kernel/sched/core.c:2641)
[12326.105324] ? _raw_spin_unlock_irqrestore 
(./arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:161 
kernel/locking/spinlock.c:191)
[12326.105986] ? prepare_to_wait (kernel/sched/wait.c:181)
[12326.106542] fanotify_read (./arch/x86/include/asm/atomic.h:27 
include/linux/mutex.h:131 fs/notify/fanotify/fanotify_user.c:57 
fs/notify/fanotify/fanotify_user.c:273)
[12326.107070] ? abort_exclusive_wait (kernel/sched/wait.c:291)
[12326.107676] vfs_read (fs/read_write.c:430)
[12326.108169] SyS_read (fs/read_write.c:569 fs/read_write.c:562)
[12326.108652] tracesys_phase2 (arch/x86/kernel/entry_64.S:529)

Instead of trying to fix fanotify_read() I've converted notification_mutex
into a spinlock. I didn't see a reason why it should be a mutex nor anything
complained when I ran the same tests again.

Signed-off-by: Sasha Levin 
---
 fs/notify/fanotify/fanotify_user.c |   18 +-
 fs/notify/group.c  |2 +-
 fs/notify/inotify/inotify_user.c   |   16 
 fs/notify/notification.c   |   22 +++---
 include/linux/fsnotify_backend.h   |2 +-
 5 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/fs/notify/fanotify/fanotify_user.c 
b/fs/notify/fanotify/fanotify_user.c
index c991616..f03bffc 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -49,12 +49,12 @@ struct kmem_cache *fanotify_perm_event_cachep __read_mostly;
  * enough to fit in "count". Return an error pointer if the count
  * is not large enough.
  *
- * Called with the group->notification_mutex held.
+ * Called with the group->notification_lock held.
  */
 static struct fsnotify_event *get_one_event(struct fsnotify_group *group,
size_t count)
 {
-   BUG_ON(!mutex_is_locked(>notification_mutex));
+   BUG_ON(!spin_is_locked(>notification_lock));
 
pr_debug("%s: group=%p count=%zd\n", __func__, group, count);
 
@@ -64,7 +64,7 @@ static struct fsnotify_event *get_one_event(struct 
fsnotify_group *group,
if (FAN_EVENT_METADATA_LEN > count)
return ERR_PTR(-EINVAL);
 
-   /* held the notification_mutex the whole time, so this is the
+   /* held the notification_lock the whole time, so this is the
 * same event we peeked above */
return fsnotify_remove_first_event(group);
 }
@@ -244,10 +244,10 @@ static unsigned int fanotify_poll(struct file *file, 
poll_table *wait)
int ret = 0;
 
poll_wait(file, >notification_waitq, wait);
-   mutex_lock(>notification_mutex);
+   spin_lock(>notification_lock);
if (!fsnotify_notify_queue_is_empty(group))
ret = POLLIN | POLLRDNORM;
-   mutex_unlock(>notification_mutex);
+   spin_unlock(>notification_lock);
 
return ret;
 }
@@ -269,9 +269,9 @@ static ssize_t fanotify_read(struct file *file, char __user 
*buf,
while (1) {
prepare_to_wait(>notification_waitq, , 
TASK_INTERRUPTIBLE);
 
-   mutex_lock(>notification_mutex);
+   

[PATCH] fsnotify: don't call mutex_lock from TASK_INTERRUPTIBLE context

2014-11-01 Thread Sasha Levin
Sleeping functions should only be called from TASK_RUNNING. The following
code in fanotify_read():

prepare_to_wait(group-notification_waitq, wait, TASK_INTERRUPTIBLE);

mutex_lock(group-notification_mutex);

would call it under TASK_INTERRUPTIBLE, and trigger a warning:

[12326.092094] WARNING: CPU: 27 PID: 30207 at kernel/sched/core.c:7305 
__might_sleep+0xd2/0x110()
[12326.092878] do not call blocking ops when !TASK_RUNNING; state=1 set at 
prepare_to_wait (./arch/x86/include/asm/current.h:14 kernel/sched/wait.c:179)
[12326.093938] Modules linked in:
[12326.094261] CPU: 27 PID: 30207 Comm: fanotify01 Not tainted 
3.18.0-rc2-next-20141031-sasha-00057-g9a0b11b-dirty #1435
[12326.095255]  0009  88003b563000 
88005bfbbc38
[12326.096019]  90dabf13  88005bfbbc98 
88005bfbbc88
[12326.096791]  8c1b12fa 88005bfbbc88 8c1f6112 
001d76c0
[12326.097610] Call Trace:
[12326.097881] dump_stack (lib/dump_stack.c:52)
[12326.098383] warn_slowpath_common (kernel/panic.c:432)
[12326.098973] ? __might_sleep (kernel/sched/core.c:7311)
[12326.099512] ? prepare_to_wait (./arch/x86/include/asm/current.h:14 
kernel/sched/wait.c:179)
[12326.100100] warn_slowpath_fmt (kernel/panic.c:446)
[12326.100704] ? check_chain_key (kernel/locking/lockdep.c:2190)
[12326.101319] ? prepare_to_wait (./arch/x86/include/asm/current.h:14 
kernel/sched/wait.c:179)
[12326.101870] ? prepare_to_wait (./arch/x86/include/asm/current.h:14 
kernel/sched/wait.c:179)
[12326.102421] __might_sleep (kernel/sched/core.c:7311)
[12326.102949] ? prepare_to_wait (./arch/x86/include/asm/current.h:14 
kernel/sched/wait.c:179)
[12326.103502] ? prepare_to_wait (kernel/sched/wait.c:181)
[12326.104060] mutex_lock_nested (kernel/locking/mutex.c:623)
[12326.104620] ? preempt_count_sub (kernel/sched/core.c:2641)
[12326.105324] ? _raw_spin_unlock_irqrestore 
(./arch/x86/include/asm/preempt.h:95 include/linux/spinlock_api_smp.h:161 
kernel/locking/spinlock.c:191)
[12326.105986] ? prepare_to_wait (kernel/sched/wait.c:181)
[12326.106542] fanotify_read (./arch/x86/include/asm/atomic.h:27 
include/linux/mutex.h:131 fs/notify/fanotify/fanotify_user.c:57 
fs/notify/fanotify/fanotify_user.c:273)
[12326.107070] ? abort_exclusive_wait (kernel/sched/wait.c:291)
[12326.107676] vfs_read (fs/read_write.c:430)
[12326.108169] SyS_read (fs/read_write.c:569 fs/read_write.c:562)
[12326.108652] tracesys_phase2 (arch/x86/kernel/entry_64.S:529)

Instead of trying to fix fanotify_read() I've converted notification_mutex
into a spinlock. I didn't see a reason why it should be a mutex nor anything
complained when I ran the same tests again.

Signed-off-by: Sasha Levin sasha.le...@oracle.com
---
 fs/notify/fanotify/fanotify_user.c |   18 +-
 fs/notify/group.c  |2 +-
 fs/notify/inotify/inotify_user.c   |   16 
 fs/notify/notification.c   |   22 +++---
 include/linux/fsnotify_backend.h   |2 +-
 5 files changed, 30 insertions(+), 30 deletions(-)

diff --git a/fs/notify/fanotify/fanotify_user.c 
b/fs/notify/fanotify/fanotify_user.c
index c991616..f03bffc 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -49,12 +49,12 @@ struct kmem_cache *fanotify_perm_event_cachep __read_mostly;
  * enough to fit in count. Return an error pointer if the count
  * is not large enough.
  *
- * Called with the group-notification_mutex held.
+ * Called with the group-notification_lock held.
  */
 static struct fsnotify_event *get_one_event(struct fsnotify_group *group,
size_t count)
 {
-   BUG_ON(!mutex_is_locked(group-notification_mutex));
+   BUG_ON(!spin_is_locked(group-notification_lock));
 
pr_debug(%s: group=%p count=%zd\n, __func__, group, count);
 
@@ -64,7 +64,7 @@ static struct fsnotify_event *get_one_event(struct 
fsnotify_group *group,
if (FAN_EVENT_METADATA_LEN  count)
return ERR_PTR(-EINVAL);
 
-   /* held the notification_mutex the whole time, so this is the
+   /* held the notification_lock the whole time, so this is the
 * same event we peeked above */
return fsnotify_remove_first_event(group);
 }
@@ -244,10 +244,10 @@ static unsigned int fanotify_poll(struct file *file, 
poll_table *wait)
int ret = 0;
 
poll_wait(file, group-notification_waitq, wait);
-   mutex_lock(group-notification_mutex);
+   spin_lock(group-notification_lock);
if (!fsnotify_notify_queue_is_empty(group))
ret = POLLIN | POLLRDNORM;
-   mutex_unlock(group-notification_mutex);
+   spin_unlock(group-notification_lock);
 
return ret;
 }
@@ -269,9 +269,9 @@ static ssize_t fanotify_read(struct file *file, char __user 
*buf,
while (1) {
prepare_to_wait(group-notification_waitq, wait, 
TASK_INTERRUPTIBLE);
 
-