Re: [PATCH 1/2] btrfs: drop trace_btrfs_all_work_done() from normal_work_helper()

2016-12-21 Thread David Sterba
On Wed, Dec 21, 2016 at 04:45:06PM +0800, Qu Wenruo wrote:
> 
> 
> At 12/21/2016 04:28 PM, Sebastian Andrzej Siewior wrote:
> > On 2016-12-21 08:33:03 [+0800], Qu Wenruo wrote:
> >> The trace point only uses the pointer, and this helps us to pair with
> >> btrfs_work_queued/sched.
> >
> > | /* For situiations that the work is freed */
> > | DECLARE_EVENT_CLASS(btrfs__work__done,
> > |
> > | TP_PROTO(struct btrfs_work *work),
> > |
> > | TP_ARGS(work),
> > |
> > | TP_STRUCT__entry_btrfs(
> > | __field(void *, work)
> > | ),
> > |
> > | TP_fast_assign_btrfs(btrfs_work_owner(work),
> > | __entry->work   = work;
> > | ),
> > |
> > | TP_printk_btrfs("work->%p", __entry->work)
> > | );
> >
> > and btrfs_work_owner exapnds to:
> >
> > | struct btrfs_fs_info *
> > | btrfs_work_owner(struct btrfs_work *work)
> > | {
> > | return work->wq->fs_info;
> > | }
> >
> > voilà
> 
> Oh I got it, thanks very much.
> 
> The btrfs_work_owner() is newly introduced, no wonder I didn't know that.
> 
> 
> I think we can fix it by extracting fs_info pointer before running the 
> work, and using the extracted one in the trace point.

That sounds like an easy fix to preserve the tracepoint.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] btrfs: drop trace_btrfs_all_work_done() from normal_work_helper()

2016-12-21 Thread Sebastian Andrzej Siewior
On 2016-12-21 16:45:06 [+0800], Qu Wenruo wrote:
> > | DECLARE_EVENT_CLASS(btrfs__work__done,
> > |
> > | TP_PROTO(struct btrfs_work *work),
> > |
> > | TP_ARGS(work),
> > |
> > | TP_STRUCT__entry_btrfs(
> > | __field(void *, work)
> > | ),
> > |
> > | TP_fast_assign_btrfs(btrfs_work_owner(work),
> > | __entry->work   = work;
> > | ),
> > |
> > | TP_printk_btrfs("work->%p", __entry->work)
> > | );
> > 
> > and btrfs_work_owner exapnds to:
> > 
> > | struct btrfs_fs_info *
> > | btrfs_work_owner(struct btrfs_work *work)
> > | {
> > | return work->wq->fs_info;
> > | }
> > 
> > voilà
> 
> Oh I got it, thanks very much.
> 
> The btrfs_work_owner() is newly introduced, no wonder I didn't know that.

It was introduced in cb001095ca70 ("btrfs: plumb fs_info into
btrfs_work") which is v4.8-rc1. The usage in trace points started in
bc074524e123 ("btrfs: prefix fsid to all trace events") which is also
v4.8-rc1. So whatever is done to get it fixed, it should be pushed
stable for v4.8+.

> Thanks,
> Qu

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] btrfs: drop trace_btrfs_all_work_done() from normal_work_helper()

2016-12-21 Thread Qu Wenruo



At 12/21/2016 04:28 PM, Sebastian Andrzej Siewior wrote:

On 2016-12-21 08:33:03 [+0800], Qu Wenruo wrote:

The trace point only uses the pointer, and this helps us to pair with
btrfs_work_queued/sched.


| /* For situiations that the work is freed */
| DECLARE_EVENT_CLASS(btrfs__work__done,
|
| TP_PROTO(struct btrfs_work *work),
|
| TP_ARGS(work),
|
| TP_STRUCT__entry_btrfs(
| __field(void *, work)
| ),
|
| TP_fast_assign_btrfs(btrfs_work_owner(work),
| __entry->work   = work;
| ),
|
| TP_printk_btrfs("work->%p", __entry->work)
| );

and btrfs_work_owner exapnds to:

| struct btrfs_fs_info *
| btrfs_work_owner(struct btrfs_work *work)
| {
| return work->wq->fs_info;
| }

voilà


Oh I got it, thanks very much.

The btrfs_work_owner() is newly introduced, no wonder I didn't know that.


I think we can fix it by extracting fs_info pointer before running the 
work, and using the extracted one in the trace point.


Thanks,
Qu





But I still don't understand why backtrace is triggered.
Since we're just recording a pointer, not touching it.

Would you please explain the problem with more details on how it trigger the
problem?


enabled all events played with the fs which was just an upgrade and git
tree sync + checkout so nothing special.



So I think we should either remove the tracepoint completely or change
the arguments to take something else than a potentially freed 'work'.


I'm mostly OK to remove the tracepoint, but such all_workd_done() trace
should still help to determine if it's a workqueue stalled.

Thanks,
Qu


Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html





--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] btrfs: drop trace_btrfs_all_work_done() from normal_work_helper()

2016-12-21 Thread Sebastian Andrzej Siewior
On 2016-12-21 08:33:03 [+0800], Qu Wenruo wrote:
> The trace point only uses the pointer, and this helps us to pair with
> btrfs_work_queued/sched.

| /* For situiations that the work is freed */
| DECLARE_EVENT_CLASS(btrfs__work__done,
| 
| TP_PROTO(struct btrfs_work *work),
| 
| TP_ARGS(work),
| 
| TP_STRUCT__entry_btrfs(
| __field(void *, work)
| ),
| 
| TP_fast_assign_btrfs(btrfs_work_owner(work),
| __entry->work   = work;
| ),
| 
| TP_printk_btrfs("work->%p", __entry->work)
| );

and btrfs_work_owner exapnds to:

| struct btrfs_fs_info *
| btrfs_work_owner(struct btrfs_work *work)
| {
| return work->wq->fs_info;
| }
 
voilà

> But I still don't understand why backtrace is triggered.
> Since we're just recording a pointer, not touching it.
> 
> Would you please explain the problem with more details on how it trigger the
> problem?

enabled all events played with the fs which was just an upgrade and git
tree sync + checkout so nothing special.

> > 
> > So I think we should either remove the tracepoint completely or change
> > the arguments to take something else than a potentially freed 'work'.
> 
> I'm mostly OK to remove the tracepoint, but such all_workd_done() trace
> should still help to determine if it's a workqueue stalled.
> 
> Thanks,
> Qu

Sebastian
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] btrfs: drop trace_btrfs_all_work_done() from normal_work_helper()

2016-12-20 Thread Qu Wenruo



At 12/21/2016 01:26 AM, David Sterba wrote:

Adding Qu to CC,

On Wed, Dec 14, 2016 at 03:05:29PM +0100, Sebastian Andrzej Siewior wrote:

For btrfs_scrubparity_helper() the ->func() is set to
scrub_parity_bio_endio_worker(). This functions invokes
scrub_free_parity() which kfrees() the `work' object. All is good as
long as trace events are not enabled because we boom with a backtrace
like this:
| Workqueue: btrfs-endio btrfs_endio_helper
| RIP: 0010:[]  [] 
trace_event_raw_event_btrfs__work__done+0x4e/0xa0
| Call Trace:
|  [] btrfs_scrubparity_helper+0x59d/0x780
|  [] btrfs_endio_helper+0x9/0x10
|  [] process_one_work+0x26e/0x7b0
|  [] worker_thread+0x46/0x560
|  [] kthread+0xee/0x110
|  [] ret_from_fork+0x2a/0x40

So in order to avoid this, I remove the trace point.

Signed-off-by: Sebastian Andrzej Siewior 
---
 fs/btrfs/async-thread.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
index e0f071f6b5a7..d0dfc3d2e199 100644
--- a/fs/btrfs/async-thread.c
+++ b/fs/btrfs/async-thread.c
@@ -318,8 +318,6 @@ static void normal_work_helper(struct btrfs_work *work)
set_bit(WORK_DONE_BIT, >flags);
run_ordered_work(wq);
}
-   if (!need_order)
-   trace_btrfs_all_work_done(work);


The comment in the function says we can't touch 'work' after the
callbacks. I don't see any way to use it in a tracepoint here. The
"all_work_done" pairs with a preceding trace_btrfs_work_sched in the
same function or from within run_ordered_work, also called after the
free callback.


The trace point only uses the pointer, and this helps us to pair with 
btrfs_work_queued/sched.


But I still don't understand why backtrace is triggered.
Since we're just recording a pointer, not touching it.

Would you please explain the problem with more details on how it trigger 
the problem?




So I think we should either remove the tracepoint completely or change
the arguments to take something else than a potentially freed 'work'.


I'm mostly OK to remove the tracepoint, but such all_workd_done() trace 
should still help to determine if it's a workqueue stalled.


Thanks,
Qu



I'm a bit puzzled by the comment in trace/events/btrfs.h

http://lxr.free-electrons.com/source/include/trace/events/btrfs.h#L1165

/* For situiations that the work is freed */
DECLARE_EVENT_CLASS(btrfs__work__done,

so we're expecing a freed pointer anyway? That sounds wrong.

I'll queue the patch for 4.10 as it fixes a crash.





--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] btrfs: drop trace_btrfs_all_work_done() from normal_work_helper()

2016-12-20 Thread David Sterba
Adding Qu to CC,

On Wed, Dec 14, 2016 at 03:05:29PM +0100, Sebastian Andrzej Siewior wrote:
> For btrfs_scrubparity_helper() the ->func() is set to
> scrub_parity_bio_endio_worker(). This functions invokes
> scrub_free_parity() which kfrees() the `work' object. All is good as
> long as trace events are not enabled because we boom with a backtrace
> like this:
> | Workqueue: btrfs-endio btrfs_endio_helper
> | RIP: 0010:[]  [] 
> trace_event_raw_event_btrfs__work__done+0x4e/0xa0
> | Call Trace:
> |  [] btrfs_scrubparity_helper+0x59d/0x780
> |  [] btrfs_endio_helper+0x9/0x10
> |  [] process_one_work+0x26e/0x7b0
> |  [] worker_thread+0x46/0x560
> |  [] kthread+0xee/0x110
> |  [] ret_from_fork+0x2a/0x40
> 
> So in order to avoid this, I remove the trace point.
> 
> Signed-off-by: Sebastian Andrzej Siewior 
> ---
>  fs/btrfs/async-thread.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
> index e0f071f6b5a7..d0dfc3d2e199 100644
> --- a/fs/btrfs/async-thread.c
> +++ b/fs/btrfs/async-thread.c
> @@ -318,8 +318,6 @@ static void normal_work_helper(struct btrfs_work *work)
>   set_bit(WORK_DONE_BIT, >flags);
>   run_ordered_work(wq);
>   }
> - if (!need_order)
> - trace_btrfs_all_work_done(work);

The comment in the function says we can't touch 'work' after the
callbacks. I don't see any way to use it in a tracepoint here. The
"all_work_done" pairs with a preceding trace_btrfs_work_sched in the
same function or from within run_ordered_work, also called after the
free callback.

So I think we should either remove the tracepoint completely or change
the arguments to take something else than a potentially freed 'work'.

I'm a bit puzzled by the comment in trace/events/btrfs.h

http://lxr.free-electrons.com/source/include/trace/events/btrfs.h#L1165

/* For situiations that the work is freed */
DECLARE_EVENT_CLASS(btrfs__work__done,

so we're expecing a freed pointer anyway? That sounds wrong.

I'll queue the patch for 4.10 as it fixes a crash.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] btrfs: drop trace_btrfs_all_work_done() from normal_work_helper()

2016-12-14 Thread Sebastian Andrzej Siewior
For btrfs_scrubparity_helper() the ->func() is set to
scrub_parity_bio_endio_worker(). This functions invokes
scrub_free_parity() which kfrees() the `work' object. All is good as
long as trace events are not enabled because we boom with a backtrace
like this:
| Workqueue: btrfs-endio btrfs_endio_helper
| RIP: 0010:[]  [] 
trace_event_raw_event_btrfs__work__done+0x4e/0xa0
| Call Trace:
|  [] btrfs_scrubparity_helper+0x59d/0x780
|  [] btrfs_endio_helper+0x9/0x10
|  [] process_one_work+0x26e/0x7b0
|  [] worker_thread+0x46/0x560
|  [] kthread+0xee/0x110
|  [] ret_from_fork+0x2a/0x40

So in order to avoid this, I remove the trace point.

Signed-off-by: Sebastian Andrzej Siewior 
---
 fs/btrfs/async-thread.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
index e0f071f6b5a7..d0dfc3d2e199 100644
--- a/fs/btrfs/async-thread.c
+++ b/fs/btrfs/async-thread.c
@@ -318,8 +318,6 @@ static void normal_work_helper(struct btrfs_work *work)
set_bit(WORK_DONE_BIT, >flags);
run_ordered_work(wq);
}
-   if (!need_order)
-   trace_btrfs_all_work_done(work);
 }
 
 void btrfs_init_work(struct btrfs_work *work, btrfs_work_func_t uniq_func,
-- 
2.11.0

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html