On Fri, 2013-03-29 at 08:42 +0100, Piotr Szymaniak wrote:
> On Thu, Mar 28, 2013 at 10:22:52PM +0300, Vyacheslav Dubeyko wrote:
> > Thank you for additional details. Unfortunately, your sysrq-trigger
> > output is not complete. So, I can't make conclusion about what
> > operation was a reason of issue on your side. Could you send to me the
> > full log of sysrq-trigger output?
>
> How to generete more verbose log?
>
I meant that sysrq-trigger output ends with resume about runnable tasks
and locks in the system (for example):
[ 8670.960040] runnable tasks:
[ 8670.960040] task PID tree-key switches prio
exec-runtime sum-exec sum-sleep
[ 8670.960040]
----------------------------------------------------------------------------------------------------------
[ 8670.960040] migration/1 14 0.002037 2223 0
0.002037 2308.610315 0.000000 /
[ 8670.960040] kworker/1:1 29 2713578.453275 43727 120
2713578.453275 2098.177537 8659787.039783 /
[ 8670.960040] R bash 11089 61.102602 142 120
61.102602 39.773082 46581.905519 /autogroup-171
[ 8670.960040]
[ 8670.960040]
[ 8670.960040] Showing all locks held in the system:
[ 8670.960040] 1 lock held by getty/892:
[ 8670.960040] #0: (&ldata->atomic_read_lock){+.+.+.}, at:
[<ffffffff81460139>] n_tty_read+0x399/0x950
[ 8670.960040] 1 lock held by getty/901:
[ 8670.960040] #0: (&ldata->atomic_read_lock){+.+.+.}, at:
[<ffffffff81460139>] n_tty_read+0x399/0x950
[ 8670.960040] 1 lock held by getty/927:
[ 8670.960040] #0: (&ldata->atomic_read_lock){+.+.+.}, at:
[<ffffffff81460139>] n_tty_read+0x399/0x950
[ 8670.960040] 1 lock held by getty/930:
[ 8670.960040] #0: (&ldata->atomic_read_lock){+.+.+.}, at:
[<ffffffff81460139>] n_tty_read+0x399/0x950
[ 8670.960040] 1 lock held by getty/937:
[ 8670.960040] #0: (&ldata->atomic_read_lock){+.+.+.}, at:
[<ffffffff81460139>] n_tty_read+0x399/0x950
[ 8670.960040] 1 lock held by getty/1214:
[ 8670.960040] #0: (&ldata->atomic_read_lock){+.+.+.}, at:
[<ffffffff81460139>] n_tty_read+0x399/0x950
[ 8670.960040] 1 lock held by bash/9572:
[ 8670.960040] #0: (&ldata->atomic_read_lock){+.+.+.}, at:
[<ffffffff81460139>] n_tty_read+0x399/0x950
[ 8670.960040] 1 lock held by bash/9797:
[ 8670.960040] #0: (&ldata->atomic_read_lock){+.+.+.}, at:
[<ffffffff81460139>] n_tty_read+0x399/0x950
[ 8670.960040] 1 lock held by bash/9879:
[ 8670.960040] #0: (&ldata->atomic_read_lock){+.+.+.}, at:
[<ffffffff81460139>] n_tty_read+0x399/0x950
[ 8670.960040] 1 lock held by bash/10021:
[ 8670.960040] #0: (&ldata->atomic_read_lock){+.+.+.}, at:
[<ffffffff81460139>] n_tty_read+0x399/0x950
[ 8670.960040] 1 lock held by bash/10224:
[ 8670.960040] #0: (&ldata->atomic_read_lock){+.+.+.}, at:
[<ffffffff81460139>] n_tty_read+0x399/0x950
[ 8670.960040] 2 locks held by bash/11089:
[ 8670.960040] #0: (sysrq_key_table_lock){......}, at: [<ffffffff814689c2>]
__handle_sysrq+0x32/0x190
[ 8670.960040] #1: (tasklist_lock){.?.?.-}, at: [<ffffffff810bf4e4>]
debug_show_all_locks+0x44/0x1e0
[ 8670.960040]
[ 8670.960040] =============================================
So, I need in full sysrq-trigger output for understanding situation on
your side. Could you share it?
> Btw looking at ps aux output it seems, that this flush is hanging there
> almost from first boot:
> root 937 88.2 0.0 0 0 ? S Feb17 50160:39 [flush-8:0]
>
Could you share "cat /proc/partitions" and "mount" outputs? I need to
understand what partition is processed by [flush-8:0].
> And an uptime:
> 08:25:18 up 39 days, 10:57, 4 users, load average: 0.72, 0.89, 0.98
>
Yes, it can define a reason of the issue on your side.
> Also I don't know if this isn't some kind of regression. I'm using nilfs
> for, well, some time now (looking at fs creation date Aug 2011) and
> didn't noticed any strange behaviour before. I think I won't be able to
> check this, but before that last boot I used (95% sure) vanilla kernel
> 3.4.4 and it was not flushing things or I didn't noticed. I could go to
> some older LTS kernel on other machine and check that.
>
>
> > I can easily reproduce the issue by big file (100 - 500 GB) deletion
> > or truncation. Please, find description in:
> > http://www.mail-archive.com/[email protected]/msg01504.html.
>
> If this brings something new - I'm not using huge files like that (this
> flush above is a 3.7G device). But if this reproduces the issue, it
> could be related. (:
>
Yes, it is really important to understand the situation on your side.
Because you can have another reason of the issue with similar symptoms.
So, we need to investigate your case more deeply, I think.
With the best regards,
Vyacheslav Dubeyko.
>
> Piotr Szymaniak.
--
To unsubscribe from this list: send the line "unsubscribe linux-nilfs" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html