Hello > > > > I will check the memory as well. > > > > The memory check didn't report any error, but I replaced the whole > > memory anyway. > > Oh, that is not what I meant. > I was talking about "some other SOFTWARE component MAY overwrite > something on memory". It never mean the HARDWARE error. > If this misunderstanding forced you to buy a new memory (hardware), I > feel sorry.
No problem. I had the memory already. I misunderstood you.
> > I can reproducte it again and again on two machines and even with
> > replaced RAM. A new awareness is that sometimes the command is
> > unblocking again (after a few minutes). But not everytime.
>
> I have tested your "while" loop test for about an hour, but found
> nothing wrong. But my system in linux-3.3-rcN instead of 3.2.1.
I have also systems, where I am unable to reproduce the error. It's very
strange.
> > Call Trace:
> > [<c102862b>] ? try_to_wake_up+0x17b/0x1f0
> > [<c1020dc0>] ? __wake_up_common+0x40/0x70
> > [<c10ac1ef>] ? iput+0x2f/0x1a0
> > [<c134b630>] schedule+0x30/0x50
> > [<f80bbdfd>] au_hn_alloc+0x24d/0x370 [aufs]
> > [<c10431a0>] ? wake_up_bit+0x60/0x60
> > [<f80bbb8c>] au_hn_free+0x1c/0x40 [aufs]
> > [<f80b36cb>] au_hiput+0xb/0x20 [aufs]
> > [<f80b380b>] au_iinfo_fin+0x12b/0x1a0 [aufs]
> > [<f80a13ab>] au_si_free+0xabb/0xc00 [aufs]
> > [<c10ac04c>] destroy_inode+0x2c/0x50
> > [<c10ac143>] evict+0xd3/0x150
> > [<c10ac27d>] iput+0xbd/0x1a0
> > [<c10aa1ef>] d_kill+0x9f/0xf0
> > [<c10aa3f0>] shrink_dentry_list+0x1b0/0x1d0
> > [<c10aa8ae>] shrink_dcache_sb+0x5e/0x90
> > [<c1098b25>] do_remount_sb+0x35/0x160
> > [<c1034110>] ? ns_capable+0x20/0x50
> > [<c10b1870>] do_mount+0x500/0x6e0
> > [<c101b840>] ? mm_fault_error+0x130/0x130
> > [<c10b0208>] ? copy_mount_options+0x98/0x110
> > [<c10b1ab6>] sys_mount+0x66/0xa0
> > [<c134d065>] syscall_call+0x7/0xb
>
> This call trace may be unreliable too.
> You can see destroy_inode() calling au_si_free() as well as
> au_hn_free()
> calling au_hn_alloc(), but there is no such calls in the source files.
> Look at the VFS function destroy_inode() in linux/fs/inode.c, and you
> can find what I mean.
>
> > You say, that au_hn_alloc() cannot follow au_hn_free(). But how can
> it
> > be, that can reproduce it again and again? Which code is executing
> the
> > function "wake_up_bit()"?
>
> It is prefixed by '?' in the call trace which means that the address
is
> unreliable. In other words, if the function name is not prefixed by
> '?',
> it is reliable.
Ok, I understand.
> > Is there everything else I can try?
>
> - MagicSysRq + A
> or
> - set the aufs module parameter 'debug' to 1, just before the hang
> mount
>
> But I am afraid they may not help, since your call trace looks
> unbelievalbe to me.
>
> Another option is modifying aufs-util/mount.aufs.c.
> Arount the line 223, you will see
> flags[AuFlush] = test_flush(opts);
> if (flags[AuFlush] /* && !flags[Fake] */) {
> err = au_plink(cwd, AuPlink_FLUSH,
> AuPlinkFlag_OPEN |
AuPlinkFlag_CLOEXEC,
> &fd);
> if (err)
> AuFin(NULL);
> }
>
> In your case, the function au_plink() is not called. But calling it
may
> break the current situation. So I'd suggest you to set flags[AuFlush]
> to
> 1 regardless the return value from test_flush().
I will try again, with this change and your patch from the other post.
I have some news:
There is a cronjob, which causes the problem. The cronjob does a "rsync"
in *dry-run* Mode between / and a branch (/mnt/overlay). I am able to
cause it as well with a simple "find /usr -type f".
Most of the time, if the command hangs, it does it only for a short
time, some seconds or a few minutes.
I will do some further research.
Thanks for your help.
Best regards
Elmar
smime.p7s
Description: S/MIME cryptographic signature
------------------------------------------------------------------------------ Keep Your Developer Skills Current with LearnDevNow! The most comprehensive online learning library for Microsoft developers is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3, Metro Style Apps, more. Free future releases when you subscribe now! http://p.sf.net/sfu/learndevnow-d2d
