On Thu, 2014-02-27 at 00:52 +0000, Ken Moffat wrote: > Hi, > > Short summary : on 3.13.5, rm -rf of an application source > directory on an ext4 filesystem sometimes takes forever (probably > isn't going anywhere), with one CPU pegged at all-but 100% utilization. > > I've nearly finished building a new system from source, to check > various desktop packages in linuxfromscratch. On this build, much of > it is things I don't normally use and I needed to upgrade my > buildscripts, so most of it was built in chroot using 3.10.32. But > late last night I booted the new system using 3.13.5 to finish the > build. This morning I discovered that rm -rf for the icedtea source > directory was still running, and had taken over 5 hours of CPU time > (one CPU seemd to be running at close to 100%, the others had dropped > to their slowest frequency). That script was running as root (yeah, > but it's a new system) and it looks as if /etc/passwd~ had got > trashed, because I could no longer su or login. Not sure if that is > related, at this stage it might just be a side-effect of my scripts. > > Booted another system, chrooted, fixed up passwords. Started > again after commenting out icedtea - I hadn't intended to build > what was an old version, I'd just forgotten it was in this script - > that's why I do things in userspace, not the kernel :-( > > Continued with remaining packages, but a couple of hours later I > saw a similar "one CPU at 100%, rm -rf GConf source taking forever" > problem. Dumped all the processes with Alt-SysRQ-T [ huge log ] but > at that point 'rm' was merely 'ready' so I doubt there is anything > useful to see in the log. > > Built 3.13.4, booted to that. So far, everything looks good - but > I'm now building the _current_ version of icedtea, so if this isn't > a new 3.13.5 problem I guess I'm fairly likely to see it tomorrow. > > Meanwhile, any suggestions about how I can debug this if I hit it > again, please ?
I would start with strace to see if a task is looping in userspace, then move on to perf top -g -p <pid> (or perf record/report) to peek at what it's up to in the kernel. Once you have the where, trace_printk() is the best thing since sliced bread (which ranks just below printk()). -Mike -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/