On Fri, Jul 05, 2019 at 10:29:24PM +0200, Thomas Klausner wrote:
> >From some debugging so far, the cause for the hang seems to be that
> the nvme driver is waiting for an interrupt that doesn't come.
>
> At least once I got it to get unstuck by call "nvme_intr()" on the
> softc address from ddb.
On Mon, Jul 01, 2019 at 07:02:06AM -, Michael van Elst wrote:
> t...@giga.or.at (Thomas Klausner) writes:
>
> >So it looks like it could be a very extreme slowness instead of a
> >complete deadlock.
>
> When it stops, try to reduce kern.maxvnodes to something low (like 100),
> you can
hello. If I were looking at this issue, I'd be looking at the perl
process stuck in bioloc, to see what it's doing. As I understand it,
processes stuck in tstile are a symptom, rather than a cause. that is, any
process that is waiting for access to some subsystem in an indirect manner
On Fri, Jun 28, 2019 at 09:42:08PM +0200, Thomas Klausner wrote:
> To reduce the bug surface, I've disconnected the wd0 device which was
> attached at ahcisata. This also removed the swap device, but the
> machine is far from needing to swap.
>
> After ~5 hours the machine is currently hanging in
t...@giga.or.at (Thomas Klausner) writes:
>So it looks like it could be a very extreme slowness instead of a
>complete deadlock.
When it stops, try to reduce kern.maxvnodes to something low (like 100),
you can restore it, if the machine wakes up.
If this is a memory shortage instead of a
To reduce the bug surface, I've disconnected the wd0 device which was
attached at ahcisata. This also removed the swap device, but the
machine is far from needing to swap.
After ~5 hours the machine is currently hanging in tstile again. I
noticed the bulk build wasn't progressing (in a perl
This time it recovered! It took an hour or so, but the tstile blocked
processes are now gone (finished) and I got my console shell (with the
rm) back.
Nothing in dmesg or /var/log/messages.
So it looks like it could be a very extreme slowness instead of a
complete deadlock.
Thomas
On Fri, Jun
With dmesg this time.
On Fri, Jun 28, 2019 at 11:39:05AM +0200, Thomas Klausner wrote:
> Hi Frank!
>
> I checked some process states in ddb.
>
> "master", the 2 "bjam" and at least one "cp" hanging in tstile have:
> sleepq_block()
> turnstile_block()
> rw_vector_enter()
> genfs_lock()
>
When I tried to get a core, I saw:
> reboot 0x104
dumping to dev 168,2 (offset=73677660, size=33524130)
dump ahcisata0 port 5: clearing WDCTL_RST failed for drive 0
wddump: device timed out
i/o error
rebooting...
Thomas
On Fri, Jun 28, 2019 at 11:39:05AM +0200, Thomas Klausner wrote:
> Hi
On Fri, Jun 28, 2019 at 11:44:37AM +0100, Robert Swindells wrote:
>
> Thomas Klausner wrote:
> >I've set up a new machine for bulk building. I have tried various
> >things, but in the end it always hangs in tstile.
> >
> >First try was what I currently use: tmpfs sandboxes with nullfs
> >mounted
Thomas Klausner wrote:
>I've set up a new machine for bulk building. I have tried various
>things, but in the end it always hangs in tstile.
>
>First try was what I currently use: tmpfs sandboxes with nullfs
>mounted /bin, /lib, ... When it hung, the suspicion was that it's
>nullfs' fault. (The
Hi Frank!
I checked some process states in ddb.
"master", the 2 "bjam" and at least one "cp" hanging in tstile have:
sleepq_block()
turnstile_block()
rw_vector_enter()
genfs_lock()
VOP_LOCK()
vn_lock()
namei_tryemulroot()
namei()
check_exec()
execve_loadvm()
execve1()
syscall()
These look quite
Hi Thomas,
glad that this is observed elsewhere.
Maybe following bugs could resonate with your observations:
kern/54207 [serious/high]:
-current locks up solidly when pkgsrc building
adapta-gtk-theme-3.95.0.11
looks like locking issue in layerfs* (nullfs). (AMD 1800X, 64GB)
13 matches
Mail list logo