Re: unkillable process consuming 100% cpu

2019-11-13 Thread Steve Kargl
On Mon, Nov 11, 2019 at 01:22:09PM +0100, Hans Petter Selasky wrote:

> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index a6e0a16ae..0697d70f4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c

Are you using ports/graphics/drm-devel-kmod?
This file does not exist in drm-current-kmod.

> @@ -236,6 +238,12 @@ static int amdgpu_amdkfd_remove_eviction_fence(struct 
> amdgpu_bo *bo,

Using 'nm *.ko | grep eviction_fence' in /boot/modules shows
that none of the modules contain amdgpu_amdkfd_remove_eviction_fence().

-- 
Steve
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: unkillable process consuming 100% cpu

2019-11-13 Thread Steve Kargl
On Wed, Nov 13, 2019 at 04:22:19PM +0100, Hans Petter Selasky wrote:
> On 2019-11-13 15:52, Steve Kargl wrote:
> >  at /usr/src/sys/amd64/amd64/trap.c:743
> > #7  0x808b0468 in trap (frame=0xfe00b460e0c0)
> >  at /usr/src/sys/amd64/amd64/trap.c:407
> > #8  
> > #9  0x in ?? ()
> > #10 0x817d2c0f in radeon_ttm_tt_to_gtt (ttm=0xf80061eeb248)
> >  at 
> > /usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/radeon/radeon_ttm.c:720
> > #11 radeon_ttm_tt_set_userptr (ttm=0xf80061eeb248, addr=1,
> >  flags=2147483647)
> 
> Hi,
> 
> I don't see any function call here. Can you try to double check the 
> backtrace?
> 
> Which version of FreeBSD is this?
> 

% uname -a (trimmed)
FreeBSD 13.0-CURRENT r353571

% kgdb /usr/lib/debug/boot/kernel/kernel.debug vmcore.2
% bt
...
#7  0x808b0468 in trap (frame=0xfe00b460e0c0)
at /usr/src/sys/amd64/amd64/trap.c:407
#8  
#9  0x in ?? ()
#10 0x817d2c0f in radeon_ttm_tt_to_gtt (ttm=0xf80061eeb248)
at 
/usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/radeon/radeon_ttm.c:720
#11 radeon_ttm_tt_set_userptr (ttm=0xf80061eeb248, addr=1, 
flags=2147483647)
at 
/usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/radeon/radeon_ttm.c:804
#12 0x817adc9b in radeon_is_px (dev=0xf8017fe84e00)
at 
/usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/radeon/radeon_device.c:156

Looking at radeon_ttm.c, line 720 is the if-stmt in this function

static struct radeon_ttm_tt *radeon_ttm_tt_to_gtt(struct ttm_tt *ttm)
{
 if (!ttm || ttm->func != &radeon_backend_func)
  return NULL;
 return (struct radeon_ttm_tt *)ttm;
}

(kgdb) p ttm->func
$2 = (struct ttm_backend_func *) 0x231
(kgdb) p &radeon_backend_func
$4 = (struct ttm_backend_func *) 0x8186d870 

AFAIK, 0x231 is not a valid address.

(kgdb) p *ttm
$5 = {bdev = 0x819021ef, func = 0x231, dummy_read_page = 0x0, 
  pages = 0xf800612c, page_flags = 2173789980, num_pages = 0, 
  sg = 0x0, glob = 0x2a, swap_storage = 0xf8017fe84e00, 
  caching_state = (unknown: 145613312), 
  state = (tt_unbound | tt_unpopulated | unknown: 4294965248)}

Moving to frame 12 suggests that the stack is corrupt (whether
by the dump or the crash I don't know)

(kgdb) frame 12
#12 0x817adc9b in radeon_is_px (dev=0xf8017fe84e00)
at 
/usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/radeon/radeon_device.c:156
156 if (rdev->flags & RADEON_IS_PX)
(kgdb) p *dev
Cannot access memory at address 0xf8017fe84e00
(kgdb) p rdev
$25 = (struct radeon_device *) 0x0


-- 
Steve
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: unkillable process consuming 100% cpu

2019-11-13 Thread Hans Petter Selasky

On 2019-11-13 15:52, Steve Kargl wrote:

 at /usr/src/sys/amd64/amd64/trap.c:743
#7  0x808b0468 in trap (frame=0xfe00b460e0c0)
 at /usr/src/sys/amd64/amd64/trap.c:407
#8  
#9  0x in ?? ()
#10 0x817d2c0f in radeon_ttm_tt_to_gtt (ttm=0xf80061eeb248)
 at 
/usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/radeon/radeon_ttm.c:720
#11 radeon_ttm_tt_set_userptr (ttm=0xf80061eeb248, addr=1,
 flags=2147483647)


Hi,

I don't see any function call here. Can you try to double check the 
backtrace?


Which version of FreeBSD is this?

--HPS
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: unkillable process consuming 100% cpu

2019-11-13 Thread Steve Kargl
On Wed, Nov 13, 2019 at 09:10:06AM +0100, Hans Petter Selasky wrote:
> On 2019-11-13 01:30, Steve Kargl wrote:
> > 
> > I installed the 2nd seqlock.diff, rebuilt drm-current-kmod-4.16.g20191023,
> > rebooting, and have been pounding on the system with workloads that are
> > similar to what the system was doing during the lockups.  So far, I
> > cannot ge the system lock-up.  Looks like your patch fixes (or at
> > least helps).  Thanks for taking a look at the problem.
> > 
> 
> Can you apply the kdb.diff on top and check dmesg for prints?
> 

I could not find the amdgpu_amdkfd_gpuvm.c file when I went looking.
Is it autogenerated?

I also spoke too soon. I got a panic after my reply above.

Fatal trap 12: page fault while in kernel mode
cpuid = 5; apic id = 15
fault virtual address   = 0x0
fault code  = supervisor read instruction, page not present
instruction pointer = 0x20:0x0
stack pointer   = 0x28:0xfe00b460e188
frame pointer   = 0x28:0xfe00b460e1c0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 877 (X:rcs0)
trap number = 12
panic: page fault
cpuid = 5

db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe00b460dde0
vpanic() at vpanic+0x17e/frame 0xfe00b460de40
panic() at panic+0x43/frame 0xfe00b460dea0
trap_fatal() at trap_fatal+0x388/frame 0xfe00b460df10
trap_pfault() at trap_pfault+0x4f/frame 0xfe00b460df80
trap() at trap+0x288/frame 0xfe00b460e0b0
calltrap() at calltrap+0x8/frame 0xfe00b460e0b0
--- trap 0xc, rip = 0, rsp = 0xfe00b460e188, rbp = 0xfe00b460e1c0 ---
??() at 0/frame 0xfe00b460e1c0
radeon_cs_ioctl() at radeon_cs_ioctl+0xa0b/frame 0xfe00b460e640
drm_ioctl_kernel() at drm_ioctl_kernel+0xf1/frame 0xfe00b460e680
drm_ioctl() at drm_ioctl+0x279/frame 0xfe00b460e770
linux_file_ioctl() at linux_file_ioctl+0x298/frame 0xfe00b460e7d0
kern_ioctl() at kern_ioctl+0x284/frame 0xfe00b460e840
sys_ioctl() at sys_ioctl+0x157/frame 0xfe00b460e910
amd64_syscall() at amd64_syscall+0x273/frame 0xfe00b460ea30
fast_syscall_common() at fast_syscall_common+0x101/frame 0xfe00b460ea30
--- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x200cc6bfa, rsp = 
0x7fffbfffde98, rbp = 0x7fffbfffdec0 ---
Uptime: 5h9m5s
Dumping 1472 out of 16327 MB:..2%..11%..21%..31%..41%..52%..61%..71%..81%..91%

__curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
warning: Source file is more recent than executable.
55  __asm("movq %%gs:%P1,%0" : "=r" (td) : "n" (offsetof(struct 
pcpu,
(kgdb) #0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:392
#2  0x805de452 in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:479
#3  0x805de8a6 in vpanic (fmt=, ap=)
at /usr/src/sys/kern/kern_shutdown.c:908
#4  0x805de6c3 in panic (fmt=)
at /usr/src/sys/kern/kern_shutdown.c:835
#5  0x808b0d58 in trap_fatal (frame=0xfe00b460e0c0, eva=0)
at /usr/src/sys/amd64/amd64/trap.c:925
#6  0x808b0daf in trap_pfault (frame=0xfe00b460e0c0, 
usermode=, signo=, ucode=)
at /usr/src/sys/amd64/amd64/trap.c:743
#7  0x808b0468 in trap (frame=0xfe00b460e0c0)
at /usr/src/sys/amd64/amd64/trap.c:407
#8  
#9  0x in ?? ()
#10 0x817d2c0f in radeon_ttm_tt_to_gtt (ttm=0xf80061eeb248)
at 
/usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/radeon/radeon_ttm.c:720
#11 radeon_ttm_tt_set_userptr (ttm=0xf80061eeb248, addr=1, 
flags=2147483647)
at 
/usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/radeon/radeon_ttm.c:804
#12 0x817adc9b in radeon_is_px (dev=0xf8017fe84e00)
at 
/usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/radeon/radeon_device.c:156
#13 0x818a9e81 in drm_ioctl_kernel (linux_file=, 
func=0xfe00b460e428, kdata=0xfe00b31eb000, flags=1521620552)
at /usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/drm_ioctl.c:760
#14 0x818aa129 in drm_ioctl (filp=0xf80061198e00, 
cmd=, arg=65536)
at /usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm/drm_ioctl.c:856
#15 0x807c8098 in linux_file_ioctl_sub (fp=, 
filp=, fop=, cmd=, 
data=, td=)
at /usr/src/sys/compat/linuxkpi/common/src/linux_compat.c:965
#16 linux_file_ioctl (fp=, cmd=, 
data=, cred=, td=0xf800612c)
at /usr/src/sys/compat/linuxkpi/common/src/linux_compat.c:1558
#17 0x8063ed34 in fo_ioctl (fp=, com=3223348326, 
data=0x7fff, active_cred=0xfe001f7e6250, td=0xf800612c)
at /usr/src/sys/sys/file.h:340
#18 kern_ioctl (td=, fd=9, com=3223348326, 
data=0x7fff )
at /usr/src/sys/kern/sys_generic.c:801
#19 0x8063ea37 in sys_ioctl (td=0xfff

Re: unkillable process consuming 100% cpu

2019-11-13 Thread Hans Petter Selasky

On 2019-11-13 01:30, Steve Kargl wrote:

On Tue, Nov 12, 2019 at 06:48:22PM +0100, Hans Petter Selasky wrote:

On 2019-11-12 18:31, Steve Kargl wrote:

Can you open the radeonkms.ko in gdb83 from ports and type:

l *(radeon_gem_busy_ioctl+0x30)


% /boot/modules/radeonkms.ko
(gdb) l  *(radeon_gem_busy_ioctl+0x30)
0xa12b0 is in radeon_gem_busy_ioctl 
(/usr/ports/graphics/drm-current-kmod/work/kms-drm-2d2852e/drivers/gpu/drm/radeon/radeon_gem.c:453).
448 
/usr/ports/graphics/drm-current-kmod/work/kms-drm-2d2852e/drivers/gpu/drm/radeon/radeon_gem.c:
 No such file or directory.
(gdb)


Like expected.



I installed the 2nd seqlock.diff, rebuilt drm-current-kmod-4.16.g20191023,
rebooting, and have been pounding on the system with workloads that are
similar to what the system was doing during the lockups.  So far, I
cannot ge the system lock-up.  Looks like your patch fixes (or at
least helps).  Thanks for taking a look at the problem.



Can you apply the kdb.diff on top and check dmesg for prints?

--HPS
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: unkillable process consuming 100% cpu

2019-11-12 Thread Steve Kargl
On Tue, Nov 12, 2019 at 06:48:22PM +0100, Hans Petter Selasky wrote:
> On 2019-11-12 18:31, Steve Kargl wrote:
> >> Can you open the radeonkms.ko in gdb83 from ports and type:
> >>
> >> l *(radeon_gem_busy_ioctl+0x30)
> >>
> > % /boot/modules/radeonkms.ko
> > (gdb) l  *(radeon_gem_busy_ioctl+0x30)
> > 0xa12b0 is in radeon_gem_busy_ioctl 
> > (/usr/ports/graphics/drm-current-kmod/work/kms-drm-2d2852e/drivers/gpu/drm/radeon/radeon_gem.c:453).
> > 448 
> > /usr/ports/graphics/drm-current-kmod/work/kms-drm-2d2852e/drivers/gpu/drm/radeon/radeon_gem.c:
> >  No such file or directory.
> > (gdb)
> 
> Like expected.
> 

I installed the 2nd seqlock.diff, rebuilt drm-current-kmod-4.16.g20191023,
rebooting, and have been pounding on the system with workloads that are
similar to what the system was doing during the lockups.  So far, I
cannot ge the system lock-up.  Looks like your patch fixes (or at 
least helps).  Thanks for taking a look at the problem.

-- 
Steve
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: unkillable process consuming 100% cpu

2019-11-12 Thread Hans Petter Selasky

On 2019-11-12 18:31, Steve Kargl wrote:

Can you open the radeonkms.ko in gdb83 from ports and type:

l *(radeon_gem_busy_ioctl+0x30)


% /boot/modules/radeonkms.ko
(gdb) l  *(radeon_gem_busy_ioctl+0x30)
0xa12b0 is in radeon_gem_busy_ioctl 
(/usr/ports/graphics/drm-current-kmod/work/kms-drm-2d2852e/drivers/gpu/drm/radeon/radeon_gem.c:453).
448 
/usr/ports/graphics/drm-current-kmod/work/kms-drm-2d2852e/drivers/gpu/drm/radeon/radeon_gem.c:
 No such file or directory.
(gdb)


Like expected.

--HPS
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: unkillable process consuming 100% cpu

2019-11-12 Thread Steve Kargl
On Mon, Nov 11, 2019 at 10:34:23AM +0100, Hans Petter Selasky wrote:
> Hi,
> 
> Can you open the radeonkms.ko in gdb83 from ports and type:
> 
> l *(radeon_gem_busy_ioctl+0x30)
> 

% /boot/modules/radeonkms.ko
(gdb) l  *(radeon_gem_busy_ioctl+0x30)
0xa12b0 is in radeon_gem_busy_ioctl 
(/usr/ports/graphics/drm-current-kmod/work/kms-drm-2d2852e/drivers/gpu/drm/radeon/radeon_gem.c:453).
448 
/usr/ports/graphics/drm-current-kmod/work/kms-drm-2d2852e/drivers/gpu/drm/radeon/radeon_gem.c:
 No such file or directory.
(gdb) 

-- 
Steve
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: unkillable process consuming 100% cpu

2019-11-11 Thread Steve Kargl
On Mon, Nov 11, 2019 at 02:22:55PM +0100, Hans Petter Selasky wrote:
> On 2019-11-08 23:09, Steve Kargl wrote:
> > Here's 'procstat -kk' for the stuck process with the long line wrapped.
> 
> Can you run this command a couple of times and see if the backtrace changes?
> 
> --HPS

I was AFK for a few days.  I'll try all your suggestions
tomorrow.  The two lock ups occurred while using chrome
to watch/listen to youtube and using libreoffice to prepare
a presentation.  I'll see if I can reproduce the issue.

-- 
Steve
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: unkillable process consuming 100% cpu

2019-11-11 Thread Hans Petter Selasky

On 2019-11-08 23:09, Steve Kargl wrote:

Here's 'procstat -kk' for the stuck process with the long line wrapped.


Can you run this command a couple of times and see if the backtrace changes?

--HPS
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: unkillable process consuming 100% cpu

2019-11-11 Thread Konstantin Belousov
On Mon, Nov 11, 2019 at 01:22:09PM +0100, Hans Petter Selasky wrote:
> On 2019-11-11 11:44, Hans Petter Selasky wrote:
> > Seems like we can optimise away one more write memory barrier.
> > 
> > If you are building from ports, simply:
> > 
> > cd work/kms-drm*
> > cat seqlock.diff | patch -p1
> > 
> 
> Hi,
> 
> Here is one more debug patch you can try. See if you get that print 
> added in the patch in dmesg.
> 
> --HPS
> 

> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c 
> b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> index a6e0a16ae..0697d70f4 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
> @@ -31,6 +31,8 @@
>  #include "amdgpu_vm.h"
>  #include "amdgpu_amdkfd.h"
>  
> +#include 
> +
>  /* Special VM and GART address alignment needed for VI pre-Fiji due to
>   * a HW bug.
>   */
> @@ -236,6 +238,12 @@ static int amdgpu_amdkfd_remove_eviction_fence(struct 
> amdgpu_bo *bo,
>   *ef_count = 0;
>   }
>  
> + if (resv != NULL &&
> + (struct thread *)SX_OWNER(resv->lock.base.sx.sx_lock) != curthread) 
> {
This is really should be spelled as sx_xlocked().

> + printf("Called unlocked\n");
> + kdb_backtrace();
> + }
> +
>   old = reservation_object_get_list(resv);
>   if (!old)
>   return 0;

> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: unkillable process consuming 100% cpu

2019-11-11 Thread Hans Petter Selasky

On 2019-11-11 11:44, Hans Petter Selasky wrote:

Seems like we can optimise away one more write memory barrier.

If you are building from ports, simply:

cd work/kms-drm*
cat seqlock.diff | patch -p1



Hi,

Here is one more debug patch you can try. See if you get that print 
added in the patch in dmesg.


--HPS

diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
index a6e0a16ae..0697d70f4 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c
@@ -31,6 +31,8 @@
 #include "amdgpu_vm.h"
 #include "amdgpu_amdkfd.h"
 
+#include 
+
 /* Special VM and GART address alignment needed for VI pre-Fiji due to
  * a HW bug.
  */
@@ -236,6 +238,12 @@ static int amdgpu_amdkfd_remove_eviction_fence(struct amdgpu_bo *bo,
 		*ef_count = 0;
 	}
 
+	if (resv != NULL &&
+	(struct thread *)SX_OWNER(resv->lock.base.sx.sx_lock) != curthread) {
+		printf("Called unlocked\n");
+		kdb_backtrace();
+	}
+
 	old = reservation_object_get_list(resv);
 	if (!old)
 		return 0;
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: unkillable process consuming 100% cpu

2019-11-11 Thread Hans Petter Selasky

Seems like we can optimise away one more write memory barrier.

If you are building from ports, simply:

cd work/kms-drm*
cat seqlock.diff | patch -p1

--HPS
diff --git a/linuxkpi/gplv2/include/linux/reservation.h b/linuxkpi/gplv2/include/linux/reservation.h
index b975f792c..0ce922a0e 100644
--- a/linuxkpi/gplv2/include/linux/reservation.h
+++ b/linuxkpi/gplv2/include/linux/reservation.h
@@ -94,7 +94,7 @@ reservation_object_init(struct reservation_object *obj)
 {
 	ww_mutex_init(&obj->lock, &reservation_ww_class);
 
-	__seqcount_init(&obj->seq, reservation_seqcount_string, &reservation_seqcount_class);
+	seqcount_init(&obj->seq);
 	RCU_INIT_POINTER(obj->fence, NULL);
 	RCU_INIT_POINTER(obj->fence_excl, NULL);
 	obj->staged = NULL;
diff --git a/linuxkpi/gplv2/include/linux/seqlock.h b/linuxkpi/gplv2/include/linux/seqlock.h
index e86351810..115ad5e68 100644
--- a/linuxkpi/gplv2/include/linux/seqlock.h
+++ b/linuxkpi/gplv2/include/linux/seqlock.h
@@ -1,410 +1,148 @@
 #ifndef __LINUX_SEQLOCK_H
-#define __LINUX_SEQLOCK_H
-/*
- * Reader/writer consistent mechanism without starving writers. This type of
- * lock for data where the reader wants a consistent set of information
- * and is willing to retry if the information changes. There are two types
- * of readers:
- * 1. Sequence readers which never block a writer but they may have to retry
- *if a writer is in progress by detecting change in sequence number.
- *Writers do not wait for a sequence reader.
- * 2. Locking readers which will wait if a writer or another locking reader
- *is in progress. A locking reader in progress will also block a writer
- *from going forward. Unlike the regular rwlock, the read lock here is
- *exclusive so that only one locking reader can get it.
- *
- * This is not as cache friendly as brlock. Also, this may not work well
- * for data that contains pointers, because any writer could
- * invalidate a pointer that a reader was following.
- *
- * Expected non-blocking reader usage:
- * 	do {
- *	seq = read_seqbegin(&foo);
- * 	...
- *  } while (read_seqretry(&foo, seq));
- *
- *
- * On non-SMP the spin locks disappear but the writer still needs
- * to increment the sequence variables because an interrupt routine could
- * change the state of the data.
- *
- * Based on x86_64 vsyscall gettimeofday 
- * by Keith Owens and Andrea Arcangeli
- */
+#define	__LINUX_SEQLOCK_H
 
 #include 
 #include 
-#include 
 #include 
 #include 
+#include 
 #include 
 
-
-/*
- * Version using sequence counter only.
- * This can be used when code has its own mutex protecting the
- * updating starting before the write_seqcountbeqin() and ending
- * after the write_seqcount_end().
- */
 typedef struct seqcount {
-	unsigned sequence;
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-	struct lockdep_map dep_map;
-#endif
+	volatile unsigned sequence;
 } seqcount_t;
 
-
-#define lockdep_init_map(a, b, c, d)
-
-static inline void __seqcount_init(seqcount_t *s, const char *name,
-	  struct lock_class_key *key)
+static inline void
+seqcount_init(seqcount_t *s)
 {
-	/*
-	 * Make sure we are not reinitializing a held lock:
-	 */
-	lockdep_init_map(&s->dep_map, name, key, 0);
 	s->sequence = 0;
 }
 
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-# define SEQCOUNT_DEP_MAP_INIT(lockname) \
-		.dep_map = { .name = #lockname } \
-
-# define seqcount_init(s)\
-	do {		\
-		static struct lock_class_key __key;	\
-		__seqcount_init((s), #s, &__key);	\
-	} while (0)
+#define	__seqcount_init(a,b,c) \
+	seqcount_init(a)
 
-static inline void seqcount_lockdep_reader_access(seqcount_t *s)
-{
-	seqcount_t *l = (seqcount_t *)s;
-	unsigned long flags;
-
-	local_irq_save(flags);
-	seqcount_acquire_read(&l->dep_map, 0, 0, _RET_IP_);
-	seqcount_release(&l->dep_map, 1, _RET_IP_);
-	local_irq_restore(flags);
+#define	SEQCNT_ZERO(lockname) {			\
+	.sequence = 0\
 }
 
-#else
-# define SEQCOUNT_DEP_MAP_INIT(lockname)
-# define seqcount_init(s) __seqcount_init(s, NULL, NULL)
-# define seqcount_lockdep_reader_access(x)
-#endif
-
-#define SEQCNT_ZERO(lockname) { .sequence = 0, SEQCOUNT_DEP_MAP_INIT(lockname)}
-
-
-/**
- * __read_seqcount_begin - begin a seq-read critical section (without barrier)
- * @s: pointer to seqcount_t
- * Returns: count to be passed to read_seqcount_retry
- *
- * __read_seqcount_begin is like read_seqcount_begin, but has no smp_rmb()
- * barrier. Callers should ensure that smp_rmb() or equivalent ordering is
- * provided before actually loading any of the variables that are to be
- * protected in this critical section.
- *
- * Use carefully, only in critical code, and comment how the barrier is
- * provided.
- */
-static inline unsigned __read_seqcount_begin(seqcount_t *s)
+static inline unsigned
+__read_seqcount_begin(seqcount_t *s)
 {
 	unsigned ret;
 
 repeat:
-	ret = READ_ONCE(s->sequence);
+	ret = s->sequence;
 	if (unlikely(ret & 1)) {
 		cpu_relax();
 		goto repeat;
 	}
-	return ret;
+	return (ret);
 }
 
-/**
- * raw_read_seqcount - Read the raw seq

Re: unkillable process consuming 100% cpu

2019-11-11 Thread Hans Petter Selasky

On 2019-11-11 10:34, Hans Petter Selasky wrote:

Hi,

Can you open the radeonkms.ko in gdb83 from ports and type:

l *(radeon_gem_busy_ioctl+0x30)



Hi,

I suspect there is a memory race in the seqlock framework. Can you try 
the attached patch and re-build?


Is this issue easily reproducible?

--HPS
diff --git a/linuxkpi/gplv2/include/linux/reservation.h b/linuxkpi/gplv2/include/linux/reservation.h
index b975f792c..0ce922a0e 100644
--- a/linuxkpi/gplv2/include/linux/reservation.h
+++ b/linuxkpi/gplv2/include/linux/reservation.h
@@ -94,7 +94,7 @@ reservation_object_init(struct reservation_object *obj)
 {
 	ww_mutex_init(&obj->lock, &reservation_ww_class);
 
-	__seqcount_init(&obj->seq, reservation_seqcount_string, &reservation_seqcount_class);
+	seqcount_init(&obj->seq);
 	RCU_INIT_POINTER(obj->fence, NULL);
 	RCU_INIT_POINTER(obj->fence_excl, NULL);
 	obj->staged = NULL;
diff --git a/linuxkpi/gplv2/include/linux/seqlock.h b/linuxkpi/gplv2/include/linux/seqlock.h
index e86351810..940bd8e90 100644
--- a/linuxkpi/gplv2/include/linux/seqlock.h
+++ b/linuxkpi/gplv2/include/linux/seqlock.h
@@ -1,410 +1,149 @@
 #ifndef __LINUX_SEQLOCK_H
-#define __LINUX_SEQLOCK_H
-/*
- * Reader/writer consistent mechanism without starving writers. This type of
- * lock for data where the reader wants a consistent set of information
- * and is willing to retry if the information changes. There are two types
- * of readers:
- * 1. Sequence readers which never block a writer but they may have to retry
- *if a writer is in progress by detecting change in sequence number.
- *Writers do not wait for a sequence reader.
- * 2. Locking readers which will wait if a writer or another locking reader
- *is in progress. A locking reader in progress will also block a writer
- *from going forward. Unlike the regular rwlock, the read lock here is
- *exclusive so that only one locking reader can get it.
- *
- * This is not as cache friendly as brlock. Also, this may not work well
- * for data that contains pointers, because any writer could
- * invalidate a pointer that a reader was following.
- *
- * Expected non-blocking reader usage:
- * 	do {
- *	seq = read_seqbegin(&foo);
- * 	...
- *  } while (read_seqretry(&foo, seq));
- *
- *
- * On non-SMP the spin locks disappear but the writer still needs
- * to increment the sequence variables because an interrupt routine could
- * change the state of the data.
- *
- * Based on x86_64 vsyscall gettimeofday 
- * by Keith Owens and Andrea Arcangeli
- */
+#define	__LINUX_SEQLOCK_H
 
 #include 
 #include 
-#include 
 #include 
 #include 
+#include 
 #include 
 
-
-/*
- * Version using sequence counter only.
- * This can be used when code has its own mutex protecting the
- * updating starting before the write_seqcountbeqin() and ending
- * after the write_seqcount_end().
- */
 typedef struct seqcount {
-	unsigned sequence;
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-	struct lockdep_map dep_map;
-#endif
+	volatile unsigned sequence;
 } seqcount_t;
 
-
-#define lockdep_init_map(a, b, c, d)
-
-static inline void __seqcount_init(seqcount_t *s, const char *name,
-	  struct lock_class_key *key)
+static inline void
+seqcount_init(seqcount_t *s)
 {
-	/*
-	 * Make sure we are not reinitializing a held lock:
-	 */
-	lockdep_init_map(&s->dep_map, name, key, 0);
 	s->sequence = 0;
 }
 
-#ifdef CONFIG_DEBUG_LOCK_ALLOC
-# define SEQCOUNT_DEP_MAP_INIT(lockname) \
-		.dep_map = { .name = #lockname } \
-
-# define seqcount_init(s)\
-	do {		\
-		static struct lock_class_key __key;	\
-		__seqcount_init((s), #s, &__key);	\
-	} while (0)
+#define	__seqcount_init(a,b,c) \
+	seqcount_init(a)
 
-static inline void seqcount_lockdep_reader_access(seqcount_t *s)
-{
-	seqcount_t *l = (seqcount_t *)s;
-	unsigned long flags;
-
-	local_irq_save(flags);
-	seqcount_acquire_read(&l->dep_map, 0, 0, _RET_IP_);
-	seqcount_release(&l->dep_map, 1, _RET_IP_);
-	local_irq_restore(flags);
+#define	SEQCNT_ZERO(lockname) {			\
+	.sequence = 0\
 }
 
-#else
-# define SEQCOUNT_DEP_MAP_INIT(lockname)
-# define seqcount_init(s) __seqcount_init(s, NULL, NULL)
-# define seqcount_lockdep_reader_access(x)
-#endif
-
-#define SEQCNT_ZERO(lockname) { .sequence = 0, SEQCOUNT_DEP_MAP_INIT(lockname)}
-
-
-/**
- * __read_seqcount_begin - begin a seq-read critical section (without barrier)
- * @s: pointer to seqcount_t
- * Returns: count to be passed to read_seqcount_retry
- *
- * __read_seqcount_begin is like read_seqcount_begin, but has no smp_rmb()
- * barrier. Callers should ensure that smp_rmb() or equivalent ordering is
- * provided before actually loading any of the variables that are to be
- * protected in this critical section.
- *
- * Use carefully, only in critical code, and comment how the barrier is
- * provided.
- */
-static inline unsigned __read_seqcount_begin(seqcount_t *s)
+static inline unsigned
+__read_seqcount_begin(seqcount_t *s)
 {
 	unsigned ret;
 
 repeat:
-	ret = READ_ONCE(s->sequence);
+	ret = s->sequenc

Re: unkillable process consuming 100% cpu

2019-11-11 Thread Hans Petter Selasky

Hi,

Can you open the radeonkms.ko in gdb83 from ports and type:

l *(radeon_gem_busy_ioctl+0x30)

--HPS
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: unkillable process consuming 100% cpu

2019-11-08 Thread Steve Kargl
On Thu, Nov 07, 2019 at 03:32:23PM -0500, Mark Johnston wrote:
> On Thu, Nov 07, 2019 at 12:29:19PM -0800, Steve Kargl wrote:
> > I haven't seen anyone post about an unkillable process
> > (even by root), which consumes 100% cpu.
> > 
> > last pid:  4592;  load averages:  1.24,  1.08,  0.74   up 13+20:21:20  
> > 12:26:29
> > 68 processes:  2 running, 66 sleeping
> > CPU:  0.1% user,  0.0% nice, 12.6% system,  0.0% interrupt, 87.2% idle
> > Mem: 428M Active, 11G Inact, 138M Laundry, 2497M Wired, 1525M Buf, 2377M 
> > Free
> > Swap: 16G Total, 24M Used, 16G Free
> > 
> >   PID USERNAMETHR PRI NICE   SIZERES STATEC   TIMEWCPU 
> > COMMAND
> > 69092 kargl 2  450   342M   148M CPU2 2  12:51 100.07% 
> > chrome
> > 
> > 
> > Neither of these have an effect.
> > 
> > kill -1 69092
> > kill -9 69069
> > 
> > Attempts to attach gdb831 to -p 69092 leads to hung xterm.
> 
> Could you please show us the output of "procstat -kk 69092"?

Just had another lock-up.  A force 'shutdown -r now' from a
remote terminal led to a console message about an unkillable
process.

Here's 'procstat -kk' for the stuck process with the long line wrapped.

  PIDTID COMM   TDNAME  KSTACK   
  877 100161 Xorg   -   radeon_gem_busy_ioctl+0x30
drm_ioctl_kernel+0xf1
drm_ioctl+0x279
linux_file_ioctl+0x298
kern_ioctl+0x284
sys_ioctl+0x157
amd64_syscall+0x273
fast_syscall_common+0x101 
  877 100344 Xorg   X:rcs0  mi_switch+0xcb
sleepq_catch_signals+0x35d
sleepq_wait_sig+0xc
_sleep+0x1bd
umtxq_sleep+0x132
do_wait+0x3d6
__umtx_op_wait_uint_private+0x7e
amd64_syscall+0x273
fast_syscall_common+0x101 


It looks like radeonkms+drm is getting stuck.

-- 
Steve
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: unkillable process consuming 100% cpu

2019-11-07 Thread Steve Kargl
On Thu, Nov 07, 2019 at 03:32:23PM -0500, Mark Johnston wrote:
> On Thu, Nov 07, 2019 at 12:29:19PM -0800, Steve Kargl wrote:
> > I haven't seen anyone post about an unkillable process
> > (even by root), which consumes 100% cpu.
> > 
> > last pid:  4592;  load averages:  1.24,  1.08,  0.74   up 13+20:21:20  
> > 12:26:29
> > 68 processes:  2 running, 66 sleeping
> > CPU:  0.1% user,  0.0% nice, 12.6% system,  0.0% interrupt, 87.2% idle
> > Mem: 428M Active, 11G Inact, 138M Laundry, 2497M Wired, 1525M Buf, 2377M 
> > Free
> > Swap: 16G Total, 24M Used, 16G Free
> > 
> >   PID USERNAMETHR PRI NICE   SIZERES STATEC   TIMEWCPU 
> > COMMAND
> > 69092 kargl 2  450   342M   148M CPU2 2  12:51 100.07% 
> > chrome
> > 
> > 
> > Neither of these have an effect.
> > 
> > kill -1 69092
> > kill -9 69069
> > 
> > Attempts to attach gdb831 to -p 69092 leads to hung xterm.
> 
> Could you please show us the output of "procstat -kk 69092"?

Unfortunately, no.  I just rebooted the system to kill 69092.
During 'shutdown -r now', a message appeared on the console
warning that some processes would not die.  Then 'shutdown
-r now' hung the console. :(

Before rebooting I did try a number of ps and procstat commands, 69092 was

chrome: --type=gpu-process --field-trial-handle=long-string-of-number
--gpu-preferences=long-string-with-IAs

So, it seems that drm-current-kmod may not be happy.

For the record, uname gives FreeBSD 13.0-CURRENT r353571



-- 
Steve
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: unkillable process consuming 100% cpu

2019-11-07 Thread Mark Johnston
On Thu, Nov 07, 2019 at 12:29:19PM -0800, Steve Kargl wrote:
> I haven't seen anyone post about an unkillable process
> (even by root), which consumes 100% cpu.
> 
> last pid:  4592;  load averages:  1.24,  1.08,  0.74   up 13+20:21:20  
> 12:26:29
> 68 processes:  2 running, 66 sleeping
> CPU:  0.1% user,  0.0% nice, 12.6% system,  0.0% interrupt, 87.2% idle
> Mem: 428M Active, 11G Inact, 138M Laundry, 2497M Wired, 1525M Buf, 2377M Free
> Swap: 16G Total, 24M Used, 16G Free
> 
>   PID USERNAMETHR PRI NICE   SIZERES STATEC   TIMEWCPU COMMAND
> 69092 kargl 2  450   342M   148M CPU2 2  12:51 100.07% chrome
> 
> 
> Neither of these have an effect.
> 
> kill -1 69092
> kill -9 69069
> 
> Attempts to attach gdb831 to -p 69092 leads to hung xterm.

Could you please show us the output of "procstat -kk 69092"?
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


unkillable process consuming 100% cpu

2019-11-07 Thread Steve Kargl
I haven't seen anyone post about an unkillable process
(even by root), which consumes 100% cpu.

last pid:  4592;  load averages:  1.24,  1.08,  0.74   up 13+20:21:20  12:26:29
68 processes:  2 running, 66 sleeping
CPU:  0.1% user,  0.0% nice, 12.6% system,  0.0% interrupt, 87.2% idle
Mem: 428M Active, 11G Inact, 138M Laundry, 2497M Wired, 1525M Buf, 2377M Free
Swap: 16G Total, 24M Used, 16G Free

  PID USERNAMETHR PRI NICE   SIZERES STATEC   TIMEWCPU COMMAND
69092 kargl 2  450   342M   148M CPU2 2  12:51 100.07% chrome


Neither of these have an effect.

kill -1 69092
kill -9 69069

Attempts to attach gdb831 to -p 69092 leads to hung xterm.

-- 
Steve
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Running linux ldconfig on tmpfs results in unkillable process

2011-01-19 Thread Beat Gätzi
On 19.01.2011 13:24, Kostik Belousov wrote:
> On Tue, Jan 18, 2011 at 05:40:14PM +0100, Beat G?tzi wrote:
>> On 18.01.2011 17:13, Kostik Belousov wrote:
>>> On Tue, Jan 18, 2011 at 04:34:10PM +0100, Beat G?tzi wrote:
>>>> On 18.01.2011 15:46, Kostik Belousov wrote:
>>>>> On Tue, Jan 18, 2011 at 03:16:27PM +0100, Beat G?tzi wrote:
>>>>>> Hi,
>>>>>>
>>>>>> I've a tinderbox which uses tmpfs to build ports. Every time I build a
>>>>>> port which executes linux ldconfig it results in an unkillable process
>>>>>> which uses 100% CPU. The problem is reproduceable without tinderbox:
>>>>>>
>>>>>> # uname -a
>>>>>> FreeBSD daedalus.network.local 9.0-CURRENT FreeBSD 9.0-CURRENT #3
>>>>>> r216761: Tue Dec 28 15:32:26 CET 2010
>>>>>> root@daedalus.network.local:/usr/obj/usr/src/sys/GENERIC  i386
>>>>>> # mkdir /compat/test
>>>>>> # mount -t tmpfs tmpfs /compat/test
>>>>>> # cp -Rp /compat/linux/* /compat/test/
>>>>>> # mount -t linprocfs linprocfs /compat/test/proc
>>>>>> # /compat/linux/sbin/ldconfig -r /compat/test/
>>>>>> # pgrep ldconfig
>>>>>> 3449
>>>>>> # procstat -i 3449 | grep KILL
>>>>>>  3449 ldconfig KILL ---
>>>>>> # kill -9 3449
>>>>>> # procstat -i 3449 | grep KILL
>>>>>>  3449 ldconfig KILL P--
>>>>>>
>>>>>> >From top(1):
>>>>>> PID USERNAME THR PRI NICE  SIZE   RES STATEC  TIME   WCPU COMMAND
>>>>>> 3449 root 1  440   992K   712K CPU11  10:06 100.00% ldconfig
>>>>>>
>>>>>> When I reboot the machine it hangs after "All buffers synced.".
>>>>>>
>>>>>> I've uploaded some additional output of procstat and ktrace here:
>>>>>> http://people.freebsd.org/~beat/logs/linux-ldconfig-tmpfs.txt
>>>>>>
>>>>>> Anyone knows how to fix this?
>>>>> kdump for the trace of the linux binary is a garbage. You need to
>>>>> use linux_kdump (from ports).
>>>>>
>>>>> I think that your process is looping in the kernel, you can confirm this
>>>>> by dropping in the ddb and doing "bt ".
>>>>
>>>> I've uploaded a screenshot from the output of bt  in ddb:
>>>> http://people.freebsd.org/~beat/logs/linux-ldconfig-tmpfs-bt.jpg
>>>
>>> Please try this.
>>>
>>> diff --git a/sys/compat/linux/linux_file.c b/sys/compat/linux/linux_file.c
>>> index 9ff1cf0..44ad193 100644
>>> --- a/sys/compat/linux/linux_file.c
>>> +++ b/sys/compat/linux/linux_file.c
>>> @@ -369,7 +369,6 @@ getdents_common(struct thread *td, struct 
>>> linux_getdents64_args *args,
>>> lbuf = malloc(LINUX_MAXRECLEN, M_TEMP, M_WAITOK | M_ZERO);
>>> vn_lock(vp, LK_SHARED | LK_RETRY);
>>>  
>>> -again:
>>> aiov.iov_base = buf;
>>> aiov.iov_len = buflen;
>>> auio.uio_iov = &aiov;
>>> @@ -506,8 +505,10 @@ again:
>>> break;
>>> }
>>>  
>>> -   if (outp == (caddr_t)args->dirent)
>>> -   goto again;
>>> +   if (outp == (caddr_t)args->dirent) {
>>> +   nbytes = resid;
>>> +   goto eof;
>>> +   }
>>>  
>>> fp->f_offset = off;
>>> if (justone)
>>> diff --git a/sys/fs/tmpfs/tmpfs_subr.c b/sys/fs/tmpfs/tmpfs_subr.c
>>> index 84a2038..62dd0bf 100644
>>> --- a/sys/fs/tmpfs/tmpfs_subr.c
>>> +++ b/sys/fs/tmpfs/tmpfs_subr.c
>>> @@ -827,9 +827,10 @@ tmpfs_dir_getdents(struct tmpfs_node *node, struct uio 
>>> *uio, off_t *cntp)
>>> /* Copy the new dirent structure into the output buffer and
>>>  * advance pointers. */
>>> error = uiomove(&d, d.d_reclen, uio);
>>> -
>>> -   (*cntp)++;
>>> -   de = TAILQ_NEXT(de, td_entries);
>>> +   if (error == 0) {
>>> +   (*cntp)++;
>>> +   de = TAILQ_NEXT(de, td_entries);
>>> +   }
>>> } while (error == 0 && uio->uio_resid > 0 && de != NULL);
>>>  
>>> /* Update the offset and cache. */
>>

Re: Running linux ldconfig on tmpfs results in unkillable process

2011-01-19 Thread Kostik Belousov
On Tue, Jan 18, 2011 at 05:40:14PM +0100, Beat G?tzi wrote:
> On 18.01.2011 17:13, Kostik Belousov wrote:
> > On Tue, Jan 18, 2011 at 04:34:10PM +0100, Beat G?tzi wrote:
> >> On 18.01.2011 15:46, Kostik Belousov wrote:
> >>> On Tue, Jan 18, 2011 at 03:16:27PM +0100, Beat G?tzi wrote:
> >>>> Hi,
> >>>>
> >>>> I've a tinderbox which uses tmpfs to build ports. Every time I build a
> >>>> port which executes linux ldconfig it results in an unkillable process
> >>>> which uses 100% CPU. The problem is reproduceable without tinderbox:
> >>>>
> >>>> # uname -a
> >>>> FreeBSD daedalus.network.local 9.0-CURRENT FreeBSD 9.0-CURRENT #3
> >>>> r216761: Tue Dec 28 15:32:26 CET 2010
> >>>> root@daedalus.network.local:/usr/obj/usr/src/sys/GENERIC  i386
> >>>> # mkdir /compat/test
> >>>> # mount -t tmpfs tmpfs /compat/test
> >>>> # cp -Rp /compat/linux/* /compat/test/
> >>>> # mount -t linprocfs linprocfs /compat/test/proc
> >>>> # /compat/linux/sbin/ldconfig -r /compat/test/
> >>>> # pgrep ldconfig
> >>>> 3449
> >>>> # procstat -i 3449 | grep KILL
> >>>>  3449 ldconfig KILL ---
> >>>> # kill -9 3449
> >>>> # procstat -i 3449 | grep KILL
> >>>>  3449 ldconfig KILL P--
> >>>>
> >>>> >From top(1):
> >>>> PID USERNAME THR PRI NICE  SIZE   RES STATEC  TIME   WCPU COMMAND
> >>>> 3449 root 1  440   992K   712K CPU11  10:06 100.00% ldconfig
> >>>>
> >>>> When I reboot the machine it hangs after "All buffers synced.".
> >>>>
> >>>> I've uploaded some additional output of procstat and ktrace here:
> >>>> http://people.freebsd.org/~beat/logs/linux-ldconfig-tmpfs.txt
> >>>>
> >>>> Anyone knows how to fix this?
> >>> kdump for the trace of the linux binary is a garbage. You need to
> >>> use linux_kdump (from ports).
> >>>
> >>> I think that your process is looping in the kernel, you can confirm this
> >>> by dropping in the ddb and doing "bt ".
> >>
> >> I've uploaded a screenshot from the output of bt  in ddb:
> >> http://people.freebsd.org/~beat/logs/linux-ldconfig-tmpfs-bt.jpg
> > 
> > Please try this.
> > 
> > diff --git a/sys/compat/linux/linux_file.c b/sys/compat/linux/linux_file.c
> > index 9ff1cf0..44ad193 100644
> > --- a/sys/compat/linux/linux_file.c
> > +++ b/sys/compat/linux/linux_file.c
> > @@ -369,7 +369,6 @@ getdents_common(struct thread *td, struct 
> > linux_getdents64_args *args,
> > lbuf = malloc(LINUX_MAXRECLEN, M_TEMP, M_WAITOK | M_ZERO);
> > vn_lock(vp, LK_SHARED | LK_RETRY);
> >  
> > -again:
> > aiov.iov_base = buf;
> > aiov.iov_len = buflen;
> > auio.uio_iov = &aiov;
> > @@ -506,8 +505,10 @@ again:
> > break;
> > }
> >  
> > -   if (outp == (caddr_t)args->dirent)
> > -   goto again;
> > +   if (outp == (caddr_t)args->dirent) {
> > +   nbytes = resid;
> > +   goto eof;
> > +   }
> >  
> > fp->f_offset = off;
> > if (justone)
> > diff --git a/sys/fs/tmpfs/tmpfs_subr.c b/sys/fs/tmpfs/tmpfs_subr.c
> > index 84a2038..62dd0bf 100644
> > --- a/sys/fs/tmpfs/tmpfs_subr.c
> > +++ b/sys/fs/tmpfs/tmpfs_subr.c
> > @@ -827,9 +827,10 @@ tmpfs_dir_getdents(struct tmpfs_node *node, struct uio 
> > *uio, off_t *cntp)
> > /* Copy the new dirent structure into the output buffer and
> >  * advance pointers. */
> > error = uiomove(&d, d.d_reclen, uio);
> > -
> > -   (*cntp)++;
> > -   de = TAILQ_NEXT(de, td_entries);
> > +   if (error == 0) {
> > +   (*cntp)++;
> > +   de = TAILQ_NEXT(de, td_entries);
> > +   }
> > } while (error == 0 && uio->uio_resid > 0 && de != NULL);
> >  
> > /* Update the offset and cache. */
> 
> This patch solves the problem.
> 
Thank you, but apparently this is not the end of story.

I committed the linuxolator part of change, but I think that tmpfs
change is uncomplete yet. Strictly following getdirentries(2), tmpfs
must return EINVAL in the case when no single record can be returned.
Currently, it indicates E

Re: Running linux ldconfig on tmpfs results in unkillable process

2011-01-18 Thread Beat Gätzi
On 18.01.2011 17:13, Kostik Belousov wrote:
> On Tue, Jan 18, 2011 at 04:34:10PM +0100, Beat G?tzi wrote:
>> On 18.01.2011 15:46, Kostik Belousov wrote:
>>> On Tue, Jan 18, 2011 at 03:16:27PM +0100, Beat G?tzi wrote:
>>>> Hi,
>>>>
>>>> I've a tinderbox which uses tmpfs to build ports. Every time I build a
>>>> port which executes linux ldconfig it results in an unkillable process
>>>> which uses 100% CPU. The problem is reproduceable without tinderbox:
>>>>
>>>> # uname -a
>>>> FreeBSD daedalus.network.local 9.0-CURRENT FreeBSD 9.0-CURRENT #3
>>>> r216761: Tue Dec 28 15:32:26 CET 2010
>>>> root@daedalus.network.local:/usr/obj/usr/src/sys/GENERIC  i386
>>>> # mkdir /compat/test
>>>> # mount -t tmpfs tmpfs /compat/test
>>>> # cp -Rp /compat/linux/* /compat/test/
>>>> # mount -t linprocfs linprocfs /compat/test/proc
>>>> # /compat/linux/sbin/ldconfig -r /compat/test/
>>>> # pgrep ldconfig
>>>> 3449
>>>> # procstat -i 3449 | grep KILL
>>>>  3449 ldconfig KILL ---
>>>> # kill -9 3449
>>>> # procstat -i 3449 | grep KILL
>>>>  3449 ldconfig KILL P--
>>>>
>>>> >From top(1):
>>>> PID USERNAME THR PRI NICE  SIZE   RES STATEC  TIME   WCPU COMMAND
>>>> 3449 root 1  440   992K   712K CPU11  10:06 100.00% ldconfig
>>>>
>>>> When I reboot the machine it hangs after "All buffers synced.".
>>>>
>>>> I've uploaded some additional output of procstat and ktrace here:
>>>> http://people.freebsd.org/~beat/logs/linux-ldconfig-tmpfs.txt
>>>>
>>>> Anyone knows how to fix this?
>>> kdump for the trace of the linux binary is a garbage. You need to
>>> use linux_kdump (from ports).
>>>
>>> I think that your process is looping in the kernel, you can confirm this
>>> by dropping in the ddb and doing "bt ".
>>
>> I've uploaded a screenshot from the output of bt  in ddb:
>> http://people.freebsd.org/~beat/logs/linux-ldconfig-tmpfs-bt.jpg
> 
> Please try this.
> 
> diff --git a/sys/compat/linux/linux_file.c b/sys/compat/linux/linux_file.c
> index 9ff1cf0..44ad193 100644
> --- a/sys/compat/linux/linux_file.c
> +++ b/sys/compat/linux/linux_file.c
> @@ -369,7 +369,6 @@ getdents_common(struct thread *td, struct 
> linux_getdents64_args *args,
>   lbuf = malloc(LINUX_MAXRECLEN, M_TEMP, M_WAITOK | M_ZERO);
>   vn_lock(vp, LK_SHARED | LK_RETRY);
>  
> -again:
>   aiov.iov_base = buf;
>   aiov.iov_len = buflen;
>   auio.uio_iov = &aiov;
> @@ -506,8 +505,10 @@ again:
>   break;
>   }
>  
> - if (outp == (caddr_t)args->dirent)
> - goto again;
> + if (outp == (caddr_t)args->dirent) {
> + nbytes = resid;
> + goto eof;
> + }
>  
>   fp->f_offset = off;
>   if (justone)
> diff --git a/sys/fs/tmpfs/tmpfs_subr.c b/sys/fs/tmpfs/tmpfs_subr.c
> index 84a2038..62dd0bf 100644
> --- a/sys/fs/tmpfs/tmpfs_subr.c
> +++ b/sys/fs/tmpfs/tmpfs_subr.c
> @@ -827,9 +827,10 @@ tmpfs_dir_getdents(struct tmpfs_node *node, struct uio 
> *uio, off_t *cntp)
>   /* Copy the new dirent structure into the output buffer and
>* advance pointers. */
>   error = uiomove(&d, d.d_reclen, uio);
> -
> - (*cntp)++;
> - de = TAILQ_NEXT(de, td_entries);
> + if (error == 0) {
> + (*cntp)++;
> + de = TAILQ_NEXT(de, td_entries);
> + }
>   } while (error == 0 && uio->uio_resid > 0 && de != NULL);
>  
>   /* Update the offset and cache. */

This patch solves the problem.

Thanks a lot!
Beat
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Running linux ldconfig on tmpfs results in unkillable process

2011-01-18 Thread Kostik Belousov
On Tue, Jan 18, 2011 at 04:34:10PM +0100, Beat G?tzi wrote:
> On 18.01.2011 15:46, Kostik Belousov wrote:
> > On Tue, Jan 18, 2011 at 03:16:27PM +0100, Beat G?tzi wrote:
> >> Hi,
> >>
> >> I've a tinderbox which uses tmpfs to build ports. Every time I build a
> >> port which executes linux ldconfig it results in an unkillable process
> >> which uses 100% CPU. The problem is reproduceable without tinderbox:
> >>
> >> # uname -a
> >> FreeBSD daedalus.network.local 9.0-CURRENT FreeBSD 9.0-CURRENT #3
> >> r216761: Tue Dec 28 15:32:26 CET 2010
> >> root@daedalus.network.local:/usr/obj/usr/src/sys/GENERIC  i386
> >> # mkdir /compat/test
> >> # mount -t tmpfs tmpfs /compat/test
> >> # cp -Rp /compat/linux/* /compat/test/
> >> # mount -t linprocfs linprocfs /compat/test/proc
> >> # /compat/linux/sbin/ldconfig -r /compat/test/
> >> # pgrep ldconfig
> >> 3449
> >> # procstat -i 3449 | grep KILL
> >>  3449 ldconfig KILL ---
> >> # kill -9 3449
> >> # procstat -i 3449 | grep KILL
> >>  3449 ldconfig KILL P--
> >>
> >> >From top(1):
> >> PID USERNAME THR PRI NICE  SIZE   RES STATEC  TIME   WCPU COMMAND
> >> 3449 root 1  440   992K   712K CPU11  10:06 100.00% ldconfig
> >>
> >> When I reboot the machine it hangs after "All buffers synced.".
> >>
> >> I've uploaded some additional output of procstat and ktrace here:
> >> http://people.freebsd.org/~beat/logs/linux-ldconfig-tmpfs.txt
> >>
> >> Anyone knows how to fix this?
> > kdump for the trace of the linux binary is a garbage. You need to
> > use linux_kdump (from ports).
> > 
> > I think that your process is looping in the kernel, you can confirm this
> > by dropping in the ddb and doing "bt ".
> 
> I've uploaded a screenshot from the output of bt  in ddb:
> http://people.freebsd.org/~beat/logs/linux-ldconfig-tmpfs-bt.jpg

Please try this.

diff --git a/sys/compat/linux/linux_file.c b/sys/compat/linux/linux_file.c
index 9ff1cf0..44ad193 100644
--- a/sys/compat/linux/linux_file.c
+++ b/sys/compat/linux/linux_file.c
@@ -369,7 +369,6 @@ getdents_common(struct thread *td, struct 
linux_getdents64_args *args,
lbuf = malloc(LINUX_MAXRECLEN, M_TEMP, M_WAITOK | M_ZERO);
vn_lock(vp, LK_SHARED | LK_RETRY);
 
-again:
aiov.iov_base = buf;
aiov.iov_len = buflen;
auio.uio_iov = &aiov;
@@ -506,8 +505,10 @@ again:
break;
}
 
-   if (outp == (caddr_t)args->dirent)
-   goto again;
+   if (outp == (caddr_t)args->dirent) {
+   nbytes = resid;
+   goto eof;
+   }
 
fp->f_offset = off;
if (justone)
diff --git a/sys/fs/tmpfs/tmpfs_subr.c b/sys/fs/tmpfs/tmpfs_subr.c
index 84a2038..62dd0bf 100644
--- a/sys/fs/tmpfs/tmpfs_subr.c
+++ b/sys/fs/tmpfs/tmpfs_subr.c
@@ -827,9 +827,10 @@ tmpfs_dir_getdents(struct tmpfs_node *node, struct uio 
*uio, off_t *cntp)
/* Copy the new dirent structure into the output buffer and
 * advance pointers. */
error = uiomove(&d, d.d_reclen, uio);
-
-   (*cntp)++;
-   de = TAILQ_NEXT(de, td_entries);
+   if (error == 0) {
+   (*cntp)++;
+   de = TAILQ_NEXT(de, td_entries);
+   }
} while (error == 0 && uio->uio_resid > 0 && de != NULL);
 
/* Update the offset and cache. */


pgpkY5u9Bi8eo.pgp
Description: PGP signature


Re: Running linux ldconfig on tmpfs results in unkillable process

2011-01-18 Thread Beat Gätzi
On 18.01.2011 15:46, Kostik Belousov wrote:
> On Tue, Jan 18, 2011 at 03:16:27PM +0100, Beat G?tzi wrote:
>> Hi,
>>
>> I've a tinderbox which uses tmpfs to build ports. Every time I build a
>> port which executes linux ldconfig it results in an unkillable process
>> which uses 100% CPU. The problem is reproduceable without tinderbox:
>>
>> # uname -a
>> FreeBSD daedalus.network.local 9.0-CURRENT FreeBSD 9.0-CURRENT #3
>> r216761: Tue Dec 28 15:32:26 CET 2010
>> root@daedalus.network.local:/usr/obj/usr/src/sys/GENERIC  i386
>> # mkdir /compat/test
>> # mount -t tmpfs tmpfs /compat/test
>> # cp -Rp /compat/linux/* /compat/test/
>> # mount -t linprocfs linprocfs /compat/test/proc
>> # /compat/linux/sbin/ldconfig -r /compat/test/
>> # pgrep ldconfig
>> 3449
>> # procstat -i 3449 | grep KILL
>>  3449 ldconfig KILL ---
>> # kill -9 3449
>> # procstat -i 3449 | grep KILL
>>  3449 ldconfig KILL P--
>>
>> >From top(1):
>> PID USERNAME THR PRI NICE  SIZE   RES STATEC  TIME   WCPU COMMAND
>> 3449 root 1  440   992K   712K CPU11  10:06 100.00% ldconfig
>>
>> When I reboot the machine it hangs after "All buffers synced.".
>>
>> I've uploaded some additional output of procstat and ktrace here:
>> http://people.freebsd.org/~beat/logs/linux-ldconfig-tmpfs.txt
>>
>> Anyone knows how to fix this?
> kdump for the trace of the linux binary is a garbage. You need to
> use linux_kdump (from ports).
> 
> I think that your process is looping in the kernel, you can confirm this
> by dropping in the ddb and doing "bt ".

I've uploaded a screenshot from the output of bt  in ddb:
http://people.freebsd.org/~beat/logs/linux-ldconfig-tmpfs-bt.jpg

Thanks,
Beat
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Running linux ldconfig on tmpfs results in unkillable process

2011-01-18 Thread Kostik Belousov
On Tue, Jan 18, 2011 at 03:16:27PM +0100, Beat G?tzi wrote:
> Hi,
> 
> I've a tinderbox which uses tmpfs to build ports. Every time I build a
> port which executes linux ldconfig it results in an unkillable process
> which uses 100% CPU. The problem is reproduceable without tinderbox:
> 
> # uname -a
> FreeBSD daedalus.network.local 9.0-CURRENT FreeBSD 9.0-CURRENT #3
> r216761: Tue Dec 28 15:32:26 CET 2010
> root@daedalus.network.local:/usr/obj/usr/src/sys/GENERIC  i386
> # mkdir /compat/test
> # mount -t tmpfs tmpfs /compat/test
> # cp -Rp /compat/linux/* /compat/test/
> # mount -t linprocfs linprocfs /compat/test/proc
> # /compat/linux/sbin/ldconfig -r /compat/test/
> # pgrep ldconfig
> 3449
> # procstat -i 3449 | grep KILL
>  3449 ldconfig KILL ---
> # kill -9 3449
> # procstat -i 3449 | grep KILL
>  3449 ldconfig KILL P--
> 
> >From top(1):
> PID USERNAME THR PRI NICE  SIZE   RES STATEC  TIME   WCPU COMMAND
> 3449 root 1  440   992K   712K CPU11  10:06 100.00% ldconfig
> 
> When I reboot the machine it hangs after "All buffers synced.".
> 
> I've uploaded some additional output of procstat and ktrace here:
> http://people.freebsd.org/~beat/logs/linux-ldconfig-tmpfs.txt
> 
> Anyone knows how to fix this?
kdump for the trace of the linux binary is a garbage. You need to
use linux_kdump (from ports).

I think that your process is looping in the kernel, you can confirm this
by dropping in the ddb and doing "bt ".


pgpagpeX68cJj.pgp
Description: PGP signature


Running linux ldconfig on tmpfs results in unkillable process

2011-01-18 Thread Beat Gätzi
Hi,

I've a tinderbox which uses tmpfs to build ports. Every time I build a
port which executes linux ldconfig it results in an unkillable process
which uses 100% CPU. The problem is reproduceable without tinderbox:

# uname -a
FreeBSD daedalus.network.local 9.0-CURRENT FreeBSD 9.0-CURRENT #3
r216761: Tue Dec 28 15:32:26 CET 2010
root@daedalus.network.local:/usr/obj/usr/src/sys/GENERIC  i386
# mkdir /compat/test
# mount -t tmpfs tmpfs /compat/test
# cp -Rp /compat/linux/* /compat/test/
# mount -t linprocfs linprocfs /compat/test/proc
# /compat/linux/sbin/ldconfig -r /compat/test/
# pgrep ldconfig
3449
# procstat -i 3449 | grep KILL
 3449 ldconfig KILL ---
# kill -9 3449
# procstat -i 3449 | grep KILL
 3449 ldconfig KILL P--

>From top(1):
PID USERNAME THR PRI NICE  SIZE   RES STATEC  TIME   WCPU COMMAND
3449 root 1  440   992K   712K CPU11  10:06 100.00% ldconfig

When I reboot the machine it hangs after "All buffers synced.".

I've uploaded some additional output of procstat and ktrace here:
http://people.freebsd.org/~beat/logs/linux-ldconfig-tmpfs.txt

Anyone knows how to fix this?

Thanks,
Beat
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: unkillable process - 'mdconfig -t vnode' on small file

2002-12-01 Thread Hiten Pandya
On Sat, Nov 30, 2002 at 06:24:06PM +0100, Michal Mertl wrote the words in effect of:
> Subject says it all.
> 
> I wanted to make vnode-backed md(4) and forgot to specify size, thas it
> after 'touch mdfile;mdconfig -a -t vnode -f mdfile' mdconfig process can't
> be killed. It's wchan ('ps axO wchan|grep mdconf') is mddest.

Hello.

I recently reported this issue, and Ian Dowse had a fix to correct this
situation in the mddestroy() routine in src/sys/dev/md/md.c.  Please
update your tree, and rebuild.

Cheers.

-- 
Hiten Pandya ([EMAIL PROTECTED], [EMAIL PROTECTED])
http://www.unixdaemons.com/~hiten/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: unkillable process - 'mdconfig -t vnode' on small file

2002-11-30 Thread Ian Dowse
In message <[EMAIL PROTECTED]>, Michal 
Mertl writes:
>Subject says it all.

Fixed in md.c revision 1.74 - this was discussed here a few days
ago, but I was just waiting for approval to commit the fix.

Ian

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



unkillable process - 'mdconfig -t vnode' on small file

2002-11-30 Thread Michal Mertl
Subject says it all.

I wanted to make vnode-backed md(4) and forgot to specify size, thas it
after 'touch mdfile;mdconfig -a -t vnode -f mdfile' mdconfig process can't
be killed. It's wchan ('ps axO wchan|grep mdconf') is mddest.

-- 
Michal Mertl
[EMAIL PROTECTED]



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: unkillable process :-\

2002-07-29 Thread Julian Elischer

ddb
ps   (find address of proc)
cont

cd /sys/i386/compile/MYKERNEL
gdb -k kernel.debug /dev/mem
print *(struct proc *){address}
go to the thread..
 do the same.


maybe others can tell you other things to look at while you are in
there


(if you can not get to ddb you may neet to traverse the allproc list
to find your process.)





On Mon, 29 Jul 2002, Mikhail Teterin wrote:

> KOffice's kword is stuck here... Can not be killed even with -9.
> Sits idle, with its window open, but not updating:
> 
>  UID   PID  PPID CPU PRI NI   VSZ  RSS MWCHAN STAT  TT   TIME COMMAND
> 1042 88248 1   0  96  0 119296 28105 -  WWs   pm0:00,00 kword 
> /tmp/k
> 
> Machine is otherwise fine, uptime 12 days -- my work desktop. -current
> of Tue Jul 16 13:14:05 EDT 2002 vintage. Any clues?
> 
>   -mi
> 
> 
> To Unsubscribe: send mail to [EMAIL PROTECTED]
> with "unsubscribe freebsd-current" in the body of the message
> 


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: unkillable process :-\

2002-07-29 Thread Dan Nelson

In the last episode (Jul 29), Mikhail Teterin said:
> KOffice's kword is stuck here... Can not be killed even with -9.
> Sits idle, with its window open, but not updating:
> 
>  UID   PID  PPID CPU PRI NI   VSZ  RSS MWCHAN STAT  TT   TIME COMMAND
> 1042 88248 1   0  96  0 119296 28105 -  WWs   pm0:00,00 kword /tmp/k

The W in STAT means it's been swapped out.  I don't know what WW means,
though.

-- 
Dan Nelson
[EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



unkillable process :-\

2002-07-29 Thread Mikhail Teterin

KOffice's kword is stuck here... Can not be killed even with -9.
Sits idle, with its window open, but not updating:

 UID   PID  PPID CPU PRI NI   VSZ  RSS MWCHAN STAT  TT   TIME COMMAND
1042 88248 1   0  96  0 119296 28105 -  WWs   pm0:00,00 kword 
/tmp/k

Machine is otherwise fine, uptime 12 days -- my work desktop. -current
of Tue Jul 16 13:14:05 EDT 2002 vintage. Any clues?

-mi


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message