Re: ci.freebsd.org 's FreeBSD-head-{amd64, i386}-test started failing after -r337332 (last good), inp_gcmoptions involved

2018-08-11 Thread Matthew Macy
Thanks I'll take a look.

On Sat, Aug 11, 2018 at 10:41 AM Li-Wen Hsu  wrote:

> With the VMs images on artifact.ci.freebsd.org, I can reproduce this with:
>
> root@:/usr/tests/sys/netinet # kyua debug
> fibs_test:slaac_on_nondefault_fib6
> fib is 1
> fib is 2
> net.inet6.ip6.forwarding: 0 -> 1
> net.inet6.ip6.rfc6204w3: 0 -> 1
> /sbin/pfctl
> /sbin/ipf
> ipf: IP Filter: v5.1.2 (608)
> setfib 1 ifconfig epair0a inet6 2001:db8:115e:fc32::2/64 fib 1
> setfib 2 ifconfig epair0b inet6 -ifdisabled accept_rtadv fib 2 up
> Executing command [ ifconfig epair0b ]
> Executing co
>
> Fatal trap 9: general protection fault while in kernel mode
> cpuid = 0; apic id = 00
> instruction pointer = 0x20:0x80ded513
> stack pointer   = 0x28:0xfe0012158860
> frame pointer   = 0x28:0xfe00121588a0
> code segment= base 0x0, limit 0xf, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags= interrupt enabled, resume, IOPL = 0
> current process = 0 (softirq_0)
> [ thread pid 0 tid 100013 ]
> Stopped at  inp_gcmoptions+0xe3:movqll+0x33f(%rax),%r9
> db> bt
> Tracing pid 0 tid 100013 td 0xf800031de000
> inp_gcmoptions() at inp_gcmoptions+0xe3/frame 0xfe00121588a0
> epoch_call_task() at epoch_call_task+0x21a/frame 0xfe00121588f0
> gtaskqueue_run_locked() at gtaskqueue_run_locked+0x139/frame
> 0xfe0012158940
> gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0x88/frame
> 0xfe0012158970
> fork_exit() at fork_exit+0x84/frame 0xfe00121589b0
> fork_trampoline() at fork_trampoline+0xe/frame 0xfe00121589b0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> db>
>
>
> Li-Wen
>
> On Mon, Aug 6, 2018 at 9:53 AM Alan Somers  wrote:
> >
> > I can't reproduce the failure.  On my VM, with a kernel from Aug-2, the
> test passes.  But it sure seems to be consistent in Jenkins.
> >
> > On Sun, Aug 5, 2018 at 6:59 PM, Matthew Macy  wrote:
> >>
> >> That looks like it is tied to changes I made 3 months ago. I won't be
> at my desk until the end of the week, but if it's consistent I can take a
> look.
> >>
> >> -M
> >>
> >> On Sun, Aug 5, 2018 at 17:57 Li-Wen Hsu  wrote:
> >>>
> >>> On Sun, Aug 5, 2018 at 6:23 PM Mark Millard  wrote:
> >>> >
> >>> > amd64: #8493 was for -r337342 and #8492 (last good) was for -r337332
> .
> >>> > more recent builds also failed. -r337342 and laster also failed for
> >>> > i386.
> >>> >
> >>> > All but a sys/gettimeofday.2 change after -r337332 through -r337342
> >>> > are from Brad Davis. It is unclear to me how the changes matches up
> >>> > with the below example (from the log for amd64). It might not?
> >>> >
> >>> > For example (i386 is similar):
> >>> >
> >>> > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/8493/consoleText
> >>> >
> >>> >
> sys/netinet/fibs_test:subnet_route_with_multiple_fibs_on_same_subnet  ->
> >>> >
> >>> > Fatal trap 9: general protection fault while in kernel mode
> >>> > cpuid = 0; apic id = 00
> >>> > instruction pointer = 0x20:0x80ded213
> >>> > stack pointer   = 0x28:0xfe002648c960
> >>> > frame pointer   = 0x28:0xfe002648c9a0
> >>> > code segment= base 0x0, limit 0xf, type 0x1b
> >>> > = DPL 0, pres 1, long 1, def32 0, gran 1
> >>> > processor eflags= interrupt enabled, resume, IOPL = 0
> >>> > current process = 0 (softirq_0)
> >>> > [ thread pid 0 tid 100013 ]
> >>> > Stopped at  inp_gcmoptions+0xe3:movqll+0x33f(%rax),%r9
> >>>
> >>> I think this is because we are trying to enable more tests:
> >>> https://github.com/freebsd/freebsd-ci/pull/25
> >>>
> >>> I'm looking into that.  If I cannot resolve this quickly I will revert
> >>> it temporarily.
> >>>
> >>> Li-Wen
> >>>
> >>> --
> >>> Li-Wen Hsu 
> >>> https://lwhsu.org
> >>> ___
> >>> freebsd-current@freebsd.org mailing list
> >>> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> >>> To unsubscribe, send any mail to "
> freebsd-current-unsubscr...@freebsd.org"
> >
> >
>
>
> --
> Li-Wen Hsu 
> https://lwhsu.org
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ci.freebsd.org 's FreeBSD-head-{amd64, i386}-test started failing after -r337332 (last good), inp_gcmoptions involved

2018-08-11 Thread Li-Wen Hsu
With the VMs images on artifact.ci.freebsd.org, I can reproduce this with:

root@:/usr/tests/sys/netinet # kyua debug fibs_test:slaac_on_nondefault_fib6
fib is 1
fib is 2
net.inet6.ip6.forwarding: 0 -> 1
net.inet6.ip6.rfc6204w3: 0 -> 1
/sbin/pfctl
/sbin/ipf
ipf: IP Filter: v5.1.2 (608)
setfib 1 ifconfig epair0a inet6 2001:db8:115e:fc32::2/64 fib 1
setfib 2 ifconfig epair0b inet6 -ifdisabled accept_rtadv fib 2 up
Executing command [ ifconfig epair0b ]
Executing co

Fatal trap 9: general protection fault while in kernel mode
cpuid = 0; apic id = 00
instruction pointer = 0x20:0x80ded513
stack pointer   = 0x28:0xfe0012158860
frame pointer   = 0x28:0xfe00121588a0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 0 (softirq_0)
[ thread pid 0 tid 100013 ]
Stopped at  inp_gcmoptions+0xe3:movqll+0x33f(%rax),%r9
db> bt
Tracing pid 0 tid 100013 td 0xf800031de000
inp_gcmoptions() at inp_gcmoptions+0xe3/frame 0xfe00121588a0
epoch_call_task() at epoch_call_task+0x21a/frame 0xfe00121588f0
gtaskqueue_run_locked() at gtaskqueue_run_locked+0x139/frame 0xfe0012158940
gtaskqueue_thread_loop() at gtaskqueue_thread_loop+0x88/frame 0xfe0012158970
fork_exit() at fork_exit+0x84/frame 0xfe00121589b0
fork_trampoline() at fork_trampoline+0xe/frame 0xfe00121589b0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
db>


Li-Wen

On Mon, Aug 6, 2018 at 9:53 AM Alan Somers  wrote:
>
> I can't reproduce the failure.  On my VM, with a kernel from Aug-2, the test 
> passes.  But it sure seems to be consistent in Jenkins.
>
> On Sun, Aug 5, 2018 at 6:59 PM, Matthew Macy  wrote:
>>
>> That looks like it is tied to changes I made 3 months ago. I won't be at my 
>> desk until the end of the week, but if it's consistent I can take a look.
>>
>> -M
>>
>> On Sun, Aug 5, 2018 at 17:57 Li-Wen Hsu  wrote:
>>>
>>> On Sun, Aug 5, 2018 at 6:23 PM Mark Millard  wrote:
>>> >
>>> > amd64: #8493 was for -r337342 and #8492 (last good) was for -r337332 .
>>> > more recent builds also failed. -r337342 and laster also failed for
>>> > i386.
>>> >
>>> > All but a sys/gettimeofday.2 change after -r337332 through -r337342
>>> > are from Brad Davis. It is unclear to me how the changes matches up
>>> > with the below example (from the log for amd64). It might not?
>>> >
>>> > For example (i386 is similar):
>>> >
>>> > https://ci.freebsd.org/job/FreeBSD-head-amd64-test/8493/consoleText
>>> >
>>> > sys/netinet/fibs_test:subnet_route_with_multiple_fibs_on_same_subnet  ->
>>> >
>>> > Fatal trap 9: general protection fault while in kernel mode
>>> > cpuid = 0; apic id = 00
>>> > instruction pointer = 0x20:0x80ded213
>>> > stack pointer   = 0x28:0xfe002648c960
>>> > frame pointer   = 0x28:0xfe002648c9a0
>>> > code segment= base 0x0, limit 0xf, type 0x1b
>>> > = DPL 0, pres 1, long 1, def32 0, gran 1
>>> > processor eflags= interrupt enabled, resume, IOPL = 0
>>> > current process = 0 (softirq_0)
>>> > [ thread pid 0 tid 100013 ]
>>> > Stopped at  inp_gcmoptions+0xe3:movqll+0x33f(%rax),%r9
>>>
>>> I think this is because we are trying to enable more tests:
>>> https://github.com/freebsd/freebsd-ci/pull/25
>>>
>>> I'm looking into that.  If I cannot resolve this quickly I will revert
>>> it temporarily.
>>>
>>> Li-Wen
>>>
>>> --
>>> Li-Wen Hsu 
>>> https://lwhsu.org
>>> ___
>>> freebsd-current@freebsd.org mailing list
>>> https://lists.freebsd.org/mailman/listinfo/freebsd-current
>>> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>
>


--
Li-Wen Hsu 
https://lwhsu.org
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic after ifioctl/if_clone_destroy

2018-08-11 Thread Roman Bogorodskiy
  Hans Petter Selasky wrote:

> On 8/11/18 9:44 AM, Roman Bogorodskiy wrote:
> >Hans Petter Selasky wrote:
> > 
> >> On 08/06/18 21:43, Matthew Macy wrote:
> >>> The struct thread is typesafe. The problem is that the link is no longer
> >>> typesafe now that it’s not part of the thread. Thanks for pointing this
> >>> out. I’ll commit a fix later today.
> >>>
> >>
> >> Is there a patch yet?
> >>
> >> --HPS
> >>
> > 
> > This was committed in:
> > 
> > https://svnweb.freebsd.org/changeset/base/337525
> > 
> > However, I've just updated to r337595, and it still panics. Not sure if
> > that's related to the original issue though:
> > 
> > (kgdb) #0  doadump (textdump=0) at pcpu.h:230
> > #1  0x8043ddfb in db_dump (dummy=,
> >  dummy2=, dummy3=,
> >  dummy4=) at /usr/src/sys/ddb/db_command.c:574
> > #2  0x8043dbc9 in db_command (cmd_table=)
> >  at /usr/src/sys/ddb/db_command.c:481
> > #3  0x8043d944 in db_command_loop ()
> >  at /usr/src/sys/ddb/db_command.c:534
> > #4  0x80440b6f in db_trap (type=,
> >  code=) at /usr/src/sys/ddb/db_main.c:252
> > #5  0x80bdef83 in kdb_trap (type=9, code=0, tf= > out>)
> >  at /usr/src/sys/kern/subr_kdb.c:693
> > #6  0x8107aee1 in trap_fatal (frame=0xfe00760dc8a0, eva=0)
> >  at /usr/src/sys/amd64/amd64/trap.c:906
> > #7  0x8107a3bd in trap (frame=0xfe00760dc8a0) at counter.h:87
> > #8  0x81054d05 in calltrap ()
> >  at /usr/src/sys/amd64/amd64/exception.S:232
> > #9  0x80ded513 in inp_gcmoptions (ctx=0xf80003079f20)
> >  at epoch_private.h:188
> > #10 0x80bd9cba in epoch_call_task (arg=)
> >  at /usr/src/sys/kern/subr_epoch.c:507
> > #11 0x80bdd0a9 in gtaskqueue_run_locked (queue=0xf800035be900)
> >  at /usr/src/sys/kern/subr_gtaskqueue.c:332
> > #12 0x80bdce28 in gtaskqueue_thread_loop (arg=)
> >  at /usr/src/sys/kern/subr_gtaskqueue.c:507
> > #13 0x80b530c4 in fork_exit (
> >  callout=0x80bdcda0 ,
> >  arg=0xfe00061a4038, frame=0xfffffe00760dcac0)
> >  at /usr/src/sys/kern/kern_fork.c:1057
> > #14 0x81055cde in fork_trampoline ()
> >  at /usr/src/sys/amd64/amd64/exception.S:990
> > #15 0x in ?? ()
> > Current language:  auto; currently minimal
> > (kgdb)
> > 
> > Full core.txt is here: 
> > https://people.freebsd.org/~novel/misc/core.20180811.txt
> > 
> > Roman Bogorodskiy
> > 
> 
> What is the full panic message? Are you loading // unloading any network 
> modules?
> 
> --HPS

Fatal trap 9: general protection fault while in kernel mode
cpuid = 2; apic id = 04
instruction pointer     = 0x20:0x80ded513
stack pointer   = 0x28:0xfe00760dc960
frame pointer   = 0x28:0xfe00760dc9a0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 0 (softirq_2)

(more details in
https://people.freebsd.org/~novel/misc/core.20180811.txt)

Panic happens right after boot. I do have:

if_tap_load="YES"
if_bridge_load="YES"

in /boot/loader.conf.

Just as before, panic happens after creating/renaming bridge and tap
interfaces. Last few lines before panic (as could be seen in
core.20180811.txt linked above):

bridge0: Ethernet address: 02:af:41:48:c7:00
bridge0: changing name to 'virbr0'
tap0: Ethernet address: 00:bd:95:08:f7:00
tap0: link state changed to UP
tap0: changing name to 'virbr0-nic'
virbr0-nic: promiscuous mode enabled
virbr0: link state changed to UP
virbr0-nic: link state changed to DOWN
virbr0: link state changed to DOWN
bridge0: Ethernet address: 02:af:41:48:c7:00
bridge0: changing name to 'virbr-hostnet'
tap0: Ethernet address: 00:bd:e5:0b:f7:00
tap0: link state changed to UP
tap0: changing name to 'virbr-honet-nic'
virbr-honet-nic: promiscuous mode enabled
virbr-hostnet: link state changed to UP

Roman Bogorodskiy


signature.asc
Description: PGP signature


Re: Unexpected results with 'mergemaster -Ui'

2018-08-11 Thread Graham Perrin
On 11/08/2018 09:02, Graham Perrin wrote:
> … Also today, with an earlier run of mergemaster, when I _did_ choose to 
> install a temporary file, the installation failed. I didn't keep a note of 
> the specifics but a file mode was mentioned.

For reference only

I built a slightly more recent world and kernel,
installed the kernel,
could not reproduce the installation failure with mergemaster.

(Side note: could not installworld, bugged as outlined at 
. I guess 
that the breakage will be fixed in due course.)

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic after ifioctl/if_clone_destroy

2018-08-11 Thread Hans Petter Selasky

On 8/11/18 9:44 AM, Roman Bogorodskiy wrote:

   Hans Petter Selasky wrote:


On 08/06/18 21:43, Matthew Macy wrote:

The struct thread is typesafe. The problem is that the link is no longer
typesafe now that it’s not part of the thread. Thanks for pointing this
out. I’ll commit a fix later today.



Is there a patch yet?

--HPS



This was committed in:

https://svnweb.freebsd.org/changeset/base/337525

However, I've just updated to r337595, and it still panics. Not sure if
that's related to the original issue though:

(kgdb) #0  doadump (textdump=0) at pcpu.h:230
#1  0x8043ddfb in db_dump (dummy=,
 dummy2=, dummy3=,
 dummy4=) at /usr/src/sys/ddb/db_command.c:574
#2  0x8043dbc9 in db_command (cmd_table=)
 at /usr/src/sys/ddb/db_command.c:481
#3  0x8043d944 in db_command_loop ()
 at /usr/src/sys/ddb/db_command.c:534
#4  0x80440b6f in db_trap (type=,
 code=) at /usr/src/sys/ddb/db_main.c:252
#5  0x80bdef83 in kdb_trap (type=9, code=0, tf=)
 at /usr/src/sys/kern/subr_kdb.c:693
#6  0x8107aee1 in trap_fatal (frame=0xfe00760dc8a0, eva=0)
 at /usr/src/sys/amd64/amd64/trap.c:906
#7  0x8107a3bd in trap (frame=0xfe00760dc8a0) at counter.h:87
#8  0x81054d05 in calltrap ()
 at /usr/src/sys/amd64/amd64/exception.S:232
#9  0x80ded513 in inp_gcmoptions (ctx=0xf80003079f20)
 at epoch_private.h:188
#10 0x80bd9cba in epoch_call_task (arg=)
 at /usr/src/sys/kern/subr_epoch.c:507
#11 0x80bdd0a9 in gtaskqueue_run_locked (queue=0xf800035be900)
 at /usr/src/sys/kern/subr_gtaskqueue.c:332
#12 0x80bdce28 in gtaskqueue_thread_loop (arg=)
 at /usr/src/sys/kern/subr_gtaskqueue.c:507
#13 0x80b530c4 in fork_exit (
 callout=0x80bdcda0 ,
 arg=0xfe00061a4038, frame=0xfe00760dcac0)
 at /usr/src/sys/kern/kern_fork.c:1057
#14 0x81055cde in fork_trampoline ()
 at /usr/src/sys/amd64/amd64/exception.S:990
#15 0x in ?? ()
Current language:  auto; currently minimal
(kgdb)

Full core.txt is here: https://people.freebsd.org/~novel/misc/core.20180811.txt

Roman Bogorodskiy



What is the full panic message? Are you loading // unloading any network 
modules?


--HPS
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ffs_truncate3 panics

2018-08-11 Thread Konstantin Belousov
On Sat, Aug 11, 2018 at 12:05:25PM +, Rick Macklem wrote:
> Konstantin Belousov wrote:
> >On Thu, Aug 09, 2018 at 08:38:50PM +, Rick Macklem wrote:
> >> >BTW, does NFS server use extended attributes ?  What for ?  Can you, 
> >> >please,
> >> >point out the code which does this ?
> >> For the pNFS service, there are two system namespace extended attributes 
> >> for
> >> each file stored on the service.
> >> pnfsd.dsfile - Stores where the data for the file is. Can be displayed by 
> >> the
> >>  pnfsdsfile(8) command.
> >>
> >> pnfsd.dsattr - Cached attributes that change when a file is written (size, 
> >> mtime,
> >> change) so that the MDS doesn't have to do a Getattr on the data server 
> >> for every client Getattr.
> >>
> >
> >My reading of the nfsd code + ffs extattr handling reminds me that you
> >already reported this issue some time ago.  I suspected ufs_balloc() at
> >that time.
> Yes. I had almost forgotten about them, because I have been testing with a
> couple of machines (not big, but amd64 with a few Gbytes of RAM) and they
> never hit the panic(). Recently, I've been using the 256Mbyte i386 and started
> seeing them again.
> 
> >Now I think that the situation with the stray buffers hanging on the
> >queue is legitimate, ffs_extread() might create such buffer and release
> >it to a clean queue, then removal of the file would see inode with no
> >allocated ext blocks but with the buffer.
> >
> >I think the easiest way to handle it is to always flush buffers and pages
> >in the ext attr range, regardless of the number of allocated ext blocks.
> >Patch below was not tested.
> [patch deleted for brevity]
> Well, the above sounds reasonable, but the patch didn't help.
> Here's a small portion of the log a test run last night.
> - First, a couple of things about the printf()s. When they start with 
> "CL=",
>   the printf() is at the start of ffs_truncate(). "" is a static counter 
> of calls to
>   ffs_truncate(), so "same value" indicates same call.
> 
> 
> CL=31816 flags=0xc00 vtyp=1 bodirty=0 boclean=1 diextsiz=320
> buf at 0x429f260
> b_flags = 0x20001020, b_xflags=0x2, b_vflags=0x0
> b_error = 0, b_bufsize = 4096, b_bcount = 4096, b_resid = 0
> b_bufobj = (0xfa3f734), b_data = 0x4c9, b_blkno = -1, b_lblkno = -1, 
> b_dep = 0
> b_kvabase = 0x4c9, b_kvasize = 32768
> 
> CL=34593 flags=0xc00 vtyp=1 bodirty=0 boclean=1 diextsiz=320
> buf at 0x429deb0
> b_flags = 0x20001020, b_xflags=0x2, b_vflags=0x0
> b_error = 0, b_bufsize = 4096, b_bcount = 4096, b_resid = 0
> b_bufobj = (0xfd3da94), b_data = 0x570, b_blkno = -1, b_lblkno = -1, 
> b_dep = 0
> b_kvabase = 0x570, b_kvasize = 32768
> 
> FFST3=34593 vtyp=1 bodirty=0 boclean=1
> buf at 0x429deb0
> b_flags = 0x20001020, b_xflags=0x2, b_vflags=0x0
> b_error = 0, b_bufsize = 4096, b_bcount = 4096, b_resid = 0
> b_bufobj = (0xfd3da94), b_data = 0x570, b_blkno = -1, b_lblkno = -1, 
> b_dep = 0
> b_kvabase = 0x570, b_kvasize = 32768
Problem with this buffer is that BX_ALTDATA bit is not set.
This is the reason why vinvalbuf(V_ALT) skips it.

> 
> So, the first one is what typically happens and there would be no panic().
>  The second/third would be a panic(), since the one that starts with "FFST3"
> is a printf() that replaces the panic() call.
> - Looking at the second/third, the number at the beginning is the same, so it 
> is
>   the same call, but for some reason, between the start of the function and
>   where the ffs_truncate3 panic() test is, di_extsize has been set to 0, but 
> the
>   buffer is still there (or has been re-created there by another thread?).
> 
> Looking at the code, I can't see how this could happen, since there is a 
> vinvalbuf()
> call after the only place in the code that sets di_extsize == 0, from what I 
> can see?
> I am going to add printf()s after the vinvalbuf() calls, to make sure they are
> happening and getting rid of the buffer.
> 
> If another thread could somehow (re)create the buffer concurrently with the
> ffs_truncate() call, that would explain it, I think?
The vnode is exclusively locked. Other thread must not be able to
instantiate a buffer under us.

> 
> Just a wild guess, but I suspect softdep_slowdown() is flipping, due to the 
> small
> size of the machine and this makes the behaviour of ffs_truncate() confusing.

This is the patch that I posted long time ago.  It is obviously related
to missed BX_ALTDATA.  Can you add this patch to your kernel ?

diff --git a/sys/ufs/ffs/ffs_balloc.c b/sys/ufs/ffs/ffs_balloc.c
index 552c295753d..6d89a229ea7 100644
--- a/sys/ufs/ffs/ffs_balloc.c
+++ b/sys/ufs/ffs/ffs_balloc.c
@@ -682,8 +682,16 @@ ffs_balloc_ufs2(struct vnode *vp, off_t startoffset, int 
size,
ffs_blkpref_ufs2(ip, lbn, (int)lbn,
>di_extb[0]), osize, nsize, flags,
cred, );
-   if (error)
+   if (error != 

Re: ffs_truncate3 panics

2018-08-11 Thread Rick Macklem
Konstantin Belousov wrote:
>On Thu, Aug 09, 2018 at 08:38:50PM +, Rick Macklem wrote:
>> >BTW, does NFS server use extended attributes ?  What for ?  Can you, please,
>> >point out the code which does this ?
>> For the pNFS service, there are two system namespace extended attributes for
>> each file stored on the service.
>> pnfsd.dsfile - Stores where the data for the file is. Can be displayed by the
>>  pnfsdsfile(8) command.
>>
>> pnfsd.dsattr - Cached attributes that change when a file is written (size, 
>> mtime,
>> change) so that the MDS doesn't have to do a Getattr on the data server for 
>> every client Getattr.
>>
>
>My reading of the nfsd code + ffs extattr handling reminds me that you
>already reported this issue some time ago.  I suspected ufs_balloc() at
>that time.
Yes. I had almost forgotten about them, because I have been testing with a
couple of machines (not big, but amd64 with a few Gbytes of RAM) and they
never hit the panic(). Recently, I've been using the 256Mbyte i386 and started
seeing them again.

>Now I think that the situation with the stray buffers hanging on the
>queue is legitimate, ffs_extread() might create such buffer and release
>it to a clean queue, then removal of the file would see inode with no
>allocated ext blocks but with the buffer.
>
>I think the easiest way to handle it is to always flush buffers and pages
>in the ext attr range, regardless of the number of allocated ext blocks.
>Patch below was not tested.
[patch deleted for brevity]
Well, the above sounds reasonable, but the patch didn't help.
Here's a small portion of the log a test run last night.
- First, a couple of things about the printf()s. When they start with "CL=",
  the printf() is at the start of ffs_truncate(). "" is a static counter of 
calls to
  ffs_truncate(), so "same value" indicates same call.


CL=31816 flags=0xc00 vtyp=1 bodirty=0 boclean=1 diextsiz=320
buf at 0x429f260
b_flags = 0x20001020, b_xflags=0x2, b_vflags=0x0
b_error = 0, b_bufsize = 4096, b_bcount = 4096, b_resid = 0
b_bufobj = (0xfa3f734), b_data = 0x4c9, b_blkno = -1, b_lblkno = -1, b_dep 
= 0
b_kvabase = 0x4c9, b_kvasize = 32768

CL=34593 flags=0xc00 vtyp=1 bodirty=0 boclean=1 diextsiz=320
buf at 0x429deb0
b_flags = 0x20001020, b_xflags=0x2, b_vflags=0x0
b_error = 0, b_bufsize = 4096, b_bcount = 4096, b_resid = 0
b_bufobj = (0xfd3da94), b_data = 0x570, b_blkno = -1, b_lblkno = -1, b_dep 
= 0
b_kvabase = 0x570, b_kvasize = 32768

FFST3=34593 vtyp=1 bodirty=0 boclean=1
buf at 0x429deb0
b_flags = 0x20001020, b_xflags=0x2, b_vflags=0x0
b_error = 0, b_bufsize = 4096, b_bcount = 4096, b_resid = 0
b_bufobj = (0xfd3da94), b_data = 0x570, b_blkno = -1, b_lblkno = -1, b_dep 
= 0
b_kvabase = 0x570, b_kvasize = 32768

So, the first one is what typically happens and there would be no panic().
 The second/third would be a panic(), since the one that starts with "FFST3"
is a printf() that replaces the panic() call.
- Looking at the second/third, the number at the beginning is the same, so it is
  the same call, but for some reason, between the start of the function and
  where the ffs_truncate3 panic() test is, di_extsize has been set to 0, but the
  buffer is still there (or has been re-created there by another thread?).

Looking at the code, I can't see how this could happen, since there is a 
vinvalbuf()
call after the only place in the code that sets di_extsize == 0, from what I 
can see?
I am going to add printf()s after the vinvalbuf() calls, to make sure they are
happening and getting rid of the buffer.

If another thread could somehow (re)create the buffer concurrently with the
ffs_truncate() call, that would explain it, I think?

Just a wild guess, but I suspect softdep_slowdown() is flipping, due to the 
small
size of the machine and this makes the behaviour of ffs_truncate() confusing.

I'll post again when I have more info.
Thanks for looking at it, rick

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Unexpected results with 'mergemaster -Ui'

2018-08-11 Thread David Wolfskill
Of course, I then saw that you *had* an "ntpd" entry in your side
anyway.  Sorry for the noise.

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
Trump is gaslighting us: https://www.bbc.com/news/world-us-canada-44959300

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


signature.asc
Description: PGP signature


Re: Unexpected results with 'mergemaster -Ui'

2018-08-11 Thread David Wolfskill
On Sat, Aug 11, 2018 at 10:48:33AM +0200, Kurt Jaeger wrote:
> Hi!
> 
> >  line 133 onwards, for example.
> > 
> > I never before found any potential change to
> > /etc/group
> > 
> > Please, is it unusual?
> 
> No, it is normal, that happens if you have local changes.
> I suggest you discard the incoming /etc/group and go with your
> current version and the local changes.
> 

Caution:  if you do that, you will lose the (new) entry for the group
"ntpd", which is probably not what you want.

There was a recent(-ish) change that created the user & group "ntpd" *so
there should be a corresponding change to /etc/master.passwd, as well).

I recommend that you "merge" the changes: take the FreeBSD version ("r")
for the $FreeBSD: line and for the ntpd line; use yours ("l") for the
others, then -- optionally -- "view" the changed file (will use your
$PAGER -- from which you can invoke $EDITOR and make additional changes
if something doesn't look quite right), then "install" the changed file.

Tangentially related, I also recommend doing these updates within a
script(1) session, so you have a record in case there are questionis
about what actually happened.

Peace,
david  (who updates daily)
-- 
David H. Wolfskill  da...@catwhisker.org
Trump is gaslighting us: https://www.bbc.com/news/world-us-canada-44959300

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


signature.asc
Description: PGP signature


Re: Unexpected results with 'mergemaster -Ui'

2018-08-11 Thread Kurt Jaeger
Hi!

>  line 133 onwards, for example.
> 
> I never before found any potential change to
> /etc/group
> 
> Please, is it unusual?

No, it is normal, that happens if you have local changes.
I suggest you discard the incoming /etc/group and go with your
current version and the local changes.

-- 
p...@freebsd.org +49 171 3101372  2 years to go !
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Unexpected results with 'mergemaster -Ui'

2018-08-11 Thread Graham Perrin
 line 133 onwards, for example.

I never before found any potential change to
/etc/group

Please, is it unusual?

Lack of experience here. I have updated the system only a few times (maybe less 
than a dozen), the result of mergemaster on this occasion took me by surprise.

Results on all previous occasions were relatively terse, much less to 
consider/merge.



Also today, with an earlier run of mergemaster, when I _did_ choose to install 
a temporary file, the installation failed. I didn't keep a note of the 
specifics but a file mode was mentioned.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic after ifioctl/if_clone_destroy

2018-08-11 Thread Roman Bogorodskiy
  Hans Petter Selasky wrote:

> On 08/06/18 21:43, Matthew Macy wrote:
> > The struct thread is typesafe. The problem is that the link is no longer
> > typesafe now that it’s not part of the thread. Thanks for pointing this
> > out. I’ll commit a fix later today.
> > 
> 
> Is there a patch yet?
> 
> --HPS
> 

This was committed in:

https://svnweb.freebsd.org/changeset/base/337525

However, I've just updated to r337595, and it still panics. Not sure if
that's related to the original issue though:

(kgdb) #0  doadump (textdump=0) at pcpu.h:230
#1  0x8043ddfb in db_dump (dummy=,
dummy2=, dummy3=,
dummy4=) at /usr/src/sys/ddb/db_command.c:574
#2  0x8043dbc9 in db_command (cmd_table=)
at /usr/src/sys/ddb/db_command.c:481
#3  0x8043d944 in db_command_loop ()
at /usr/src/sys/ddb/db_command.c:534
#4  0x80440b6f in db_trap (type=,
code=) at /usr/src/sys/ddb/db_main.c:252
#5  0x80bdef83 in kdb_trap (type=9, code=0, tf=)
at /usr/src/sys/kern/subr_kdb.c:693
#6  0x8107aee1 in trap_fatal (frame=0xfe00760dc8a0, eva=0)
at /usr/src/sys/amd64/amd64/trap.c:906
#7  0x8107a3bd in trap (frame=0xfe00760dc8a0) at counter.h:87
#8  0x81054d05 in calltrap ()
at /usr/src/sys/amd64/amd64/exception.S:232
#9  0x80ded513 in inp_gcmoptions (ctx=0xf80003079f20)
at epoch_private.h:188
#10 0x80bd9cba in epoch_call_task (arg=)
at /usr/src/sys/kern/subr_epoch.c:507
#11 0x80bdd0a9 in gtaskqueue_run_locked (queue=0xf800035be900)
at /usr/src/sys/kern/subr_gtaskqueue.c:332
#12 0x80bdce28 in gtaskqueue_thread_loop (arg=)
at /usr/src/sys/kern/subr_gtaskqueue.c:507
#13 0x80b530c4 in fork_exit (
callout=0x80bdcda0 , 
arg=0xfe00061a4038, frame=0xfe00760dcac0)
at /usr/src/sys/kern/kern_fork.c:1057
#14 0x81055cde in fork_trampoline ()
at /usr/src/sys/amd64/amd64/exception.S:990
#15 0x in ?? ()
Current language:  auto; currently minimal
(kgdb) 

Full core.txt is here: https://people.freebsd.org/~novel/misc/core.20180811.txt

Roman Bogorodskiy


signature.asc
Description: PGP signature