Re: uuid/label based fstab during bsdinstall

2016-08-02 Thread Nathan Whitehorn
Unfortunately, the glabel and (especially) GPT partition ID labelling 
has in-kernel race conditions that make it impossible to rely on in the 
installer. This has been an open problem since FreeBSD 9; hopefully it 
will be solved soon.

-Nathan

On 08/02/16 20:35, Alive 4ever wrote:

Greetings, everyone.

I am testing FreeBSD 11.0-BETA3 release on a virtual machine (qemu,
with
edkII ovmf).

Currently, bsdinstall creates device path based block device driver
scheme instead of uuid/label based device scheme.

bsdinstall-generated fstab has some drawback. For example, when
switching from ide to virtio on qemu, freebsd can't find its root
partition because the path name has changed from 'da0' to 'vtbd0'.

I suggest adding an option during bsdinstall to select fstab block
device pointer scheme. User will choose a scheme based on fs-uuid,
fs-label, geom label (glabel), gpt id, gpt label, or driver based
numbering scheme (da0/vtbd0 style).

If fstab scheme choice is too hard to implement, it would be better to
just switch default fstab generation to label based scheme, so that
FreeBSD kernel will be able to find its rootfs in different
circumstances.

I hope this will be implemented soon.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"



___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


uuid/label based fstab during bsdinstall

2016-08-02 Thread Alive 4ever
Greetings, everyone.

I am testing FreeBSD 11.0-BETA3 release on a virtual machine (qemu,
with
edkII ovmf).

Currently, bsdinstall creates device path based block device driver
scheme instead of uuid/label based device scheme.

bsdinstall-generated fstab has some drawback. For example, when
switching from ide to virtio on qemu, freebsd can't find its root
partition because the path name has changed from 'da0' to 'vtbd0'.

I suggest adding an option during bsdinstall to select fstab block
device pointer scheme. User will choose a scheme based on fs-uuid,
fs-label, geom label (glabel), gpt id, gpt label, or driver based
numbering scheme (da0/vtbd0 style).

If fstab scheme choice is too hard to implement, it would be better to
just switch default fstab generation to label based scheme, so that
FreeBSD kernel will be able to find its rootfs in different
circumstances.

I hope this will be implemented soon.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Xen networking problems in -current with xn driver?

2016-08-02 Thread Kevin Oberman
On Tue, Aug 2, 2016 at 11:12 AM, Julian Elischer  wrote:

> I upgraded my VPS machine to today's current, and on reboot I couldn't get
> into it by network.
>
> A quick switch to the VNC console showed that it was up but that it
> couldn't get out.
>
>
> The xn interfaces said they were UP but attempts to get out were met with
> "network is down".
>
> if I did 'tcpdump -n -i xn0' (and xn1) hten all was fine again.
>
> tcpdump saw packets, and in fact ipfw saw some packets coming in even
> before that but it was not possible to send.
>
>
> Has anyone seen similar?
>
> some relevant parts of the dmesg output.:
>
> [...]

A bit of a guess, but the obvious thing that I see  is that when you start
tcpdump you are placing the interface in promiscuous mode. Looks like the
"device" fails to properly see packet addressed to it.
--
Kevin Oberman, Part time kid herder and retired Network Engineer
E-mail: rkober...@gmail.com
PGP Fingerprint: D03FB98AFA78E3B78C1694B318AB39EF1B055683
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Xen networking problems in -current with xn driver?

2016-08-02 Thread Julian Elischer
I upgraded my VPS machine to today's current, and on reboot I couldn't 
get into it by network.


A quick switch to the VNC console showed that it was up but that it 
couldn't get out.



The xn interfaces said they were UP but attempts to get out were met 
with "network is down".


if I did 'tcpdump -n -i xn0' (and xn1) hten all was fine again.

tcpdump saw packets, and in fact ipfw saw some packets coming in even 
before that but it was not possible to send.



Has anyone seen similar?

some relevant parts of the dmesg output.:


T(vga): text 80x25
XEN: Hypervisor version 3.4 detected.
CPU: Intel(R) Xeon(R) CPU   E5620  @ 2.40GHz (2400.05-MHz 
686-class CPU)

  Origin="GenuineIntel"  Id=0x206c2  Family=0x6  Model=0x2c Stepping=2
Features=0x1781fbff
Features2=0x80982201
  AMD Features=0x2010
  AMD Features2=0x1
Hypervisor: Origin = "XenVMMXenVMM"
real memory  = 536870912 (512 MB)
avail memory = 503783424 (480 MB)
Event timer "LAPIC" quality 400
ACPI APIC Table: 
WARNING: L1 data cache covers less APIC IDs than a core
0 < 1
WARNING: L2 data cache covers less APIC IDs than a core
0 < 1
WARNING: L3 data cache covers less APIC IDs than a core
0 < 1

ipfw2 (+ipv6) initialized, divert loadable, nat enabled, default to 
deny, logging disabled

xs_dev0:  on xenstore0
xenbusb_front0:  on xenstore0
xn0:  at device/vif/0 on xenbusb_front0
xn0: Ethernet address: 00:16:3e:01:99:54
xn1:  at device/vif/1 on xenbusb_front0
xn1: Ethernet address: 00:16:3e:01:9a:54
xenbusb_back0:  on xenstore0
xenballoon0:  on xenstore0
xctrl0:  on xenstore0
xn0: backend features: feature-sg feature-gso-tcp4
xn1: backend features: feature-sg feature-gso-tcp4
xbd0: 20480MB  at device/vbd/768 on xenbusb_front0
xbd0: attaching as ada0
xbd0: features: write_barrier
xbd0: synchronize cache commands enabled.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: EARLY_AP_STARTUP hangs during boot

2016-08-02 Thread John Baldwin
On Tuesday, August 02, 2016 09:03:10 AM Gary Jennejohn wrote:
> On Mon, 01 Aug 2016 13:19:16 -0700
> John Baldwin  wrote:
> 
> > On Monday, August 01, 2016 03:31:11 PM Gary Jennejohn wrote:
> > > On Mon, 1 Aug 2016 09:34:34 +0200
> > > Gary Jennejohn  wrote:
> > >   
> > > > On Sun, 31 Jul 2016 14:22:35 -0700
> > > > John Baldwin  wrote:
> > > >   
> > > > > On Sunday, July 31, 2016 11:29:14 AM Gary Jennejohn wrote:
> > > > > > On Sat, 30 Jul 2016 12:03:59 -0700
> > > > > > John Baldwin  wrote:
> > > > > >   
> > > > > > > On Saturday, July 30, 2016 09:44:22 AM Gary Jennejohn wrote:  
> > > > > > > > On Fri, 29 Jul 2016 13:17:42 -0700
> > > > > > > > John Baldwin  wrote:
> > > > > > > > 
> > > > > > > > > On Thursday, July 28, 2016 12:31:31 AM Gary Jennejohn wrote:  
> > > > > > > > >   
> > > > > > > > > > Well, now I know that ULE is a prerequiste for 
> > > > > > > > > > EARLY_AP_STARTUP!  I
> > > > > > > > > > wasn't aware of that.  I prefer BSD and that's the 
> > > > > > > > > > scheduler I did
> > > > > > > > > > the first tests with.
> > > > > > > > > > 
> > > > > > > > > > But with the ULE scheduler the system comes up all the way.
> > > > > > > > > > 
> > > > > > > > > > It would be nice if the BSD scheduler could also be 
> > > > > > > > > > modified to
> > > > > > > > > > work with EARLY_AP_STARTUP.  
> > > > > > > > > 
> > > > > > > > > I wasn't able to reproduce your hang with 4BSD, but I think I 
> > > > > > > > > see a
> > > > > > > > > possible problem.  Try this:
> > > > > > > > > 
> > > > > > > > > diff --git a/sys/kern/sched_4bsd.c b/sys/kern/sched_4bsd.c
> > > > > > > > > index 7de56b6..d53331a 100644
> > > > > > > > > --- a/sys/kern/sched_4bsd.c
> > > > > > > > > +++ b/sys/kern/sched_4bsd.c
> > > > > > > > > @@ -327,7 +327,6 @@ maybe_preempt(struct thread *td)
> > > > > > > > >*  - The current thread has a higher (numerically 
> > > > > > > > > lower) or
> > > > > > > > >*equivalent priority.  Note that this prevents 
> > > > > > > > > curthread from
> > > > > > > > >*trying to preempt to itself.
> > > > > > > > > -  *  - It is too early in the boot for context switches 
> > > > > > > > > (cold is set).
> > > > > > > > >*  - The current thread has an inhibitor set or is in 
> > > > > > > > > the process of
> > > > > > > > >*exiting.  In this case, the current thread is 
> > > > > > > > > about to switch
> > > > > > > > >*out anyways, so there's no point in preempting.  
> > > > > > > > > If we did,
> > > > > > > > > @@ -348,7 +347,7 @@ maybe_preempt(struct thread *td)
> > > > > > > > >   ("maybe_preempt: trying to run 
> > > > > > > > > inhibited thread"));
> > > > > > > > >   pri = td->td_priority;
> > > > > > > > >   cpri = ctd->td_priority;
> > > > > > > > > - if (panicstr != NULL || pri >= cpri || cold /* || 
> > > > > > > > > dumping */ ||
> > > > > > > > > + if (panicstr != NULL || pri >= cpri /* || dumping */ ||
> > > > > > > > >   TD_IS_INHIBITED(ctd))
> > > > > > > > >   return (0);
> > > > > > > > >  #ifndef FULL_PREEMPTION
> > > > > > > > > @@ -1127,7 +1126,7 @@ forward_wakeup(int cpunum)
> > > > > > > > >   if ((!forward_wakeup_enabled) ||
> > > > > > > > >(forward_wakeup_use_mask == 0 && 
> > > > > > > > > forward_wakeup_use_loop == 0))
> > > > > > > > >   return (0);
> > > > > > > > > - if (!smp_started || cold || panicstr)
> > > > > > > > > + if (!smp_started || panicstr)
> > > > > > > > >   return (0);
> > > > > > > > >  
> > > > > > > > >   forward_wakeups_requested++;
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > Thanks, but with this patch the kernel hangs in exactly the same
> > > > > > > > place as before - after the HPET output.
> > > > > > > > 
> > > > > > > > Maybe I'm missing some kernel option which ULE works around, or
> > > > > > > > something like that.
> > > > > > > 
> > > > > > > Hmm, ok.  Please add KTR_RUNQ and KTR_SMP to the KTR masks, that 
> > > > > > > is
> > > > > > > 'options KTR_COMPILE=(KTR_PROC|KTR_RUNQ|KTR_SMP)' and
> > > > > > > 'options KTR_MASK=(KTR_PROC|KTR_RUNQ|KTR_SMP)'
> > > > > > > 
> > > > > > > Please also add this patch (on top of the previous patch):
> > > > > > > 
> > > > > > > diff --git a/sys/kern/sched_4bsd.c b/sys/kern/sched_4bsd.c
> > > > > > > index 2973a23..bab2278 100644
> > > > > > > --- a/sys/kern/sched_4bsd.c
> > > > > > > +++ b/sys/kern/sched_4bsd.c
> > > > > > > @@ -1278,6 +1278,8 @@ sched_add(struct thread *td, int flags)
> > > > > > > KASSERT(td->td_flags & TDF_INMEM,
> > > > > > > ("sched_add: thread swapped out"));
> > > > > > >  
> > > > > > > +   CTR2(KTR_PROC, "sched_add: thread %d (%s)", td->td_tid,
> > > > > > > +   sched_tdname(td));
> > > > > > > 

Re: SVN r303643 breaks non-SMP compilation

2016-08-02 Thread Guido Falsi
On 08/02/16 05:06, Mateusz Guzik wrote:
> On Mon, Aug 01, 2016 at 09:49:03PM -0400, Michael Butler wrote:
>> In the non-SMP case, ADAPTIVE_MUTEXES is not defined and a subsequent
>> reference to mtx_delay causes compilation of kern_mutex.c to fail
>> because KDTRACE_HOOKS may be,
>>
> 
> Indeed, fixed in r303655.
> 
> Thanks for reporting.
> 

I've noticed another failure in the same file, caused by r303643.

It's failing to compile here due to errors about SYSINIT(9), it looks
like #include  is missing.

I have made a local patch which compiles and afdter a reboot seems to
work fine:

Index: head/sys/kern/kern_sx.c
===
--- head/sys/kern/kern_sx.c (revision 303658)
+++ head/sys/kern/kern_sx.c (working copy)
@@ -58,6 +58,7 @@

 #if defined(SMP) && !defined(NO_ADAPTIVE_SX)
 #include 
+#include 
 #endif

 #ifdef DDB


-- 
Guido Falsi 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


amd64-xtoolchain-gcc: small kernel compilation issue

2016-08-02 Thread Andriy Gapon
/usr/src/sys/modules/vmm/../../amd64/vmm/vmm_dev.c: In function
'alloc_memseg':
/usr/src/sys/modules/vmm/../../amd64/vmm/vmm_dev.c:261:3: error: null
argument where non-null required (argument 1) [-Werror=nonnull]
   error = copystr(VM_MEMSEG_NAME(mseg), name, SPECNAMELEN + 1, 0);

This is with amd64-xtoolchain-gcc-0.1, which seems to install gcc
version 5.3.0, and optimization level set to O1.

It seems that in that case gcc is not smart enough to figure out that if
VM_MEMSEG_NAME(mseg) is not NULL in a condition, then it can not be NULL
in a block guarded by the condition.

So, the following trivial patch should not be necessary but makes gcc a
bit happier:
--- a/sys/amd64/vmm/vmm_dev.c
+++ b/sys/amd64/vmm/vmm_dev.c
@@ -258,7 +258,7 @@ alloc_memseg
if (VM_MEMSEG_NAME(mseg)) {
sysmem = false;
name = malloc(SPECNAMELEN + 1, M_VMMDEV, M_WAITOK);
-   error = copystr(VM_MEMSEG_NAME(mseg), name, SPECNAMELEN + 1, 0);
+   error = copystr(mseg->name, name, SPECNAMELEN + 1, 0);
if (error)
goto done;
}


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CURRENT: [USB] : GEOM_PART: da4 was automatically resized.

2016-08-02 Thread Andrey V. Elsukov
On 01.08.16 12:05, O. Hartmann wrote:
> On every(!) USB drive which worked well with 11-CURRENT up to 11-BETA, I fail
> to access with 12-CURRENT (12.0-CURRENT FreeBSD 12.0-CURRENT #14 r303475: Fri
> Jul 29 11:59:11 CEST 2016) with the error shown below.
> 
> On USB flash drives I created myself, the suggested gpart command solved the
> problem, but I can not do this with drives I was given by a vendor or 
> supplier.

JFYI, r303637 fixes this problem for me.

-- 
WBR, Andrey V. Elsukov



signature.asc
Description: OpenPGP digital signature


Re: mfi driver performance too bad on LSI MegaRAID SAS 9260-8i

2016-08-02 Thread Borja Marcos

> On 01 Aug 2016, at 19:30, Michelle Sullivan  wrote:
> 
> There are reasons for using either…

Indeed, but my decision was to run ZFS. And getting a HBA in some 
configurations can be difficult because vendors insist on using 
RAID adapters. After all, that’s what most of their customers demand.

Fortunately, at least some Avago/LSI cards can work as HBAs pretty well. An 
example is the now venerable LSI2008.

> Nowadays its seems the conversations have degenerated into those like Windows 
> vs Linux vs Mac where everyone thinks their answer is the right one (just as 
> you suggested you (Borja Marcos) did with the Dell salesman), where in 
> reality each has its own advantages and disadvantages.

I know, but this is not the case. But it’s quite frustrating to try to order a 
server with a HBA rather than a RAID and receiving an answer such as
“the HBA option is not available”. That’s why people are zapping, flashing and, 
generally, torturing HBA cards rather cruelly ;)

So, in my case, it’s not about what’s better or worse. It’s just a simpler 
issue. Customer (myself) has made a decision, which can be right or wrong. 
Manufacturer fails to deliver what I need. If it was only one manufacturer, 
well, off with them, but the issue is widespread in industry. 

> Eg: I'm running 2 zfs servers on 'LSI 9260-16i's... big mistake! (the ZFS, 
> not LSI's)... one is a 'movie server' the other a 'postgresql database' 
> server...  The latter most would agree is a bad use of zfs, the die-hards 
> won't but then they don't understand database servers and how they work on 
> disk.  The former has mixed views, some argue that zfs is the only way to 
> ensure the movies will always work, personally I think of all the years 
> before zfs when my data on disk worked without failure until the disks 
> themselves failed... and RAID stopped that happening...  what suddenly 
> changed, are disks and ram suddenly not reliable at transferring data? .. 
> anyhow back to the issue there is another part with this particular hardware 
> that people just throw away…

Well, silent corruption can happen. I’ve seen it once caused by a flaky HBA and 
ZFS saved the cake. Yes. there were reliable replicas. Still, rebuilding would 
be a pain in the ass. 

> The LSI 9260-* controllers have been designed to provide on hardware RAID.  
> The caching whether using the Cachecade SSD or just oneboard ECC memory is 
> *ONLY* used when running some sort of RAID set and LVs... this is why LSI 
> recommend 'MegaCli -CfgEachDskRaid0' because it does enable caching..  A good 
> read on how to setup something similar is here: 
> https://calomel.org/megacli_lsi_commands.html (disclaimer, I haven't parsed 
> it all so the author could be clueless, but it seems to give generally good 
> advice.)  Going the way of 'JBOD' is a bad thing to do, just don't, 
> performance sucks. As for the recommended command above, can't comment 
> because currently I don't use it nor will I need to in the near future... but…

Actually it’s not a good idea to use heavy disk caching when running ZFS. Its 
reliability depends on being able to commit metadata to disk. So I don’t care 
about that caching option. Provided you have enough RAM, ZFS is very effective 
caching data itself.

> If you (O Hartmann) want to use or need to use ZFS with any OS including 
> FreeBSD don't go with the LSI 92xx series controllers, its just the wrong 
> thing to do..  Pick an HBA that is designed to give you direct access to the 
> drives not one you have to kludge and cajole.. Including LSI controllers with 
> caches that use the mfi driver, just not those that are not designed to work 
> in a non RAID mode (with or without the passthru command/mode above.)

As I said, the problem is, sometimes it’s not so easy to find the right HBA. 

> So moral of the story/choices.  Don't go with ZFS because people tell you its 
> best, because it isn't, go with ZFS if it suits your hardware and 
> application, and if ZFS suits your application, get hardware for it.

Indeed, I second this. But really, "hardware for it" covers a rather broad 
cathegory ;) ZFS can even manage to work on hardware _against_ it.






Borja.


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: EARLY_AP_STARTUP hangs during boot

2016-08-02 Thread Gary Jennejohn
On Mon, 01 Aug 2016 13:19:16 -0700
John Baldwin  wrote:

> On Monday, August 01, 2016 03:31:11 PM Gary Jennejohn wrote:
> > On Mon, 1 Aug 2016 09:34:34 +0200
> > Gary Jennejohn  wrote:
> >   
> > > On Sun, 31 Jul 2016 14:22:35 -0700
> > > John Baldwin  wrote:
> > >   
> > > > On Sunday, July 31, 2016 11:29:14 AM Gary Jennejohn wrote:
> > > > > On Sat, 30 Jul 2016 12:03:59 -0700
> > > > > John Baldwin  wrote:
> > > > >   
> > > > > > On Saturday, July 30, 2016 09:44:22 AM Gary Jennejohn wrote:  
> > > > > > > On Fri, 29 Jul 2016 13:17:42 -0700
> > > > > > > John Baldwin  wrote:
> > > > > > > 
> > > > > > > > On Thursday, July 28, 2016 12:31:31 AM Gary Jennejohn wrote:
> > > > > > > > 
> > > > > > > > > Well, now I know that ULE is a prerequiste for 
> > > > > > > > > EARLY_AP_STARTUP!  I
> > > > > > > > > wasn't aware of that.  I prefer BSD and that's the scheduler 
> > > > > > > > > I did
> > > > > > > > > the first tests with.
> > > > > > > > > 
> > > > > > > > > But with the ULE scheduler the system comes up all the way.
> > > > > > > > > 
> > > > > > > > > It would be nice if the BSD scheduler could also be modified 
> > > > > > > > > to
> > > > > > > > > work with EARLY_AP_STARTUP.  
> > > > > > > > 
> > > > > > > > I wasn't able to reproduce your hang with 4BSD, but I think I 
> > > > > > > > see a
> > > > > > > > possible problem.  Try this:
> > > > > > > > 
> > > > > > > > diff --git a/sys/kern/sched_4bsd.c b/sys/kern/sched_4bsd.c
> > > > > > > > index 7de56b6..d53331a 100644
> > > > > > > > --- a/sys/kern/sched_4bsd.c
> > > > > > > > +++ b/sys/kern/sched_4bsd.c
> > > > > > > > @@ -327,7 +327,6 @@ maybe_preempt(struct thread *td)
> > > > > > > >  *  - The current thread has a higher (numerically 
> > > > > > > > lower) or
> > > > > > > >  *equivalent priority.  Note that this prevents 
> > > > > > > > curthread from
> > > > > > > >  *trying to preempt to itself.
> > > > > > > > -*  - It is too early in the boot for context switches 
> > > > > > > > (cold is set).
> > > > > > > >  *  - The current thread has an inhibitor set or is in 
> > > > > > > > the process of
> > > > > > > >  *exiting.  In this case, the current thread is 
> > > > > > > > about to switch
> > > > > > > >  *out anyways, so there's no point in preempting.  
> > > > > > > > If we did,
> > > > > > > > @@ -348,7 +347,7 @@ maybe_preempt(struct thread *td)
> > > > > > > > ("maybe_preempt: trying to run 
> > > > > > > > inhibited thread"));
> > > > > > > > pri = td->td_priority;
> > > > > > > > cpri = ctd->td_priority;
> > > > > > > > -   if (panicstr != NULL || pri >= cpri || cold /* || 
> > > > > > > > dumping */ ||
> > > > > > > > +   if (panicstr != NULL || pri >= cpri /* || dumping */ ||
> > > > > > > > TD_IS_INHIBITED(ctd))
> > > > > > > > return (0);
> > > > > > > >  #ifndef FULL_PREEMPTION
> > > > > > > > @@ -1127,7 +1126,7 @@ forward_wakeup(int cpunum)
> > > > > > > > if ((!forward_wakeup_enabled) ||
> > > > > > > >  (forward_wakeup_use_mask == 0 && 
> > > > > > > > forward_wakeup_use_loop == 0))
> > > > > > > > return (0);
> > > > > > > > -   if (!smp_started || cold || panicstr)
> > > > > > > > +   if (!smp_started || panicstr)
> > > > > > > > return (0);
> > > > > > > >  
> > > > > > > > forward_wakeups_requested++;
> > > > > > > > 
> > > > > > > 
> > > > > > > Thanks, but with this patch the kernel hangs in exactly the same
> > > > > > > place as before - after the HPET output.
> > > > > > > 
> > > > > > > Maybe I'm missing some kernel option which ULE works around, or
> > > > > > > something like that.
> > > > > > 
> > > > > > Hmm, ok.  Please add KTR_RUNQ and KTR_SMP to the KTR masks, that is
> > > > > > 'options KTR_COMPILE=(KTR_PROC|KTR_RUNQ|KTR_SMP)' and
> > > > > > 'options KTR_MASK=(KTR_PROC|KTR_RUNQ|KTR_SMP)'
> > > > > > 
> > > > > > Please also add this patch (on top of the previous patch):
> > > > > > 
> > > > > > diff --git a/sys/kern/sched_4bsd.c b/sys/kern/sched_4bsd.c
> > > > > > index 2973a23..bab2278 100644
> > > > > > --- a/sys/kern/sched_4bsd.c
> > > > > > +++ b/sys/kern/sched_4bsd.c
> > > > > > @@ -1278,6 +1278,8 @@ sched_add(struct thread *td, int flags)
> > > > > > KASSERT(td->td_flags & TDF_INMEM,
> > > > > > ("sched_add: thread swapped out"));
> > > > > >  
> > > > > > +   CTR2(KTR_PROC, "sched_add: thread %d (%s)", td->td_tid,
> > > > > > +   sched_tdname(td));
> > > > > > KTR_STATE2(KTR_SCHED, "thread", sched_tdname(td), "runq 
> > > > > > add",
> > > > > > "prio:%d", td->td_priority, KTR_ATTR_LINKED,
> > > > > > sched_tdname(curthread));
> > > > > > diff --git 

Re: [PATCH] randomized delay in locking primitives, take 2

2016-08-02 Thread Alfred Perlstein

Why is

+struct lock_delay_config {
+u_int initial;
+u_int step;
+u_int min;
+u_int max;
+};

missing comments for its members?  Are they documented anywhere else?

-Alfred


On 7/31/16 5:41 AM, Mateusz Guzik wrote:

On Sun, Jul 31, 2016 at 01:49:28PM +0300, Konstantin Belousov wrote:
[snip]

After an irc discussion, the following was produced (also available at:
https://people.freebsd.org/~mjg/lock_backoff_complete4.diff):

Differences:
- uint64_t usage was converted to u_int (also see r303584)
- currently unused features (cap limit and return value) were removed
- lock_delay args got packed into a dedicated structure

Note this patch requires the tree to be at least at r303584.

diff --git a/sys/kern/kern_mutex.c b/sys/kern/kern_mutex.c
index 0555a78..9b07b8b 100644
--- a/sys/kern/kern_mutex.c
+++ b/sys/kern/kern_mutex.c
@@ -55,6 +55,7 @@ __FBSDID("$FreeBSD$");
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -138,6 +139,36 @@ struct lock_class lock_class_mtx_spin = {
  #endif
  };
  
+#ifdef ADAPTIVE_MUTEXES

+static SYSCTL_NODE(_debug, OID_AUTO, mtx, CTLFLAG_RD, NULL, "mtx debugging");
+
+static struct lock_delay_config mtx_delay = {
+   .initial= 1000,
+   .step   = 500,
+   .min= 100,
+   .max= 5000,
+};
+
+SYSCTL_INT(_debug_mtx, OID_AUTO, delay_initial, CTLFLAG_RW, _delay.initial,
+0, "");
+SYSCTL_INT(_debug_mtx, OID_AUTO, delay_step, CTLFLAG_RW, _delay.step,
+0, "");
+SYSCTL_INT(_debug_mtx, OID_AUTO, delay_min, CTLFLAG_RW, _delay.min,
+0, "");
+SYSCTL_INT(_debug_mtx, OID_AUTO, delay_max, CTLFLAG_RW, _delay.max,
+0, "");
+
+static void
+mtx_delay_sysinit(void *dummy)
+{
+
+   mtx_delay.initial = mp_ncpus * 25;
+   mtx_delay.min = mp_ncpus * 5;
+   mtx_delay.max = mp_ncpus * 25 * 10;
+}
+LOCK_DELAY_SYSINIT(mtx_delay_sysinit);
+#endif
+
  /*
   * System-wide mutexes
   */
@@ -408,8 +439,10 @@ __mtx_lock_sleep(volatile uintptr_t *c, uintptr_t tid, int 
opts,
int contested = 0;
uint64_t waittime = 0;
  #endif
+#if defined(ADAPTIVE_MUTEXES) || defined(KDTRACE_HOOKS)
+   struct lock_delay_arg lda;
+#endif
  #ifdef KDTRACE_HOOKS
-   u_int spin_cnt = 0;
u_int sleep_cnt = 0;
int64_t sleep_time = 0;
int64_t all_time = 0;
@@ -418,6 +451,9 @@ __mtx_lock_sleep(volatile uintptr_t *c, uintptr_t tid, int 
opts,
if (SCHEDULER_STOPPED())
return;
  
+#if defined(ADAPTIVE_MUTEXES) || defined(KDTRACE_HOOKS)

+   lock_delay_arg_init(, _delay);
+#endif
m = mtxlock2mtx(c);
  
  	if (mtx_owned(m)) {

@@ -451,7 +487,7 @@ __mtx_lock_sleep(volatile uintptr_t *c, uintptr_t tid, int 
opts,
if (m->mtx_lock == MTX_UNOWNED && _mtx_obtain_lock(m, tid))
break;
  #ifdef KDTRACE_HOOKS
-   spin_cnt++;
+   lda.spin_cnt++;
  #endif
  #ifdef ADAPTIVE_MUTEXES
/*
@@ -471,12 +507,8 @@ __mtx_lock_sleep(volatile uintptr_t *c, uintptr_t tid, int 
opts,
"spinning", "lockname:\"%s\"",
m->lock_object.lo_name);
while (mtx_owner(m) == owner &&
-   TD_IS_RUNNING(owner)) {
-   cpu_spinwait();
-#ifdef KDTRACE_HOOKS
-   spin_cnt++;
-#endif
-   }
+   TD_IS_RUNNING(owner))
+   lock_delay();
KTR_STATE0(KTR_SCHED, "thread",
sched_tdname((struct thread *)tid),
"running");
@@ -570,7 +602,7 @@ __mtx_lock_sleep(volatile uintptr_t *c, uintptr_t tid, int 
opts,
/*
 * Only record the loops spinning and not sleeping.
 */
-   if (spin_cnt > sleep_cnt)
+   if (lda.spin_cnt > sleep_cnt)
LOCKSTAT_RECORD1(adaptive__spin, m, all_time - sleep_time);
  #endif
  }
diff --git a/sys/kern/kern_rwlock.c b/sys/kern/kern_rwlock.c
index d4cae61..363b042 100644
--- a/sys/kern/kern_rwlock.c
+++ b/sys/kern/kern_rwlock.c
@@ -44,6 +44,7 @@ __FBSDID("$FreeBSD$");
  #include 
  #include 
  #include 
+#include 
  #include 
  #include 
  #include 
@@ -65,15 +66,6 @@ PMC_SOFT_DECLARE( , , lock, failed);
   */
  #define   rwlock2rw(c)(__containerof(c, struct rwlock, rw_lock))
  
-#ifdef ADAPTIVE_RWLOCKS

-static int rowner_retries = 10;
-static int rowner_loops = 1;
-static SYSCTL_NODE(_debug, OID_AUTO, rwlock, CTLFLAG_RD, NULL,
-"rwlock debugging");
-SYSCTL_INT(_debug_rwlock, OID_AUTO, retry, CTLFLAG_RW, _retries, 0, "");
-SYSCTL_INT(_debug_rwlock, OID_AUTO, loops, CTLFLAG_RW, _loops, 0, "");
-#endif
-
  #ifdef DDB
  #include 
  
@@ -100,6 +92,41 @@ struct lock_class lock_class_rw = {

  #endif
  };
  
+#ifdef ADAPTIVE_RWLOCKS