Re: How to hotplug a PCI device (such as VF) on FreeBSD

2019-03-25 Thread John Baldwin
On 3/25/19 7:05 AM, Ian Lepore wrote:
> On Mon, 2019-03-25 at 08:49 +, Hongxiong Xian (Wicresoft North
> America Ltd) wrote:
>> Hi,
>>
>> I'm looking for a way to refresh the pci device list.
>> In Linux, we can remove a particular pci device, and then after
>> preforming a "rescan" the device will appear again.
>> For example, disable/rescind PCI (such as VF) :
>> echo 1 >  /sys/bus/pci/devices/0001\:00\:02.0/remove
>> # Get the device back
>> echo 1 > /sys/bus/pci/rescan
>>
>> I'm looking for a similar way in FreeBSD. Does the FreeBSD support
>> the hotplug of a PCI device?  Thanks in advance!
>>
>>
> 
> I think 'devctl rescan' will do that, 'man devctl' for details.

For VFs you can create/remote them using iovctl on the PF device.

You can also use 'devctl rescan' to force a rescan of a PCI bus as
Ian noted.  For native PCI-express hotplug you should not need to
do manual rescans (though FreeBSD does not support PCI-e hotplug
via Thunderbolt).

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Optimization bug with floating-point?

2019-03-14 Thread John Baldwin
On 3/14/19 1:08 PM, Steve Kargl wrote:
> On Fri, Mar 15, 2019 at 05:50:37AM +1100, Peter Jeremy wrote:
>> On 2019-Mar-13 23:30:07 -0700, Steve Kargl 
>>  wrote:
>>> AFAICT, all libm float routines need to be modified to conditional
>>> include ieeefp.h and call fpsetprec(FP_PD).  This will work around
>>> issues is FP and libm.  FreeBSD needs to issue an erratum about 
>>> the numerical issues with clang.
>>
>> I vaguely recall looking into the x87 initialisation a long time ago
>> and STR that the startup code (either crtX or in the kernel) does
>> a fninit() to set the precision.  I don't recall exactly where.
>>
>> IMO, calling fpsetprec() in every libm float function is overkill. It
>> should be enough to fpsetprec() before main() and add a note in the
>> man pages that libm is built to use the default FPU configuration and
>> changing the configuration (precision or rounding) may result in larger
>> errors.
> 
> My understanding of the situation is that FreeBSD i386/387 sets
> the FPU to 53-bit precision (whether at start up or first access
> is immaterial).  This was done long ago to prevent issues with
> different optimization levels leaving different intermediate
> results is registers with extended precision.  You can observe
> the problem with the toy program I posted and clang.  Compile it
> with -O0 and -O2.  With the former you have max ULP of 2.9 (the
> desired result); with the latter you have a max ULP of 23.xxx.
> I have observed a 6 billion ULP issue when running my testsuite.
> As pointed out by John Baldwin, GCC is aware of the FPU setting.
> The problem with clang is that it seems to unconditionally assume
> the FPU is set to 64-bit precision.   It is unclear if clang is
> generated the desired result for float routines in libm.  The
> only to gaurantee the desired resut is to use fpsetprec(FP_PD),
> or fix clang to take into account the FPU environment.

OTOH, note that every other OS in 32-bit mode uses 64-bit precision,
and amd64 also uses 64-bit precision by default IIUC.  FreeBSD/i386
is definitely unique in this regard.  Linux doesn't do it, none of
the other BSD's do it (only Dragonfly does b/c they inherited it
from FreeBSD).  None of Solaris, Windows, etc. do it either if the
gcc sources are to be trusted as a reference.

That said, I think it must have to do with how clang vs GCC is
handling saving the values in memory and whether or not it does
truncation to 53 bits when stored in memory somehow.  I was trying
to poke around in GCC's sources to figure out if it was doing anything
differently, but I couldn't find a difference in terms of function
pointers, etc.  The only difference is is the constants used in a set
of structures.  I haven't tried to track down what those struct
member values control though.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Optimization bug with floating-point?

2019-03-14 Thread John Baldwin
On 3/14/19 12:20 PM, Konstantin Belousov wrote:
> On Fri, Mar 15, 2019 at 05:50:37AM +1100, Peter Jeremy wrote:
>> On 2019-Mar-13 23:30:07 -0700, Steve Kargl 
>>  wrote:
>>> AFAICT, all libm float routines need to be modified to conditional
>>> include ieeefp.h and call fpsetprec(FP_PD).  This will work around
>>> issues is FP and libm.  FreeBSD needs to issue an erratum about 
>>> the numerical issues with clang.
>>
>> I vaguely recall looking into the x87 initialisation a long time ago
>> and STR that the startup code (either crtX or in the kernel) does
>> a fninit() to set the precision.  I don't recall exactly where.
> At boot, a clean initial FPU state is stored in fpu_initialstate.
> Then on first FPU access from userspace  (first for the given process
> context), this saved state is copied into hardware registers.  The
> quirk is that for i386 binaries on amd64, we adjust fpu control word
> to what is expected by i386 binaries.
> 
>>
>> IMO, calling fpsetprec() in every libm float function is overkill. It
>> should be enough to fpsetprec() before main() and add a note in the
>> man pages that libm is built to use the default FPU configuration and
>> changing the configuration (precision or rounding) may result in larger
>> errors.
> Changing default precision in crt1 would break the ABI.

So what I don't understand then is what is gcc doing different than clang
in this case.  I assume neither GCC _nor_ clang are adjusting the FPU in
compiler-generated code, and in fact as Steve's earlier tests shows, the
precision is set to PD by default when a clang-built binary is run.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Optimization bug with floating-point?

2019-03-13 Thread John Baldwin
On 3/13/19 9:40 AM, Steve Kargl wrote:
> On Wed, Mar 13, 2019 at 09:32:57AM -0700, John Baldwin wrote:
>> On 3/13/19 8:16 AM, Steve Kargl wrote:
>>> On Tue, Mar 12, 2019 at 07:45:41PM -0700, Steve Kargl wrote:
>>>>
>>>> gcc8 --version
>>>> gcc8 (FreeBSD Ports Collection) 8.3.0
>>>>
>>>> gcc8 -fno-builtin -o z a.c -lm && ./z
>>>> gcc8 -O -fno-builtin -o z a.c -lm && ./z
>>>> gcc8 -O2 -fno-builtin -o z a.c -lm && ./z
>>>> gcc8 -O3 -fno-builtin -o z a.c -lm && ./z
>>>>
>>>> Max ULP: 2.297073
>>>> Count: 0   (# of ULP that exceed 21)
>>>>
>>>
>>> clang agrees with gcc8 if one changes ...
>>>
>>>> int
>>>> main(void)
>>>> {
>>>>double re, im, u, ur, ui;
>>>>float complex f;
>>>>float x, y;
>>>
>>> this line to "volatile float x, y".
>>
>> So it seems to be a regression in clang 7 vs clang 6?
>>
> 
> /usr/local/bin/clang60 has the same problem.  
> 
> % /usr/local/bin/clang60 -o z -O2 a.c -lm && ./z
>   Maximum ULP: 23.061242
> # of ULP > 21: 39
> 
> Adding volatile as in the above "fixes" the problem.
> 
> AFAICT, this a i386/387 code generation problem.  Perhaps,
> an alignment issue?

Oh, I misread your earlier e-mail to say that clang60 worked.

One issue I'm aware of is that clang does not have any support for the
special arrangement FreeBSD/i386 uses where it uses different precision
for registers vs in-memory for some of the floating point types (GCC has
a special hack that is only used on FreeBSD for this but isn't used on
any other OS's).  I wonder if that could be a factor?  Volatile probably
forces a round trip between memory which might explain why this is the
case.

I wonder what your test program does on i386 Linux with GCC?

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Optimization bug with floating-point?

2019-03-13 Thread John Baldwin
On 3/13/19 8:16 AM, Steve Kargl wrote:
> On Tue, Mar 12, 2019 at 07:45:41PM -0700, Steve Kargl wrote:
>>
>> gcc8 --version
>> gcc8 (FreeBSD Ports Collection) 8.3.0
>>
>> gcc8 -fno-builtin -o z a.c -lm && ./z
>> gcc8 -O -fno-builtin -o z a.c -lm && ./z
>> gcc8 -O2 -fno-builtin -o z a.c -lm && ./z
>> gcc8 -O3 -fno-builtin -o z a.c -lm && ./z
>>
>> Max ULP: 2.297073
>> Count: 0   (# of ULP that exceed 21)
>>
> 
> clang agrees with gcc8 if one changes ...
> 
>> int
>> main(void)
>> {
>>double re, im, u, ur, ui;
>>float complex f;
>>float x, y;
> 
> this line to "volatile float x, y".

So it seems to be a regression in clang 7 vs clang 6?

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r343567 aka PAE vs non-PAE merge breaks i386 freebsd

2019-03-01 Thread John Baldwin
On 3/1/19 5:03 AM, Rodney W. Grimes wrote:
>> On 2/28/19 10:32 AM, Steve Kargl wrote:
> ( ... trimmed ... )
> 
>>> The BIOS does have a enable/disable button for virtualization.
>>> During the great drm-legacy-kmod event of the last month, enabling
>>> virtualization locks up a i386 FreeBSD kernel very quickly.
>>> Perhaps, virtualization works under amd64.  Guess I'll burn
>>> an image onto a memstick an d give it a whirl.
>>
>> bhyve is definitely amd64-only.  We don't have any support for bhyve on i386
>> kernels and likely never will.  However, if an i386 chroot works, it's 
>> probably
>> faster than an i386 VM anyway.
> 
> bhyve/vmm.ko does not come into play here at all, the real question
> is why does our i386 kernel "lock up" simply because a newer CPU
> feature appears, it should not do that, as far as I am aware turing
> VT-x on does not or should not in anyway change the "i386" behavior
> or a machine.   What am I missing?

I think we don't know enough about this bug report to know what causes
the hang.

>>>> However, an amd64 kernel is going to be a more stable, better
>>>> supported kernel for running i386 binaries than an i386 kernel
>>>> at this point, and that will become even more true in the future.
>>>
>>> This is interesting as well.  Does this mean that amd64 is now 
>>> the only tier 1 platform and all other architectures are after
>>> thoughts?
>>
>> i386 is still marked as tier 1.  However, it's becoming increasingly harder 
>> to
>> maintain that level of support for the kernel.  core@ is currently exploring
>> some ideas about how to make our tiering for i386 more closely reflect what 
>> we
>> as a project are able to provide.  Originally we were considering a proposal 
>> to
>> demote all of i386 to tier 2, but after some initial conversations we think a
>> better model is to keep the i386 user ABI as tier 1 and only demote the i386
>> kernel.  However, we still need to think about what that looks like and 
>> update
>> our tiering language to reflect what that looks like.  I think the short 
>> version
>> is that we might no longer guarantee i386-specific fixes for kernel SAs, but
>> there are probably additional wrinkles that will arise as that is fleshed out
>> further.
> 
> Is core talking to the stake holders about this issue?  IMHO this topic
> should be an open discussion some place with all parties involved, not
> just core deciding what is or is not a tier 1 and/or how to fix our
> tier 1 situation with i386 (which I do agree needs to change, but
> to what I have not a solid idea.)

As you are well aware, core@ has talked to some stakeholders already
(including you) which has already resulted in some changes to what core@
is considering to propose to developers.  However, it is ultimately
core@ who makes tiering decisions.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r343567 aka PAE vs non-PAE merge breaks i386 freebsd

2019-02-28 Thread John Baldwin
On 2/28/19 11:24 AM, Cy Schubert wrote:
> On February 28, 2019 11:21:24 AM PST, Steve Kargl 
>  wrote:
>> On Thu, Feb 28, 2019 at 11:14:51AM -0800, Cy Schubert wrote:
>>> On February 28, 2019 11:06:46 AM PST, Conrad Meyer 
>> wrote:
>>>> On Thu, Feb 28, 2019 at 10:32 AM Steve Kargl
>>>>  wrote:
>>>>> This is interesting as well.  Does this mean that amd64 is now
>>>>> the only tier 1 platform and all other architectures are after
>>>>> thoughts?
>>>>
>>>> This has been the de facto truth for years.  i386 is mostly only
>>>> supported by virtue of sharing code with amd64.  There are efforts
>> to
>>>> promote arm64 to Tier 1, but it isn't there yet.  Power8+ might be
>>>> another good alternative Tier 1 candidate eventually.  None have
>>>> anything like the developer popularity that amd64 enjoys.
>>>>
>>>
>>> We deprecated and removed support for 386 and 486 processors. We
>> should consider removing support for low end Pentium as well. I'm
>> specifically thinking of removing the workarounds like F00F. Are there
>> any processors that are still vulnerable to this?
>>>
>>
>> Ahem, sys/i386/conf/GENERIC contains "cpu I486_CPU".
>> Is that a typo?
> 
> I stand corrected. We should remove that.

No, it doesn't need removing per my other mail.  While there is some cruft in a
few files for older CPUs (mostly just initcpu.c and identcpu.c) it is quite
small and in code that doesn't change.  To effect any type of substantial "win"
in reducing code complexity for i386, you'd have to do something like require 
PAE
(so that the kernel could assume PAE instead of a bunch of runtime checks as it
does now).   That would also give you working 64-bit atomics on i386.  However,
removing I486_CPU alone doesn't buy you anything.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r343567 aka PAE vs non-PAE merge breaks i386 freebsd

2019-02-28 Thread John Baldwin
On 2/28/19 11:14 AM, Cy Schubert wrote:
> On February 28, 2019 11:06:46 AM PST, Conrad Meyer  wrote:
>> On Thu, Feb 28, 2019 at 10:32 AM Steve Kargl
>>  wrote:
>>> This is interesting as well.  Does this mean that amd64 is now
>>> the only tier 1 platform and all other architectures are after
>>> thoughts?
>>
>> This has been the de facto truth for years.  i386 is mostly only
>> supported by virtue of sharing code with amd64.  There are efforts to
>> promote arm64 to Tier 1, but it isn't there yet.  Power8+ might be
>> another good alternative Tier 1 candidate eventually.  None have
>> anything like the developer popularity that amd64 enjoys.
>>
>> Conrad
>> ___
>> freebsd-current@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-current
>> To unsubscribe, send any mail to
>> "freebsd-current-unsubscr...@freebsd.org"
> 
> We deprecated and removed support for 386 and 486 processors. We should 
> consider removing support for low end Pentium as well. I'm specifically 
> thinking of removing the workarounds like F00F. Are there any processors that 
> are still vulnerable to this?

We have only removed support for 386 since it didn't support cmpxchg.  We still
nominally support 486s.  I don't know how well FreeBSD 13 would run on a 486, 
but
in theory the code is still there and the binaries shouldn't die with illegal
instruction faults.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r343567 aka PAE vs non-PAE merge breaks i386 freebsd

2019-02-28 Thread John Baldwin
On 2/28/19 10:32 AM, Steve Kargl wrote:
> On Thu, Feb 28, 2019 at 09:09:38AM -0800, John Baldwin wrote:
>> You can do all your tests directly on amd64 by just adding
>> "-m32" to compile i386 binaries against the libraries in /usr/lib32
>> and you will generate the same i386 binaries as if you were building
>> on an i386 system.  If you are a bit more paranoid, you can install
>> an i386 world and chroot into it and use that to test i386.  I do this
>> myself (-m32) for testing i386 things.  I also run i386 VMs under
>> bhyve on amd64 hosts.  I'm not sure your laptop's CPU can run i386
>> VMs though, and you don't need a VM to test userland-only changes
>> (I'm usually trying to test kernel changes).
> 
> Interesting.  Did not know I could use -m32 with any reliability.
> Setting up my testing environment may be a challenge as I use
> mpfr/gmp, so would need -m32 aware versions of those libraries.
> I normally install whatever the port collection offers for mpfr/gmp.
> I suppose I would need to install those independently to get
> multilib support.  I would also need to compile 2 versions of my
> testing framework (ie., additional support library).

-m32 didn't used to work in early amd64 (like 6.x or maybe 7.x), but it has 
worked
reliably for several branches now.  That said, if you want to use i386 
packages, I
think a chroot is probably a safer route as in the chroot you can use pkg to 
install
i386 packages just as if it was an i386 machine.

> The BIOS does have a enable/disable button for virtualization.
> During the great drm-legacy-kmod event of the last month, enabling
> virtualization locks up a i386 FreeBSD kernel very quickly.
> Perhaps, virtualization works under amd64.  Guess I'll burn
> an image onto a memstick an d give it a whirl.

bhyve is definitely amd64-only.  We don't have any support for bhyve on i386
kernels and likely never will.  However, if an i386 chroot works, it's probably
faster than an i386 VM anyway.

>> However, an amd64 kernel is going to be a more stable, better
>> supported kernel for running i386 binaries than an i386 kernel
>> at this point, and that will become even more true in the future.
> 
> This is interesting as well.  Does this mean that amd64 is now 
> the only tier 1 platform and all other architectures are after
> thoughts?

i386 is still marked as tier 1.  However, it's becoming increasingly harder to
maintain that level of support for the kernel.  core@ is currently exploring
some ideas about how to make our tiering for i386 more closely reflect what we
as a project are able to provide.  Originally we were considering a proposal to
demote all of i386 to tier 2, but after some initial conversations we think a
better model is to keep the i386 user ABI as tier 1 and only demote the i386
kernel.  However, we still need to think about what that looks like and update
our tiering language to reflect what that looks like.  I think the short version
is that we might no longer guarantee i386-specific fixes for kernel SAs, but
there are probably additional wrinkles that will arise as that is fleshed out
further.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r343567 aka PAE vs non-PAE merge breaks i386 freebsd

2019-02-28 Thread John Baldwin
On 2/23/19 8:39 AM, Steve Kargl wrote:
> On Sat, Feb 23, 2019 at 08:32:23AM -0800, Conrad Meyer wrote:
>> On Sat, Feb 23, 2019 at 12:44 AM Steve Kargl
>>  wrote:
>>> Ideas?
>>> ...
>>> +CPU: Intel(R) Core(TM)2 Duo CPU T7250  @ 2.00GHz (1995.04-MHz 
>>> 686-class CPU)
>>>Origin="GenuineIntel"  Id=0x6fd  Family=0x6  Model=0xf  Stepping=13
>>
>> https://ark.intel.com/content/www/us/en/ark/products/31728/intel-core-2-duo-processor-t7250-2m-cache-2-00-ghz-800-mhz-fsb.html
>>
>>> Intel® Virtualization Technology (VT-x) ‡  Yes
>>> Intel® 64 ‡   Yes
>>
>>> Merom is the first Intel mobile processor to feature Intel 64 architecture.
>>
>> So, as a workaround, maybe run amd64?
> 
> This is the only i386 FreeBSD system that I have.  This
> is the system where all the libm changes I've made have
> been tested.  i386 floating point is different than 
> amd64 floating point.  See npx.c and the history of any
> of the long double functions that I've worked on.  If
> this laptop does not run i386, there will be no testing
> of libm changes on the architecture.

Yes, but we set the initial FPU control word for 32-bit binaries to match i386 
when
running i386 binaries under an amd64 kernel.

See these comments in sys/x86/include/fpu.h with which you are likely familiar:

/*
 * The hardware default control word for i387's and later coprocessors is
 * 0x37F, giving:
 *
 *  round to nearest
 *  64-bit precision
 *  all exceptions masked.
 *
 * FreeBSD/i386 uses 53 bit precision for things like fadd/fsub/fsqrt etc
 * because of the difference between memory and fpu register stack arguments.
 * If its using an intermediate fpu register, it has 80/64 bits to work
 * with.  If it uses memory, it has 64/53 bits to work with.  However,
 * gcc is aware of this and goes to a fair bit of trouble to make the
 * best use of it.
 *
 * This is mostly academic for AMD64, because the ABI prefers the use
 * SSE2 based math.  For FreeBSD/amd64, we go with the default settings.
 */
#define __INITIAL_FPUCW__   0x037F
#define __INITIAL_FPUCW_I386__  0x127F
#define __INITIAL_NPXCW__   __INITIAL_FPUCW_I386__
#define __INITIAL_MXCSR__   0x1F80
#define __INITIAL_MXCSR_MASK__  0xFFBF

And this code in ia32_setregs() in sys/amd64/ia32/ia32_signal.c to set the
initial register values for i386 processes:

/*
 * Clear registers on exec
 */
void
ia32_setregs(struct thread *td, struct image_params *imgp, u_long stack)
{
...
pcb->pcb_initial_fpucw = __INITIAL_FPUCW_I386__;
...
}

This matches what exec_setregs() in sys/i386/i386/machdep.c does:

/*
 * Reset registers to default values on exec.
 */
void
exec_setregs(struct thread *td, struct image_params *imgp, u_long stack)
{
...
pcb->pcb_initial_npxcw = __INITIAL_NPXCW__;
...
}

You can do all your tests directly on amd64 by just adding "-m32" to compile 
i386
binaries against the libraries in /usr/lib32 and you will generate the same i386
binaries as if you were building on an i386 system.  If you are a bit more 
paranoid,
you can install an i386 world and chroot into it and use that to test i386.  I 
do
this myself (-m32) for testing i386 things.  I also run i386 VMs under bhyve on
amd64 hosts.  I'm not sure your laptop's CPU can run i386 VMs though, and you 
don't
need a VM to test userland-only changes (I'm usually trying to test kernel 
changes).

However, an amd64 kernel is going to be a more stable, better supported kernel 
for
running i386 binaries than an i386 kernel at this point, and that will become 
even
more true in the future.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Panic in sys_fstatat()

2019-02-14 Thread John Baldwin
On 2/14/19 12:38 PM, Steve Kargl wrote:
> On Thu, Feb 14, 2019 at 12:26:01PM -0800, John Baldwin wrote:
>> On 2/13/19 6:47 PM, Steve Kargl wrote:
>>> #16 0x00ff58bb in trap (frame=0x2e7b6880) at 
>>> /usr/src/sys/i386/i386/trap.c:519
>>> #17 0xffc0315d in ?? ()
>>> #18 0x2e7b6880 in ?? ()
>>> #19 0x00d1de64 in lookup (ndp=0x2e7b6a50)
>>> at /usr/src/sys/kern/vfs_lookup.c:710
>>> #20 0x00d1d763 in namei (ndp=0x2e7b6a50) at 
>>> /usr/src/sys/kern/vfs_lookup.c:487
>>> #21 0x00d372c5 in kern_statat (td=0x3c5dc700, flag=0, fd=-100, 
>>> path=0x2cced08e , 
>>> pathseg=UIO_USERSPACE, sbp=0x2e7b6b18, hook=0x0)
>>> at /usr/src/sys/kern/vfs_syscalls.c:2307
>>> #22 0x00d37c99 in sys_fstatat (td=0x3c5dc700, uap=0x3c5dc988)
>>> at /usr/src/sys/kern/vfs_syscalls.c:2284
>>> #23 0x00ff69fa in syscallenter (td=)
>>> at /usr/src/sys/i386/i386/../../kern/subr_syscall.c:135
>>> #24 syscall (frame=0x2e7b6ce8) at /usr/src/sys/i386/i386/trap.c:1144
>>> #25 0xffc033a7 in ?? ()
>>> #26 0x2e7b6ce8 in ?? ()
>>> Backtrace stopped: Cannot access memory at address 0xfbafbbbc
>>> (kgdb) 
>>
>> Frame 18 is probably the root problem, though it doesn't look like kgdb is
>> able to unwind it correctly.  Looking at frame 19 might help though.  It
>> seems like a NULL pointer dereference when invoking VOP_LOCK.
>>
> 
> I can't look at this until tonight (about 6-7 hours).
> Anything in frame 19 that you would be particularly
> interested in?

Just what source line it is and what the value of the arguments passed to the
function it is calling are.  Probably it's vn_lock() or VOP_LOCK() and it's
most likely the 'vp' that is NULL, but it would be good to see all the args
passed to the function if possible.

-- 
John Baldwin


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Panic in sys_fstatat()

2019-02-14 Thread John Baldwin
f69fa in syscallenter (td=)
> at /usr/src/sys/i386/i386/../../kern/subr_syscall.c:135
> #24 syscall (frame=0x2e7b6ce8) at /usr/src/sys/i386/i386/trap.c:1144
> #25 0xffc033a7 in ?? ()
> #26 0x2e7b6ce8 in ?? ()
> Backtrace stopped: Cannot access memory at address 0xfbafbbbc
> (kgdb) 

Frame 18 is probably the root problem, though it doesn't look like kgdb is
able to unwind it correctly.  Looking at frame 19 might help though.  It
seems like a NULL pointer dereference when invoking VOP_LOCK.

-- 
John Baldwin


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: kernel config question

2019-01-04 Thread John Baldwin
On 1/3/19 10:40 PM, Kevin Oberman wrote:
> On Wed, Jan 2, 2019 at 5:02 PM Robert Huff  wrote:
> 
>>
>> John Baldwin writes:
>>
>>>  -[8] In order to have a kernel that can run the 4.x binaries needed
>> to
>>>  -do an installworld, you must include the COMPAT_FREEBSD4 option in
>>>  -your kernel.  Failure to do so may leave you with a system that is
>>>  -hard to boot to recover. A similar kernel option COMPAT_FREEBSD5 is
>>>  -required to run the 5.x binaries on more recent kernels.  And so on
>>>  -for COMPAT_FREEBSD6 and COMPAT_FREEBSD7.
>>>  +[8] The new kernel must be able to run existing binaries used by
>>>  +an installworld.  When upgrading across major versions, the new
>>>  +kernel's configuration must include the correct COMPAT_FREEBSD
>>>  +option for existing binaries (e.g. COMPAT_FREEBSD11 to run 11.x
>>>  +binaries).  Failure to do so may leave you with a system that is
>>>  +hard to boot to recover.  A GENERIC kernel will include suitable
>>>  +compatibility options to run binaries from older branches.
>>
>> Maybe not perfect, but _much_ better.
>> Thanks.
>>
>>
>> Respectfully,
>>
>>
>> Robert Huff
>>
> Some ports may require compat ports. E.g. plexmediaserver requires
> compat9x. Oddly, compat9x requires compat10x, so I need 9, 10, and 11.
> 
> Now that 10 is EOL, I wish Plex would start building their blobs against 11.

While that is true, that isn't quite relevant to this note which is specific
to the buildworld + installworld upgrade process.

In general COMPAT_FREEBSD and the compatx packages require all the
newer compat options and packages when you are more than 1 branch away
from the running system.  That is, to run 9.x binaries on a 12.0 kernel you
need COMPAT_FREEBSD9, COMPAT_FREEBSD10, and COMPAT_FREEBSD11.  The same can
be true of compat packages except that you only need to step up to whatever
the verison of your userland binaries are.  If you had a 9.x jail on a
12.0 host you wouldn't need any compat packages, just the kernel options.
For a 10.x jail on a 12.0 host you would only need compat9x, etc.

-- 
John Baldwin


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: kernel config question

2019-01-02 Thread John Baldwin
On 1/2/19 1:31 PM, Robert Huff wrote:
> 
> John Baldwin writes:
> 
>>  >>  [8] In order to have a kernel that can run the 4.x binaries
>>  >>  needed to do an installworld, you must include the
>>  >>  COMPAT_FREEBSD4 option in your kernel. [...]
>>
>>  > No, COMPAT_FREEBSD4 is not needed. Maybe COMPAT_FREEBSD11 is needed.
>>  
>>  Yes, that text needs to be made more generic to say that you will need
>>  COMPAT_FREEBSD.  Though we've also had some major branches that
>>  didn't get a COMPAT_FREEBSD option.
> 
>   Are any of those still supported?

I'm not sure, but I mean more that you can't assume we will always have a
COMPAT_FREEBSD.  There was a COMPAT_FREEBSD11.  It looks like we actually
only skipped COMPAT_FREEBSD8 to date.  Perhaps we can just avoid worrying
about the lack of COMPAT_FREEBSD.

The text does say "and so on" for newer versions, but it's probably not
clear.  How about this:

Index: UPDATING
===
--- UPDATING(revision 342703)
+++ UPDATING(working copy)
@@ -1901,12 +1901,13 @@ COMMON ITEMS:
can be deleted by "make delete-old-libs", but you have to make
sure that no program is using those libraries anymore.
 
-   [8] In order to have a kernel that can run the 4.x binaries needed to
-   do an installworld, you must include the COMPAT_FREEBSD4 option in
-   your kernel.  Failure to do so may leave you with a system that is
-   hard to boot to recover. A similar kernel option COMPAT_FREEBSD5 is
-   required to run the 5.x binaries on more recent kernels.  And so on
-   for COMPAT_FREEBSD6 and COMPAT_FREEBSD7.
+   [8] The new kernel must be able to run existing binaries used by
+   an installworld.  When upgrading across major versions, the new
+   kernel's configuration must include the correct COMPAT_FREEBSD
+   option for existing binaries (e.g. COMPAT_FREEBSD11 to run 11.x
+   binaries).  Failure to do so may leave you with a system that is
+   hard to boot to recover.  A GENERIC kernel will include suitable
+   compatibility options to run binaries from older branches.
 
Make sure that you merge any new devices from GENERIC since the
last time you updated your kernel config file.


-- 
John Baldwin


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: kernel config question

2019-01-02 Thread John Baldwin
On 1/2/19 12:02 PM, Kurt Jaeger wrote:
> Hi!
> 
>> FreeBSD 12.0-CURRENT #0 r331659: Thu Mar 29 12:31:36 EDT 2018 amd64
>>
>>  to CURRENT (as of last midnight.
>>  Does this, in src/UPDATING:
>>
>>  [8] In order to have a kernel that can run the 4.x binaries
>>  needed to do an installworld, you must include the
>>  COMPAT_FREEBSD4 option in your kernel. [...]
> 
>>   (It seems ... irrational ... one would need compatibility stuff
>>  going back to FreeBSD 4 to rebuild/update FreeBSD 13.)
> 
> No, COMPAT_FREEBSD4 is not needed. Maybe COMPAT_FREEBSD11 is needed.

Yes, that text needs to be made more generic to say that you will need
COMPAT_FREEBSD.  Though we've also had some major branches that
didn't get a COMPAT_FREEBSD option.
-- 
John Baldwin


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: buildworld falure: truncated or malformed archive

2019-01-02 Thread John Baldwin
On 12/31/18 12:08 PM, Warner Losh wrote:
> On Mon, Dec 31, 2018, 1:36 PM Ryan Stone  
>> Does this mean that it's currently impossible to build a world with
>> debug symbols?
>>
> 
> Yes. Clang is simply too big until 64 bit offset support goes in.

We actually build clang (and llvm) with stripped down debug symbols by
default.  If you don't put any DEBUG_* foo in src.conf you will get debug
symbols for all of the world including the limited ones for clang.  (We
use -gline-tables-only or some such for clang).

-- 
John Baldwin


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: 12.0 - (zpool upgrade ) /bootpool/boot/* new location for /boot/* ?

2018-12-17 Thread John Baldwin
On 12/14/18 4:45 PM, Bruce Cantrall wrote:
> Hi, First time poster.
> Did the location for the files in /boot move to /bootpool/boot but the zpool 
> upgrade did not know this in 12.0?
> 
> # uname -a
> FreeBSD filestore1b.phishline.com 12.0-RELEASE FreeBSD 12.0-RELEASE r341666 
> GENERIC  amd64
> 
> # zpool upgrade zroot
> This system supports ZFS pool feature flags.
> 
> Enabled the following features on 'zroot':
>   large_dnode
>   spacemap_v2
> 
> If you boot from pool 'zroot', don't forget to update boot code.
> Assuming you use GPT partitioning and da0 is your boot disk
> the following command will do it:
> 
> gpart bootcode -b /boot/pmbr -p /boot/gptzfsboot -i 1 da0
> 
> ?
> (/boot does not exist but is now in /bootpool/boot)
> 
> #  gpart bootcode -b /bootpool/boot/pmbr -p /bootpool/boot/gptzfsboot -i 1 da0
> 
> (SNIP)
> This system supports ZFS pool feature flags.
> 
> Enabled the following features on 'bootpool':
>   large_dnode
>   spacemap_v2
> 
> # freebsd-update upgrade
> src component not installed, skipped
> Cannot identify running kernel
> 
> Worked Ok when I added the /bootpool directory to the path.

zpool upgrade's message is not very smart.  I think it just uses hardcoded paths
and a hardcode drive.  (For example, it always says da0 even if you have
a mirror in which case you should update the bootcode on all of the devices.)

I have a fresh install of 12.0-RC3 here that still has the files in /boot
and doesn't have a /bootpool directory at all.

-- 
John Baldwin


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Composite PCI devices in FreeBSD (mfd in Linux)

2018-12-10 Thread John Baldwin
On 12/10/18 12:19 PM, Ian Lepore wrote:
> On Mon, 2018-12-10 at 14:42 -0500, Anthony Jenkins wrote:
>> On 12/10/18 1:26 PM, John Baldwin wrote:
>>>
>>> On 12/10/18 9:00 AM, Anthony Jenkins wrote:
>>>>
>>>> Hi all,
>>>>
>>>> I'm trying to port an Intel PCI I2C controller from Linux to
>>>> FreeBSD.
>>>> Linux represents this device as an MFD (multi-function device),
>>>> meaning
>>>> it has these "sub-devices" that can be handed off to other
>>>> drivers to
>>>> actually attach devices to the system.  The Linux "super" PCI
>>>> device is
>>>> the intel-lpss-pci.c, and the "sub" device is i2c-designware-
>>>> platdrv.c,
>>>> which represents the DesignWare driver's "platform" attachment to
>>>> the
>>>> Linux system.  FreeBSD also has a DesignWare I2C controller
>>>> driver,
>>>> ig4(4), but it only has PCI and ACPI bus attachment
>>>> implementations.
>>>>
>>>> I have a port of the Linux intel-lpss driver to FreeBSD, but now
>>>> I'm
>>>> trying to figure out the best way to give FreeBSD's ig4(4) driver
>>>> access
>>>> to my lpss(4) device.  I'm thinking I could add an ig4_lpss.c
>>>> describing
>>>> the "attachment" of an ig4(4) to an lpss(4).  Its probe() method
>>>> would
>>>> scan the "lpss" devclass for devices, and its attach() method
>>>> would
>>>> attach itself as a child to the lpss device and "grab" the
>>>> portion of
>>>> PCI memory and the IRQ that the lpss PCI device got.
>>>>
>>>> Is this the "FreeBSD Way (TM)" of handling this type of device? 
>>>> If not,
>>>> can you recommend an existing FreeBSD driver I can model my code
>>>> after?
>>>> If my approach is acceptable, how do I fully describe the ig4(4)
>>>> device's attachment to the system?  Is simply making it a child
>>>> of
>>>> lpss(4) sufficient?  It's "kind of" a PCI device (it is
>>>> controlled via
>>>> access to a PCI memory region and an IRQ), but it's a sub-device
>>>> of an
>>>> actual PCI device (lpss(4)) attached to PCI.
>>>> How would my ig4_lpss attachment get information from the lpss(4)
>>>> driver
>>>> about what it probed?
>>> There are some existing PCI drivers that act as "virtual" busses
>>> that attach
>>> child devices.  For example, vga_pci.c can have drm, agp, and
>>> acpi_video
>>> child devices.  There are also some SMBus drivers that are also
>>> PCI-ISA
>>> bridges and thus create separate child devices.
>> Yeah I was hoping to avoid using video PCI devices as a model, as 
>> complex as they've gotten recently.  I'll check out its bus glue
>> logic.
>>
>>>
>>> For a virtual bus like this, you need to figure out how your child
>>> devices
>>> will be enumerated.  A simple way is to let child devices use an
>>> identify
>>> routine that looks at each parent device and decides if a child
>>> device
>>> for that driver makes sense.  It can then add a child device in the
>>> identify routine.
>> Really an lpss parent PCI parent device can only have the following:
>>
>>   * one of {I2C, UART, SPI} controller
>>   * optionally an IDMA64 controller
>>
>> so I was thinking a child ig4(4) device would attach to lpss iff
>>
>>   * the lpss device detected an I2C controller
>>   * no other ig4 device is already attached
>>
>> I haven't fiddled with identify() yet, will look at that tonight.
>>
> 
> If this is just another "bus" an ig4 instance can attach to, I'd think
> the recipe would be to add another DRIVER_MODULE() to ig4_iic.c naming
> ig4_lpss as the parent. Then add a new ig4_lpss.c modeled after the
> existing pci and acpi attachment code, its DRIVER_MODULE() would name
> lpss as parent, and its probe routine would return BUS_PROBE_NOWILDCARD
> (attach only if specifically added by the parent).
> 
> Then there would be a new lpss driver that does the resource managment
> stuff mentioned above, and if it detects configuration for I2C it would
> do a device_add_child(lpssdev, "ig4_lpss", -1) followed by
> bus_generic_attach(). There'd be no need for identify() in the child in
> that case, I think.
> 
> But t

Re: Composite PCI devices in FreeBSD (mfd in Linux)

2018-12-10 Thread John Baldwin
On 12/10/18 9:00 AM, Anthony Jenkins wrote:
> Hi all,
> 
> I'm trying to port an Intel PCI I2C controller from Linux to FreeBSD.  
> Linux represents this device as an MFD (multi-function device), meaning 
> it has these "sub-devices" that can be handed off to other drivers to 
> actually attach devices to the system.  The Linux "super" PCI device is 
> the intel-lpss-pci.c, and the "sub" device is i2c-designware-platdrv.c, 
> which represents the DesignWare driver's "platform" attachment to the 
> Linux system.  FreeBSD also has a DesignWare I2C controller driver, 
> ig4(4), but it only has PCI and ACPI bus attachment implementations.
> 
> I have a port of the Linux intel-lpss driver to FreeBSD, but now I'm 
> trying to figure out the best way to give FreeBSD's ig4(4) driver access 
> to my lpss(4) device.  I'm thinking I could add an ig4_lpss.c describing 
> the "attachment" of an ig4(4) to an lpss(4).  Its probe() method would 
> scan the "lpss" devclass for devices, and its attach() method would 
> attach itself as a child to the lpss device and "grab" the portion of 
> PCI memory and the IRQ that the lpss PCI device got.
> 
> Is this the "FreeBSD Way (TM)" of handling this type of device?  If not, 
> can you recommend an existing FreeBSD driver I can model my code after?
> If my approach is acceptable, how do I fully describe the ig4(4) 
> device's attachment to the system?  Is simply making it a child of 
> lpss(4) sufficient?  It's "kind of" a PCI device (it is controlled via 
> access to a PCI memory region and an IRQ), but it's a sub-device of an 
> actual PCI device (lpss(4)) attached to PCI.
> How would my ig4_lpss attachment get information from the lpss(4) driver 
> about what it probed?

There are some existing PCI drivers that act as "virtual" busses that attach
child devices.  For example, vga_pci.c can have drm, agp, and acpi_video
child devices.  There are also some SMBus drivers that are also PCI-ISA
bridges and thus create separate child devices.

For a virtual bus like this, you need to figure out how your child devices
will be enumerated.  A simple way is to let child devices use an identify
routine that looks at each parent device and decides if a child device
for that driver makes sense.  It can then add a child device in the
identify routine.  To handle things like resources, you want to have
bus_*_resource methods that let your child device use the normal bus_*
functions to allocate resources.  At the simplest end you don't need to
permit any sharing of BARs among multiple children so you can just proxy
the requests in the "real" PCI driver.  (vga_pci.c does this)  If you need
the BARs to be shared you have a couple of options such as just using a
refcount on the BAR resource but letting multiple devices allocate the same
BAR.  If you want to enforce exclusivity (once a device allocates part of
a BAR then other children shouldn't be permitted to do so), then you will
need a more complicated solution.

Hopefully that gives you a starting point?

-- 
John Baldwin


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: axp288 on Intel HW

2018-11-16 Thread John Baldwin
On 11/16/18 12:51 AM, Johannes Lundberg wrote:
> Hi
> 
> I have a Lenovo Ideapad Miix 310 that has a Intel CherryTrail CPU and it
> runs FreeBSD quite nicely (with accelerated graphics). What I'm missing is
> battery life status..
> 
> I can get this information using smb (for some reason i2c just returns
> error sending start condition)
> smbmsg -f /dev/smb6 -s 0x68 -c 0xB9 -i 1 -F %d
> 
> In emergency this works but it would be nice to have a kernel driver for
> it.
> 
> I found the axp2xx driver for Allwinner in the tree. Would it be possible
> to share this with amd64 with not too much effort?
> 
> If not, all I'm really interested in is battery status so I might just hack
> together a simple driver for that report values to hw.acpi.battery (because
> I don't think there is a sysctl for battery info that aggregates different
> sources?)
> 
> Datasheet for the pmic can be found here
> http://download.bbs.ickey.cn/201707/cfe88ee7ef01eed7a4586ce340165da0.pdf

Have you looked to see why ACPI isn't reporting battery status?  Maybe we
need to be invoking some specific method like DSM or OSC before your battery
devices work?  Maybe your battery devices don't have a _STA method and you
need the change in https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=227191?

-- 
John Baldwin


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Has anything changed from 11.2 to 12.0 in PCI MSI/MSIX path?

2018-11-01 Thread John Baldwin
On 10/30/18 1:22 AM, Rajesh Kumar wrote:
> Hi John,
> 
> Thanks for your updates.  I assume you are talking about having a unified 
> intr_machdep.h compared to having seperate amd64 and i386 versions.
> 
> Can you please update this thread once all changes are MFC complete or tag me 
> in necessary place? So that I can give a try in my board once it's ready.

I just committed r340016 which merges r338360 along with followup fixes to
stable/11.
 
> On Mon, Oct 29, 2018 at 11:08 PM John Baldwin  <mailto:j...@freebsd.org>> wrote:
> 
> On 10/25/18 10:24 AM, Rajesh Kumar wrote:
> > Hi John,
> >
> > Thanks a lot. It helps. I backported the changes to 11.2 and tried 
> booting in my board with success without any need for the said tunables.
> >
> > I see those changes are marked for MFC after 2 Weeks. But I don't see 
> them still in stable/11 branch.  So, will it be taken into stable/11 branch 
> by any chance? If not, can the backported changes be submitted for review to 
> take into stable/11 branch?
> 
> I'm working on the MFC.  The current patch I've tested an MFC of is the 
> one to
> unify sys/x86/include/intr_machdep.h as a precursor to MFC'ing this 
> change.
> 
> > On Thu, Oct 25, 2018 at 1:17 AM John Baldwin  <mailto:j...@freebsd.org> <mailto:j...@freebsd.org 
> <mailto:j...@freebsd.org>>> wrote:
> >
> >     On 10/24/18 3:40 AM, Rajesh Kumar wrote:
> >     > Hi,
> >     >
> >     > I have a amd64 based board. When I tried to boot 11.1 (or) 11.2 
> in that, I
> >     > needed the following tunables to be set from loader prompt to get 
> it booted
> >     > (otherwise machine reboots continuously).
> >     >
> >     > hw.usb.xhci.msi=0
> >     > hw.usb.xhci.msix=0
> >     > hw.pci.enable_msi=0
> >     > hw.pci.enable_msix=0
> >     >
> >     > But, when I tried with 12.0 - ALPHA4, I could able to get it 
> booted without
> >     > any tunables.  So, has anything changed significantly on PCI 
> MSI/MSI-X
> >     > path?
> >     >
> >     > Note: I have a forum topic with my observations about the issue on
> >     > 11.1/11.2 in the following thread
> >     > 
> https://forums.freebsd.org/threads/freebsd-11-1-installation-fails-and-rebooting.65814/
> >     >
> >     > Let me know if you need any details.
> >
> >     I believe this was fixed by r338360.
> >
> >     --
> >     John Baldwin
> >
> 
> 
> -- 
> John Baldwin
> 


-- 
John Baldwin


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Has anything changed from 11.2 to 12.0 in PCI MSI/MSIX path?

2018-10-29 Thread John Baldwin
On 10/25/18 10:24 AM, Rajesh Kumar wrote:
> Hi John,
> 
> Thanks a lot. It helps. I backported the changes to 11.2 and tried booting in 
> my board with success without any need for the said tunables.
> 
> I see those changes are marked for MFC after 2 Weeks. But I don't see them 
> still in stable/11 branch.  So, will it be taken into stable/11 branch by any 
> chance? If not, can the backported changes be submitted for review to take 
> into stable/11 branch? 

I'm working on the MFC.  The current patch I've tested an MFC of is the one to
unify sys/x86/include/intr_machdep.h as a precursor to MFC'ing this change.
 
> On Thu, Oct 25, 2018 at 1:17 AM John Baldwin  <mailto:j...@freebsd.org>> wrote:
> 
> On 10/24/18 3:40 AM, Rajesh Kumar wrote:
> > Hi,
> >
> > I have a amd64 based board. When I tried to boot 11.1 (or) 11.2 in 
> that, I
> > needed the following tunables to be set from loader prompt to get it 
> booted
> > (otherwise machine reboots continuously).
> >
> > hw.usb.xhci.msi=0
> > hw.usb.xhci.msix=0
> > hw.pci.enable_msi=0
> > hw.pci.enable_msix=0
> >
> > But, when I tried with 12.0 - ALPHA4, I could able to get it booted 
> without
> > any tunables.  So, has anything changed significantly on PCI MSI/MSI-X
> > path?
> >
> > Note: I have a forum topic with my observations about the issue on
> > 11.1/11.2 in the following thread
> > 
> https://forums.freebsd.org/threads/freebsd-11-1-installation-fails-and-rebooting.65814/
> >
> > Let me know if you need any details.
> 
> I believe this was fixed by r338360.
> 
> -- 
> John Baldwin
> 


-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: savecore: BFD: BFD 2.17.50 [FreeBSD] 2007-07-03 assertion fail /usr/src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elf64-x86-64.c:276

2018-10-25 Thread John Baldwin
On 10/25/18 2:14 AM, Marcin Cieslak wrote:
> On Wed, 24 Oct 2018, John Baldwin wrote:
> 
>> On 10/23/18 10:58 AM, Marcin Cieslak wrote:
>>> This GDB was configured as "amd64-marcel-freebsd"...BFD: 
>>> /boot/kernel/kernel: invalid relocation type 37
>>> BFD: BFD 2.17.50 [FreeBSD] 2007-07-03 assertion fail 
>>> /usr/src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elf64-x86-64.c:276
>>>
>>> The kernel has been built on 11.1 with LD=/usr/bin/ld.lld
>>>
>>> Is this something that matters at all?
>>
>> It is not something that is likely to be fixed.  If you pkg install gdb from
>> ports, is the kgdb it includes able to examine the crash dump?
> 
> Not really (using 8.2 from ports):
> 
> # /usr/local/bin/kgdb82 -n 5 /usr/obj/usr/src/sys/GENERIC/kernel.debug
> GNU gdb (GDB) 8.2 [GDB v8.2 for FreeBSD]
> Copyright (C) 2018 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.
> Type "show copying" and "show warranty" for details.
> This GDB was configured as "x86_64-portbld-freebsd11.1".
> Type "show configuration" for configuration details.
> For bug reporting instructions, please see:
> <http://www.gnu.org/software/gdb/bugs/>.
> Find the GDB manual and other documentation resources online at:
> <http://www.gnu.org/software/gdb/documentation/>.
> 
> For help, type "help".
> Type "apropos word" to search for commands related to "word"...
> Reading symbols from /usr/obj/usr/src/sys/GENERIC/kernel.debug...done.
> thread.c:93: internal-error: struct thread_info *inferior_thread(): Assertion 
> `tp' failed.
> A problem internal to GDB has been detected,
> further debugging may prove unreliable.
> Quit this debugging session? (y or n) 

This usually means the kernel image you are using doesn't match the vmcore, so
it's reading garbage data from the vmcore since the symbol offsets are wrong.

You can test this by doing what crashinfo does: compare the version string in 
the
/var/crash/info.5 file with 'p version' in kgdb (or gdb) of the kernel without
a vmcore.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Has anything changed from 11.2 to 12.0 in PCI MSI/MSIX path?

2018-10-24 Thread John Baldwin
On 10/24/18 3:40 AM, Rajesh Kumar wrote:
> Hi,
> 
> I have a amd64 based board. When I tried to boot 11.1 (or) 11.2 in that, I
> needed the following tunables to be set from loader prompt to get it booted
> (otherwise machine reboots continuously).
> 
> hw.usb.xhci.msi=0
> hw.usb.xhci.msix=0
> hw.pci.enable_msi=0
> hw.pci.enable_msix=0
> 
> But, when I tried with 12.0 - ALPHA4, I could able to get it booted without
> any tunables.  So, has anything changed significantly on PCI MSI/MSI-X
> path?
> 
> Note: I have a forum topic with my observations about the issue on
> 11.1/11.2 in the following thread
> https://forums.freebsd.org/threads/freebsd-11-1-installation-fails-and-rebooting.65814/
> 
> Let me know if you need any details.

I believe this was fixed by r338360.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: savecore: BFD: BFD 2.17.50 [FreeBSD] 2007-07-03 assertion fail /usr/src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elf64-x86-64.c:276

2018-10-24 Thread John Baldwin
On 10/23/18 10:58 AM, Marcin Cieslak wrote:
> Hello, I have a freshly built 12.0-ALPHA10 (r339406) and the kernel
> panicked at some point (another mail coming on that).
> 
> I have a full dump partition enabled, but during savecore
> quite lot BFD assertion messages appear:
> 
> Tue Oct 23 18:45:53 CEST 2018
> 
> FreeBSD radziecki 12.0-ALPHA10 FreeBSD 12.0-ALPHA10 r339406 GENERIC  amd64
> 
> panic: 
> 
> GNU gdb 6.1.1 [FreeBSD]
> Copyright 2004 Free Software Foundation, Inc.
> GDB is free software, covered by the GNU General Public License, and you are
> welcome to change it and/or distribute copies of it under certain conditions.
> Type "show copying" to see the conditions.
> There is absolutely no warranty for GDB.  Type "show warranty" for details.
> This GDB was configured as "amd64-marcel-freebsd"...BFD: /boot/kernel/kernel: 
> invalid relocation type 37
> BFD: BFD 2.17.50 [FreeBSD] 2007-07-03 assertion fail 
> /usr/src/gnu/usr.bin/binutils/libbfd/../../../../contrib/binutils/bfd/elf64-x86-64.c:276
> 
> The kernel has been built on 11.1 with LD=/usr/bin/ld.lld
> 
> Is this something that matters at all?

It is not something that is likely to be fixed.  If you pkg install gdb from
ports, is the kgdb it includes able to examine the crash dump?

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: intr_machdep.c:176:2: error: use of undeclared identifier 'interrupt_sorted'

2018-09-17 Thread John Baldwin
On 9/17/18 11:32 AM, Michael Butler wrote:
> On 9/10/18 1:20 PM, John Baldwin wrote:
>> On 9/8/18 1:44 PM, Michael Butler wrote:
>>> On 9/8/18 3:43 PM, Konstantin Belousov wrote:
>>>> On Sat, Sep 08, 2018 at 02:07:41PM -0400, Michael Butler wrote:
>>>>> On 8/31/18 1:28 AM, Konstantin Belousov wrote:
>>>>>> On Fri, Aug 31, 2018 at 12:21:02AM -0400, Michael Butler wrote:
>>>>>
>>>>>  [ .. snip .. ]
>>>>>
>>>>>>> I see another problem after using Ian's workaround of moving the #ifdef
>>>>>>> SMP; it seems I now run out of kernel stack on an i386 (Pentium-III)
>>>>>>> machine with only 512MB of RAM:
>>>>>>>
>>>>>>> Aug 29 23:29:19 sarah kernel: vm_thread_new: kstack allocation failed
>>>>>>> Aug 29 23:29:26 sarah kernel: vm_thread_new: kstack allocation failed
>>>>>>> Aug 29 23:29:30 sarah kernel: vm_thread_new: kstack allocation failed
>>>>>>> Aug 29 23:29:38 sarah kernel: vm_thread_new: kstack allocation failed
>>>>>>> Aug 29 23:29:38 sarah kernel: vm_thread_new: kstack allocation failed
>>>>>>> Aug 29 23:29:40 sarah kernel: vm_thread_new: kstack allocation failed
>>>>>>
>>>>>> What is the kernel revision for "now".  What was the previous revision
>>>>>> where the kstack allocation failures did not happen.
>>>>>>
>>>>>> Also, what is the workload ?
>>>>>
>>>>> Sorry for the delay. Any version at or after SVN r338360 would either a)
>>>>> not boot at all or b) crash shortly after boot with a swarm of messages
>>>>> as above. It was stable before that.
>>>>>
>>>>> Unfortunately, this machine is remote and, being as old as it is, has no
>>>>> remote console facility. 'nextboot' has been my savior ;-)
>>>>>
>>>>> It is a 700MHz Pentium-III with 512MB of RAM and has 3 used interfaces,
>>>>> local ethernet (FXP), GIF for an IPv6 tunnel to HE and TAP for an
>>>>> OpenVPN endpoint. It has IPFW compiled into the kernel and acts as a
>>>>> router/firewall with few actual applications running.
>>>>>
>>>>> As another data point, I manually reversed both SVN r338360 and r338415
>>>>> (a related change) and it is now stable running at SVN r338520,
>>>>
>>>> It is very unprobable.  I do not see how could r338360 affect KVA 
>>>> allocation.
>>>> Double-check that you booted right kernels.
>>>>
>>>
>>> FreeBSD sarah.protected-networks.net 12.0-ALPHA5 FreeBSD 12.0-ALPHA5 #14
>>> r338520M: Thu Sep  6 21:35:31 EDT 2018
>>>
>>> 'svn diff' reports the only changes being the two reversals I noted above,
>>
>> Can you get the output of 'x num_io_irqs' at the DDB prompt after the
>> panic?
>>
> 
> SVN r338725 fixed this - thanks! :-)

Hmm, I'm not sure how that fixed this, but glad it is ok now.

-- 
John Baldwin


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: devd on head -r338675 on aarch64 (Pine64+ 2GB example) gets during booting: "sh: /usr/libexec/hyperv/hyperv_vfattach: not found"

2018-09-14 Thread John Baldwin
On 9/14/18 11:24 AM, Mark Millard via freebsd-arm wrote:
> From the boot of the Pine64+ 2GB for -r338675:
> 
> . . .
> Starting devd.
> sh: /usr/libexec/hyperv/hyperv_vfattach: not found
> add host 127.0.0.1: gateway lo0 fib 0: route already in table
> . . .

I got the same error booting FreeBSD/riscv in qemu.

-- 
John Baldwin


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Speed problems with both system openssl and security/openssl-devel

2018-09-13 Thread John Baldwin
On 9/13/18 1:27 AM, Lev Serebryakov wrote:
> Hello Kevin,
> 
> Thursday, September 13, 2018, 6:32:30 AM, you wrote:
> 
> 
>> This is probably not the issue, but aesni is not in the GENERIC kernel.  Are 
>> you sure aesni.ko is loaded?
>> % kldstat | grep aesni
>  I'm not using modules, as it is NanoBSD image build for minimal size ant
> maximal efficiency. But I have aesni in my kernel config for sure:
> 
> % grep aesni ~/nanobsd/gatevay.v3/J3160
> device   aesni

From my understanding of the OpenSSL code, it doesn't use the kernel driver
at all (the kernel driver is only needed for in-kernel crypto such as IPSec
or GELI).  AESNI are just instructions that can be used in userland, and
OpenSSL's AESNI acceleration is purely different routines in userland.
I would verify if AESNI shows up in the CPU features in dmesg first (if it
doesn't I'd check for a BIOS option disabling it).

-- 
John Baldwin


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Enabling the WITH_REPRODUCIBLE_BUILD knob for 12.0-REL

2018-09-10 Thread John Baldwin
On 9/10/18 10:55 AM, Rodney W. Grimes wrote:
>> On 9/10/18 9:51 AM, Rodney W. Grimes wrote:
>>>> The FreeBSD base system is a reproducible build[1] with a minor
>>>> exception: the build metadata (timestamps, user, hostname, etc.)
>>>> included in the kernel and loader.
>>>>
>>>> With the default, non-reproducible build the kernel ident looks like:
>>>>
>>>> FreeBSD 12.0-ALPHA5 #4 r338195: Mon Jan 1 10:11:12 EDT 2018
>>>>user@hostname:/path/to/freebsd/src
>>>>
>>>> and the loader ident:
>>>>
>>>> FreeBSD/amd64 EFI loader, Revision 1.1
>>>> (Mon Jan 1 10:11:12 EDT 2018 user@hostname)
>>>>
>>>> With reproducible builds enabled the kernel ident looks like:
>>>>
>>>> FreeBSD 12.0-ALPHA5  r338195
>>>>
>>>> and the loader ident:
>>>>
>>>> FreeBSD/amd64 EFI loader, Revision 1.1
>>>>
>>>> I would like to enable the REPRODUCIBLE_BUILD knob by default for the
>>>> 12.0 release, and propose we do this by adding a step to switch the
>>>> default to the list of changes[2] that re@ commits to the branch as
>>>> part of the release process.
>>>
>>> Why not just turn this on and leave it on?
>>
>> For kernels not built against a pristine tree the extra info is useful to
>> have.  For better or worse, kgdb also parses the path to try to find
>> kernel.full (used by e.g. 'kgdb -n last'), so if you remove the path it
>> won't be able to find the matching kernel using its current logic.
> 
> So this means stable/12 users would have hassles getting kgdb to work?

No, this means that if you turn this option on in HEAD and leave it always
on (as I read your mail to say), then it would be a hassle for developers
on head.  On stable branches it would be nice to keep the info if people
are building kernels that aren't stock kernels (meaning modified source
trees).  For release kernels, crashinfo should work fine though even with
the extra information stripped.

For release builds the information is not really useful, it's only ever
useful if someone is building their own kernel for some reason (and even in
some of those cases it isn't all that useful).

-- 
John Baldwin


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: intr_machdep.c:176:2: error: use of undeclared identifier 'interrupt_sorted'

2018-09-10 Thread John Baldwin
On 9/8/18 1:44 PM, Michael Butler wrote:
> On 9/8/18 3:43 PM, Konstantin Belousov wrote:
>> On Sat, Sep 08, 2018 at 02:07:41PM -0400, Michael Butler wrote:
>>> On 8/31/18 1:28 AM, Konstantin Belousov wrote:
>>>> On Fri, Aug 31, 2018 at 12:21:02AM -0400, Michael Butler wrote:
>>>
>>>  [ .. snip .. ]
>>>
>>>>> I see another problem after using Ian's workaround of moving the #ifdef
>>>>> SMP; it seems I now run out of kernel stack on an i386 (Pentium-III)
>>>>> machine with only 512MB of RAM:
>>>>>
>>>>> Aug 29 23:29:19 sarah kernel: vm_thread_new: kstack allocation failed
>>>>> Aug 29 23:29:26 sarah kernel: vm_thread_new: kstack allocation failed
>>>>> Aug 29 23:29:30 sarah kernel: vm_thread_new: kstack allocation failed
>>>>> Aug 29 23:29:38 sarah kernel: vm_thread_new: kstack allocation failed
>>>>> Aug 29 23:29:38 sarah kernel: vm_thread_new: kstack allocation failed
>>>>> Aug 29 23:29:40 sarah kernel: vm_thread_new: kstack allocation failed
>>>>
>>>> What is the kernel revision for "now".  What was the previous revision
>>>> where the kstack allocation failures did not happen.
>>>>
>>>> Also, what is the workload ?
>>>
>>> Sorry for the delay. Any version at or after SVN r338360 would either a)
>>> not boot at all or b) crash shortly after boot with a swarm of messages
>>> as above. It was stable before that.
>>>
>>> Unfortunately, this machine is remote and, being as old as it is, has no
>>> remote console facility. 'nextboot' has been my savior ;-)
>>>
>>> It is a 700MHz Pentium-III with 512MB of RAM and has 3 used interfaces,
>>> local ethernet (FXP), GIF for an IPv6 tunnel to HE and TAP for an
>>> OpenVPN endpoint. It has IPFW compiled into the kernel and acts as a
>>> router/firewall with few actual applications running.
>>>
>>> As another data point, I manually reversed both SVN r338360 and r338415
>>> (a related change) and it is now stable running at SVN r338520,
>>
>> It is very unprobable.  I do not see how could r338360 affect KVA allocation.
>> Double-check that you booted right kernels.
>>
> 
> FreeBSD sarah.protected-networks.net 12.0-ALPHA5 FreeBSD 12.0-ALPHA5 #14
> r338520M: Thu Sep  6 21:35:31 EDT 2018
> 
> 'svn diff' reports the only changes being the two reversals I noted above,

Can you get the output of 'x num_io_irqs' at the DDB prompt after the
panic?
-- 
John Baldwin


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Enabling the WITH_REPRODUCIBLE_BUILD knob for 12.0-REL

2018-09-10 Thread John Baldwin
On 9/10/18 9:51 AM, Rodney W. Grimes wrote:
>> The FreeBSD base system is a reproducible build[1] with a minor
>> exception: the build metadata (timestamps, user, hostname, etc.)
>> included in the kernel and loader.
>>
>> With the default, non-reproducible build the kernel ident looks like:
>>
>> FreeBSD 12.0-ALPHA5 #4 r338195: Mon Jan 1 10:11:12 EDT 2018
>>user@hostname:/path/to/freebsd/src
>>
>> and the loader ident:
>>
>> FreeBSD/amd64 EFI loader, Revision 1.1
>> (Mon Jan 1 10:11:12 EDT 2018 user@hostname)
>>
>> With reproducible builds enabled the kernel ident looks like:
>>
>> FreeBSD 12.0-ALPHA5  r338195
>>
>> and the loader ident:
>>
>> FreeBSD/amd64 EFI loader, Revision 1.1
>>
>> I would like to enable the REPRODUCIBLE_BUILD knob by default for the
>> 12.0 release, and propose we do this by adding a step to switch the
>> default to the list of changes[2] that re@ commits to the branch as
>> part of the release process.
> 
> Why not just turn this on and leave it on?

For kernels not built against a pristine tree the extra info is useful to
have.  For better or worse, kgdb also parses the path to try to find
kernel.full (used by e.g. 'kgdb -n last'), so if you remove the path it
won't be able to find the matching kernel using its current logic.
crashinfo uses different logic so will still work fine (crashinfo looks
for all the things matching /boot/*/kernel and tries them all until it finds
a match).

-- 
John Baldwin


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: intr_machdep.c:176:2: error: use of undeclared identifier 'interrupt_sorted'

2018-08-29 Thread John Baldwin
On 8/29/18 4:20 PM, Ian FREISLICH wrote:
> Hi
> 
> I see the definition of interrupt_sorted is #ifdefed out by #ifdef SMP
> at line 84.  My system is UP  so I'm not compiling an SMP kernel.
> 
> /usr/src/sys/x86/x86/intr_machdep.c:176:2: error: use of undeclared
> identifier 'interrupt_sorted'; did you mean 'interrupt_sources'?
>     interrupt_sorted = mallocarray(num_io_irqs,
> sizeof(*interrupt_sorted),
>     ^~~~
>     interrupt_sources
> /usr/src/sys/x86/x86/intr_machdep.c:83:24: note: 'interrupt_sources'
> declared here
> static struct intsrc **interrupt_sources;
>    ^
> /usr/src/sys/x86/x86/intr_machdep.c:176:54: error: use of undeclared
> identifier 'interrupt_sorted'; did you mean 'interrupt_sources'?
>     interrupt_sorted = mallocarray(num_io_irqs,
> sizeof(*interrupt_sorted),

Probably just needs #ifdef SMP around the mallocarray().  I'll test locallyon a 
UP kernel config.

-- 
John Baldwin


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: svn commit: r338204 - in head: etc etc/defaults sbin/devfs

2018-08-23 Thread John Baldwin
On 8/22/18 8:37 PM, Mark Millard wrote:
> I'm just using this move as an example for some more
> general questions.
> 
> After this change when I look at:
> 
> https://www.freebsd.org/cgi/man.cgi?query=devfs.conf=0=5=FreeBSD+12-current=default=html
> 
> I see in the man page:
> 
> FILES
>  /etc/devfs.conf
>  /usr/share/examples/etc/devfs.conf
> 
> So . . .
> 
> Roughly when are the "FreeBSD+12-current" man pages going to
> track the moves? Once everything has been moved?
> 
> Are the examples also going to be moved/reorganized? Similar
> timing question to the above (if yes).

The installed location of the files doesn't change, only their location
in the source tree.  It does seem that share/examples has not been
handled to date, as they probably belong in the same package as the thing
they are samples of.

I really wish that the Makefiles were smart enough to use .PATH or
some such to reach over into ${SRCTOP}/etc to find the files without
requiring them to actually move in the tree since it's not very
intuitive where to find many of these files now.  (And the source
locations are starting to no longer mimic the layout on the host,
such as syslog.d being "flattened".)

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Newly upgraded -CURRENT box does not boot

2018-08-21 Thread John Baldwin
On 8/21/18 4:19 AM, Kyle Evans wrote:
> On Mon, Aug 20, 2018 at 4:27 PM, Brett Gmoser
>  wrote:
>> Hi there,
>>
>> I was told to e-mail these addresses with this.
>>
>> I did an `svn update` on /usr/src last night, build world and kernel as
>> usual. This morning I installed the kernel, booted into single user,
>> installed world and did mergemaster -Ui as usual. The new kernel had booted
>> fine. Upon reboot, the machine will no longer boot:
>>
>> Startup error in /boot/lua/loader.lua:
>> LUA ERROR: cannot open /boot/lua/loader.lua: no such file or directory
>>
>> can't load 'kernel'
>>
>> Many things in the bootloader do not work, including "boot kernel.old", "ls
>> /boot", and various other things (most if not all just result in "Command
>> failed"). Interestingly, "ls /mnt" works, other directories do not. That's
>> the only clue I have.
>>
>> I'm able to reboot in an installer image and mount the drive just fine.
>> Everything is there and is as expected, including /boot/lua/loader.lua.
>>
>> I re-installed everything in /usr/src/stand (chroot'd on the installer
>> image, and "cd /usr/src/stand && make clean all install"). This did not fix
>> the problem.
>>
>> Does anybody happen to have any ideas?
>>
> 
> To briefly follow up and summarize the current standing here following
> some more discussion/attempts to fix on IRC:
> 
> 1.) x86 BIOS boot
> 2.) Problem appears for both forthloader and lualoader
> 3.) Early March loader works, recent loader does not [Only tried
> loader from the past ~day]
> 4.) ls / works, ls /mnt works, ls /boot and other directories fails
> 5.) However, /boot is confirmed intact and populated by booting via
> 11.2 install media and inspecting local disk
> 
> We'll hopefully be having a bisect session tomorrow to figure out
> where exactly this broke so that maybe Brett has a chance to upgrade
> to 12.0, unless this sounds familiar to someone and the cause is
> obvious. =)

I would start with bisecting the changes to libi386/biosdisk.c.  Also,
comparing 'lsdev -v' output between old and new loaders might be a useful
step before starting on the bisecting.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: buildworld failure: Do not include ${SRCTOP}/sys when building bootstrap tools

2018-08-21 Thread John Baldwin
On 8/20/18 9:00 PM, O. Hartmann wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA512
> 
> Am Mon, 20 Aug 2018 21:24:21 +0200
> "O. Hartmann"  schrieb:
> 
>> -BEGIN PGP SIGNED MESSAGE-
>> Hash: SHA512
>>
>> Building NanoBSD world on CURRENT r338113 fails due to:
>>
>> [...]
>> cd /pool/sources/CURRENT/src/rescue/rescue/../../sbin/gbde &&  MK_TESTS=no
>> UPDATE_DEPENDFILE=no  _RECURSING_CRUNCH=1
>> MAKEOBJDIRPREFIX=/pool/nanobsd/amd64/ALERICH_amd64/pool/sources/CURRENT/src/amd64.amd64/rescue/rescue
>> make  MK_AUTO_OBJ=no  DIRPRFX=rescue/rescue/gbde/ -DRESCUE 
>> CRUNCH_CFLAGS=-DRESCUE
>> MK_AUTO_OBJ=no   obj make[5]: 
>> "/pool/sources/CURRENT/src/tools/build/mk/Makefile.boot"
>> line 18: Do not include ${SRCTOP}/sys when building bootstrap tools.  Copy 
>> the header to
>> ${WORLDTMP}/legacy in tools/build/Makefile instead.  Error was caused by 
>> Makefile
>> in /pool/sources/CURRENT/src/sbin/gbde *** [obj_crunchdir_gbde] Error code 1
>>
>> make[4]: stopped in /pool/sources/CURRENT/src/rescue/rescue
>> [...]
>>
>>
>> This problem occured during today's source updates since I was able to build 
>> the NanoBSD
>> image I intend to build yesterday ~ r338060.
>>
>> What is going wrong?
> 
> It seems the problem has been introduced after r338095, since r338095 builds 
> ok, while
> r338096 doesn't.

338096 added this check to catch a kind of error in our Makefiles.  Alex (cc'd) 
can
help with figuring out what the error is.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: LUA loader: bhyve now doesn't?

2018-08-19 Thread John Baldwin
On 8/19/18 5:28 PM, Kyle Evans wrote:
> On Sun, Aug 19, 2018 at 10:42 AM, Warner Losh  wrote:
>> On Sun, Aug 19, 2018 at 9:35 AM, Larry Rosenman  wrote:
>>
>>> On Sun, Aug 19, 2018 at 09:33:18AM -0600, Warner Losh wrote:
>>>> On Sun, Aug 19, 2018 at 9:22 AM, Larry Rosenman  wrote:
>>>>
>>>>> With today's change to LUA as the loader, I seem to have an issue with
>>>>> bhyhve:
>>>>>
>>>>> Consoles: userboot
>>>>>
>>>>> FreeBSD/amd64 User boot, Revision 1.1
>>>>> (Thu Nov 16 15:04:02 CST 2017 r...@borg.lerctr.org)
>>>>> Startup error in /boot/lua/loader.lua:
>>>>> LUA ERROR: cannot open /boot/lua/loader.lua: no such file or directory.
>>>>>
>>>>> /boot/kernel/kernel text=0x1063d88 data=0x12e930+0x283970
>>>>> syms=[0x8+0x14cf28+0x8+0x163e57]
>>>>> Hit [Enter] to boot immediately, or any other key for command prompt.
>>>>> Booting [/boot/kernel/kernel]...
>>>>>
>>>>> These VM's have been running for MONTHS.
>>>>>
>>>>> Ideas?
>>>>>
>>>>
>>>> There's no boot/lua/loader.lua.
>>>>
>>>> You can either fix that, or you can recompile with
>>>> LOADER_DEFAULT_INTERP=4th for the moment.
>>> actually on the host there is:
>>> borg.lerctr.org /home/ler $ ls -l /boot/lua/
>>> total 131
>>> -r--r--r--  1 root  wheel   3895 Aug 19 09:46 cli.lua
>>> -r--r--r--  1 root  wheel   3204 Aug 19 09:46 color.lua
>>> -r--r--r--  1 root  wheel  14024 Aug 19 09:46 config.lua
>>> -r--r--r--  1 root  wheel  10302 Aug 19 09:46 core.lua
>>> -r--r--r--  1 root  wheel   9986 Aug 19 09:46 drawer.lua
>>> -r--r--r--  1 root  wheel   3324 Aug 19 09:46 hook.lua
>>> -r--r--r--  1 root  wheel   2543 Aug 19 09:46 loader.lua
>>> -r--r--r--  1 root  wheel   2431 Aug 19 09:46 logo-beastie.lua
>>> -r--r--r--  1 root  wheel   2203 Aug 19 09:46 logo-beastiebw.lua
>>> -r--r--r--  1 root  wheel   1958 Aug 19 09:46 logo-fbsdbw.lua
>>> -r--r--r--  1 root  wheel   2399 Aug 19 09:46 logo-orb.lua
>>> -r--r--r--  1 root  wheel   2119 Aug 19 09:46 logo-orbbw.lua
>>> -r--r--r--  1 root  wheel  12010 Aug 19 09:46 menu.lua
>>> -r--r--r--  1 root  wheel   3941 Aug 19 09:46 password.lua
>>> -r--r--r--  1 root  wheel   2381 Aug 19 09:46 screen.lua
>>> borg.lerctr.org /home/ler $
>>>
>>> This is when booting the vm, and it's not on the vm's disk.
>>>
>>> So the bhyveload behavior *CHANGED*.
>>>
>>> POLA?
>>>
>>
>> Unlikely, but a couple of questions. Have you always used the LUA loader,
>> or is this a change with the recent default switch?
>>
>> And to be clear, you expect the host's file to be used for this, not the VM
>> filesystem?
>>
> 
> (CC'ing jhb@ and tychon@, who might have better insight)
> 
> If we can swing it, I think the best model here should have always
> been that userboot uses the host's scripts but the guest's
> loader.conf. The current model doesn't tolerate any mismatch between
> host and guest and looks unsustainable.

Err, normally guests read things out of the a guest disk image (think most
VMs like VirtualBox, etc.).  userboot.so is looking in the guest's disk image.
Now, userboot isn't memory limited like the BIOS boot, so if it's
possible to have userboot just include both lua and forth perhaps with
some auto-detection based on what is in /boot/loader.rc to determine
which interpreter to use, that is really the best path forward.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: kernel build failure

2018-08-19 Thread John Baldwin
On 8/14/18 1:35 AM, Matthew Macy wrote:
> On Mon, Aug 13, 2018 at 5:33 PM Rick Macklem  wrote:
> 
>> Rodney W. Grimes wrote:
>>>> On Sun, 12 Aug 2018 14:39-0700, Matthew Macy wrote:
>>>>
>>>>> Sorry guys, last time I touched ZFS I tried to push to make it an
>> option to
>>>>> statically link and was actually told that it wasn't something anyone
>> else
>>>>> wanted. The issue comes from ZFS not being in NOTES and thus not in
>> LINT.
>>>>
>>>> If consensus is that "options ZFS" is no longer valid, then maybe
>>>> UPDATING should reflect the fact.
>>>>
>>>> I can live with loading zfs.ko and opensolaris.ko at boot time, but I
>>>> think this is a step backwards.
>>>
>>> Please no, I can think of no sound reason that you should be
>>> forced to use modules.
>> I thought that ZFS was required to be a module because of the licensing
>> terms (they didn't want any CDDL code in the core kernel)?
>>
> 
> It can't be in _GENERIC_ for that reason. There's no reason it can't be in
> LINT or end users can't configure a CDDL tainted kernel.

It should definitely be in sys/conf/NOTES.  That may have just been oversight
of whoever finally fixed 'options ZFS' to work.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Fatal trap 12: page fault on Acer Chromebook 720 (peppy)

2018-08-17 Thread John Baldwin
On 8/17/18 9:54 AM, Michael Gmelin wrote:
> 
> 
>> On 17. Aug 2018, at 08:17, John Baldwin  wrote:
>>
>>> On 8/16/18 1:58 PM, Michael Gmelin wrote:
>>>
>>>
>>>> On 15. Aug 2018, at 15:55, Konstantin Belousov >>> <mailto:kostik...@gmail.com>> wrote:
>>>>
>>>>> On Wed, Aug 15, 2018 at 03:52:37PM +0200, Michael Gmelin wrote:
>>>>>
>>>>>
>>>>>>> On 15. Aug 2018, at 15:04, Konstantin Belousov >>>>>> <mailto:kostik...@gmail.com>> wrote:
>>>>>>>
>>>>>>> On Wed, Aug 15, 2018 at 12:51:06AM +0200, Michael Gmelin wrote:
>>>>>>> Reviving this old thread, since I just updated to r337818 and a similar
>>>>>>> problem is happening again. Since the fix in r334799 (review
>>>>>>> https://reviews.freebsd.org/D15675) (mp_)machdep.c have been touched,
>>>>>>> so maybe this is related
>>>>>>> (https://svnweb.freebsd.org/base?view=revision=334799).
>>>>>>>
>>>>>>> Please see the screenshot of the panic below:
>>>>>>> https://gist.github.com/grembo/78d0f2a100dd4f16775b85a118769658
>>>>>>>
>>>>>>> This is me not digging any deeper, hoping that this is something
>>>>>>> obvious. Please let me know if you need more input.
>>>>>>
>>>>>> I do not see how recent mp_machdep.c changes could affect this.
>>>>>> Can you try newest kernel but old loader ?
>>>>>
>>>>> I will try (but that will take a while). Oh, also, it still boots in save 
>>>>> mode/with smp disabled.
>>>>
>>>> Right, this is because the access to that address through DMAP is only
>>>> needed when configuring AP startup resources.
>>>>
>>>> Also, I think it is safe to suggest that the bisect is needed.
>>>
>>> Using an older loader didn’t help, but I identified the problem:
>>>
>>> https://svnweb.freebsd.org/base?view=revision=334952
>>>
>>> modified the code you introduced in
>>>
>>> https://svnweb.freebsd.org/base?view=revision=334799
>>>
>>> By correcting units to pages it also broke booting the Chromebook as a side 
>>> effect - so the previous fix just worked due to a bug it seems.
>>>
>>> Is there an easy way to output the content of physmap at that point 
>>> (debug.late_console=0 doesn’t work) - like an existing buffer I could use, 
>>> or would this be more elaborate (I did something complicated last time but 
>>> didn’t save it, so any simple solution would be preferred).
>>
>> How about reverting the commit for now so you get a working console
>> and print out the physmap array values along with Maxmem later in
>> the boot (or just use kgdb to examine them once the system is running)?
>>
> 
> This is before the system has a working console (part of calling getmem...), 
> disabling late console makes it hang, physmap changes afterwards, so running 
> kgdb later doesn’t help. Last time I kept a copy of physmap and logged it 
> later to know the original content. I can do that again, I just thought maybe 
> there is a simple mechanism I’m not aware of that would save me some time.

I thought we only modified phys_avail[], but saving a copy of physmap[] and
dumping it from kgdb is probably the simplest thing to do.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Fatal trap 12: page fault on Acer Chromebook 720 (peppy)

2018-08-17 Thread John Baldwin
On 8/16/18 1:58 PM, Michael Gmelin wrote:
> 
> 
> On 15. Aug 2018, at 15:55, Konstantin Belousov  <mailto:kostik...@gmail.com>> wrote:
> 
>> On Wed, Aug 15, 2018 at 03:52:37PM +0200, Michael Gmelin wrote:
>>>
>>>
>>>> On 15. Aug 2018, at 15:04, Konstantin Belousov >>> <mailto:kostik...@gmail.com>> wrote:
>>>>
>>>>> On Wed, Aug 15, 2018 at 12:51:06AM +0200, Michael Gmelin wrote:
>>>>> Reviving this old thread, since I just updated to r337818 and a similar
>>>>> problem is happening again. Since the fix in r334799 (review
>>>>> https://reviews.freebsd.org/D15675) (mp_)machdep.c have been touched,
>>>>> so maybe this is related
>>>>> (https://svnweb.freebsd.org/base?view=revision=334799).
>>>>>
>>>>> Please see the screenshot of the panic below:
>>>>> https://gist.github.com/grembo/78d0f2a100dd4f16775b85a118769658
>>>>>
>>>>> This is me not digging any deeper, hoping that this is something
>>>>> obvious. Please let me know if you need more input.
>>>>
>>>> I do not see how recent mp_machdep.c changes could affect this.
>>>> Can you try newest kernel but old loader ?
>>>
>>> I will try (but that will take a while). Oh, also, it still boots in save 
>>> mode/with smp disabled.
>>
>> Right, this is because the access to that address through DMAP is only
>> needed when configuring AP startup resources.
>>
>> Also, I think it is safe to suggest that the bisect is needed.
> 
> Using an older loader didn’t help, but I identified the problem:
> 
> https://svnweb.freebsd.org/base?view=revision=334952
> 
> modified the code you introduced in
> 
> https://svnweb.freebsd.org/base?view=revision=334799
> 
> By correcting units to pages it also broke booting the Chromebook as a side 
> effect - so the previous fix just worked due to a bug it seems.
> 
> Is there an easy way to output the content of physmap at that point 
> (debug.late_console=0 doesn’t work) - like an existing buffer I could use, or 
> would this be more elaborate (I did something complicated last time but 
> didn’t save it, so any simple solution would be preferred).

How about reverting the commit for now so you get a working console
and print out the physmap array values along with Maxmem later in
the boot (or just use kgdb to examine them once the system is running)?

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: programs like gdb core dump

2018-08-09 Thread John Baldwin
On 8/8/18 4:49 PM, Erich Dollansky wrote:
> Hi,
> 
> here we are:
> 
> http://sumeritec/FreeBSD/fortune.core
> http://sumeritec/FreeBSD/gdb.core
> 
> The fortune core is from the same source as the now running system. The
> gdb core should be but I am not 100% sure.
> 
> Revision: Revision: 337343

The core dumps don't really do me any good unfortunately without a binary,
but if you can open fortune.core under gdb for example, just getting the
stack trace along with 'info reg' is probably sufficient.
 
> Erich
> 
> 
> On Wed, 8 Aug 2018 08:57:06 -0700
> John Baldwin  wrote:
> 
>> On 8/7/18 7:00 PM, Erich Dollansky wrote:
>>> Hi,
>>>
>>> On Tue, 7 Aug 2018 11:59:11 -0700
>>> John Baldwin  wrote:
>>>   
>>>> On 8/6/18 8:11 PM, Erich Dollansky wrote:  
>>>>> On Mon, 6 Aug 2018 15:57:53 -0700
>>>>> John Baldwin  wrote:
>>>>> 
>>>>>> On 8/4/18 4:38 PM, Erich Dollansky wrote:
>>>   
>>>>>>> Bad system call (core dumped)  
>>>>>>
>>>>>> Did you upgrade from stable/11 with a world that is still
>>>>>> stable/11? If so, did you make sure your kernel config includes
>>>>>> COMPAT_FREEBSD11? (GENERIC should include this)
>>>>>>
>>>>>
>>>>> I never have had a machine running 11. This machine is on 12 since
>>>>> 2 or 3 years. I will check if this configuration was properly set
>>>>> on that machine.
>>>>
>>>> Ahh, a fairly old 12 world with a recent 12 kernel will still need
>>>> COMPAT_FREEBSD11.
>>>>  
>>>
>>> even when kernel and world are on '1200076' as provided by uname
>>> -U/-K, COMPAT_FREEBSD11 is required at the moment. The system is
>>> currently on r337343.  
>>
>> Hmm, plain 12.0 binaries that are up to date should not need
>> COMPAT_FREEBSD11. Do you have any of the core dumps from before handy?
>>
> 


-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Early kernel boot log?

2018-08-09 Thread John Baldwin
On 8/9/18 4:02 AM, Konstantin Belousov wrote:
> On Thu, Aug 09, 2018 at 10:26:06AM +0100, Johannes Lundberg wrote:
>> On Thu, Aug 9, 2018 at 9:29 AM Konstantin Belousov 
>> wrote:
>>
>>> On Thu, Aug 09, 2018 at 08:54:31AM +0100, Johannes Lundberg wrote:
>>>> Hi
>>>>
>>>> So I believe the reason I'm not seeing and printf output in dmesg is that
>>>> it is too early in some functions.
>>>> For example
>>>> machdep.s
>>>>  getmemsize()
>>>>  add_efi_map_entries()
>>>>  etc
>>>>
>>>> However, these functions do contain debug printf statements so if they're
>>>> logging to somewhere, where/how can I see this?
>>>>
>>>> I also tried booting in bhyve too see if I could get any output via
>>> serial
>>>> console but nothing there either.
>>> Disable efi console, only leaving comconsole around, then set
>>> debug.late_console=0
>>> in loader.
>>>
>>
>> Thanks for the tip. I found the comment in machdep.c that explains this
>> now.
>> However, running in bhyve with
>> console="comconsole" (not needed in bhyve I guess?)
>> debug.late_console=0
>>
>> Boot hangs after
>> Booting...
>> output.
>> Caused by late_console=0.
> 
> That early hangs are typically due to an exception occuring before
> IDT is set up and trap machinery operational.  Double-check that
> there is no any early framebuffer access, as a drastic measure remove
> all framebuffer drivers from your kernel config.
> 
> I do not remember, where gdb stubs added to bhyve ?  Is there a way
> to inspect the vm guest state in bhyve by other means ?

For this case the gdb stub in FreeBSD head should be sufficient.  You need
to add '-G 1234' to the command line when starting bhyve and then you can
use 'target remote localhost:1234' from either gdb or kgdb.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: programs like gdb core dump

2018-08-08 Thread John Baldwin
On 8/7/18 7:00 PM, Erich Dollansky wrote:
> Hi,
> 
> On Tue, 7 Aug 2018 11:59:11 -0700
> John Baldwin  wrote:
> 
>> On 8/6/18 8:11 PM, Erich Dollansky wrote:
>>> On Mon, 6 Aug 2018 15:57:53 -0700
>>> John Baldwin  wrote:
>>>   
>>>> On 8/4/18 4:38 PM, Erich Dollansky wrote:  
> 
>>>>> Bad system call (core dumped)
>>>>
>>>> Did you upgrade from stable/11 with a world that is still
>>>> stable/11? If so, did you make sure your kernel config includes
>>>> COMPAT_FREEBSD11? (GENERIC should include this)
>>>>  
>>>
>>> I never have had a machine running 11. This machine is on 12 since
>>> 2 or 3 years. I will check if this configuration was properly set
>>> on that machine.  
>>
>> Ahh, a fairly old 12 world with a recent 12 kernel will still need
>> COMPAT_FREEBSD11.
>>
> 
> even when kernel and world are on '1200076' as provided by uname -U/-K,
> COMPAT_FREEBSD11 is required at the moment. The system is currently on
> r337343.

Hmm, plain 12.0 binaries that are up to date should not need COMPAT_FREEBSD11.
Do you have any of the core dumps from before handy?

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: programs like gdb core dump

2018-08-07 Thread John Baldwin
On 8/6/18 8:11 PM, Erich Dollansky wrote:
> Hi,
> 
> On Mon, 6 Aug 2018 15:57:53 -0700
> John Baldwin  wrote:
> 
>> On 8/4/18 4:38 PM, Erich Dollansky wrote:
>>> Hi,
>>>
>>> I compiled me yesterday this system:
>>>
>>> 12.0-CURRENT FreeBSD 12.0-CURRENT #1 r337285:
>>>
>>> When restarting fortune core dumps. When trying to load the core
>>> dump, gdb core dumps.
>>>
>>> The message is always:
>>>
>>> Bad system call (core dumped)
>>>
>>> Trying to install ports results in the same effect.
>>>
>>> Erich  
>>
>> Did you upgrade from stable/11 with a world that is still stable/11?
>> If so, did you make sure your kernel config includes COMPAT_FREEBSD11?
>> (GENERIC should include this)
>>
> 
> I never have had a machine running 11. This machine is on 12 since 2 or
> 3 years. I will check if this configuration was properly set on that
> machine.

Ahh, a fairly old 12 world with a recent 12 kernel will still need
COMPAT_FREEBSD11.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: programs like gdb core dump

2018-08-06 Thread John Baldwin
On 8/4/18 4:38 PM, Erich Dollansky wrote:
> Hi,
> 
> I compiled me yesterday this system:
> 
> 12.0-CURRENT FreeBSD 12.0-CURRENT #1 r337285:
> 
> When restarting fortune core dumps. When trying to load the core dump,
> gdb core dumps.
> 
> The message is always:
> 
> Bad system call (core dumped)
> 
> Trying to install ports results in the same effect.
> 
> Erich

Did you upgrade from stable/11 with a world that is still stable/11?
If so, did you make sure your kernel config includes COMPAT_FREEBSD11?
(GENERIC should include this)

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: head -r336568 and -r336570 appears to have made ci.freebsg.org's FreeBSD-head-amd64-gcc fail either than it had been (error: operand type 'struct *' is incompatible with argument 1 of

2018-08-03 Thread John Baldwin
I decided that it was better to fix our stdatomic.h, so I have a review
to do that at https://reviews.freebsd.org/D16585

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: head -r336568 and -r336570 appears to have made ci.freebsg.org's FreeBSD-head-amd64-gcc fail either than it had been (error: operand type 'struct *' is incompatible with argument 1 of

2018-07-27 Thread John Baldwin
On 7/27/18 12:12 AM, Mark Millard wrote:
> I was looking too locally: the overall context has an outer #if
> as well that skips the section:
> 
> /*
>  * Keywords added in C11.
>  */
>  
> #if !defined(__STDC_VERSION__) || __STDC_VERSION__ < 201112L
> . . .
> #if !defined(__cplusplus) && !__has_extension(c_atomic) && \
> !__has_extension(cxx_atomic)
> /*
>  * No native support for _Atomic(). Place object in structure to prevent
>  * most forms of direct non-atomic access.
>  */
> #define _Atomic(T)  struct { T volatile __val; }
> #endif
> . . .
> #endif /* __STDC_VERSION__ || __STDC_VERSION__ < 201112L */

Yes.  It also means that if we didn't ship the compiler's stdatomic.h and
tried to build with -std=gnu11 or -std=c11 the compile would break.

Rather than requiring c11, another approach might be to fix sys/cdefs.h
and sys/stdatomic.h to actually work with modern GCC by having them not
use the struct for the _GCC_ATOMICS case, only for the _SYNC case.

I think that would fix all of the cases.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: head -r335782 (?) broke ci.freebsd.org's FreeBSD-head-amd64-gcc build (lib32 part of build)

2018-07-26 Thread John Baldwin
On 7/16/18 11:27 PM, Mark Millard wrote:
> On 2018-Jul-1, at 6:34 AM, Mark Millard  wrote:
> 
>> My brain finally engaged for showing exactly what files are included
>> for the gcc builds: the .meta files include that information explicitly
>> (along with other files that are opened during the operation).
>>
>> amd64 is as I reported, just one header file from gcc: float.h .
>>
>> powerpc64 builds Lex/Lexer.cpp without defining __ALTIVEC__ and so
>> is not including  . Building without __ALTIVEC__ might
>> be an error itself but would be a workaround for the altivec.h
>> file name aliasing vs. search-path problem.
>>
>> . . .
> 
> Going in a different direction, what of the unchanged Makefile.inc1
> code block:
> 
> .if ${WANT_COMPILER_TYPE} == gcc || \
> (defined(X_COMPILER_TYPE) && ${X_COMPILER_TYPE} == gcc)
> # GCC requires -isystem and -L when using a cross-compiler.  --sysroot
> # won't set header path and -L is used to ensure the base library path
> # is added before the port PREFIX library path.
> CD2CFLAGS+= -isystem ${XDDESTDIR}/usr/include -L${XDDESTDIR}/usr/lib
> # GCC requires -B to find /usr/lib/crti.o when using a cross-compiler
> # combined with --sysroot.
> CD2CFLAGS+= -B${XDDESTDIR}/usr/lib
> # Force using libc++ for external GCC.
> .if defined(X_COMPILER_TYPE) && \
> ${X_COMPILER_TYPE} == gcc && ${X_COMPILER_VERSION} >= 40800
> CD2CXXFLAGS+=   -isystem ${XDDESTDIR}/usr/include/c++/v1 -std=c++11 \
> -nostdinc++
> .endif
> .endif
> 
> Why is that pair of -isystem uses that gives the old search order
> okay? Or was the block just missed? (Similarly for other options
> listed above.)

Just missed.  They should probably also be removed.

> Note: Locally I've reverted the -r335782 changes in order for my use
> of devel/*-gcc as cross compilers to work where they used to (hopefully:
> still building), restoring the historical search order for the
> directories for now.

I finally got the approval 2 days ago to remove float.h from amd64-gcc so
you shouldn't need this reverted anymore once the OFED thing is
straightened out.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: head -r336568 and -r336570 appears to have made ci.freebsg.org's FreeBSD-head-amd64-gcc fail either than it had been (error: operand type 'struct *' is incompatible with argument 1 of

2018-07-26 Thread John Baldwin
On 7/26/18 10:55 AM, Mark Millard wrote:
> 
> 
> On 2018-Jul-26, at 10:21 AM, John Baldwin  wrote:
> 
>> On 7/25/18 6:52 PM, Mark Millard wrote:
>>>
>>>
>>> On 2018-Jul-25, at 2:10 PM, Mark Millard  wrote:
>>>
>>>
>>>
>>>> On 2018-Jul-25, at 10:09 AM, Mark Millard  wrote:
>>>>
>>>>> On 2018-Jul-25, at 8:39 AM, John Baldwin  wrote:
>>>>>
>>>>>> On 7/24/18 11:39 PM, Mark Millard wrote:
>>>>>>> On 2018-Jul-24, at 10:32 PM, Mark Millard  wrote:
>>>>>>>
>>>>>>>> https://ci.freebsd.org/job/FreeBSD-head-amd64-gcc/6597/consoleText
>>>>>>>> (head -r336573 after the prior 6596's -r336565 ):
>>>>>>>>
>>>>>>>> --- all_subdir_lib/ofed ---
>>>>>>>> In file included from /workspace/src/contrib/ofed/librdmacm/cma.h:43:0,
>>>>>>>> from /workspace/src/contrib/ofed/librdmacm/acm.c:42:
>>>>>>>> /workspace/src/contrib/ofed/librdmacm/cma.h: In function 
>>>>>>>> 'fastlock_init':
>>>>>>>> /workspace/src/contrib/ofed/librdmacm/cma.h:60:2: error: invalid 
>>>>>>>> initializer
>>>>>>>> atomic_store(>cnt, 0);
>>>>>>>> ^
>>>>>>>> In file included from /workspace/src/contrib/ofed/librdmacm/acm.c:42:0:
>>>>>>>> /workspace/src/contrib/ofed/librdmacm/cma.h: In function 
>>>>>>>> 'fastlock_acquire':
>>>>>>>> /workspace/src/contrib/ofed/librdmacm/cma.h:68:2: error: operand type 
>>>>>>>> 'struct  *' is incompatible with argument 1 of 
>>>>>>>> '__atomic_fetch_add'
>>>>>>>> if (atomic_fetch_add(>cnt, 1) > 0)
>>>>>>>> ^~
>>>>>>>> /workspace/src/contrib/ofed/librdmacm/cma.h: In function 
>>>>>>>> 'fastlock_release':
>>>>>>>> /workspace/src/contrib/ofed/librdmacm/cma.h:73:2: error: operand type 
>>>>>>>> 'struct  *' is incompatible with argument 1 of 
>>>>>>>> '__atomic_fetch_sub'
>>>>>>>> if (atomic_fetch_sub(>cnt, 1) > 1)
>>>>>>>> ^~
>>>>>>>> . . .
>>>>>>>> --- all_subdir_lib/ofed ---
>>>>>>>> *** [acm.o] Error code 1
>>>>>>>>
>>>>>>>>
>>>>>>>> https://ci.freebsd.org/job/FreeBSD-head-amd64-gcc/6621/consoleText ( 
>>>>>>>> for
>>>>>>>> -r336700 ) still shows this type of error.
>>>>>>>
>>>>>>>
>>>>>>> [I should have a subject with "head -r336568 through -r336570 . . .".]
>>>>>>>
>>>>>>> From what I can tell looking around having something like:
>>>>>>>
>>>>>>> if (atomic_fetch_add(>cnt, 1) > 0)
>>>>>>>
>>>>>>> involve a __atomic_fetch_add indicates that:
>>>>>>>
>>>>>>> /usr/local/lib/gcc/x86_64-unknown-freebsd12.0/6.4.0/include/stdatomic.h
>>>>>>>
>>>>>>> was in use instead of FreeBSD's stdatomic.h file.
>>>>>>>
>>>>>>> If this is right, then the issue may be tied to head -r335782
>>>>>>> implicitly changing the order of the include file directory
>>>>>>> searching for builds via the devel/*-gcc .
>>>>>>>
>>>>>>> (I reverted -r335782 in my environment some time ago and have
>>>>>>> not run into this problem in my context so far.)
>>>>>>
>>>>>> C11 atomics should work fine with compiler-provided headers since they
>>>>>> are a part of the language (and the system stdatomic.h simply attempts
>>>>>> to mimic the compiler-provided header in case it is missing).
>>>>>>
>>>>>> Simple standalone tests of _Atomic(int) with GCC don't trigger those
>>>>>> failures when using its stdatomic.h, so there is probably something else
>>>>>> going on with kernel includes being used while building the library,
>>>>>> etc.  The last time we had this issue with stdarg.h it was because a
>>>>>> header shared between the 

Re: head -r336568 and -r336570 appears to have made ci.freebsg.org's FreeBSD-head-amd64-gcc fail either than it had been (error: operand type 'struct *' is incompatible with argument 1 of

2018-07-26 Thread John Baldwin
On 7/25/18 6:52 PM, Mark Millard wrote:
> 
> 
> On 2018-Jul-25, at 2:10 PM, Mark Millard  wrote:
> 
> 
> 
>> On 2018-Jul-25, at 10:09 AM, Mark Millard  wrote:
>>
>>> On 2018-Jul-25, at 8:39 AM, John Baldwin  wrote:
>>>
>>>> On 7/24/18 11:39 PM, Mark Millard wrote:
>>>>> On 2018-Jul-24, at 10:32 PM, Mark Millard  wrote:
>>>>>
>>>>>> https://ci.freebsd.org/job/FreeBSD-head-amd64-gcc/6597/consoleText
>>>>>> (head -r336573 after the prior 6596's -r336565 ):
>>>>>>
>>>>>> --- all_subdir_lib/ofed ---
>>>>>> In file included from /workspace/src/contrib/ofed/librdmacm/cma.h:43:0,
>>>>>>  from /workspace/src/contrib/ofed/librdmacm/acm.c:42:
>>>>>> /workspace/src/contrib/ofed/librdmacm/cma.h: In function 'fastlock_init':
>>>>>> /workspace/src/contrib/ofed/librdmacm/cma.h:60:2: error: invalid 
>>>>>> initializer
>>>>>> atomic_store(>cnt, 0);
>>>>>> ^
>>>>>> In file included from /workspace/src/contrib/ofed/librdmacm/acm.c:42:0:
>>>>>> /workspace/src/contrib/ofed/librdmacm/cma.h: In function 
>>>>>> 'fastlock_acquire':
>>>>>> /workspace/src/contrib/ofed/librdmacm/cma.h:68:2: error: operand type 
>>>>>> 'struct  *' is incompatible with argument 1 of 
>>>>>> '__atomic_fetch_add'
>>>>>> if (atomic_fetch_add(>cnt, 1) > 0)
>>>>>> ^~
>>>>>> /workspace/src/contrib/ofed/librdmacm/cma.h: In function 
>>>>>> 'fastlock_release':
>>>>>> /workspace/src/contrib/ofed/librdmacm/cma.h:73:2: error: operand type 
>>>>>> 'struct  *' is incompatible with argument 1 of 
>>>>>> '__atomic_fetch_sub'
>>>>>> if (atomic_fetch_sub(>cnt, 1) > 1)
>>>>>> ^~
>>>>>> . . .
>>>>>> --- all_subdir_lib/ofed ---
>>>>>> *** [acm.o] Error code 1
>>>>>>
>>>>>>
>>>>>> https://ci.freebsd.org/job/FreeBSD-head-amd64-gcc/6621/consoleText ( for
>>>>>> -r336700 ) still shows this type of error.
>>>>>
>>>>>
>>>>> [I should have a subject with "head -r336568 through -r336570 . . .".]
>>>>>
>>>>> From what I can tell looking around having something like:
>>>>>
>>>>> if (atomic_fetch_add(>cnt, 1) > 0)
>>>>>
>>>>> involve a __atomic_fetch_add indicates that:
>>>>>
>>>>> /usr/local/lib/gcc/x86_64-unknown-freebsd12.0/6.4.0/include/stdatomic.h
>>>>>
>>>>> was in use instead of FreeBSD's stdatomic.h file.
>>>>>
>>>>> If this is right, then the issue may be tied to head -r335782
>>>>> implicitly changing the order of the include file directory
>>>>> searching for builds via the devel/*-gcc .
>>>>>
>>>>> (I reverted -r335782 in my environment some time ago and have
>>>>> not run into this problem in my context so far.)
>>>>
>>>> C11 atomics should work fine with compiler-provided headers since they
>>>> are a part of the language (and the system stdatomic.h simply attempts
>>>> to mimic the compiler-provided header in case it is missing).
>>>>
>>>> Simple standalone tests of _Atomic(int) with GCC don't trigger those
>>>> failures when using its stdatomic.h, so there is probably something else
>>>> going on with kernel includes being used while building the library,
>>>> etc.  The last time we had this issue with stdarg.h it was because a
>>>> header shared between the kernel and userland always used 
>>>> ''
>>>> which is correct for the kernel but not for userland.
>>>
>>> I did misread the headers. FreeBSD has the likes of:
>>>
>>> #if defined(__CLANG_ATOMICS)
>>> . . .
>>> #define atomic_fetch_add_explicit(object, operand, order)   
>>> \
>>> __c11_atomic_fetch_add(object, operand, order)
>>> . . .
>>> #elif defined(__GNUC_ATOMICS)
>>> . . .
>>> #define atomic_fetch_add_explicit(object, operand, order)   
>>> \
>>> __atomic_fetch_add(&(object)->__val, operand, order)
>>> . . .
>>> #endif
>>> . . .
>>> #define 

Re: head -r336568 and -r336570 appears to have made ci.freebsg.org's FreeBSD-head-amd64-gcc fail either than it had been (error: operand type 'struct *' is incompatible with argument 1 of

2018-07-25 Thread John Baldwin
On 7/25/18 10:09 AM, Mark Millard wrote:
> 
> 
> On 2018-Jul-25, at 8:39 AM, John Baldwin  wrote:
> 
>> On 7/24/18 11:39 PM, Mark Millard wrote:
>>> On 2018-Jul-24, at 10:32 PM, Mark Millard  wrote:
>>>
>>>> https://ci.freebsd.org/job/FreeBSD-head-amd64-gcc/6597/consoleText
>>>> (head -r336573 after the prior 6596's -r336565 ):
>>>>
>>>> --- all_subdir_lib/ofed ---
>>>> In file included from /workspace/src/contrib/ofed/librdmacm/cma.h:43:0,
>>>>from /workspace/src/contrib/ofed/librdmacm/acm.c:42:
>>>> /workspace/src/contrib/ofed/librdmacm/cma.h: In function 'fastlock_init':
>>>> /workspace/src/contrib/ofed/librdmacm/cma.h:60:2: error: invalid 
>>>> initializer
>>>> atomic_store(>cnt, 0);
>>>> ^
>>>> In file included from /workspace/src/contrib/ofed/librdmacm/acm.c:42:0:
>>>> /workspace/src/contrib/ofed/librdmacm/cma.h: In function 
>>>> 'fastlock_acquire':
>>>> /workspace/src/contrib/ofed/librdmacm/cma.h:68:2: error: operand type 
>>>> 'struct  *' is incompatible with argument 1 of 
>>>> '__atomic_fetch_add'
>>>> if (atomic_fetch_add(>cnt, 1) > 0)
>>>> ^~
>>>> /workspace/src/contrib/ofed/librdmacm/cma.h: In function 
>>>> 'fastlock_release':
>>>> /workspace/src/contrib/ofed/librdmacm/cma.h:73:2: error: operand type 
>>>> 'struct  *' is incompatible with argument 1 of 
>>>> '__atomic_fetch_sub'
>>>> if (atomic_fetch_sub(>cnt, 1) > 1)
>>>> ^~
>>>> . . .
>>>> --- all_subdir_lib/ofed ---
>>>> *** [acm.o] Error code 1
>>>>
>>>>
>>>> https://ci.freebsd.org/job/FreeBSD-head-amd64-gcc/6621/consoleText ( for
>>>> -r336700 ) still shows this type of error.
>>>
>>>
>>> [I should have a subject with "head -r336568 through -r336570 . . .".]
>>>
>>> From what I can tell looking around having something like:
>>>
>>> if (atomic_fetch_add(>cnt, 1) > 0)
>>>
>>> involve a __atomic_fetch_add indicates that:
>>>
>>> /usr/local/lib/gcc/x86_64-unknown-freebsd12.0/6.4.0/include/stdatomic.h
>>>
>>> was in use instead of FreeBSD's stdatomic.h file.
>>>
>>> If this is right, then the issue may be tied to head -r335782
>>> implicitly changing the order of the include file directory
>>> searching for builds via the devel/*-gcc .
>>>
>>> (I reverted -r335782 in my environment some time ago and have
>>> not run into this problem in my context so far.)
>>
>> C11 atomics should work fine with compiler-provided headers since they
>> are a part of the language (and the system stdatomic.h simply attempts
>> to mimic the compiler-provided header in case it is missing).
>>
>> Simple standalone tests of _Atomic(int) with GCC don't trigger those
>> failures when using its stdatomic.h, so there is probably something else
>> going on with kernel includes being used while building the library,
>> etc.  The last time we had this issue with stdarg.h it was because a
>> header shared between the kernel and userland always used 
>> ''
>> which is correct for the kernel but not for userland.
> 
> I did misread the headers. FreeBSD has the likes of:
> 
> #if defined(__CLANG_ATOMICS)
> . . .
> #define   atomic_fetch_add_explicit(object, operand, order)   
> \
>   __c11_atomic_fetch_add(object, operand, order)
> . . .
> #elif defined(__GNUC_ATOMICS)
> . . .
> #define   atomic_fetch_add_explicit(object, operand, order)   
> \
>   __atomic_fetch_add(&(object)->__val, operand, order)
> . . .
> #endif
> . . .
> #define   atomic_fetch_add(object, operand)   
> \
>   atomic_fetch_add_explicit(object, operand, memory_order_seq_cst)
> 
> so __atomic_fetch_add would occur.
> 
> But so far I do not see the problem with -r335782 reverted. I last built
> -r336693 last night via devel/amd64-gcc (via xtoolchain).
> 
> From what I can tell FreeBSD defines:
> 
> #if !defined(__CLANG_ATOMICS)
> #define   _Atomic(T)  struct { volatile T __val; }
> #endif

This looks wrong for modern GCC supporting C11 atomics.  What is happening is
that this is probably overriding the compiler's builtin _Atomic and then the
compiler's stdatomic.h which doesn't look for __val but expects 'object' to
be a plain int is then failing to compile.  Just including sys/cdefs.h in
my test program doesn't trigger the failure though.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: head -r336568 and -r336570 appears to have made ci.freebsg.org's FreeBSD-head-amd64-gcc fail either than it had been (error: operand type 'struct *' is incompatible with argument 1 of

2018-07-25 Thread John Baldwin
On 7/24/18 11:39 PM, Mark Millard wrote:
> On 2018-Jul-24, at 10:32 PM, Mark Millard  wrote:
> 
>> https://ci.freebsd.org/job/FreeBSD-head-amd64-gcc/6597/consoleText
>> (head -r336573 after the prior 6596's -r336565 ):
>>
>> --- all_subdir_lib/ofed ---
>> In file included from /workspace/src/contrib/ofed/librdmacm/cma.h:43:0,
>> from /workspace/src/contrib/ofed/librdmacm/acm.c:42:
>> /workspace/src/contrib/ofed/librdmacm/cma.h: In function 'fastlock_init':
>> /workspace/src/contrib/ofed/librdmacm/cma.h:60:2: error: invalid initializer
>>  atomic_store(>cnt, 0);
>>  ^
>> In file included from /workspace/src/contrib/ofed/librdmacm/acm.c:42:0:
>> /workspace/src/contrib/ofed/librdmacm/cma.h: In function 'fastlock_acquire':
>> /workspace/src/contrib/ofed/librdmacm/cma.h:68:2: error: operand type 
>> 'struct  *' is incompatible with argument 1 of 
>> '__atomic_fetch_add'
>>  if (atomic_fetch_add(>cnt, 1) > 0)
>>  ^~
>> /workspace/src/contrib/ofed/librdmacm/cma.h: In function 'fastlock_release':
>> /workspace/src/contrib/ofed/librdmacm/cma.h:73:2: error: operand type 
>> 'struct  *' is incompatible with argument 1 of 
>> '__atomic_fetch_sub'
>>  if (atomic_fetch_sub(>cnt, 1) > 1)
>>  ^~
>> . . .
>> --- all_subdir_lib/ofed ---
>> *** [acm.o] Error code 1
>>
>>
>> https://ci.freebsd.org/job/FreeBSD-head-amd64-gcc/6621/consoleText ( for
>> -r336700 ) still shows this type of error.
> 
> 
> [I should have a subject with "head -r336568 through -r336570 . . .".]
> 
> From what I can tell looking around having something like:
> 
> if (atomic_fetch_add(>cnt, 1) > 0)
> 
> involve a __atomic_fetch_add indicates that:
> 
> /usr/local/lib/gcc/x86_64-unknown-freebsd12.0/6.4.0/include/stdatomic.h
> 
> was in use instead of FreeBSD's stdatomic.h file.
> 
> If this is right, then the issue may be tied to head -r335782
> implicitly changing the order of the include file directory
> searching for builds via the devel/*-gcc .
> 
> (I reverted -r335782 in my environment some time ago and have
> not run into this problem in my context so far.)

C11 atomics should work fine with compiler-provided headers since they
are a part of the language (and the system stdatomic.h simply attempts
to mimic the compiler-provided header in case it is missing).

Simple standalone tests of _Atomic(int) with GCC don't trigger those
failures when using its stdatomic.h, so there is probably something else
going on with kernel includes being used while building the library,
etc.  The last time we had this issue with stdarg.h it was because a
header shared between the kernel and userland always used ''
which is correct for the kernel but not for userland.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: atomic changes break drm-next-kmod?

2018-07-05 Thread John Baldwin
On 7/5/18 12:36 PM, Konstantin Belousov wrote:
> On Thu, Jul 05, 2018 at 09:12:24PM +0200, Hans Petter Selasky wrote:
>> On 07/05/18 20:59, Hans Petter Selasky wrote:
>>> On 07/05/18 19:48, Pete Wright wrote:
>>>>
>>>>
>>>> On 07/05/2018 10:10, John Baldwin wrote:
>>>>> On 7/3/18 5:10 PM, Pete Wright wrote:
>>>>>>
>>>>>> On 07/03/2018 15:56, John Baldwin wrote:
>>>>>>> On 7/3/18 3:34 PM, Pete Wright wrote:
>>>>>>>> On 07/03/2018 15:29, John Baldwin wrote:
>>>>>>>>> That seems like kgdb is looking at the wrong CPU.  Can you use
>>>>>>>>> 'info threads' and look for threads not stopped in 'sched_switch'
>>>>>>>>> and get their backtraces?  You could also just do 'thread apply
>>>>>>>>> all bt' and put that file at a URL if that is easiest.
>>>>>>>>>
>>>>>>>> sure thing John - here's a gist of "thread apply all bt"
>>>>>>>>
>>>>>>>> https://gist.github.com/gem-pete/d8d7ab220dc8781f0827f965f09d43ed
>>>>>>> That doesn't look right at all.  Are you sure the kernel matches the
>>>>>>> vmcore?  Also, which kgdb version are you using?
>>>>>>>
>>>>>> yea i agree that doesn't look right at all.  here is my setup:
>>>>>>
>>>>>> $ which kgdb
>>>>>> /usr/bin/kgdb
>>>>>> $ kgdb
>>>>>> GNU gdb 6.1.1 [FreeBSD]
>>>>>> $ ls -lh /var/crash/vmcore.1
>>>>>> -rw---  1 root  wheel   1.6G Jul  3 15:03 /var/crash/vmcore.1
>>>>>> $ ls -l /usr/lib/debug/boot/kernel/kernel.debug
>>>>>> -r-xr-xr-x  1 root  wheel  87840496 Jul  3 13:54
>>>>>> /usr/lib/debug/boot/kernel/kernel.debug
>>>>>>
>>>>>> and i invoke kgdb like so:
>>>>>> $ sudo kgdb /usr/lib/debug/boot/kernel/kernel.debug /var/crash/vmcore.1
>>>>>>
>>>>>> here's a gist of my full gdb session:
>>>>>> http://termbin.com/krsn
>>>>>>
>>>>>> dunno - maybe i have a bad core dump?  regardless, more than happy to
>>>>>> help so let me know if i should try anything else or patches etc..
>>>>> Can you try installing gdb from ports and using /usr/local/bin/kgdb?
>>>>>
>>>>
>>>> that seems to have done the trick, at least the output looks more 
>>>> encouraging.
>>>>
>>>>   --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
>>>> KDB: enter: panic
>>>>
>>>> __curthread () at ./machine/pcpu.h:231
>>>> 231        __asm("movq %%gs:%1,%0" : "=r" (td)
>>>>
>>>>
>>>> here's my full kgdb session:
>>>> http://termbin.com/qa4f
>>>>
>>>> i don't see any threads not in "sched_switch" though :(
>>>
>>> Hi,
>>>
>>> The problem may be that the patch to enable atomic inlining of all 
>>> macros forgot to set the SMP keyword which means SMP is not defined at 
>>> all for KLD's so all non-kernel atomic usage is with MPLOCKED empty!
> Problem is that out-of-tree modules build does not have opt*.h files
> from the kernel.  UP config is a valid one, flipping some option's
> default value does not solve the problem.

Yes, but using the lock prefix in a generic module is ok (it will still
work, just not quite as fast) whereas the lack of lock is fatal on 
SMP.  I would amend Hans' patch slightly to honor the opt_* setting
for KLD_TIED (but that is only true if KLD_TIED means "built as part of
a kernel build, so has valid opt_foo.h headers" and not
'a standalone module where someone put MODULES_TIED=1 on the command line
to make').

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: atomic changes break drm-next-kmod?

2018-07-05 Thread John Baldwin
On 7/3/18 5:10 PM, Pete Wright wrote:
> 
> 
> On 07/03/2018 15:56, John Baldwin wrote:
>> On 7/3/18 3:34 PM, Pete Wright wrote:
>>>
>>> On 07/03/2018 15:29, John Baldwin wrote:
>>>> That seems like kgdb is looking at the wrong CPU.  Can you use
>>>> 'info threads' and look for threads not stopped in 'sched_switch'
>>>> and get their backtraces?  You could also just do 'thread apply
>>>> all bt' and put that file at a URL if that is easiest.
>>>>
>>>
>>> sure thing John - here's a gist of "thread apply all bt"
>>>
>>> https://gist.github.com/gem-pete/d8d7ab220dc8781f0827f965f09d43ed
>> That doesn't look right at all.  Are you sure the kernel matches the
>> vmcore?  Also, which kgdb version are you using?
>>
> 
> yea i agree that doesn't look right at all.  here is my setup:
> 
> $ which kgdb
> /usr/bin/kgdb
> $ kgdb
> GNU gdb 6.1.1 [FreeBSD]
> $ ls -lh /var/crash/vmcore.1
> -rw---  1 root  wheel   1.6G Jul  3 15:03 /var/crash/vmcore.1
> $ ls -l /usr/lib/debug/boot/kernel/kernel.debug
> -r-xr-xr-x  1 root  wheel  87840496 Jul  3 13:54 
> /usr/lib/debug/boot/kernel/kernel.debug
> 
> and i invoke kgdb like so:
> $ sudo kgdb /usr/lib/debug/boot/kernel/kernel.debug /var/crash/vmcore.1
> 
> here's a gist of my full gdb session:
> http://termbin.com/krsn
> 
> dunno - maybe i have a bad core dump?  regardless, more than happy to 
> help so let me know if i should try anything else or patches etc..

Can you try installing gdb from ports and using /usr/local/bin/kgdb?

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: atomic changes break drm-next-kmod?

2018-07-03 Thread John Baldwin
On 7/3/18 3:40 PM, Matthew Macy wrote:
> This seems like a clang inline asm bug. Could you try building the port with 
> a recent gcc against an unpatched HEAD?

I've already committed the patch to HEAD, but using 'e' is from the GCC docs, 
not clang docs:

https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints

The disassembly of one of the functions from the kmod using one of the affected 
atomic ops would
show if it is working correctly (there should now be a mov with a 64-bit 
immediate into a register
followed by the atomic op using a register operand).  You could also try using 
just 'r' to always
force the use of a register.  It would be less optimal than "er" but should 
function correctly.

> On Tue, Jul 3, 2018 at 15:38 Pete Wright  <mailto:p...@nomadlogic.org>> wrote:
> 
> 
> 
> On 07/03/2018 15:29, John Baldwin wrote:
> > That seems like kgdb is looking at the wrong CPU.  Can you use
> > 'info threads' and look for threads not stopped in 'sched_switch'
> > and get their backtraces?  You could also just do 'thread apply
> > all bt' and put that file at a URL if that is easiest.
> >
> 
> 
> sure thing John - here's a gist of "thread apply all bt"
> 
> https://gist.github.com/gem-pete/d8d7ab220dc8781f0827f965f09d43ed
> 
> cheers!
> -pete
> 
> -- 
> Pete Wright
> p...@nomadlogic.org <mailto:p...@nomadlogic.org>
> @nomadlogicLA
> 
> ___
> freebsd-current@freebsd.org <mailto:freebsd-current@freebsd.org> mailing 
> list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org 
> <mailto:freebsd-current-unsubscr...@freebsd.org>"
> 


-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: atomic changes break drm-next-kmod?

2018-07-03 Thread John Baldwin
On 7/3/18 3:34 PM, Pete Wright wrote:
> 
> 
> On 07/03/2018 15:29, John Baldwin wrote:
>> That seems like kgdb is looking at the wrong CPU.  Can you use
>> 'info threads' and look for threads not stopped in 'sched_switch'
>> and get their backtraces?  You could also just do 'thread apply
>> all bt' and put that file at a URL if that is easiest.
>>
> 
> 
> sure thing John - here's a gist of "thread apply all bt"
> 
> https://gist.github.com/gem-pete/d8d7ab220dc8781f0827f965f09d43ed

That doesn't look right at all.  Are you sure the kernel matches the
vmcore?  Also, which kgdb version are you using?

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: atomic changes break drm-next-kmod?

2018-07-03 Thread John Baldwin
On 7/3/18 3:20 PM, Pete Wright wrote:
> 
> 
> On 07/03/2018 15:12, Pete Wright wrote:
>>
>>
>> On 07/03/2018 14:17, Pete Wright wrote:
>>>
>>>
>>> On 07/03/2018 12:02, John Baldwin wrote:
>>>> On 7/3/18 11:28 AM, Niclas Zeising wrote:
>>>>> On 07/03/18 17:02, O. Hartmann wrote:
>>>>>> -BEGIN PGP SIGNED MESSAGE-
>>>>>> Hash: SHA512
>>>>>>
>>>>>> Am Tue, 3 Jul 2018 10:19:57 -0400
>>>>>> Michael Butler  schrieb:
>>>>>>
>>>>>>> It seems recent changes (SVN r335873?) may have broken 
>>>>>>> drm-next-kmod ..
>>>>>>>
>>>>>>> --- i915_drv.o ---
>>>>>>> In file included from i915_drv.c:30:
>>>>>>> In file included from
>>>>>>> /usr/ports/graphics/drm-next-kmod/work/kms-drm-a753215/linuxkpi/gplv2/include/linux/acpi.h:26:
>>>>>>>  
>>>>>>>
>>>>>>> In file included from
>>>>>>> /usr/ports/graphics/drm-next-kmod/work/kms-drm-a753215/linuxkpi/gplv2/include/linux/device.h:4:
>>>>>>>  
>>>>>>>
>>>>>>> In file included from
>>>>>>> /usr/src/sys/compat/linuxkpi/common/include/linux/device.h:35:
>>>>>>> In file included from
>>>>>>> /usr/src/sys/compat/linuxkpi/common/include/linux/types.h:37:
>>>>>>> In file included from /usr/src/sys/sys/systm.h:44:
>>>>>>> ./machine/atomic.h:450:29: error: invalid operand for instruction
>>>>>>> ATOMIC_ASM(clear,    long,  "andq %1,%0",  "ir", ~v);
>>>>>>>   ^
>>>>>>> :1:7: note: instantiated into assembly here
>>>>>>>   andq $9223372036854775807,40672(%r14)
>>>>>>>    ^
>>>>>>> 1 error generated.
>>>>>>> *** [i915_drv.o] Error code 1
>>>>>>>
>>>>>>> make[3]: stopped in
>>>>>>> /usr/ports/graphics/drm-next-kmod/work/kms-drm-a753215/i915
>>>>>>> --- i915_gem.o ---
>>>>>>> In file included from i915_gem.c:28:
>>>>>>> In file included from
>>>>>>> /usr/ports/graphics/drm-next-kmod/work/kms-drm-a753215/include/drm/drmP.h:38:
>>>>>>>  
>>>>>>>
>>>>>>> In file included from /usr/src/sys/sys/malloc.h:42:
>>>>>>> In file included from /usr/src/sys/sys/systm.h:44:
>>>>>>> ./machine/atomic.h:449:29: error: invalid operand for instruction
>>>>>>> ATOMIC_ASM(set,  long,  "orq %1,%0",   "ir",  v);
>>>>>>>   ^
>>>>>>> :1:6: note: instantiated into assembly here
>>>>>>>   orq $-9223372036854775808,40672(%r14)
>>>>>>>   ^~
>>>>>>> 1 error generated.
>>>>>>> *** [i915_gem.o] Error code 1
>>>>>>>
>>>>>>> ___
>>>>>>> freebsd-current@freebsd.org mailing list
>>>>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-current
>>>>>>> To unsubscribe, send any mail to 
>>>>>>> "freebsd-current-unsubscr...@freebsd.org"
>>>>>>
>>>>>> It breaks also graphics/drm-stable-kmod (see PR 229484,
>>>>>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=229484, same 
>>>>>> error as you described
>>>>>> above) and also emulators/virtualbox-ose-kmod. As long as CURRENT 
>>>>>> revision is < r335873,
>>>>>> those kmod compile well.
>>>>> We are looking into why both the drm ports fail.
>>>>> Regards
>>>>>
>>>> I haven't yet tested an amd64 kernel with this, but I think this 
>>>> change to sys/amd64/include/atomic.h
>>>> might fix it:
>>>>
>>>> Index: atomic.h
>>>> ===
>>>> --- atomic.h    (revision 335896)
>>>> +++ atomic.h    (working copy)
>>>> @@ -446,10 +446,10 @@ ATOMIC_ASM(clear,    int,   "andl

Re: atomic changes break drm-next-kmod?

2018-07-03 Thread John Baldwin
On 7/3/18 11:28 AM, Niclas Zeising wrote:
> On 07/03/18 17:02, O. Hartmann wrote:
>> -BEGIN PGP SIGNED MESSAGE-
>> Hash: SHA512
>>
>> Am Tue, 3 Jul 2018 10:19:57 -0400
>> Michael Butler  schrieb:
>>
>>> It seems recent changes (SVN r335873?) may have broken drm-next-kmod ..
>>>
>>> --- i915_drv.o ---
>>> In file included from i915_drv.c:30:
>>> In file included from
>>> /usr/ports/graphics/drm-next-kmod/work/kms-drm-a753215/linuxkpi/gplv2/include/linux/acpi.h:26:
>>> In file included from
>>> /usr/ports/graphics/drm-next-kmod/work/kms-drm-a753215/linuxkpi/gplv2/include/linux/device.h:4:
>>> In file included from
>>> /usr/src/sys/compat/linuxkpi/common/include/linux/device.h:35:
>>> In file included from
>>> /usr/src/sys/compat/linuxkpi/common/include/linux/types.h:37:
>>> In file included from /usr/src/sys/sys/systm.h:44:
>>> ./machine/atomic.h:450:29: error: invalid operand for instruction
>>> ATOMIC_ASM(clear,long,  "andq %1,%0",  "ir", ~v);
>>>  ^
>>> :1:7: note: instantiated into assembly here
>>>  andq $9223372036854775807,40672(%r14)
>>>   ^
>>> 1 error generated.
>>> *** [i915_drv.o] Error code 1
>>>
>>> make[3]: stopped in
>>> /usr/ports/graphics/drm-next-kmod/work/kms-drm-a753215/i915
>>> --- i915_gem.o ---
>>> In file included from i915_gem.c:28:
>>> In file included from
>>> /usr/ports/graphics/drm-next-kmod/work/kms-drm-a753215/include/drm/drmP.h:38:
>>> In file included from /usr/src/sys/sys/malloc.h:42:
>>> In file included from /usr/src/sys/sys/systm.h:44:
>>> ./machine/atomic.h:449:29: error: invalid operand for instruction
>>> ATOMIC_ASM(set,  long,  "orq %1,%0",   "ir",  v);
>>>  ^
>>> :1:6: note: instantiated into assembly here
>>>  orq $-9223372036854775808,40672(%r14)
>>>  ^~
>>> 1 error generated.
>>> *** [i915_gem.o] Error code 1
>>>
>>> ___
>>> freebsd-current@freebsd.org mailing list
>>> https://lists.freebsd.org/mailman/listinfo/freebsd-current
>>> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>>
>>
>> It breaks also graphics/drm-stable-kmod (see PR 229484,
>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=229484, same error as you 
>> described
>> above) and also emulators/virtualbox-ose-kmod. As long as CURRENT revision 
>> is < r335873,
>> those kmod compile well.
> 
> We are looking into why both the drm ports fail.
> Regards
> 

I haven't yet tested an amd64 kernel with this, but I think this change to 
sys/amd64/include/atomic.h
might fix it:

Index: atomic.h
===
--- atomic.h(revision 335896)
+++ atomic.h(working copy)
@@ -446,10 +446,10 @@ ATOMIC_ASM(clear,int,   "andl %1,%0",  "ir", ~
 ATOMIC_ASM(add, int,   "addl %1,%0",  "ir",  v);
 ATOMIC_ASM(subtract, int,   "subl %1,%0",  "ir",  v);
 
-ATOMIC_ASM(set,     long,  "orq %1,%0",   "ir",  v);
-ATOMIC_ASM(clear,long,  "andq %1,%0",  "ir", ~v);
-ATOMIC_ASM(add, long,  "addq %1,%0",  "ir",  v);
-ATOMIC_ASM(subtract, long,  "subq %1,%0",  "ir",  v);
+ATOMIC_ASM(set, long,  "orq %1,%0",   "er",  v);
+ATOMIC_ASM(clear,long,  "andq %1,%0",  "er", ~v);
+ATOMIC_ASM(add, long,  "addq %1,%0",  "er",  v);
+ATOMIC_ASM(subtract, long,  "subq %1,%0",  "er",  v);
 
 #defineATOMIC_LOADSTORE(TYPE)  \
ATOMIC_LOAD(TYPE);  \


-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: head -r335782 (?) broke ci.freebsd.org's FreeBSD-head-amd64-gcc build (lib32 part of build)

2018-06-30 Thread John Baldwin
On 6/30/18 10:19 AM, Mark Millard wrote:
> 
> 
> On 2018-Jun-30, at 10:04 AM, Mark Millard  wrote:
> 
>> On 2018-Jun-30, at 9:29 AM, John Baldwin  wrote:
>>
>>> On 6/30/18 9:17 AM, Mark Millard wrote:
>>>> On 2018-Jun-30, at 7:51 AM, John Baldwin  wrote:
>>>>
>>>>> On 6/29/18 2:37 PM, Mark Millard wrote:
>>>>>> [I expect this is more than just amd64-gcc related but that is all
>>>>>> that ci.freebsd.org normally builds via a devel/*-gcc .]
>>>>>
>>>>> As indicated by my other mail, this is i386 and amd64 specific as it
>>>>> only matters for float.h on i386 due to the disagreement on
>>>>> LDBL_MANT_DIG.
>>>>
>>>> I was correct about the search order for include files being
>>>> different before -r335782 vs. -r335782 and later:
>>>
>>> Yes, but this is kind of a feature, not a bug, and the issue there is that
>>> as much as possible we should allow FreeBSD to work with the standard 
>>> headers
>>> that are supposed to be part of the language (and thus provided by the
>>> toolchain).  Right now we don't ship any of the 'std*.h' headers clang
>>> provides for example in our base system clang, though a few months ago I
>>> fixed the one place that was using  instead of
>>>  in userland that was breaking the use of the toolchain-provided
>>> stdarg.h (both GCC and clang).
>>>
>>>> Might this reversal have other effects even for
>>>> architectures for which the code does compile
>>>> via devel/*-gcc ?
>>>
>>> It depends on the header.  This particular failure is due to a quirk of
>>>  on FreeBSD/i386.  I have built other platforms with external
>>> GCC just fine.  To the extent that we encounter any other issues we
>>> should try to make our source more conformant with C and only fall back to
>>> axeing the toolchain-provided language headers as a last resort.
>>
>> It is too bad that the review https://reviews.freebsd.org/D16055 did not
>> catch the change in what headers are used by buildworld and buildkernel.
>> I'd view such switching of long established header bindings as a
>> fairly big deal, possibly even warranting being explicitly proposed and
>> debated.
>>
>> I'm not claiming my opinion on which search order that I have is
>> actually relevant. I'm just now nervous about my powerpc64-gcc based
>> builds having unexpected differences, for example. [I sometimes explore
>> the status of powerpc family builds via more modern toolchains.]
>>
>> (But lib32 for powerpc64 via modern gcc's is messed up anyway,
>> generating code in crtbeginS.o for the wrong ABI: using R30 incorrectly.
>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=206123 has more about
>> that.)
> 
> Looks like my being nervous is justified: there is a conflicting altivec.h
> that has nothing to do with C/C++ language standards:
> 
> # ls /usr/local/lib/gcc/powerpc64-unknown-freebsd12.0/6.4.0/include/
> altivec.h htmxlintrin.h   ppc-asm.h   spe.h   
> stdarg.hstddef.hstdint.h  
>   varargs.h
> float.h   iso646.hppu_intrinsics.h
> spu2vmx.h   stdatomic.h stdfix.h
> stdnoreturn.h   vec_types.h
> htmintrin.h   paired.hsi2vmx.h
> stdalign.h  stdbool.h   stdint-gcc.h
> tgmath.h
> 
> I've not checked for other name conflicts vs. FreeBSD. I just happen
> to recognize altivec.h . There is:
> 
> /usr/obj/powerpc64vtsc_xtoolchain-gcc/powerpc.powerpc64/usr/src/powerpc.powerpc64/tmp/usr/include/machine/altivec.h
> 
> /usr/obj/powerpc64vtsc_xtoolchain-gcc/powerpc.powerpc64/usr/src/powerpc.powerpc64/tmp/usr/lib/clang/6.0.0/include/altivec.h
> 
> /usr/obj/powerpc64vtsc_xtoolchain-gcc/powerpc.powerpc64/usr/src/powerpc.powerpc64/obj-lib32/tmp/usr/include/machine/altivec.h

Actually, that is a compiler intrinsincs header similar to the ,
etc. headers used for SSE on x86 that are always provided by the compiler.
However, this header is '' not '' so it won't 
conflict.

(On x86, these headers provide the _mm_* functions documented in Intel's
SDM as the official C bindings for vector extensions, and 
probably plays a similar role in providing the vendor-specified C
bindings for altivec instructions.)

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: head -r335782 (?) broke ci.freebsd.org's FreeBSD-head-amd64-gcc build (lib32 part of build)

2018-06-30 Thread John Baldwin
On 6/30/18 9:17 AM, Mark Millard wrote:
> On 2018-Jun-30, at 7:51 AM, John Baldwin  wrote:
> 
>> On 6/29/18 2:37 PM, Mark Millard wrote:
>>> [I expect this is more than just amd64-gcc related but that is all
>>> that ci.freebsd.org normally builds via a devel/*-gcc .]
>>
>> As indicated by my other mail, this is i386 and amd64 specific as it
>> only matters for float.h on i386 due to the disagreement on
>> LDBL_MANT_DIG.
> 
> I was correct about the search order for include files being
> different before -r335782 vs. -r335782 and later:

Yes, but this is kind of a feature, not a bug, and the issue there is that
as much as possible we should allow FreeBSD to work with the standard headers
that are supposed to be part of the language (and thus provided by the
toolchain).  Right now we don't ship any of the 'std*.h' headers clang
provides for example in our base system clang, though a few months ago I
fixed the one place that was using  instead of
 in userland that was breaking the use of the toolchain-provided
stdarg.h (both GCC and clang).

> Might this reversal have other effects even for
> architectures for which the code does compile
> via devel/*-gcc ?

It depends on the header.  This particular failure is due to a quirk of
 on FreeBSD/i386.  I have built other platforms with external
GCC just fine.  To the extent that we encounter any other issues we
should try to make our source more conformant with C and only fall back to
axeing the toolchain-provided language headers as a last resort.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: head -r335782 (?) broke ci.freebsd.org's FreeBSD-head-amd64-gcc build (lib32 part of build)

2018-06-30 Thread John Baldwin
On 6/29/18 2:37 PM, Mark Millard wrote:
> [I expect this is more than just amd64-gcc related but that is all
> that ci.freebsd.org normally builds via a devel/*-gcc .]

As indicated by my other mail, this is i386 and amd64 specific as it
only matters for float.h on i386 due to the disagreement on
LDBL_MANT_DIG.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: head -r335782 (?) broke ci.freebsd.org's FreeBSD-head-amd64-gcc build (lib32 part of build)

2018-06-29 Thread John Baldwin
On 6/28/18 7:54 PM, Mark Millard wrote:
> On 2018-Jun-28, at 6:04 PM, Mark Millard  wrote:
> 
>> On 2018-Jun-28, at 5:39 PM, Mark Millard  wrote:
>>
>>> [ ci.free.bsd.org jumped from -r335773 (built) to -r335784 (failed)
>>> for FreeBSD-head-amd64-gcc. It looked to me like the most likely
>>> breaking-change was the following but I've not tried personal
>>> builds to confirm.
>>> ]

So this is a bit complicated and I'm not sure what the correct fix is.

What is happening is that the  shipped with GCC is now being used
after this change instead of sys/x86/include/float.h.  A sledgehammer approach
would be to remove float.h from the GCC package (we currently don't install
the float.h for the base system clang either).  However, looking at this
in more detail, it seems that x86/include/float.h is also busted in some
ways.

First, the #error I don't understand how it is happening.  The GCC float.h
defines LDBL_MAX_EXP to the __LDBL_MAX_EXP__ builtin which is 16384 just
like the x86 float.h:

# x86_64-unknown-freebsd12.0-gcc -dM -E empty.c -m32 | grep LDBL_MAX_EXP
#define __LDBL_MAX_EXP__ 16384

I even hacked catrigl.c to add the following lines before the #error
check:

LDBL_MAX_EXP_ = LDBL_MAX_EXP
LDBL_MANT_DIG_ = LDBL_MANT_DIG

#if LDBL_MAX_EXP != 0x4000
#error "Unsupported long double format"
#endif

And the -E output is:

DBL_MAX_EXP_ = 16384
LDBL_MANT_DIG_ = 53

# 51 "/zoo/jhb/zoo/jhb/git/freebsd/lib/msun/src/catrigl.c:93:2: error: #error "U
nsupported long double format"
 #error "Unsupported long double format"
  ^

Yet clearly, 16384 == 0x4000 assuming it is doing a numeric comparison (which
it must be since the x86 float.h uses '16384' not '0x4000' as the value).

However, LDBL_MANT_DIG of 53 is a bit more fun.  We have a comment about the
initial FPU control word in sys/amd64/include/fpu.h that reads thus:

/*
 * The hardware default control word for i387's and later coprocessors is
 * 0x37F, giving:
 *
 *  round to nearest
 *  64-bit precision
 *  all exceptions masked.
 *
 * FreeBSD/i386 uses 53 bit precision for things like fadd/fsub/fsqrt etc
 * because of the difference between memory and fpu register stack arguments.
 * If its using an intermediate fpu register, it has 80/64 bits to work
 * with.  If it uses memory, it has 64/53 bits to work with.  However,
 * gcc is aware of this and goes to a fair bit of trouble to make the
 * best use of it.
 *
 * This is mostly academic for AMD64, because the ABI prefers the use
 * SSE2 based math.  For FreeBSD/amd64, we go with the default settings.
 */
#define __INITIAL_FPUCW__   0x037F
#define __INITIAL_FPUCW_I386__  0x127F
#define __INITIAL_NPXCW__   __INITIAL_FPUCW_I386__
#define __INITIAL_MXCSR__   0x1F80
#define __INITIAL_MXCSR_MASK__  0xFFBF

GCC is indeed aware of this in gcc/config/i386/freebsd.h which results in
__LDBL_MANT_DIG__ being set to 53 instead of 64:

/* FreeBSD sets the rounding precision of the FPU to 53 bits.  Let the
   compiler get the contents of  and std::numeric_limits correct.  */
#undef TARGET_96_ROUND_53_LONG_DOUBLE
#define TARGET_96_ROUND_53_LONG_DOUBLE (!TARGET_64BIT)

clang seems unaware of this as it reports all the same values for
LDBL_MIN/MAX for both amd64 and i386 (values that match GCC for amd64
but not i386):

# cc -dM -E empty.c | egrep 'LDBL_(MIN|MAX)__'
#define __LDBL_MAX__ 1.18973149535723176502e+4932L
#define __LDBL_MIN__ 3.36210314311209350626e-4932L
# cc -dM -E empty.c -m32 | egrep 'LDBL_(MIN|MAX)__'
#define __LDBL_MAX__ 1.18973149535723176502e+4932L
#define __LDBL_MIN__ 3.36210314311209350626e-4932L
# x86_64-unknown-freebsd12.0-gcc -dM -E empty.c | egrep 'LDBL_(MIN|MAX)__'
#define __LDBL_MAX__ 1.18973149535723176502e+4932L
#define __LDBL_MIN__ 3.36210314311209350626e-4932L
# x86_64-unknown-freebsd12.0-gcc -dM -E empty.c -m32 | egrep 'LDBL_(MIN|MAX)__'
#define __LDBL_MAX__ 1.1897314953572316e+4932L
#define __LDBL_MIN__ 3.3621031431120935e-4932L

The x86/include/float.h header though reports the MIN/MAX values somewhere
in between the two ranges for both amd64 and i386 while reporting an
LDBL_MANT_DIG of 64:

#define LDBL_MANT_DIG   64
#define LDBL_MIN3.3621031431120935063E-4932L
#define LDBL_MAX1.1897314953572317650E+4932L

I guess for now I will remove float.h from the amd64-gcc pkg-plist, but we
should really be fixing our tree to work with compiler-provided language
headers when at all possible.  It's not clear to me if amd64 should be
using the compiler provided values of things like LDBL_MIN/MAX either btw.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: head -r335782 (?) broke ci.freebsd.org's FreeBSD-head-amd64-gcc build (lib32 part of build)

2018-06-29 Thread John Baldwin
i.freebsd.org/job/FreeBSD-head-amd64-gcc/6331/consoleText
> there is:
> 
>> Updating FreeBSD repository catalogue...
>> FreeBSD repository is up to date.
>> All repositories are up to date.
>> The following 6 package(s) will be affected (of 0 checked):
>>
>> New packages to be INSTALLED:
>>  amd64-xtoolchain-gcc: 0.4_1
>>  amd64-gcc: 6.4.0
>>  mpfr: 4.0.1
>>  gmp: 6.1.2
>>  mpc: 1.1.0_1
>>  amd64-binutils: 2.30_3,1
> 
> and amd64-gcc being 6.4.0 (via powerpc64-gcc) is from -r466834
> (via looking up in https://svnweb.freebsd.org/ports/head/devel/ ).
> 
> This indicates that -r465416 and -r466701 did not cause:
> 
> --sysroot=/workspace/obj/workspace/src/amd64.amd64/obj-lib32/tmp
> 
> to lead to include files being looked up in:
> 
> /workspace/obj/workspace/src/amd64.amd64/obj-lib32/tmp/usr/include
> 
> Thus there appears to still be a need for:
> 
> -isystem /workspace/obj/workspace/src/amd64.amd64/obj-lib32/tmp/usr/include
> 
> unless more is done to the devel/*-gcc to make them look
> in that additional place automatically (based on --sysroot).

--sysroot does work, and you can verify it by doing the following:

% touch empty.c
% x86_64-unknown-freebsd11.2-gcc -c -v empty.c
Using built-in specs.
COLLECT_GCC=x86_64-unknown-freebsd11.2-gcc
Target: x86_64-unknown-freebsd11.2
...
ignoring nonexistent directory 
"/usr/local/lib/gcc/x86_64-unknown-freebsd11.2/6.4.0/include-fixed"
ignoring nonexistent directory 
"/usr/local/lib/gcc/x86_64-unknown-freebsd11.2/6.4.0/../../../../x86_64-unknown-freebsd11.2/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/local/lib/gcc/x86_64-unknown-freebsd11.2/6.4.0/include
 /usr/include
End of search list.
...
% x86_64-unknown-freebsd11.2-gcc -c -v empty.c  --sysroot=/foo
Using built-in specs.
COLLECT_GCC=x86_64-unknown-freebsd11.2-gcc
Target: x86_64-unknown-freebsd11.2
...
ignoring nonexistent directory 
"/usr/local/lib/gcc/x86_64-unknown-freebsd11.2/6.4.0/include-fixed"
ignoring nonexistent directory 
"/usr/local/lib/gcc/x86_64-unknown-freebsd11.2/6.4.0/../../../../x86_64-unknown-freebsd11.2/include"
ignoring nonexistent directory "/foo/usr/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/local/lib/gcc/x86_64-unknown-freebsd11.2/6.4.0/include
End of search list.

I will see if I can reproduce the failure locally.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: TSC calibration in virtual machines

2018-06-27 Thread John Baldwin
On 6/27/18 12:14 AM, Andriy Gapon wrote:
> 
> It seems that TSC calibration in virtual machines sometimes can do more harm
> than good.  Should we default to trusting the information provided by a 
> hypervisor?
> 
> Specifically, I am observing a problem on GCE instances where calibrated TSC
> frequency is about 10% lower than advertised frequency.  And apparently the
> advertised frequency is the right one.
> 
> I found this thread with similar reports and a variety of workarounds from
> administratively disabling the calibration to switching to a different 
> timecounter:
> https://lists.freebsd.org/pipermail/freebsd-cloud/2017-January/80.html

I suspect you are probably right that we should just "trust" TSC frequencies
provided by a hypervisor.  We could perhaps choose to whitelist hypervisors
known to provide accurate values if we wanted to be cautious.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Resume without drm driver results in black screen

2018-06-15 Thread John Baldwin
On 5/17/18 3:01 AM, Johannes Lundberg wrote:
> Hi
> 
> I revived this old bug
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=213501
> 
> Considering this also affects all X users using scfb driver it's worth
> investigating.

It's just not doable.  You need some sort of driver for the GPU that knows
how to turn the display back on.  That isn't a portable thing, it's 
GPU-specific.
You could perhaps write smaller drivers that only support resume and not
graphics acceleration, but that doesn't seem trivial.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: kgdb crashing on a vmcore with dumptid = 0

2018-06-15 Thread John Baldwin
id);
> 449 }
> 450
> 451 /* See common/common-regcache.h.  */
> 452
> gdb$ p inferior_ptid
> $13 = {
>   m_pid = 0x0,
>   m_lwp = 0x0,
>   m_tid = 0x0
> }
> 
> 
> gdb$ up
> #15 0x00713077 in kgdb_trgt_open (arg=0x80410900e "vmcore.2",
> from_tty=0x1) at fbsd-kvm.c:335
> 335 target_fetch_registers (get_current_regcache (), -1);
> gdb$ list
> 330 kt = kgdb_thr_next(kt);
> 331 }
> 332 if (curkthr != 0)
> 333 inferior_ptid = fbsd_vmcore_ptid(curkthr->tid);
> 334
> 335 target_fetch_registers (get_current_regcache (), -1);
> 336
> 337 reinit_frame_cache ();
> 338 print_stack_frame (get_selected_frame (NULL), 0,
> SRC_AND_LOC, 1);
> 339 }
> gdb$ p inferior_ptid
> $17 = {
>   m_pid = 0x0,
>   m_lwp = 0x0,
>   m_tid = 0x0
> }
> gdb$ p curkthr
> $18 = (kthr *) 0x0
> 
> gdb$ frame
> Stack level 15, frame at 0x7fffbd90:
>  rip = 0x713077 in kgdb_trgt_open (fbsd-kvm.c:335); saved rip = 0xbf3980
>  called by frame at 0x7fffbdc0, caller of frame at 0x7fffbc40
>  source language c++.
>  Arglist at 0x7fffbd80, args: arg=0x80410900e "vmcore.2", from_tty=0x1
>  Locals at 0x7fffbd80, Previous frame's sp is 0x7fffbd90
>  Saved registers:
>   rbp at 0x7fffbd80, rip at 0x7fffbd88
> arg = 0x80410900e "vmcore.2"
> from_tty = 0x1
> ops = 0x8043ef840
> inf = 0x80442de80
> old_chain = 0x804431820
> ti = 0x7fffd550
> kt = 0x0
> nkvm = 0x804363800
> temp = 0x8047f33b0 "/home/eax/crashes/aes_gpault_crash/vmcore.2"
> kernel = 0x8043dec80 "/home/eax/crashes/aes_gpault_crash/kernel/kernel"
> filename = 0x8047f33b0 "/home/eax/crashes/aes_gpault_crash/vmcore.2"
> ontop = 0x0
> 
> gdb$ p curkthr
> $19 = (kthr *) 0x0
> 
> which is coming from
> 
> curkthr = kgdb_thr_lookup_tid(dumptid);
> if (curkthr == NULL)
> curkthr = first;

Notice this fall back though.  If dumptid isn't valid this falls back to
just picking the first thread.  Instead, it seems that in your dump kgdb
didn't find any kthreads at all.  Is 'kernel' a stripped kernel without
any debug symbols?  You should try single-stepping kgdb_thr_init to see
if it finds valid offsets of structure members in 'struct thread' and
'struct proc'.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Current panic on boot on H11DSI motherboard with epyc cpu (nexus_add_irq: failed)

2018-04-18 Thread John Baldwin
On Wednesday, April 18, 2018 01:56:49 PM Vitalij Satanivskij wrote:
> JB> > If you need any aditional information please tell me about. 
> JB> 
> JB> Can you perhaps turn off the stack trace on boot to not lose the panic 
> messages
> JB> (remove KDB_TRACE from kernel config) and maybe modify the panic message 
> to
> JB> include the IRQ number passed to nexus_add_irq?
> 
> 
> Hm looks like it's always irq with number 256
> eg hpet - 256 
> igb - 256 
> 
> Chenged made for it was
> 
> Index: sys/x86/x86/nexus.c
> ===
> --- sys/x86/x86/nexus.c (revision 332663)
> +++ sys/x86/x86/nexus.c (working copy)
> @@ -698,7 +698,7 @@
>  {
>  
> if (rman_manage_region(_rman, irq, irq) != 0)
> -   panic("%s: failed", __func__);
> +   panic("%s: failed irq is: %lu", __func__, irq);
>  }

O, this is a different issue.  Sorry.  As a hack, try changing
'FIRST_MSI_INT' to 512 in sys/amd64/include/intr_machdep.h.  The issue
is that some systems now include more than 256 interrupt pins on I/O
APICs, so IRQ 256 is already reserved for use by one of those
interrupt pins.  The real fix is that I need to make FIRST_MSI_INT
dynamic instead of a constant and just define it as the first free IRQ
after the I/O APICs have probed.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Current panic on boot on H11DSI motherboard with epyc cpu (nexus_add_irq: failed)

2018-04-17 Thread John Baldwin
On Tuesday, April 17, 2018 10:15:53 PM Vitalij Satanivskij wrote:
> Dear John
> 
> I'm try patch with no success
> 
> http://hell.ukr.net/panic/recorder_patch165.webm
> 
> Also I'm enable verbose boot and record boot process (hpet was disabled so 
> crash in another driver atach)
> http://hell.ukr.net/panic/recorder_patch_verbose.webm
> 
> root@test:/usr/src # svnlite diff
> Index: sys/x86/x86/msi.c
> ===
> --- sys/x86/x86/msi.c   (revision 332650)
> +++ sys/x86/x86/msi.c   (working copy)
> @@ -404,7 +404,7 @@
> /* Do we need to create some new sources? */
> if (cnt < count) {
> /* If we would exceed the max, give up. */
> -   if (i + (count - cnt) > FIRST_MSI_INT + NUM_MSI_INTS) {
> +   if (i + (count - cnt) >= FIRST_MSI_INT + NUM_MSI_INTS) {
> mtx_unlock(_lock);
> free(mirqs, M_MSI);
> return (ENXIO);
> @@ -645,7 +645,7 @@
> /* Do we need to create a new source? */
> if (msi == NULL) {
> /* If we would exceed the max, give up. */
> -   if (i + 1 > FIRST_MSI_INT + NUM_MSI_INTS) {
> +   if (i + 1 >= FIRST_MSI_INT + NUM_MSI_INTS) {
> mtx_unlock(_lock);
> return (ENXIO);
> }
> root@test:/usr/src
> 
> If you need any aditional information please tell me about. 

Can you perhaps turn off the stack trace on boot to not lose the panic messages
(remove KDB_TRACE from kernel config) and maybe modify the panic message to
include the IRQ number passed to nexus_add_irq?

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


panic: VM object not locked in vm_page_ps_test()

2018-04-17 Thread John Baldwin
My laptop running recent head panicked this morning, apparently from hitting
a key to stop the screensaver (at which point xscreensaver prompts for a
password to unlock).


-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic: VM object not locked in vm_page_ps_test()

2018-04-17 Thread John Baldwin
On Tuesday, April 17, 2018 10:01:41 AM John Baldwin wrote:
> My laptop running recent head panicked this morning, apparently from hitting
> a key to stop the screensaver (at which point xscreensaver prompts for a
> password to unlock).

(Sorry, buggy mail client sent this early)

panic: Lock vm object not locked @ /usr/src/sys/vm/vm_page.c:4135

#4  0x805e4893 in panic (fmt=)
at /usr/src/sys/kern/kern_shutdown.c:764
#5  0x805dff22 in __rw_assert (c=, 
what=, file=, line=)
at /usr/src/sys/kern/kern_rwlock.c:1397
#6  0x80882723 in vm_page_ps_test (m=0xf80431c2e980, flags=7, 
skip_m=0xf80431c34890) at /usr/src/sys/vm/vm_page.c:4135
#7  0x80867d84 in vm_fault_soft_fast (vaddr=, 
prot=, fault_type=, 
fault_flags=, wired=0, fs=, 
m_hold=) at /usr/src/sys/vm/vm_fault.c:307
#8  vm_fault_hold (map=0xf8000832a000, vaddr=, 
fault_type=, fault_flags=, m_hold=0x0)
at /usr/src/sys/vm/vm_fault.c:610
#9  0x80866cf5 in vm_fault (map=0xf8000832a000, 
vaddr=, fault_type=2 '\002', fault_flags=0)
at /usr/src/sys/vm/vm_fault.c:514
#10 0x808bc64c in trap_pfault (frame=0xfe008b1dbac0, usermode=1)
at /usr/src/sys/amd64/amd64/trap.c:728
#11 0x808bbe1e in trap (frame=0xfe008b1dbac0)
#12 
#13 0x000805b51556 in ?? ()

(kgdb) frame 6
#6  0x80882723 in vm_page_ps_test (m=0xf80431c2e980, flags=7, 
skip_m=0xf80431c34890) at /usr/src/sys/vm/vm_page.c:4135
(kgdb) l
4130{
4131vm_object_t object;
4132int i, npages;
4133
4134object = m->object;
4135VM_OBJECT_ASSERT_LOCKED(object);
4136npages = atop(pagesizes[m->psind]);
4137
4138/*
4139 * The physically contiguous pages that make up a superpage, 
i.e., a
(kgdb) p m->object
$1 = (vm_object_t) 0xf80190785900
(kgdb) p pagesizes[m->psind]
$3 = 2097152
(kgdb) up
#7  0x80867d84 in vm_fault_soft_fast (vaddr=, 
prot=, fault_type=, 
fault_flags=, wired=0, fs=, 
m_hold=) at /usr/src/sys/vm/vm_fault.c:307
307 if (vm_page_ps_test(m_super, flags, m)) {
(kgdb) p m->object
$4 = (vm_object_t) 0xf80190116a00
(kgdb) p/x m->flags
$5 = 0x0

So 'm' (original page fault page) and 'm_super' are from different VM
objects.  Why are they part of the same reservation?

(kgdb) p m->phys_addr >> (9 + 12)
$7 = 4514
(kgdb) p vm_reserv_array[$7]
$8 = {lock = {lock_object = {lo_name = 0x8099112c "vm reserv", 
  lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 0}, 
  partpopq = {tqe_next = 0x0, tqe_prev = 0xf80423656680}, objq = {
le_next = 0xf8042365b0c0, le_prev = 0xf80190116ab8}, 
  object = 0xf80190116a00, pindex = 1760, pages = 0xf80431c2e980, 
  domain = 0, popcnt = 512, inpartpopq = 0 '\000', popmap = {
18446744073709551615, 18446744073709551615, 18446744073709551615, 
18446744073709551615, 18446744073709551615, 18446744073709551615, 
18446744073709551615, 18446744073709551615}}
(kgdb) set $rv = vm_reserv_array[$7]
(kgdb) p $rv.object
$9 = (vm_object_t) 0xf80190116a00

So rv->object matches m->object ($4) but not m_super->object ($1).
Double-checking:

(kgdb) p m_super->object
$10 = (vm_object_t) 0xf80190785900

Other conditions in vm_reserv_to_superpage() are true:

(kgdb) p $rv.pages
$11 = (vm_page_t) 0xf80431c2e980
(kgdb) p m_super
$12 = (vm_page_t) 0xf80431c2e980
(kgdb) p $rv.popcnt
$13 = 512

Both objects are OBJT_DEFAULT objects:

(kgdb) p *m->object
$14 = {lock = {lock_object = {lo_name = 0x8095e7ce "vm object", 
  lo_flags = 627245056, lo_data = 0, lo_witness = 0x0}, rw_lock = 41}, 
  object_list = {tqe_next = 0xf80190116b00, 
tqe_prev = 0xf80190116920}, shadow_head = {lh_first = 0x0}, 
  shadow_list = {le_next = 0x0, le_prev = 0xf80190785930}, memq = {
tqh_first = 0xf80431ddf878, tqh_last = 0xf80431e2a900}, rtree = {
rt_root = 18446735284333515328}, size = 2829, domain = {dr_policy = 0x0, 
dr_iterator = 0}, generation = 1, ref_count = 3, shadow_count = 0, 
  memattr = 6 '\006', type = 0 '\000', flags = 12352, pg_color = 1824, 
  paging_in_progress = 1, resident_page_count = 1024, 
  backing_object = 0xf80190785900, backing_object_offset = 0, 
  pager_object_list = {tqe_next = 0x0, tqe_prev = 0x0}, rvq = {
lh_first = 0xf80423659540}, handle = 0x0, un_pager = {vnp = {
  vnp_size = 0, writemappings = 0}, devp = {devp_pglist = {
tqh_first = 0x0, tqh_last = 0x0}, ops = 0x0, dev = 0x0}, sgp = {
  sgp_pglist = {tqh_first = 0x0, tqh_last = 0x0}}, swp = {swp_tmpfs = 0x0, 
  swp_blks = {pt_root = 0}}}, cred = 0xf80008d99500, 
  charge = 11587584, umtx_data = 0x0}
(kgdb) p *m_super->object
$15 = {lock = {lock_object = {lo_name = 0x8095e7ce "vm object", 
  lo_flags =

Re: Current panic on boot on H11DSI motherboard with epyc cpu (nexus_add_irq: failed)

2018-04-17 Thread John Baldwin
On Monday, April 16, 2018 10:12:13 PM Vitalij Satanivskij wrote:
> 
> igb0@pci0:1:0:0:class=0x02 card=0x152115d9 chip=0x15218086 
> rev=0x01 hdr=0x00
> vendor = 'Intel Corporation'
> device = 'I350 Gigabit Network Connection'
> class  = network
> subclass   = ethernet
> cap 01[40] = powerspec 3  supports D0 D3  current D0
> cap 05[50] = MSI supports 1 message, 64 bit, vector masks
> cap 11[70] = MSI-X supports 10 messages
>  Table in map 0x1c[0x0], PBA in map 0x1c[0x2000]
> cap 10[a0] = PCI-Express 2 endpoint max data 512(512) FLR RO NS
>  link x4(x4) speed 5.0(5.0) ASPM L1(L0s/L1)
> ecap 0001[100] = AER 2 0 fatal 0 non-fatal 1 corrected
> ecap 0003[140] = Serial 1 ac1f6b620e0c
> ecap 000e[150] = ARI 1
> ecap 0010[160] = SR-IOV 1 IOV disabled, Memory Space disabled, ARI 
> disabled
>  0 VFs configured out of 8 supported
>  First VF RID Offset 0x0180, VF RID Stride 0x0004
>  VF Device ID 0x1520
>  Page Sizes: 4096 (enabled), 8192, 65536, 262144, 
> 1048576, 4194304
> ecap 0017[1a0] = TPH Requester 1
> ecap 0018[1c0] = LTR 1
> ecap 000d[1d0] = ACS 1
> 
> It's info from system booted with HPET disabled and 
> hw.pci.enable_msix: 0
> hw.pci.enable_msi: 0
> 
> If one of this parameters not set as described system not boot ^( 

Please try the patch from here https://reviews.freebsd.org/P165

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Duplicate free in of file caps data

2018-04-09 Thread John Baldwin
I updated my laptop to HEAD as of Friday and got the following panic
after a bhyve process using capabilities exited:

panic: Duplicate free of 0xf8039515eba0 from zone 0xf8000200e540(16) 
slab 0xf8039515ef90(186)
...
(kgdb) where
#0  __curthread () at ./machine/pcpu.h:230
#1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:361
#2  0x805e42e2 in kern_reboot (howto=260)
at /usr/src/sys/kern/kern_shutdown.c:441
#3  0x805e484d in vpanic (fmt=, ap=0xfe008b2f4700)
at /usr/src/sys/kern/kern_shutdown.c:837
#4  0x805e4893 in panic (fmt=)
at /usr/src/sys/kern/kern_shutdown.c:764
#5  0x80862a37 in uma_dbg_free (zone=0xf8000200e540, 
slab=0xf8039515ef90, item=0xf8039515eba0)
at /usr/src/sys/vm/uma_core.c:3931
#6  0x80862247 in uma_zfree_arg (zone=0xf8000200e540, 
item=, udata=0xf8039515ef90)
at /usr/src/sys/vm/uma_core.c:2876
#7  0x805bf715 in free (addr=0xf8039515eba0, 
mtp=0x80c95ec0 ) at /usr/src/sys/kern/kern_malloc.c:711
#8  0x805923ba in filecaps_free (fcaps=)
at /usr/src/sys/kern/kern_descrip.c:1580
#9  fdefree_last (fde=) at /usr/src/sys/kern/kern_descrip.c:297
#10 fdescfree_fds (td=0xf8039a484000, fdp=0xf8039acfe000, 
needclose=true) at /usr/src/sys/kern/kern_descrip.c:2242
#11 0x80591f00 in fdescfree (td=0xf8039a484000)
at /usr/src/sys/kern/kern_descrip.c:2307
#12 0x805a0940 in exit1 (td=0xf8039a484000, rval=, 
signo=0) at /usr/src/sys/kern/kern_exit.c:378
#13 0x805a044d in sys_sys_exit (td=, uap=)
at /usr/src/sys/kern/kern_exit.c:180
#14 0x808bd2e9 in syscallenter (td=0xf8039a484000)
at /usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:134
#15 amd64_syscall (td=0xf8039a484000, traced=0)
at /usr/src/sys/amd64/amd64/trap.c:936
#16 
#17 0x000800ae3eda in ?? ()
(kgdb) frame 8
#8  0x805923ba in filecaps_free (fcaps=)
at /usr/src/sys/kern/kern_descrip.c:1580
1580free(fcaps->fc_ioctls, M_FILECAPS);

Note that I am using a patched bhyve that uses cap_ioctls_limit() on a listen
socket (so the caps will be copied to the new socket during accept()).

I'll see if I can't come up with a simpler program to reproduce this.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: "Could not allocate I/O space" and "intsmb0 attach returned 6" in a under-Hyper-V context on Ryzen Threadripper: Is this expected?

2018-04-09 Thread John Baldwin
On Sunday, April 01, 2018 02:23:36 PM Mark Millard wrote:
> For:
> 
> # uname -apKU
> FreeBSD FBSDHUGE 12.0-CURRENT FreeBSD 12.0-CURRENT  r331831M  amd64 amd64 
> 1200060 1200060
> 
> I get:
> 
> . . .
> pci0:  at device 7.3 (no driver attached)
> . . .
> intsmb0:  at device 7.3 on pci0
> intsmb0: Could not allocate I/O space
> device_attach: intsmb0 attach returned 6
> 
> on a Ryzen Threadripper 1950X where FreeBSD is being run under
> Hyper-V (on a Windows 10 Pro machine).
> 
> Is this expected? Did I misconfigure something in Hyper-V?

That seems like an odd device to have for an AMD machine.  I suspect that this 
has never
worked and the module started auto-loading due to devmatch.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Can't load linux64.ko module

2018-04-09 Thread John Baldwin
On Wednesday, April 04, 2018 02:34:53 PM Steve Kargl wrote:
> On Wed, Apr 04, 2018 at 02:13:15PM -0700, Steve Kargl wrote:
> > 
> > OK, so where is elf64_linux_vdso_fixup suppose to come from?
> > 
> 
> The answer is compat/linux/linux_vdso.c where we find
> 
> #if defined(__i386__) || (defined(__amd64__) && defined(COMPAT_LINUX32))
> #define __ELF_WORD_SIZE 32
> #else
> #define __ELF_WORD_SIZE 64
> #endif
> 
> having COMPAT_LINUX32 in my kernel config file gives me
> elf32_linux_vdso_fixup.  It seems that one cannot have
> a kernel that supports both 32 and 64-bit linux software.
> 
> linux(4) states
> 
>  for an amd64 kernel use:
> 
>options COMPAT_LINUX32
> 
>  Alternatively, to load the ABI as a module at boot time, place the
>  following line in loader.conf(5):
> 
>linux_load="YES"
> 
> It turns out that I have 'linux_load=YES" in /etc/loader.conf.
> When I boot the kernel built with COMPAT_LINUX32 prevents 
> the kldload of linux64.ko.
> 
> Oh well, learn something new everyday.

The Right Way to fix this is probably to have linux_vdso32.c and
linux_vdso64.c that #include linux_vdso.c after setting
ELF_WORD_SIZE similar to how sys/kern/imgact_elf.c works.
Then the COMPAT_LINUX and linux64.ko modules would include
linux_vdso64.c and COMPAT_LINUX32 and linux32.ko modules
(and linux.ko on i386) would include linux_vdso32.c.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: problem with [intr{swi4: clock (0)}]

2018-03-23 Thread John Baldwin
On Wednesday, March 21, 2018 11:36:48 AM AN wrote:
> Hi:
> 
> I would appreciate any help with this issue, this is a new machine built 
> in the last week and if it is a hardware issue I want to return it.  The 
> problem seems to have started in the last 24 hours or so.  I am seeing a 
> really high cpu utilization for [intr{swi4: clock (0)}].  I have tried a 
> couple things to troubleshoot:

I would try using dtrace to figure out which functions are running in the
callout thread.  I've cc'd a couple of folks in case they already have dtrace
scripts to do this.  You would probably want a script that watched
callout_execute::callout-start and callout_execute::callout-end events.  You
would want to save the start time in callout-start and then report a delta
along with the values of 'c->c_func' (the last argument to these probes is
'c').  You might be able to just store the time delta in an aggregate that is
keyed on the function.  Actually, I've gone ahead and written a little
script:


callout_execute:::callout-start
{
self->start = timestamp;
self->func = args[0]->c_func;
@funcs[self->func] = count();
}

callout_execute:::callout-end
{
@functimes[self->func] = sum(timestamp - self->start);
}

END
{
printf("\n\nCallout function counts:\n");
printa("%@8u %a\n", @funcs);
printf("\nCallout function runtime:\n");
printa("%@d %a\n", @functimes);
}


Store this in a file named 'callout.d' and then run 'dtrace -s callout.d'.
Let it run for a second or two and then use Ctrl-C to stop it.

The first table it will output is a histogram showing how many times
different functions were invoked.   The second table will count how much
total time was spent in each function:

CPU IDFUNCTION:NAME
  4  2 :END 

Callout function counts:
   2 kernel`kbdmux_kbd_intr_timo
   2 kernel`usb_power_wdog
   2 kernel`ipport_tick
   2 kernel`tcp_timer_delack
   2 kernel`nd6_timer
   2 kernel`key_timehandler
   2 dtrace.ko`dtrace_state_deadman
   4 kernel`newnfs_timer
   4 kernel`pfslowtimo
  10 kernel`logtimeout
  10 kernel`pffasttimo
  18 kernel`lim_cb
  32 kernel`iflib_timer
  84 kernel`sleepq_timeout
 224 dtrace.ko`dtrace_state_clean

Callout function runtime:
2080 kernel`logtimeout
2198 kernel`kbdmux_kbd_intr_timo
2890 kernel`ipport_tick
3550 kernel`iflib_timer
3672 kernel`lim_cb
3936 kernel`pffasttimo
4023 dtrace.ko`dtrace_state_clean
4224 kernel`newnfs_timer
4751 kernel`key_timehandler
5286 kernel`nd6_timer
6700 kernel`usb_power_wdog
7341 kernel`pfslowtimo
19607 kernel`tcp_timer_delack
20273 dtrace.ko`dtrace_state_deadman
32262 kernel`sleepq_timeout

You can use this to figure out which timer events are using CPU in the
softclock thread/process.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: pkg does not recognize correct kernel version

2018-03-02 Thread John Baldwin
On Thursday, March 01, 2018 02:02:58 PM Konstantin Belousov wrote:
> On Wed, Feb 28, 2018 at 03:32:43PM -0800, John Baldwin wrote:
> > On Wednesday, February 28, 2018 09:45:47 PM Konstantin Belousov wrote:
> > > On Wed, Feb 28, 2018 at 10:57:53AM -0800, John Baldwin wrote:
> > > > On Tuesday, February 20, 2018 10:19:02 AM Conrad Meyer wrote:
> > > > > On Mon, Feb 19, 2018 at 2:38 PM, Ronald Klop <ronald-li...@klop.ws> 
> > > > > wrote:
> > > > > > On Mon, 19 Feb 2018 22:05:51 +0100, Konstantin Belousov
> > > > > > <kostik...@gmail.com> wrote:
> > > > > >
> > > > > >> Look at the man page.  pkg reads version from the /bin/sh ELF 
> > > > > >> FreeBSD
> > > > > >
> > > > > >
> > > > > > Which man page? I can't find it in pkg help update or pkg help 
> > > > > > upgrade or
> > > > > > man pkg.
> > > > > 
> > > > > I had to dig for quite a while to find a reference (pkg.conf(5)):
> > > > > 
> > > > >  ABI: string  The ABI of the package you want to install.  
> > > > > Default:
> > > > >   derived from the ABI of the /bin/sh binary.
> > > > > 
> > > > > >> version note:
> > > > > >> orion% file /bin/ls
> > > > > >> /bin/ls: ELF 64-bit LSB executable, x86-64, version 1 (FreeBSD),
> > > > > >> dynamically linked, interpreter /libexec/ld-elf.so.1, for FreeBSD 
> > > > > >> 11.1
> > > > > >> (1101506), FreeBSD-style, stripped
> > > > > >>
> > > > > >> Update world past the __FreeBSD_version which is reported for the
> > > > > >> repository.
> > > > > >
> > > > > >
> > > > > > Does this mean I always have to do a *clean* buildworld after every 
> > > > > > version
> > > > > > bump? This takes ages.
> > > > > 
> > > > > You could also do a -DNO_CLEAN buildworld.
> > > > > 
> > > > > Or you can continue to override with "-o OSVERSION=foo", although that
> > > > > may eventually result in broken packages.  In general the OSVERSION is
> > > > > bumped conservatively (more often than will actually result in
> > > > > breakage), so you can get away with the easy workaround for a while
> > > > > between buildworlds.
> > > > 
> > > > NO_CLEAN=yes doesn't work.  A clean buildworld is required.  The reason 
> > > > is that
> > > > the __FreeBSD_version embedded in binaries is stored in 
> > > > /usr/lib/crt*.o, but
> > > > that the dependency rules in lib/csu/Makefile do not rebuild these .o 
> > > > files
> > > > everytime  changes (so a NO_CLEAN=yes buildworld won't 
> > > > rebuild them
> > > > leaving them with a stale version).  Furthermore, when binaries and 
> > > > shared
> > > > libraries are built, our Makefiles do not specify that the relevant
> > > > /usr/lib/crt*.o files are dependencies, so even if we fixed the missing
> > > >  dependency, no binaries would relink to pick up the 
> > > > updated
> > > > __FreeBSD_version file unless some other input to the binary changed.  
> > > > This
> > > > one could perhaps be mostly mitigated by forcing libc to depend on the
> > > > relevant crt*.o files explicitly (or even having it depend on 
> > > > 
> > > > to force relinking of everything when  changes).
> > > libc already depends on sys/param.h.
> > 
> > Hmm, even when I removed /usr/obj/usr/src/lib/csu entirely and then did a 
> > buildworld
> > NO_CLEAN=yes recently /bin/sh was not relinked, though perhaps at that point
> > libc already thought it was up-to-date relative to  from the 
> > previous
> > build.
> > 
> > > I think it would be enough to specify that crt1.o depends on sys/param.h
> > > as well. Although it is also strange, because e.g. for amd64 the dep
> > > thread is csu/amd64/crt1.c->csu/common/crtbrand.c->sys/param.h, which 
> > > should
> > > be detected by the include file calculation.
> > 
> > I think the detour via assembly + sed is what breaks the dependency chain.
> > FWIW, I found that on at least MIPS with clang I did not need the 
> > SED_FIX_NOTE
> > hack.
> 
> Perhaps the FIX_NOTE should be re-evaluated for all the changes happen in
> the toolchains since the hack was needed.

I believe modern GCC still needs the hack unfortunately. :(

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: pkg does not recognize correct kernel version

2018-02-28 Thread John Baldwin
On Wednesday, February 28, 2018 09:45:47 PM Konstantin Belousov wrote:
> On Wed, Feb 28, 2018 at 10:57:53AM -0800, John Baldwin wrote:
> > On Tuesday, February 20, 2018 10:19:02 AM Conrad Meyer wrote:
> > > On Mon, Feb 19, 2018 at 2:38 PM, Ronald Klop <ronald-li...@klop.ws> wrote:
> > > > On Mon, 19 Feb 2018 22:05:51 +0100, Konstantin Belousov
> > > > <kostik...@gmail.com> wrote:
> > > >
> > > >> Look at the man page.  pkg reads version from the /bin/sh ELF FreeBSD
> > > >
> > > >
> > > > Which man page? I can't find it in pkg help update or pkg help upgrade 
> > > > or
> > > > man pkg.
> > > 
> > > I had to dig for quite a while to find a reference (pkg.conf(5)):
> > > 
> > >  ABI: string  The ABI of the package you want to install.  
> > > Default:
> > >   derived from the ABI of the /bin/sh binary.
> > > 
> > > >> version note:
> > > >> orion% file /bin/ls
> > > >> /bin/ls: ELF 64-bit LSB executable, x86-64, version 1 (FreeBSD),
> > > >> dynamically linked, interpreter /libexec/ld-elf.so.1, for FreeBSD 11.1
> > > >> (1101506), FreeBSD-style, stripped
> > > >>
> > > >> Update world past the __FreeBSD_version which is reported for the
> > > >> repository.
> > > >
> > > >
> > > > Does this mean I always have to do a *clean* buildworld after every 
> > > > version
> > > > bump? This takes ages.
> > > 
> > > You could also do a -DNO_CLEAN buildworld.
> > > 
> > > Or you can continue to override with "-o OSVERSION=foo", although that
> > > may eventually result in broken packages.  In general the OSVERSION is
> > > bumped conservatively (more often than will actually result in
> > > breakage), so you can get away with the easy workaround for a while
> > > between buildworlds.
> > 
> > NO_CLEAN=yes doesn't work.  A clean buildworld is required.  The reason is 
> > that
> > the __FreeBSD_version embedded in binaries is stored in /usr/lib/crt*.o, but
> > that the dependency rules in lib/csu/Makefile do not rebuild these .o files
> > everytime  changes (so a NO_CLEAN=yes buildworld won't rebuild 
> > them
> > leaving them with a stale version).  Furthermore, when binaries and shared
> > libraries are built, our Makefiles do not specify that the relevant
> > /usr/lib/crt*.o files are dependencies, so even if we fixed the missing
> >  dependency, no binaries would relink to pick up the updated
> > __FreeBSD_version file unless some other input to the binary changed.  This
> > one could perhaps be mostly mitigated by forcing libc to depend on the
> > relevant crt*.o files explicitly (or even having it depend on 
> > to force relinking of everything when  changes).
> libc already depends on sys/param.h.

Hmm, even when I removed /usr/obj/usr/src/lib/csu entirely and then did a 
buildworld
NO_CLEAN=yes recently /bin/sh was not relinked, though perhaps at that point
libc already thought it was up-to-date relative to  from the 
previous
build.

> I think it would be enough to specify that crt1.o depends on sys/param.h
> as well. Although it is also strange, because e.g. for amd64 the dep
> thread is csu/amd64/crt1.c->csu/common/crtbrand.c->sys/param.h, which should
> be detected by the include file calculation.

I think the detour via assembly + sed is what breaks the dependency chain.
FWIW, I found that on at least MIPS with clang I did not need the SED_FIX_NOTE
hack.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: pkg does not recognize correct kernel version

2018-02-28 Thread John Baldwin
On Tuesday, February 20, 2018 10:19:02 AM Conrad Meyer wrote:
> On Mon, Feb 19, 2018 at 2:38 PM, Ronald Klop <ronald-li...@klop.ws> wrote:
> > On Mon, 19 Feb 2018 22:05:51 +0100, Konstantin Belousov
> > <kostik...@gmail.com> wrote:
> >
> >> Look at the man page.  pkg reads version from the /bin/sh ELF FreeBSD
> >
> >
> > Which man page? I can't find it in pkg help update or pkg help upgrade or
> > man pkg.
> 
> I had to dig for quite a while to find a reference (pkg.conf(5)):
> 
>  ABI: string  The ABI of the package you want to install.  Default:
>   derived from the ABI of the /bin/sh binary.
> 
> >> version note:
> >> orion% file /bin/ls
> >> /bin/ls: ELF 64-bit LSB executable, x86-64, version 1 (FreeBSD),
> >> dynamically linked, interpreter /libexec/ld-elf.so.1, for FreeBSD 11.1
> >> (1101506), FreeBSD-style, stripped
> >>
> >> Update world past the __FreeBSD_version which is reported for the
> >> repository.
> >
> >
> > Does this mean I always have to do a *clean* buildworld after every version
> > bump? This takes ages.
> 
> You could also do a -DNO_CLEAN buildworld.
> 
> Or you can continue to override with "-o OSVERSION=foo", although that
> may eventually result in broken packages.  In general the OSVERSION is
> bumped conservatively (more often than will actually result in
> breakage), so you can get away with the easy workaround for a while
> between buildworlds.

NO_CLEAN=yes doesn't work.  A clean buildworld is required.  The reason is that
the __FreeBSD_version embedded in binaries is stored in /usr/lib/crt*.o, but
that the dependency rules in lib/csu/Makefile do not rebuild these .o files
everytime  changes (so a NO_CLEAN=yes buildworld won't rebuild them
leaving them with a stale version).  Furthermore, when binaries and shared
libraries are built, our Makefiles do not specify that the relevant
/usr/lib/crt*.o files are dependencies, so even if we fixed the missing
 dependency, no binaries would relink to pick up the updated
__FreeBSD_version file unless some other input to the binary changed.  This
one could perhaps be mostly mitigated by forcing libc to depend on the
relevant crt*.o files explicitly (or even having it depend on 
to force relinking of everything when  changes).

This matters for more than just pkg as the kernel also looks at the embedded
__FreeBSD_version in binaries to make decisions about compat shims to enable
(grep for P_OSREL in sys/).

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: LUA boot loader coming very soon

2018-02-12 Thread John Baldwin
On Monday, February 12, 2018 02:31:46 PM Warner Losh wrote:
> On Mon, Feb 12, 2018 at 10:12 AM, John Baldwin <j...@freebsd.org> wrote:
> 
> > On Monday, February 12, 2018 08:27:27 AM Warner Losh wrote:
> > > Greetings,
> > >
> > > As you may know, the Lua (http://www.lua.org) boot loader has been in
> > the
> > > works for some time. It started out life as a GSoC in 2014 by Pedro Souza
> > > mentored by Wojciech A. Koszek. Rui Paulo created a svn project branch to
> > > try to integrate it. I rebased that effort into a github branch which
> > Pedro
> > > Arthur fixed up. Over the past year, I've been cleaning up the boot
> > loader
> > > for other reasons, and found the time was ripe to start integrating this
> > > into the tree. However, those integration efforts have taken a while as
> > my
> > > day-job work on the boot loader took priority. In the mean time, Ed Maste
> > > and the FreeBSD Foundation funded Zakary Nafziger to enhance the original
> > > GSoC Lua scripts to bring it closer to parity with the evolution of the
> > > FORTH menu system since the GSoC project started.
> > >
> > > I'm pleased to announce that all these threads of development have
> > > converged and I'll be pushing the FreeBSD Lua Loader later today. This
> > > loader uses Lua as its scripting language instead of FORTH. While
> > > co-existance is planned, the timeline for it is looking to be a few weeks
> > > and I didn't want to delay pushing this into the tree for that.
> > >
> > > To try the loader, you'll need to build WITHOUT_FORTH=yes and
> > > WITH_LOADER_LUA=yes. Fortunately, you needn't do a full world to do this,
> > > you can do it in src/stand and install the result (be sure to have the
> > > options for both the build and the install). This will replace your
> > current
> > > /boot/loader that is scripted with FORTH to one that's scripted with Lua.
> > > It will install the lua scripts in /boot/lua. The boot is scripted with
> > > /boot/lua/loader.lua instead of /boot/loader.rc. You are strongly advised
> > > to create a backup copy of /boot/loader before testing (eg cp
> > /boot/loader
> > > /boot/loader_forth), since you'll need to boot that from boot2 if
> > something
> > > goes wrong. I've tested it extensively, though, with userboot.so and it's
> > > test program, so all the initial kinks of finding the lua scripts, etc
> > have
> > > been worked out.
> > >
> > > While it's possible to build all the /boot/loader variants with Lua, I've
> > > just tested a BIOS booting /boot/loader both with and without menus
> > > enabled. I've not tested any of the other variants and the instructions
> > for
> > > testing some of them may be rather tedious (especially UEFI, if you want
> > a
> > > simple path to back out). Since there's not been full convergence
> > testing,
> > > you'll almost certainly find bumps in this system. Also, all the
> > > build-system APIs are likely not yet final.
> > >
> > > I put  MFC after a month on the commit. Due to the heroic (dare I say
> > > almost crazy) work of Kyle Evans on merging all the revs from -current to
> > > 11, I'm planning a MFC to 11 after the co-existence issues are hammered
> > > out. In 11, FORTH will be the default, and Lua will  be built by default,
> > > but users will have to do something to use it. 12, both FORTH and Lua
> > will
> > > be built and installed, with Lua as default (barring unforeseen
> > > complications). Once the co-existence stuff goes in, I imagine we'll make
> > > the switch to Lua by default shortly after that. In 13, FORTH will be
> > > removed unless there's a really really compelling case made to keep it.
> > >
> > > So please give it a spin and give me any feedback, documentation updates
> > > and/or bug fixes. I'm especially interested in reviews from people that
> > > have embedded Lua in other projects or experts in Lua that can improve
> > the
> > > robustness of the menu code.
> >
> > Do you have some memory usage numbers for LUA vs forth for the different
> > BIOS loaders (text/data/bss sizes)?  For the EFI case we probably have lots
> > of room, but for the non-EFI case we are constrained to 0xa - 0xa000
> > for the text/data/bss and stack (in some cases like PXE booting the top
> > can be lower than 0xa).  I'm not sure if we have any other platforms
> > with similar memory constraints.
> >
> 
&

Re: posix_fallocate on ZFS

2018-02-12 Thread John Baldwin
On Saturday, February 10, 2018 01:46:33 PM Garrett Wollman wrote:
> In article
> <caotmx2jzr_kvjgozweib-az3-7-uuu+uq3p0nkhgz0enrzw...@mail.gmail.com>,
> asom...@freebsd.org writes:
> 
> >On Sat, Feb 10, 2018 at 10:28 AM, Willem Jan Withagen <w...@digiware.nl>
> >wrote:
> 
> >> Is there any expectation that this is going to fixed in any near future?
> 
> >No.  It's fundamentally impossible to support posix_fallocate on a COW
> >filesystem like ZFS.  Ceph should be taught to ignore an EINVAL result,
> >since the system call is merely advisory.
> 
> I don't think it's true that this is _fundamentally_ impossible.  What
> the standard requires would in essence be a per-object refreservation.
> ZFS supports refreservation, obviously, but not on a per-object basis.
> Furthermore, there are mechanisms to preallocate blocks for things
> like dumps.  So it *could* be done (as in, the concept is there), but
> it may not be practical.  (And ultimately, there are ways in which the
> administrator might manage the system that would defeat the desired
> effect, but that's out of the standard's scope.)  Given the semantic
> mismatch, though, I suspect it's unreasonable to expect anyone to
> prioritize implementation of such a feature.

I don't think posix_fallocate() can be compatible with COW.  Suppose you
do reserve a fixed set of blocks.  That ensures the first write has a
place to write, but not if you overwrite one of those blocks.  You'd have
to reserve another block to maintain the reservation each time you wrote
to a block, or you'd have to have a way to mark a file as not COW.  The
first case isn't really any better than not using posix_fallocate() in the
first place as you are still requiring writes to allocate blocks, and the
second seems a bit fraught with peril as well if the application is
expecting the non-COW'd file to be in sync with other files in the system
since presumably non-COW'd files couldn't be snapshotted, etc.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: LUA boot loader coming very soon

2018-02-12 Thread John Baldwin
On Monday, February 12, 2018 08:27:27 AM Warner Losh wrote:
> Greetings,
> 
> As you may know, the Lua (http://www.lua.org) boot loader has been in the
> works for some time. It started out life as a GSoC in 2014 by Pedro Souza
> mentored by Wojciech A. Koszek. Rui Paulo created a svn project branch to
> try to integrate it. I rebased that effort into a github branch which Pedro
> Arthur fixed up. Over the past year, I've been cleaning up the boot loader
> for other reasons, and found the time was ripe to start integrating this
> into the tree. However, those integration efforts have taken a while as my
> day-job work on the boot loader took priority. In the mean time, Ed Maste
> and the FreeBSD Foundation funded Zakary Nafziger to enhance the original
> GSoC Lua scripts to bring it closer to parity with the evolution of the
> FORTH menu system since the GSoC project started.
> 
> I'm pleased to announce that all these threads of development have
> converged and I'll be pushing the FreeBSD Lua Loader later today. This
> loader uses Lua as its scripting language instead of FORTH. While
> co-existance is planned, the timeline for it is looking to be a few weeks
> and I didn't want to delay pushing this into the tree for that.
> 
> To try the loader, you'll need to build WITHOUT_FORTH=yes and
> WITH_LOADER_LUA=yes. Fortunately, you needn't do a full world to do this,
> you can do it in src/stand and install the result (be sure to have the
> options for both the build and the install). This will replace your current
> /boot/loader that is scripted with FORTH to one that's scripted with Lua.
> It will install the lua scripts in /boot/lua. The boot is scripted with
> /boot/lua/loader.lua instead of /boot/loader.rc. You are strongly advised
> to create a backup copy of /boot/loader before testing (eg cp /boot/loader
> /boot/loader_forth), since you'll need to boot that from boot2 if something
> goes wrong. I've tested it extensively, though, with userboot.so and it's
> test program, so all the initial kinks of finding the lua scripts, etc have
> been worked out.
> 
> While it's possible to build all the /boot/loader variants with Lua, I've
> just tested a BIOS booting /boot/loader both with and without menus
> enabled. I've not tested any of the other variants and the instructions for
> testing some of them may be rather tedious (especially UEFI, if you want a
> simple path to back out). Since there's not been full convergence testing,
> you'll almost certainly find bumps in this system. Also, all the
> build-system APIs are likely not yet final.
> 
> I put  MFC after a month on the commit. Due to the heroic (dare I say
> almost crazy) work of Kyle Evans on merging all the revs from -current to
> 11, I'm planning a MFC to 11 after the co-existence issues are hammered
> out. In 11, FORTH will be the default, and Lua will  be built by default,
> but users will have to do something to use it. 12, both FORTH and Lua will
> be built and installed, with Lua as default (barring unforeseen
> complications). Once the co-existence stuff goes in, I imagine we'll make
> the switch to Lua by default shortly after that. In 13, FORTH will be
> removed unless there's a really really compelling case made to keep it.
> 
> So please give it a spin and give me any feedback, documentation updates
> and/or bug fixes. I'm especially interested in reviews from people that
> have embedded Lua in other projects or experts in Lua that can improve the
> robustness of the menu code.

Do you have some memory usage numbers for LUA vs forth for the different
BIOS loaders (text/data/bss sizes)?  For the EFI case we probably have lots
of room, but for the non-EFI case we are constrained to 0xa - 0xa000
for the text/data/bss and stack (in some cases like PXE booting the top
can be lower than 0xa).  I'm not sure if we have any other platforms
with similar memory constraints.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Kernel build error rev.328485

2018-01-29 Thread John Baldwin
On Saturday, January 27, 2018 06:54:03 PM Per Gunnarsson wrote:
> I am back with new build errors. If I post too frequently, please inform me.

These all look like you have stale sources in your tree somehow (e.g.
sys/compat/freebsd/freebsd32_misc.c doesn't seem to match
sys/compat/freebsd/freebsd32.h)

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: witness_lock_list_get: witness exhausted

2018-01-08 Thread John Baldwin
On Tuesday, November 28, 2017 02:46:03 PM Michael Jung wrote:
> Hi!
> 
> I've recently up'd my processor count on our poudriere box and have 
> started noticing the error
> "witness_lock_list_get: witness exhausted" on the console.  The kernel 
> *DOES NOT* crash but I
> thought the report may be useful to someone.
> 
> $ uname -a
> FreeBSD poudriere 12.0-CURRENT FreeBSD 12.0-CURRENT #1 r325999: Sun Nov 
> 19 18:41:20 EST 2017
> mikej@poudriere:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64
> 
> The machine is pretty busy running four poudriere build instances.
> 
> last pid: 76584;  load averages: 115.07, 115.96, 98.30   
>   
>   up 6+07:32:59  14:44:03
> 763 processes: 117 running, 581 sleeping, 2 zombie, 63 lock
> CPU: 59.0% user,  0.0% nice, 40.7% system,  0.1% interrupt,  0.1% idle
> Mem: 12G Active, 2003M Inact, 44G Wired, 29G Free
> ARC: 28G Total, 11G MFU, 16G MRU, 122M Anon, 359M Header, 1184M Other
>   25G Compressed, 32G Uncompressed, 1.24:1 Ratio
> 
> Let me know what additional information I might supply.

This just means that WITNESS stopped working because it ran out of
pre-allocated objects.  In particular the objects used to track how
many locks are held by how many threads:

/*
 * XXX: This is somewhat bogus, as we assume here that at most 2048 threads
 * will hold LOCK_NCHILDREN locks.  We handle failure ok, and we should
 * probably be safe for the most part, but it's still a SWAG.
 */
#define LOCK_NCHILDREN  5
#define LOCK_CHILDCOUNT 2048

Probably the '2048' (max number of concurrent threads) needs to scale with
MAXCPU.  2048 threads is probably a bit low on big x86 boxes.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Breakage with sys.kern.ptrace_test.{ptrace__parent_sees_exit_after_child_debugger, parent_sees_exit_after_unrelated_debugger} after r325719:325721

2017-11-16 Thread John Baldwin
On Thursday, November 16, 2017 09:07:56 AM Ngie Cooper wrote:
> Hi Mateusz,
>   Per Jenkins, these two tests are broken after r325719:325721: 
> https://ci.freebsd.org/job/FreeBSD-head-amd64-test/4987/ 
> <https://ci.freebsd.org/job/FreeBSD-head-amd64-test/4987/> .
> Thanks,
> -Ngie

It is probably the first commit.  Previously, the kern.proc. sysctl
would fail for zombies, so these tests poll that sysctl waiting for it to
fail to determine when a process has become a zombie.  I think the first
commit broke this as the sysctl now works for zombies so the tests hang
forever.  I could fix the tests to check for the status in the kinfo_proc.
I've no idea if there are other programs aside from tests that depend on
this behavior that are also broken though.  I feel like I copied that
approach from some other bit of code when writing these tests.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Okular or any pdf reader

2017-10-28 Thread John Baldwin
On 10/25/17 10:58 AM, Hans Petter Selasky wrote:
> On 10/25/17 11:55, blubee blubeeme wrote:
>> "os.lock_mtx"
>> Oct 25 17:52:58 blubee kernel: 1st os.lock_mtx @ nvidia_os.c:841
>> Oct 25 17:52:58 blubee kernel: 2nd os.lock_mtx @ nvidia_os.c:841
>> Oct 25 17:52:58 blubee kernel: stack backtrace:
>> Oct 25 17:52:58 blubee kernel: #0 0x80ab6f30 at
>> witness_debugger+0x70
>> Oct 25 17:52:58 blubee kernel: #1 0x80ab6e23 at
>> witness_checkorder+0xe23
>> Oct 25 17:52:58 blubee kernel: #2 0x80a35293 at
>> __mtx_lock_flags+0x93
>> Oct 25 17:52:58 blubee kernel: #3 0x82f4097b at
>> os_acquire_spinlock+0x1b
>> Oct 25 17:52:58 blubee kernel: #4 0x82c45b15 at _nv012002rm+0x185
>> Oct 25 17:52:58 blubee kernel: ACPI Warning: \_SB.PCI0.PEG0.PEGP._DSM:
>> Argument #4 type mismatch - Found [Buffer], ACPI requires [Package]
>> (20170531/nsarguments-205)
>> Oct 25 17:52:59 blubee kernel: nvidia-modeset: Allocated GPU:0
>> (GPU-54a7b304-c99d-efee-0117-0ce119063cd6) @ PCI::01:00.0
>>
>>
> 
> Hi,
> 
> Try: CTRL+ALT+F1
> Or SSH into this machine.
> 
> Then do:
> 
> procstat -ak
> 
> It will reveal any hangs and deadlocks.
> 
> Further I note you're using 12-current with the NVIDIA driver. That 
> might not be a supported configuration :-( Especially nowadays some core 
> kernel structures are changing, which means NVIDIA needs to recompile 
> their binary blob aswell!

The blob does not generally use FreeBSD-specific structures.  Part of
the driver is source and that interfaces with FreeBSD's APIs to implement
shims for the APIs used by the blob.  Thus, the same blob is used for all
OS versions, it is only the wrapper shims (for which we have source) which
are subject to ABI concerns.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: VM images for 12.0-CURRENT showing checksum failed messages

2017-10-18 Thread John Baldwin
On Wednesday, October 18, 2017 04:40:22 PM Glen Barber wrote:
> On Wed, Oct 18, 2017 at 09:28:40AM -0700, John Baldwin wrote:
> > On Wednesday, October 18, 2017 03:01:55 PM Glen Barber wrote:
> > > On Wed, Oct 18, 2017 at 07:49:00AM -0700, John Baldwin wrote:
> > > > On Tuesday, October 17, 2017 11:57:44 AM David Boyd wrote:
> > > > > The FreeBSD-12.0-CURRENT-amd64-20171012-r324542.vmdk image displays
> > > > > many checksum failed messages when booted. (see attachment).
> > > > > 
> > > > > I think this started about 20170925.
> > > > > 
> > > > > I have VirtualBox VM's running 10.4-STABLE, 11.1-STABLE and 12.0-
> > > > > CURRENT.
> > > > > 
> > > > > Only the 12.0-CURRENT image exhibits this behavior.
> > > > > 
> > > > > This is easily fixed by "fsck -y /" in single-user mode during the 
> > > > > boot
> > > > > process.
> > > > > 
> > > > > I can test any updates at almost any time.
> > > > 
> > > > I wonder if the tool creating the snapshot images wasn't updated to 
> > > > generate
> > > > cg checksums when creating the initial filesystem.  Glen, do you know 
> > > > which
> > > > tool (makefs or something else?) is used to generate the UFS filesystem
> > > > in VM images for snapshots?  (In this case it appears to be a .vmdk 
> > > > image)
> > > > 
> > > 
> > > mkimg(1) is used.
> > 
> > Does makefs generate the UFS image fed into mkimg or does mkimg generate the
> > UFS partition itself?
> > 
> 
> Sorry, I may have understated a bit.
> 
> First, mdconfig(8) is used to create a md(4)-backed disk, onto which
> newfs(8) is run, followed by the installworld/installkernel targets.
> 
> Next, mkimg(1) is used to feed the resultant md(4)-based .img
> filesystem (after umount(8)) to create the final output image.

Hmm, so I suspect you are using an older kernel, but I wonder if you are also
using an older newfs or a newer newfs?  If the newfs is the same as as the
running kernel, then this means that upgrading from a pre-cg-sum kernel to a
cg-sum kernel will have similar issues.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: VM images for 12.0-CURRENT showing checksum failed messages

2017-10-18 Thread John Baldwin
On Wednesday, October 18, 2017 03:01:55 PM Glen Barber wrote:
> On Wed, Oct 18, 2017 at 07:49:00AM -0700, John Baldwin wrote:
> > On Tuesday, October 17, 2017 11:57:44 AM David Boyd wrote:
> > > The FreeBSD-12.0-CURRENT-amd64-20171012-r324542.vmdk image displays
> > > many checksum failed messages when booted. (see attachment).
> > > 
> > > I think this started about 20170925.
> > > 
> > > I have VirtualBox VM's running 10.4-STABLE, 11.1-STABLE and 12.0-
> > > CURRENT.
> > > 
> > > Only the 12.0-CURRENT image exhibits this behavior.
> > > 
> > > This is easily fixed by "fsck -y /" in single-user mode during the boot
> > > process.
> > > 
> > > I can test any updates at almost any time.
> > 
> > I wonder if the tool creating the snapshot images wasn't updated to generate
> > cg checksums when creating the initial filesystem.  Glen, do you know which
> > tool (makefs or something else?) is used to generate the UFS filesystem
> > in VM images for snapshots?  (In this case it appears to be a .vmdk image)
> > 
> 
> mkimg(1) is used.

Does makefs generate the UFS image fed into mkimg or does mkimg generate the
UFS partition itself?

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: VM images for 12.0-CURRENT showing checksum failed messages

2017-10-18 Thread John Baldwin
On Tuesday, October 17, 2017 11:57:44 AM David Boyd wrote:
> The FreeBSD-12.0-CURRENT-amd64-20171012-r324542.vmdk image displays
> many checksum failed messages when booted. (see attachment).
> 
> I think this started about 20170925.
> 
> I have VirtualBox VM's running 10.4-STABLE, 11.1-STABLE and 12.0-
> CURRENT.
> 
> Only the 12.0-CURRENT image exhibits this behavior.
> 
> This is easily fixed by "fsck -y /" in single-user mode during the boot
> process.
> 
> I can test any updates at almost any time.

I wonder if the tool creating the snapshot images wasn't updated to generate
cg checksums when creating the initial filesystem.  Glen, do you know which
tool (makefs or something else?) is used to generate the UFS filesystem
in VM images for snapshots?  (In this case it appears to be a .vmdk image)

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: LOR panic on mount -uw

2017-10-12 Thread John Baldwin
On Wednesday, October 11, 2017 05:18:17 PM grarpamp wrote:
> Let 12.0-current r324306 amd64 efi boot from usb to installer screen,
> try to write zeroes to an unallocated part of ada0, mount -uw a
> separate part of ada0 ...
> 
> 1st 0xc5ce5f0 ufs kern/vfs_mount.c:1274
> 2nd 0xc565b78 devfs ufs/ffs/ffs_vfsops.c:1414
> 
> db_trace_self_wrapper
> vpanic
> kassert_panic+0x126
> g_access+0x2b9/frame 0xfe0458a31550
> ffs_mount+0x1092/frame 0xfe0458a31700
> vfs_donmount+0x13b8/frame 0xfe0458a31940
> sys_nmount+0x72/frame 0xfe0458a31980
> amd64_syscall+0x79b/frame 0xfe0458a3a1b0
> Xfast_syscall+0xfb/frame 0xfe0458a31ab0
> syscall (378, FreeBSD ELF64, sys_nmount), rip = 0x800a88d6a, rsp =
> 0x7fffd428, rbp = 0x7fffd990
> kdb_enter+0x3b: movq $0,kdb_why

In this case the panic is separate from the LOR, and for a panic we really
need the panic message in addition to the stack trace.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: C++ in jemalloc

2017-10-05 Thread John Baldwin
ally decided).
> >>> >
> >>> > So for the popular architectures, this arrangement might work. For
> >>> building
> >>> > with external toolchains, it might also work. Some of the less popular
> >>> > architectures may be a problem.
> >>> >
> >>> > Does that help? It isn't completely cut and dried, but it should be
> >>> helpful
> >>> > for you making a decision.
> >>> >
> >>> > Warner
> >>>
> >>> Wait a sec... we've been compiling C++ code with gcc 4.2 since like
> >>> 2006.  What am I missing here that keeps this answer from being a
> >>> simple "go for it"?
> >>>
> >>> Just stay away from C++11 features and gcc 4.2 should work fine.  (DTC
> >>> may require C++11, but that was likely the author's choice given that
> >>> there was no requirement for it to work on pre-clang versions of
> >>> freebsd).
> >>>
> >>
> >> It's the ubiquity of C++11 is why I didn't just say "Go for it".
> >>
> >> Warner
> >>
> >
> >
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic: softdep_deallocate_dependencies: dangling deps

2017-09-28 Thread John Baldwin
On Wednesday, September 27, 2017 03:13:21 PM Steve Kargl wrote:
> Just got this panic on 
> 
> FreeBSD troutmask.apl.washington.edu 12.0-CURRENT FreeBSD 12.0-CURRENT
> #0 r321800: Mon Jul 31 13:48:43 PDT 2017
> kargl@:/data/obj/usr/src/sys/SPEW  amd64
> 
> core.txt.0 contains
> 
> panic: softdep_deallocate_dependencies: dangling deps
> cpuid = 7
> time = 1506549566
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe023a281710
> vpanic() at vpanic+0x19c/frame 0xfe023a281790
> panic() at panic+0x43/frame 0xfe023a2817f0
> softdep_deallocate_dependencies() at 
> softdep_deallocate_dependencies+0x76/frame 0xfe023a281810
> brelse() at brelse+0x149/frame 0xfe023a281870
> bufwrite() at bufwrite+0x65/frame 0xfe023a2818b0
> softdep_process_journal() at softdep_process_journal+0x7a8/frame 
> 0xfe023a281950
> softdep_process_worklist() at softdep_process_worklist+0x80/frame 
> 0xfe023a2819b0
> softdep_flush() at softdep_flush+0xff/frame 0xfe023a2819f0
> fork_exit() at fork_exit+0x75/frame 0xfe023a281a30
> fork_trampoline() at fork_trampoline+0xe/frame 0xfe023a281a30
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> 
> __curthread () at ./machine/pcpu.h:232
> 232   __asm("movq %%gs:%1,%0" : "=r" (td)
> (kgdb) #0  __curthread () at ./machine/pcpu.h:232
> #1  doadump (textdump=1) at /usr/src/sys/kern/kern_shutdown.c:318
> #2  0x805879eb in kern_reboot (howto=260)
> at /usr/src/sys/kern/kern_shutdown.c:386
> #3  0x80587e66 in vpanic (fmt=, ap=0xfe023a2817d0)
> at /usr/src/sys/kern/kern_shutdown.c:779
> #4  0x80587c83 in panic (fmt=)
> at /usr/src/sys/kern/kern_shutdown.c:710
> #5  0x80787f56 in softdep_deallocate_dependencies (
> bp=0xfe01f008d8b8) at /usr/src/sys/ufs/ffs/ffs_softdep.c:14304
> #6  0x8061dd69 in buf_deallocate (bp=0xfe01f008d8b8)
> at /usr/src/sys/sys/buf.h:429
> #7  brelse (bp=0xfe01f008d8b8) at /usr/src/sys/kern/vfs_bio.c:2348
> #8  0x8061b9e5 in bufwrite (bp=0xfe01f008d8b8)
> at /usr/src/sys/kern/vfs_bio.c:1914
> #9  0x8079bec8 in softdep_process_journal (mp=, 
> needwk=0x0, flags=)
> at /usr/src/sys/ufs/ffs/ffs_softdep.c:3559
> #10 0x80785dc0 in softdep_process_worklist (mp=0xf80007eef000, 
> full=0) at /usr/src/sys/ufs/ffs/ffs_softdep.c:1592
> #11 0x807894ff in softdep_flush (addr=0xf80007eef000)
> at /usr/src/sys/ufs/ffs/ffs_softdep.c:1397
> #12 0x80555075 in fork_exit (
> callout=0x80789400 , arg=0xf80007eef000, 
> frame=0xfe023a281a40) at /usr/src/sys/kern/kern_fork.c:1038
> #13 
> (kgdb) 
> 
> Hmmm,
> 
> %  kgdb /usr/lib/debug/boot/kernel/kernel.debug vmcore.0
> GNU gdb (GDB) 8.0 [GDB v8.0 for FreeBSD]
> Copyright (C) 2017 Free Software Foundation, Inc.
> Type "apropos word" to search for commands related to "word"...
> Reading symbols from /usr/lib/debug/boot/kernel/kernel.debug...done.
> ABI doesn't support a vmcore target
> 
> OK, so debugging is broken :-/

Run the debugger on the binary, not the debug symbols:

kgdb /boot/kernel/kernel vmcore.0

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: lldb unusable for regular user

2017-09-19 Thread John Baldwin
On Monday, September 18, 2017 02:41:06 PM Vladimir Zakharov wrote:
> Hello!
> 
> lldb coredumps for regular user, but works for root.
> 
> > uname -a
> FreeBSD vzakharov 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r323675: Sun Sep 17 
> 21:14:33 MSK 2017 root@vzakharov:/home/obj/usr/src/sys/GENERIC-NODEBUG  
> amd64
> > cat test.c
> #include 
> #include 
> 
> int main()
> {
>   printf("PID: %d\n", getpid());
>   sleep(10);
>   return 0;
> }
> > cc -O0 -g test.c -o test
> > lldb ./test
> (lldb) target create "./test"
> Current executable set to './test' (x86_64).
> (lldb) run
> Process 37758 launching
> Process 37758 launched: './test' (x86_64)
> Segmentation fault (core dumped)
> Exit 139
> > sudo lldb ./test
> (lldb) target create "./test"
> Current executable set to './test' (x86_64).
> (lldb) run
> Process 37776 launching
> Process 37776 launched: './test' (x86_64)
> PID: 37776
> Process 37776 exited with status = 0 (0x)
> (lldb)
> 
> 
> Postmortem by gdb:
> > gdb ./test test.core
> ...
> [New LWP 101456]
> Core was generated by `./test'.
> Program terminated with signal SIGTRAP, Trace/breakpoint trap.
> #0  _start (ap=0x7fffe858, cleanup=0x800605910 ) at 
> /usr/src/lib/csu/amd64/crt1.c:50
> 50  {
> (gdb) bt
> #0  _start (ap=0x7fffe858, cleanup=0x800605910 ) at 
> /usr/src/lib/csu/amd64/crt1.c:50
> (gdb) f
> #0  _start (ap=0x7fffe858, cleanup=0x800605910 ) at 
> /usr/src/lib/csu/amd64/crt1.c:50
> 50  {
> 
> > gdb `which lldb` lldb.core
> ...
> Reading symbols from /usr/bin/lldb...Reading symbols from 
> /usr/lib/debug//usr/bin/lldb.debug...done.
> done.
> [New LWP 101610]
> [New LWP 100968]
> [New LWP 100126]
> [New LWP 101631]
> [New LWP 101637]
> [New LWP 101662]
> [New LWP 101672]
> [New LWP 100337]
> [New LWP 101593]
> Core was generated by `lldb ./test'.
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  x86_64_freebsd_fallback_frame_state (context=0x7fffddff6e20, 
> context=0x7fffddff6e20, fs=0x7fffddff6b70) at ./md-unwind-support.h:60
> 60  ./md-unwind-support.h: No such file or directory.
> [Current thread is 1 (LWP 101610)]
> (gdb) f
> #0  x86_64_freebsd_fallback_frame_state (context=0x7fffddff6e20, 
> context=0x7fffddff6e20, fs=0x7fffddff6b70) at ./md-unwind-support.h:60
> 60  in ./md-unwind-support.h
> (gdb) bt
> #0  x86_64_freebsd_fallback_frame_state (context=0x7fffddff6e20, 
> context=0x7fffddff6e20, fs=0x7fffddff6b70) at ./md-unwind-support.h:60
> #1  uw_frame_state_for (context=context@entry=0x7fffddff6e20, 
> fs=fs@entry=0x7fffddff6b70) at 
> /wrkdirs/usr/ports/lang/gcc6/work/gcc-6.4.0/libgcc/unwind-dw2.c:1249
> #2  0x000804f6cffb in _Unwind_ForcedUnwind_Phase2 
> (exc=exc@entry=0x806b23230, context=context@entry=0x7fffddff6e20) at 
> /wrkdirs/usr/ports/lang/gcc6/work/gcc-6.4.0/libgcc/unwind.inc:155
> #3  0x000804f6d334 in _Unwind_ForcedUnwind (exc=0x806b23230, 
> stop=0x804631760 , stop_argument=) at 
> /wrkdirs/usr/ports/lang/gcc6/work/gcc-6.4.0/libgcc/unwind.inc:207
> #4  0x0008046315c3 in _Unwind_ForcedUnwind (ex=, 
> stop_func=0xe, stop_arg=0x806b23000) at 
> /usr/src/lib/libthr/thread/thr_exit.c:106
> #5  thread_unwind () at /usr/src/lib/libthr/thread/thr_exit.c:172
> #6  _pthread_exit_mask (status=, mask=) at 
> /usr/src/lib/libthr/thread/thr_exit.c:254
> #7  0x0008046313eb in _pthread_exit (status=0x806b23000) at 
> /usr/src/lib/libthr/thread/thr_exit.c:206
> #8  0x000804623c0d in thread_start (curthread=0x806b23000) at 
> /usr/src/lib/libthr/thread/thr_create.c:289
> #9  0x7fffdddf7000 in ?? ()
> Backtrace stopped: Cannot access memory at address 0x7fffddff7000

Your backtrace shows it crashed during thread exit inside of libthr, not in
lldb itself.  Also, it seems you are using libgcc_s from external gcc rather
than the base system libgcc_s which is built from
contrib/llvm/projects/libunwind.  If lldb dlopen'd some object that depends
on libgcc_s.so from ports gcc then that might explain this crash as it means
you are mixing two different unwind libraries.  What does 'info sharedlibrary'
from gdb show?

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Panic: @r323525: iflib

2017-09-14 Thread John Baldwin
On Thursday, September 14, 2017 03:19:29 PM Stephen Hurd wrote:
> John Baldwin wrote:
> > igb0: <Intel(R) PRO/1000 PCI-Express Network Driver> port 0xe020-0xe03f mem 
> > 0xfb22-0xfb23,0xfb244000-0xfb247fff irq 43 at device 0.0 on pci6
> > igb0: attach_pre capping queues at 8
> > igb0: using 1024 tx descriptors and 1024 rx descriptors
> > igb0: msix_init qsets capped at 8
> > igb0: pxm cpus: 4 queue msgs: 9 admincnt: 1
> > igb0: trying 4 rx queues 4 tx queues
> > igb0: Using MSIX interrupts with 9 vectors
> > igb0: allocated for 4 tx_queues
> > igb0: allocated for 4 rx_queues
> > taskqgroup_attach_cpu: qid not found for cpu=0
> > igb0: taskqgroup_attach_cpu failed 22
> > igb0: Failed to allocate que int 0 err: 22
> > igb0: IFDI_MSIX_INTR_ASSIGN failed 22
> > device_attach: igb0 attach returned 22
> >
> > This is on a quad-core CPU: Intel(R) Xeon(R) CPU E5-1620 v3.  It fails both
> > with SMT enabled or disabled in the BIOS.
> 
> Do you have EARLY_AP_STARTUP enabled?

Yes.

> Do you have em in the kernel, or do you load the module?

In-kernel.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Panic: @r323525: iflib

2017-09-14 Thread John Baldwin
Driver> port 0xe020-0xe03f mem 
0xfb22-0xfb23,0xfb244000-0xfb247fff irq 43 at device 0.0 on pci6

igb0: attach_pre capping queues at 8
igb0: using 1024 tx descriptors and 1024 rx descriptors 
igb0: msix_init qsets capped at 8   
igb0: pxm cpus: 4 queue msgs: 9 admincnt: 1 
igb0: trying 4 rx queues 4 tx queues
igb0: Using MSIX interrupts with 9 vectors  
igb0: allocated for 4 tx_queues         
igb0: allocated for 4 rx_queues 
taskqgroup_attach_cpu: qid not found for cpu=0  
igb0: taskqgroup_attach_cpu failed 22   
igb0: Failed to allocate que int 0 err: 22  
igb0: IFDI_MSIX_INTR_ASSIGN failed 22   
device_attach: igb0 attach returned 22  

This is on a quad-core CPU: Intel(R) Xeon(R) CPU E5-1620 v3.  It fails both
with SMT enabled or disabled in the BIOS.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r323412: Panic on boot (slab->us_keg == keg)

2017-09-11 Thread John Baldwin
On Monday, September 11, 2017 09:15:51 PM Andrey V. Elsukov wrote:
> On 11.09.2017 15:23, Andrey V. Elsukov wrote:
> > --- trap 0xc, rip = 0x80d84870, rsp = 0x82193970, rbp =
> > 0x821939b0 ---
> > zone_import() at zone_import+0x110/frame 0x821939b0
> > zone_alloc_item() at zone_alloc_item+0x36/frame 0x821939f0
> > uma_startup() at uma_startup+0x1d0/frame 0x82193ae0
> > vm_page_startup() at vm_page_startup+0x34e/frame 0x82193b30
> > vm_mem_init() at vm_mem_init+0x1a/frame 0x82193b50
> > mi_startup() at mi_startup+0x9c/frame 0x82193b70
> > btext() at btext+0x2c
> > Uptime: 1s
> 
> I bisected revisions, and the last working is r322988.
> This machine is E5-2660 v4@ based.

If you just revert r322988 on a newer tree does it work ok?

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: !EARLY_AP_STARTUP and -CURRENT

2017-08-31 Thread John Baldwin
On Wednesday, August 30, 2017 04:54:07 PM Kevin Bowling wrote:
> I'm dealing with a shit sandwich right now where the mps(4) or cam_da
> reorders drives on a few thousand legacy MBR machines I have (and I
> can't easily install glabel ATM), and !EARLY_AP_STARTUP seems to have
> regressed.  I'd like to be able to run w/o EARLY_AP_STARTUP right
> quick so I can take a more leisurely approach to fixing mps(4) boot
> probe correctly (freebsd-scsi@ has that thread).
> 
> With WITNESS and !EARLY_AP_STARTUP I hit an assert in sched_setpreempt
> in kern/sched_ule.c 100% of the time.  Here are a couple invocations,
> with oddness around a different CPU holding the curthread lock but
> somehow a different AP is runnable in the function:

Do you have the panic messages?

> Tracing pid 11 tid 100020 td 0xf80128cd1560
> kdb_enter() at kdb_enter+0x3b/frame 0xfe3e653dcc10
> vpanic() at vpanic+0x1b9/frame 0xfe3e653dcc90
> panic() at panic+0x43/frame 0xfe3e653dccf0
> __mtx_assert() at __mtx_assert+0xb4/frame 0xfe3e653dcd00
> sched_add() at sched_add+0x152/frame 0xfe3e653dcd40
> intr_event_schedule_thread() at intr_event_schedule_thread+0xca/frame
> 0xfe3e653dcd80
> swi_sched() at swi_sched+0x6c/frame 0xfe3e653dcdc0
> softclock_call_cc() at softclock_call_cc+0x155/frame 0xfe3e653dce70
> callout_process() at callout_process+0x1f9/frame 0xfe3e653dcef0
> handleevents() at handleevents+0x1a4/frame 0xfe3e653dcf30
> cpu_initclocks_ap() at cpu_initclocks_ap+0xc8/frame 0xfe3e653dcf60
> init_secondary_tail() at init_secondary_tail+0x1e3/frame 0xfe3e653dcf90
> init_secondary() at init_secondary+0x2b3/frame 0xfe3e653dcff0
> 
> 
> db> show thread 0xf80128cd1560
> Thread 100020 at 0xf80128cd1560:
>  proc (pid 11): 0xf80128cb5000
>  name: idle: cpu17
>  stack: 0xfe3e5cd88000-0xfe3e5cd8bfff
>  flags: 0x40024  pflags: 0x20
>  state: CAN RUN
>  priority: 255
>  container lock: sched lock 0 (0x81c39800)
> db> show lock 0x81c39800
>  class: spin mutex
>  name: sched lock 0
>  flags: {SPIN, RECURSE}
>  state: {OWNED}
>  owner: 0xf80128cca000 (tid 100017, pid 11, "idle: cpu14")
> 
> 
> db> bt
> Tracing pid 11 tid 100021 td 0xf80128cd2000
> kdb_enter() at kdb_enter+0x3b/frame 0xfe3e655e4c10
> vpanic() at vpanic+0x1b9/frame 0xfe3e655e4c90
> panic() at panic+0x43/frame 0xfe3e655e4cf0
> __mtx_assert() at __mtx_assert+0xb4/frame 0xfe3e655e4d00
> sched_add() at sched_add+0x152/frame 0xfe3e655e4d40
> intr_event_schedule_thread() at intr_event_schedule_thread+0xca/frame
> 0xfe3e655e4d80
> swi_sched() at swi_sched+0x6c/frame 0xfe3e655e4dc0
> softclock_call_cc() at softclock_call_cc+0x155/frame 0xfe3e655e4e70
> callout_process() at callout_process+0x1f9/frame 0xfe3e655e4ef0
> handleevents() at handleevents+0x1a4/frame 0xfe3e655e4f30
> cpu_initclocks_ap() at cpu_initclocks_ap+0xc8/frame 0xfe3e655e4f60
> init_secondary_tail() at init_secondary_tail+0x1e3/frame 0xfe3e655e4f90
> init_secondary() at init_secondary+0x2b3/frame 0xfe3e655e4ff0
> db> show thread 0xf80128cd2000
> Thread 100021 at 0xf80128cd2000:
>  proc (pid 11): 0xf80128cb6000
>  name: idle: cpu18
>  stack: 0xfe3e5cf17000-0xfe3e5cf1afff
>  flags: 0x40024  pflags: 0x20
>  state: CAN RUN
>  priority: 255
>  container lock: sched lock 0 (0xffff81c39800)
> db> show lock 0x81c39800
>  class: spin mutex
>  name: sched lock 0
>  flags: {SPIN, RECURSE}
>  state: {OWNED}
>  owner: 0xf80128cdb560 (tid 100028, pid 11, "idle: cpu25")
> 
> Regards,
> Kevin


-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: swapfile query

2017-08-20 Thread John Baldwin
On Saturday, August 19, 2017 06:08:29 PM tech-lists wrote:
> On 19/08/2017 17:54, Cy Schubert wrote:
> > Then it doesn't matter if you use one or many swapfiles and deleting the 4 
> > GB won't make a difference. Just add the desired swap as required.
> > 
> > With 128 GB RAM you shouldn't be swapping anyway. If your system is you 
> > have more serious problems than the lack of swap.
> 
> The system is a bhyve host. There are 9 guests, two of them are
> freebsd-11-stable, the rest are ubuntu-14.04-LTS. Restarting some (but
> not all) of the guests has the effect of decreasing swap usage. The
> system also runs ZFS. The guests live on the ZFS filesystem.
> 
> The OS & swap on the host are SSD and are not part of the ZFS system.
> 
> What I'm seeing is, the host system won't touch swap for days. I guess
> when the guests get busier than an as yet unknown amount, the host
> starts using swap. The issue I'm having isn't so much it using swap,
> it's that the used swap seemingly is not liberated after it has been
> used, and I don't know exactly how to narrow it down.

Note that once memory is placed in swap, it won't be pulled back in until
some thread or process actually needs it.  If nothing needs the memory it
doesn't hurt to just leave it out on swap.  It might also mean that the
memory freed up by your temporary memory pressure from your guests will now
be available the next time you have memory pressure so that you won't have
to swap that next time.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Install FreeBSD from source into VM image

2017-08-16 Thread John Baldwin
On Wednesday, August 16, 2017 12:57:25 PM Panagiotes Mousikides wrote:
> Den 2017-08-14 kl. 18:49, skrev Matt Joras:
> > On 08/14/2017 11:42, Panagiotes Mousikides wrote:
> >> I am working on the FreeBSD test suite, and need to create an image
> >> file from source.  How can I do that?
> >>
> >> I need to run something similar to make installkernel && make
> >> installworld with an image file as the target, such that the end
> >> result is a ready-made FreeBSD system that can be started up with
> >> bhyve.  How can I do that, including creating the correct /etc files,
> >> and the correct boot code and partitioning?
> >>   
> > See release(7), https://www.freebsd.org/cgi/man.cgi?release(7). The
> > relevant section is under virtual machine disk images and the vm-image
> > target. The VMFORMATS for bhyve is "raw". That will generate an image
> > that "just works" with vmrun.sh
> >
> > Matt Joras
> >
> Hi Matt!
> 
> Thank you so much for the tip!  I tried what you recommended, the 
> command I ran specifically was (inside release/)
> 
>  sudo make vm-image WITH_VMIMAGES=1 VMBASE= VMSIZE=2G 
> VMFORMATS=raw VMSIZE=2G vm-image
> 
> followed by
> 
>  sudo sh /usr/share/examples/bhyve/vmrun.sh -d .raw vm-
> 
> but apparently the image generated doesn't work.  The error I'm getting 
> after hitting "1" at the boot screen is
> 
>  Error return from kevent monitor: Not permitted in capability mode
> 
> repeatedly cascading through the screen.  Any ideas?  I would greatly 
> appreciate your help!

This sounds like an issue with the bhyve capsicum work.  I've cc'd Allan
and Peter who might be able to help track that down.  It might be useful
if you can run bhyve under ktrace, e.g.:

  sudo ktrace -i -t p sh /usr/share/examples/bhyve/vmrun.sh -d .raw vm-

And then post the output of 'sudo kdump'

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: order of executing MOD_LOAD and registering module sysctl-s

2017-08-03 Thread John Baldwin
On Thursday, August 03, 2017 09:57:15 AM Andriy Gapon wrote:
> On 02/08/2017 18:49, John Baldwin wrote:
> > sysctl nodes are created explicitly via linker_file_register_sysctls, not 
> > via
> > SYSINITs, so you can't order them with respect to other init functions.
> > 
> > I think Andriy's suggestion of doing sysctls "inside" sysinits (so they are
> > registered last and unregistered first) is probably better than the current
> > state and is a simpler fix than changing all sysctls to use SYSINITs.
> 
> Kostik (kib) suggested a possible valid use-case that depends on the current
> order: adding dynamic sysctl-s under static sysctl-s via the module load 
> handler.
> He also offered an idea for a possible solution: holding the modules lock in 
> the
> shared mode (MOD_SLOCK) around calls to sysctl-s registered from modules.

Yes, that could work.  You'd need a way to "tag" sysctls as being module sysctls
vs non-module sysctls.  Another possiblity would be to make two passes over
sysctls when loading/unloading modules where you have a "disabled" flag or some
such.  During load you would set this flag when doing sysctl_register_oid for 
the
static nodes and then do a second pass after the SYSINITs to clear all the 
disabled
flags.  During unload you would do this in reverse with an early pass before
SYSUNINITs to set "disabled" on all the static nodes for the kld.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: order of executing MOD_LOAD and registering module sysctl-s

2017-08-02 Thread John Baldwin
On Wednesday, August 02, 2017 06:53:54 PM Hans Petter Selasky wrote:
> On 08/02/17 17:49, John Baldwin wrote:
> > On Wednesday, August 02, 2017 12:39:36 PM Hans Petter Selasky wrote:
> >> On 08/02/17 12:13, Andriy Gapon wrote:
> >>>
> >>> As far as I understand a module initialization routine is executed via the
> >>> sysinit mechanism.  Specifically, module_register_init is set up as the 
> >>> sysinit
> >>> function for every module and it calls MOD_EVENT(mod, MOD_LOAD) to invoke 
> >>> the
> >>> module event handler.
> >>>
> >>> In linker_load_file() I see the following code:
> >>>   linker_file_register_sysctls(lf);
> >>>   linker_file_sysinit(lf);
> >>>
> >>> I think that this means that any statically declared sysctl-s in the 
> >>> module
> >>> would be registered before the module receives the MOD_LOAD event.
> >>> It's possible that some of the sysctl-s could have procedures as handlers 
> >>> and
> >>> they might access data that is supposed to be initialized by the module 
> >>> event
> >>> handler.
> >>>
> >>> So, for example, running sysctl -a at just the right moment during the 
> >>> loading
> >>> of a module might end up in an expected behavior (including a crash).
> >>>
> >>> Is my interpretation of how the code works correct?
> >>> Can the order of linker_file_sysinit and linker_file_register_sysctls be 
> >>> changed
> >>> without a great risk?
> >>>
> >>> Thank you!
> >>>
> >>> P.S.
> >>> The same applies to:
> >>>   linker_file_sysuninit(file);
> >>>   linker_file_unregister_sysctls(file);
> >>>
> >>
> >> Hi,
> >>
> >> Not sure if this answers your question.
> >>
> 
> Hi,
> 
> >> If a SYSCTL() is TUNABLE, it's procedure can be called when the sysctl
> >> is created. Else the SYSCTL() procedure callback might be called right
> >> after it's registered. I think there is an own subsystem in sys/kernel.h
> >> which takes care of the actual SYSCTL() creation/destruction - after the
> >> linker is involved.
> > 
> > sysctl nodes are created explicitly via linker_file_register_sysctls, not 
> > via
> > SYSINITs, so you can't order them with respect to other init functions.
> 
> For GENERIC (non-modules) the SYSCTLS() are registered by 
> sysctl_register_all() at SYSINIT(sysctl, SI_SUB_KMEM, SI_ORDER_FIRST, 
> sysctl_register_all, 0);
> 
> > 
> > I think Andriy's suggestion of doing sysctls "inside" sysinits (so they are
> > registered last and unregistered first) is probably better than the current
> > state and is a simpler fix than changing all sysctls to use SYSINITs.
> > 
> 
> If the module provided SYSCTLS's could use the same SI_SUB_KMEM it would 
> be compatible.
> 
> You have three cases to think about:
> 
> 1) SYSCTLS's in modules loaded before the kernel is booted
> 2) SYSCTLS's in modules after the kernel is booted
> 3) SYSCTLS's in the GENERIC kernel.
> 
> I'm not 100% sure, but I think 1) and 2) are treated differently. 
> Correct me if I'm wrong.

3) sysctls in the kernel are handled at SI_SUB_KMEM.
1) modules loaded by the loader are handled in linker_preload() at
   SI_SUB_KLD.  Their sysctls are all registered when linker_preload()
   executes.  Their SYSINITs are added to the pending sysinit list.
   Any SYSINITs earlier than SI_SUB_KLD will be executed in sorted
   order by mi_startup() after linker_preload() returns.  Any SYSINITs
   after SI_SUB_KLD will be kept in the pending list in order and
   executed as if they were in the kernel.
2) All of the sysctl's are registered first, and afterwards the
   SYSINITs are run in sorted order.

The race Andriy describes has always been present, but it was perhaps
easier to fix by just inverting the order when TUNABLE_* were used 
explicitly as those registered explicit SI_SUB_TUNABLES sysinits.

Note that it has always been the case with old TUNABLE_* that handlers
could be run before locks were initialized (e.g. via MTX_SYSINIT which
runs later at SI_SUB_LOCK), so if you have handlers that need to use
locks, etc. you really shouldn't use RDTUN/RWTUN but instead you should
use a dedicated SYSINIT to read a tunable and apply the right logic.

There are a few different ways to fix Andriy's race while still solving
the issue Ian notes that you may have SYSINITs that depend on the tunab

Re: zfs.ko no longer loads after r320156: unresolved symbol: abd_is_linear

2017-08-02 Thread John Baldwin
On Wednesday, August 02, 2017 10:14:01 AM Andriy Gapon wrote:
> On 02/08/2017 04:00, Ngie Cooper (yaneurabeya) wrote:
> > 
> >> On Aug 1, 2017, at 09:21, John Baldwin <j...@freebsd.org> wrote:
> >>
> >> On Tuesday, August 01, 2017 09:47:41 AM Andriy Gapon wrote:
> >>> On 01/08/2017 02:31, Ngie Cooper wrote:
> >>>> Hi,
> >>>>  I tried upgrading my host from 11.1-STABLE to 12.0-CURRENT, and it 
> >>>> didn’t work because abd_is_linear is an undefined symbol (it exists in 
> >>>> sys/conf/files, but not sys/modules/zfs/Makefile). I tried adding abd.c 
> >>>> to sys/modules/zfs/Makefile and it didn’t immediately fix my compilation 
> >>>> problem (ran into a linker error instead).
> >>>>  If it isn’t fixed in the next few hours I’ll try my hand at fixing the 
> >>>> problem.
> >>>
> >>> I am not sure what exact problem you have...
> >>> abd.c should be added to the list of source files via
> >>> .include "${SUNW}/uts/common/Makefile.files"
> >>>
> >>> Perhaps something to do with "inline"...
> >>
> >> Oh, yes.  If you use -fno-inline-funcs or the like.  I forgot to
> >> send this to Andriy earlier, but here is the fix I'm using:
> >>
> >> https://github.com/freebsd/freebsd/commit/574dc95cf8272e16f6d44aff6cb4e08dede08886
> > 
> > Unfortunately… this is head, verbatim, which means that the bug still 
> > exists.
> > This gives me an idea of where I should look though.
> 
> The URL indeed suggests that the change should be in head, but it's not there 
> as
> far as I can tell.  I never saw it being committed.

Not yet.  I'm trying to decide if 'static inline' is more correct (for me it
results in 3 separate copies of abd_is_linear in zfs.ko) vs using 'extern
inline'.  The latter seems possibly more correct but more of a pain?  I think
for that it needs to be extern in only a single file and 'inline' in the
header?

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

  1   2   3   4   5   6   7   8   9   10   >