subject:"panic on boot"

Re: 14.0-CURRENT panic on boot, i386 VirtualBox client

2023-01-04 Thread Paul Floyd





On 04-01-23 11:24, Konstantin Belousov wrote:

On Tue, Jan 03, 2023 at 11:38:55AM +0100, Floyd, Paul wrote:


On 30/12/2022 01:54, Konstantin Belousov wrote:


The backtrace is needed to make a further analysis.



Any suggestions for getting a backtrace? I get the panic booting either the
installer ISO or the VM image, both in VirtualBox.

It should be there right after the panic message.  No idea how to take the
virtual screen snapshot under VB.

Also it might be easier if you configure serial console and catch its output
outside VB, but again I have no idea how to do that.


You can connect the vbox vm serial output to a hist pipe.

Here is the start of the panics

smist: found supported isa bridge Intel PIIX4 ISA bridge
panic: td 0x1d94840 stack 0x2424ee8 not in kstack VA 0x242 4
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self_wrapper(0,1d94840,4,2424000,9,...) at 
db_trace_self_wrapper+0x28/frame 0x2424e9c
vpanic(1491c57,2424ed8,2424ed8,2424f9c,1415b35,...) at vpanic+0xf4/frame 
0x2424eb8

panic(1491c57,1d94840,2424ee8,242,4,...) at panic+0x14/frame 0x2424ecc
trap(2424fa8,0,0,0,0,...) at trap+0x975/frame 0x2424f9c
calltrap() at 0xffc0321f/frame 0x2424f9c
--- trap 0x9, eip = 0xa02, esp = 0xffe, ebp = 0 ---
KDB: enter: panic
panic: td 0x1d94840 stack 0x2424d90 not in kstack VA 0x242 4
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self_wrapper(0,1d94840,4,2424000,3,...) at 
db_trace_self_wrapper+0x28/frame 0x2424d44
vpanic(1491c57,2424d80,2424d80,2424e48,1415b35,...) at vpanic+0xf4/frame 
0x2424d60

panic(1491c57,1d94840,2424d90,242,4,...) at panic+0x14/frame 0x2424d74
trap(2424e54,8,28,28,100,...) at trap+0x975/frame 0x2424e48
calltrap() at 0xffc0321f/frame 0x2424e48
--- trap 0x3, eip = 0x1042444, esp = 0x2424e94, ebp = 0x2424e94 ---
kdb_enter(1527748,1527748) at kdb_enter+0x34/frame 0x2424e94
vpanic(1491c57,2424ed8,2424ed8,2424f9c,1415b35,...) at 
vpanic+0x11f/frame 0x2424eb8

panic(1491c57,1d94840,2424ee8,242,4,...) at panic+0x14/frame 0x2424ecc
trap(2424fa8,0,0,0,0,...) at trap+0x975/frame 0x2424f9c
calltrap() at 0xffc0321f/frame 0x2424f9c

VirtualBox: screenshots (was: 14.0-CURRENT panic on boot, i386 VirtualBox client)

2023-01-04 Thread Graham Perrin


On 04/01/2023 10:24, Konstantin Belousov wrote:

… No idea how to take the virtual screen snapshot under VB. …

For me, it's the Control key to the right + E


OpenPGP_signature
Description: OpenPGP digital signature

Re: 14.0-CURRENT panic on boot, i386 VirtualBox client

2023-01-04 Thread Konstantin Belousov

On Tue, Jan 03, 2023 at 11:38:55AM +0100, Floyd, Paul wrote:
> 
> On 30/12/2022 01:54, Konstantin Belousov wrote:
> > 
> > The backtrace is needed to make a further analysis.
> 
> 
> Any suggestions for getting a backtrace? I get the panic booting either the
> installer ISO or the VM image, both in VirtualBox.
It should be there right after the panic message.  No idea how to take the
virtual screen snapshot under VB.

Also it might be easier if you configure serial console and catch its output
outside VB, but again I have no idea how to do that.

Re: 14.0-CURRENT panic on boot, i386 VirtualBox client

2023-01-03 Thread Floyd, Paul




On 30/12/2022 01:54, Konstantin Belousov wrote:


The backtrace is needed to make a further analysis.



Any suggestions for getting a backtrace? I get the panic booting either 
the installer ISO or the VM image, both in VirtualBox.



A+

Paul

Re: 14.0-CURRENT panic on boot, i386 VirtualBox client

2022-12-29 Thread Konstantin Belousov

On Thu, Dec 29, 2022 at 09:39:44AM +0100, Paul Floyd wrote:
> 
> 
> On 28-12-22 18:12, Ronald Klop wrote:
> 
> > 
> > I've had success to capture errors by recording the screen with my phone
> > and playing back on slow speed.
> > Another option might be to enable serial port for the console of the
> > guest and capture the output. But I don't know if the default ISO uses
> > that and how hard it is to configure VirtualBox to do that properly.
> 
> Hi
> 
> I have used my phone before, and I tried that.
> 
> The last message with verbose turned on is
> 
> isa_probe_children: probing PnP devices
> smist: found supported isa bridge Intel PIX4 ISA bridge
> panic: td 0x1d94840 stack 0x2424ee8 not in kstack VA 0x242 4
> 
%esp is indeed outside the KVA for the thread stack, assuming the
numbers are accurate.  It should be in range of 0x242 - 0x2424000.
I just checked random boot in qemu for latest GENERIC/i386, and thread0
stack pointer returned by init386() is inside THREAD0_STACK.

The backtrace is needed to make a further analysis.

Re: 14.0-CURRENT panic on boot, i386 VirtualBox client

2022-12-29 Thread Paul Floyd





On 28-12-22 18:12, Ronald Klop wrote:



I've had success to capture errors by recording the screen with my phone 
and playing back on slow speed.
Another option might be to enable serial port for the console of the 
guest and capture the output. But I don't know if the default ISO uses 
that and how hard it is to configure VirtualBox to do that properly.


Hi

I have used my phone before, and I tried that.

The last message with verbose turned on is

isa_probe_children: probing PnP devices
smist: found supported isa bridge Intel PIX4 ISA bridge
panic: td 0x1d94840 stack 0x2424ee8 not in kstack VA 0x242 4



A+
Paul

Re: 14.0-CURRENT panic on boot, i386 VirtualBox client

2022-12-28 Thread Ronald Klop


On 12/28/22 17:45, Paul Floyd wrote:

Hi

For quite a few weeks I've been unable to boot 14.0-CURRENT i386 in a 
VirtualBox VM. I've tried both booting from iso image and the vmdk image. I get 
a kernel panic

The host is running 13.1-RELEASE-p3 amd64

No problems with 14.0-CURRENT amd64 guests.

I haven't been able to see the last message before the panic as it scrolls past 
too quickly.

Any suggestions for a working either how to get more info or what vbox settings 
to use?

A+
Paul




I've had success to capture errors by recording the screen with my phone and 
playing back on slow speed.
Another option might be to enable serial port for the console of the guest and 
capture the output. But I don't know if the default ISO uses that and how hard 
it is to configure VirtualBox to do that properly.

Regards,
Ronald.

Re: 14.0-CURRENT panic on boot, i386 VirtualBox client

2022-12-28 Thread Paul Floyd





On 28-12-22 18:05, Graham Perrin wrote:



If the guest has more than CPU, try reducing to one.

A step further: try booting the guest in safe mode.



Neither of those changed anything (I was using 2 CPUs)

A+
Paul

Re: 14.0-CURRENT panic on boot, i386 VirtualBox client

2022-12-28 Thread Graham Perrin


On 28/12/2022 16:45, Paul Floyd wrote:

… I haven't been able to see the last message before the panic as it 
scrolls past too quickly.


Any suggestions for a working either how to get more info or what vbox 
settings to use?



If the guest has more than CPU, try reducing to one.

A step further: try booting the guest in safe mode.



OpenPGP_signature
Description: OpenPGP digital signature

14.0-CURRENT panic on boot, i386 VirtualBox client

2022-12-28 Thread Paul Floyd


Hi

For quite a few weeks I've been unable to boot 14.0-CURRENT i386 in a 
VirtualBox VM. I've tried both booting from iso image and the vmdk 
image. I get a kernel panic


The host is running 13.1-RELEASE-p3 amd64

No problems with 14.0-CURRENT amd64 guests.

I haven't been able to see the last message before the panic as it 
scrolls past too quickly.


Any suggestions for a working either how to get more info or what vbox 
settings to use?


A+
Paul

Re: Today's 13-STABLE panic on boot

2021-02-06 Thread Gleb Popov

On Sat, Feb 6, 2021 at 4:55 PM Oleh Hushchenkov 
wrote:

> Indeed, this fixes the panic, but now I have many messages like:
>
> rtsx0: Controller timeout for CMD8
> rtsx0: Controller timeout for CMD8
> rtsx0: Controller timeout for CMD55
> rtsx0: Controller timeout for CMD55
> rtsx0: Controller timeout for CMD55
> rtsx0: Controller timeout for CMD55
> rtsx0: Controller timeout for CMD1
> rtsx0: Controller timeout for CMD1
> rtsx0: Controller timeout for CMD1
> rtsx0: Controller timeout for CMD1
> mmc0: No compatible cards found on bus
>
> Is it normal? I don't have any SD cards to test. So, I will disable
> the card reader in BIOS.
>
> Thanks for the hint!
>
>
You might want to report this to the driver's original author:
https://github.com/hlh-restart/rtsx/
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Today's 13-STABLE panic on boot

2021-02-06 Thread Oleh Hushchenkov

Indeed, this fixes the panic, but now I have many messages like:

rtsx0: Controller timeout for CMD8
rtsx0: Controller timeout for CMD8
rtsx0: Controller timeout for CMD55
rtsx0: Controller timeout for CMD55
rtsx0: Controller timeout for CMD55
rtsx0: Controller timeout for CMD55
rtsx0: Controller timeout for CMD1
rtsx0: Controller timeout for CMD1
rtsx0: Controller timeout for CMD1
rtsx0: Controller timeout for CMD1
mmc0: No compatible cards found on bus

Is it normal? I don't have any SD cards to test. So, I will disable
the card reader in BIOS.

Thanks for the hint!


On Sat, Feb 6, 2021 at 3:36 PM Jesper Schmitz Mouridsen  
wrote:
>
>  From man rtsx
>
>•   RTS522A on Lenovo P50s and Lenovo T470p, card detection and read-only
> switch are reversed.  This is sovled by adding in loader.conf(5):
>
>  dev.rtsx.0.inversion=1
>
> Perhaps this applies to Thinkpad T440p as well?
>
> On 06.02.2021 13.51, Oleh Hushchenkov wrote:
> > As a workaround I disabled Realtek SD card reader RTS5227 in BIOS. Now
> > system boots fine. However not all laptops have the setting to disable
> > integrated devices...
> >
> > On Sat, Feb 6, 2021, 1:23 PM Oleh Hushchenkov 
> > wrote:
> >
> >> Looks like attached image was removed from the message. I uploaded it
> >> https://imgur.com/a/Kv1l1pB
> >>
> >> On Sat, Feb 6, 2021, 11:42 AM Oleh Hushchenkov 
> >> wrote:
> >>
> >>> On cold boot I got this panic, photo attached. Strange thing. After
> >>> enabling verbose logging system successfully booted and next reboots also
> >>> works without verbose logging. My hardware is ThinkPad T440p.
> >>>
> > ___
> > freebsd-current@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-current
> > To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Today's 13-STABLE panic on boot

2021-02-06 Thread Jesper Schmitz Mouridsen


From man rtsx

  •   RTS522A on Lenovo P50s and Lenovo T470p, card detection and read-only
   switch are reversed.  This is sovled by adding in loader.conf(5):

    dev.rtsx.0.inversion=1

Perhaps this applies to Thinkpad T440p as well?

On 06.02.2021 13.51, Oleh Hushchenkov wrote:

As a workaround I disabled Realtek SD card reader RTS5227 in BIOS. Now
system boots fine. However not all laptops have the setting to disable
integrated devices...

On Sat, Feb 6, 2021, 1:23 PM Oleh Hushchenkov 
wrote:


Looks like attached image was removed from the message. I uploaded it
https://imgur.com/a/Kv1l1pB

On Sat, Feb 6, 2021, 11:42 AM Oleh Hushchenkov 
wrote:


On cold boot I got this panic, photo attached. Strange thing. After
enabling verbose logging system successfully booted and next reboots also
works without verbose logging. My hardware is ThinkPad T440p.


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Today's 13-STABLE panic on boot

2021-02-06 Thread Oleh Hushchenkov

As a workaround I disabled Realtek SD card reader RTS5227 in BIOS. Now
system boots fine. However not all laptops have the setting to disable
integrated devices...

On Sat, Feb 6, 2021, 1:23 PM Oleh Hushchenkov 
wrote:

> Looks like attached image was removed from the message. I uploaded it
> https://imgur.com/a/Kv1l1pB
>
> On Sat, Feb 6, 2021, 11:42 AM Oleh Hushchenkov 
> wrote:
>
>> On cold boot I got this panic, photo attached. Strange thing. After
>> enabling verbose logging system successfully booted and next reboots also
>> works without verbose logging. My hardware is ThinkPad T440p.
>>
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Today's 13-STABLE panic on boot

2021-02-06 Thread Oleh Hushchenkov

Looks like attached image was removed from the message. I uploaded it
https://imgur.com/a/Kv1l1pB

On Sat, Feb 6, 2021, 11:42 AM Oleh Hushchenkov 
wrote:

> On cold boot I got this panic, photo attached. Strange thing. After
> enabling verbose logging system successfully booted and next reboots also
> works without verbose logging. My hardware is ThinkPad T440p.
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Today's 13-STABLE panic on boot

2021-02-06 Thread Oleh Hushchenkov

On cold boot I got this panic, photo attached. Strange thing. After
enabling verbose logging system successfully booted and next reboots also
works without verbose logging. My hardware is ThinkPad T440p.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Panic on boot with r353298 (last known working r35295)

2019-10-08 Thread Gary Jennejohn

On Tue, 8 Oct 2019 09:23:30 +
M - Krasznai Andr__s  wrote:

> Hi
> 
> r353298 running on lenovo T510. Panics after starting X session, the panic is
> 
> panic: vm_fault, fault on nofault entry, addr: 0xfe00821a000
> 
> It is preceded by several error messages referring to 
> 
> drm_modeset_is_locked  failed at 
> /usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm_atomic_helper.c:577. 
> then line 622 and then line 821 of the same 
> drm_atomic_helper.c module.
> 
> 
> best regards
> 
> Andras Krasznai
> 
> -Eredeti __zenet-
> Felad__: owner-freebsd-curr...@freebsd.org 
> [mailto:owner-freebsd-curr...@freebsd.org] Meghatalmaz__ Evilham
> K__ldve: 2019. okt__ber 8. 11:00
> C__mzett: FreeBSD Current
> T__rgy: Panic on boot with r353298 (last known working r35295)
> 
> Hey, just a heads up that on a Lenovo A485 (AMD Ryzen processor), 
> r353298 panics somewhat late in the boot process. r352925 is my 
> last known working build.
> I am building GENERIC-NODEBUG.
> 
> Sadly my pulse is shaky and I can't properly read the picture I 
> took, it appears to say:
> 
> Fatal trap 32: page fault while in kernel mode
> *something, something*
> fault mode = supervisor read data, page not present
> 
> Will try to get more details and a proper dump when I have some 
> time off (hopefully later today), just thought I'd warn before.
>

Works for me with a first-generation Ryzen 5 (but in a tower, not
a laptop) and I use a graphics card from Nvidia.

X11 also starts without errors, but I use nvidia-driver and not
drm-current-kmod.

Maybe recompiling drm-current-kmod would help.

-- 
Gary Jennejohn
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

RE: Panic on boot with r353298 (last known working r35295)

2019-10-08 Thread M - Krasznai András

Hi

r353298 running on lenovo T510. Panics after starting X session, the panic is

panic: vm_fault, fault on nofault entry, addr: 0xfe00821a000

It is preceded by several error messages referring to 

drm_modeset_is_locked  failed at 
/usr/local/sys/modules/drm-current-kmod/drivers/gpu/drm_atomic_helper.c:577. 
then line 622 and then line 821 of the same 
drm_atomic_helper.c module.


best regards

Andras Krasznai

-Eredeti üzenet-
Feladó: owner-freebsd-curr...@freebsd.org 
[mailto:owner-freebsd-curr...@freebsd.org] Meghatalmazó Evilham
Küldve: 2019. október 8. 11:00
Címzett: FreeBSD Current
Tárgy: Panic on boot with r353298 (last known working r35295)

Hey, just a heads up that on a Lenovo A485 (AMD Ryzen processor), 
r353298 panics somewhat late in the boot process. r352925 is my 
last known working build.
I am building GENERIC-NODEBUG.

Sadly my pulse is shaky and I can't properly read the picture I 
took, it appears to say:

Fatal trap 32: page fault while in kernel mode
*something, something*
fault mode = supervisor read data, page not present

Will try to get more details and a proper dump when I have some 
time off (hopefully later today), just thought I'd warn before.
--
Evilham
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Panic on boot with r353298 (last known working r35295)

2019-10-08 Thread Evilham

Hey, just a heads up that on a Lenovo A485 (AMD Ryzen processor), 
r353298 panics somewhat late in the boot process. r352925 is my 
last known working build.

I am building GENERIC-NODEBUG.

Sadly my pulse is shaky and I can't properly read the picture I 
took, it appears to say:


Fatal trap 32: page fault while in kernel mode
*something, something*
fault mode = supervisor read data, page not present

Will try to get more details and a proper dump when I have some 
time off (hopefully later today), just thought I'd warn before.

--
Evilham
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Panic on boot with r351461 (AMD ThreadRipper 2990WX)

2019-08-25 Thread Rebecca Cran

On 2019-08-25 10:55, Mark Johnston wrote:
>
> Can you please try applying this patch as well?

Thanks, that fixed the panic and allows the system to fully boot again.


-- 
Rebecca Cran

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Panic on boot with r351461 (AMD ThreadRipper 2990WX)

2019-08-25 Thread Mark Johnston

On Sun, Aug 25, 2019 at 10:27:48AM -0600, Rebecca Cran wrote:
> On 2019-08-25 08:30, Konstantin Belousov wrote:
> >
> > So what happens, IMO, is that for memory-less domains ds_cnt is zero
> > because ds_mask is zero, which causes the exception on divide.  You
> > can try the following combined patch, but I really dislike the fact
> > that I cannot safely use DOMAINSET_FIXED (if my diagnosis is correct).
> 
> 
> With that patch applied, boot gets a lot further but eventually panics
> after probing pcm devices:
> 
> 
> panic: vm_domainset_iter_first: Unknown policy 0

Can you please try applying this patch as well?

diff --git a/sys/vm/uma.h b/sys/vm/uma.h
index be88c57a5c66..39749ac52e99 100644
--- a/sys/vm/uma.h
+++ b/sys/vm/uma.h
@@ -292,6 +292,8 @@ uma_zone_t uma_zcache_create(char *name, int size, uma_ctor 
ctor, uma_dtor dtor,
 #define UMA_ALIGN_CACHE(0 - 1) /* Cache line size 
align */
 #defineUMA_ALIGNOF(type) (_Alignof(type) - 1)  /* Alignment fit for 
'type' */
 
+#defineUMA_ANYDOMAIN   -1  /* Special value for domain search. */
+
 /*
  * Destroys an empty uma zone.  If the zone is not empty uma complains loudly.
  *
diff --git a/sys/vm/uma_core.c b/sys/vm/uma_core.c
index 9d8752df7200..78eaa7b49f82 100644
--- a/sys/vm/uma_core.c
+++ b/sys/vm/uma_core.c
@@ -234,8 +234,6 @@ enum zfreeskip {
SKIP_FINI = 0x0002,
 };
 
-#defineUMA_ANYDOMAIN   -1  /* Special value for domain search. */
-
 /* Prototypes.. */
 
 intuma_startup_count(int);
diff --git a/sys/vm/vm_glue.c b/sys/vm/vm_glue.c
index ed26f9607a8f..4a7bbc9770e9 100644
--- a/sys/vm/vm_glue.c
+++ b/sys/vm/vm_glue.c
@@ -454,12 +454,18 @@ vm_thread_dispose(struct thread *td)
 static int
 kstack_import(void *arg, void **store, int cnt, int domain, int flags)
 {
+   struct domainset *ds;
vm_object_t ksobj;
int i;
 
+   if (domain == UMA_ANYDOMAIN)
+   ds = DOMAINSET_RR();
+   else
+   ds = DOMAINSET_PREF(domain);
+
for (i = 0; i < cnt; i++) {
-   store[i] = (void *)vm_thread_stack_create(
-   DOMAINSET_PREF(domain), , kstack_pages);
+   store[i] = (void *)vm_thread_stack_create(ds, ,
+   kstack_pages);
if (store[i] == NULL)
break;
}
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Panic on boot with r351461 (AMD ThreadRipper 2990WX)

2019-08-25 Thread Rebecca Cran

On 2019-08-25 08:30, Konstantin Belousov wrote:
>
> So what happens, IMO, is that for memory-less domains ds_cnt is zero
> because ds_mask is zero, which causes the exception on divide.  You
> can try the following combined patch, but I really dislike the fact
> that I cannot safely use DOMAINSET_FIXED (if my diagnosis is correct).


With that patch applied, boot gets a lot further but eventually panics
after probing pcm devices:


panic: vm_domainset_iter_first: Unknown policy 0

cpuid = 49

time = 1

KDB: stack backtrace:

...

vm_domainset_iter_first()

vm_domainset_iter_page_init()

vm_page_grab_pages()

vm_thread_stack_create()

kstack_import()

uma_zalloc_arg()

vm_thread_new()

thread_alloc()

kthread_add()

vm_pageout()

fork_exit()

fork_trampoline()


-- 
Rebecca Cran

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Panic on boot with r351461 (AMD ThreadRipper 2990WX)

2019-08-25 Thread Mark Johnston

On Sun, Aug 25, 2019 at 05:30:34PM +0300, Konstantin Belousov wrote:
> On Sun, Aug 25, 2019 at 07:17:20AM -0600, Rebecca Cran wrote:
> > On 2019-08-25 00:24, Konstantin Belousov wrote:
> > > What are the panic messages ?
> > 
> > Fatal trap 18: integer divide fault while in kernel mode
> > 
> > instruction pointer = 0x20:0x80f1027c
> > 
> > stack pointer = 0x28:0x845809f0
> > 
> > frame pointer = 0x28:0x84580a00
> > 
> > code segment = base 0x0, limit 0xff, type 0x1b
> > 
> >     = DPL 0, pres 1, long 1, def32 0, gran 1
> > 
> > processor eflags = resume, IOPL = 0
> > 
> > current process = 0 ()
> > 
> > trap number = 18
> > 
> > panic: integer divide fault
> > 
> > cpuid = 0
> > 
> > time = 1
> > 
> > 
> > > What is the source line ?
> > 
> > (gdb) info line *0x80f1027c
> > Line 102 of "/usr/src/sys/vm/vm_domainset.c" starts at address
> > 0x80f10267 
> >    and ends at 0x80f1027f .
> 
> There was one more source line I asked about.
> 
> So what happens, IMO, is that for memory-less domains ds_cnt is zero
> because ds_mask is zero, which causes the exception on divide.  You
> can try the following combined patch, but I really dislike the fact
> that I cannot safely use DOMAINSET_FIXED (if my diagnosis is correct).

I think this is simply a bug.  Something like the following hack should
work: we want to leave the _FIXED domainsets unmodified, but they should
be removed from the global list (to ensure that userspace cannot specify
impossible policies).

diff --git a/sys/kern/kern_cpuset.c b/sys/kern/kern_cpuset.c
index 87f9333bf43b..931fe7e157e5 100644
--- a/sys/kern/kern_cpuset.c
+++ b/sys/kern/kern_cpuset.c
@@ -503,9 +503,17 @@ domainset_empty_vm(struct domainset *domain)
int i, j, max;
 
max = DOMAINSET_FLS(>ds_mask) + 1;
-   for (i = 0; i < max; i++)
-   if (DOMAINSET_ISSET(i, >ds_mask) && VM_DOMAIN_EMPTY(i))
+   for (i = 0; i < max; i++) {
+   if (DOMAINSET_ISSET(i, >ds_mask) &&
+   VM_DOMAIN_EMPTY(i)) {
+   /*
+* Leave the domainset unmodified, in case it is a
+* static policy defined for use by the kernel.
+*/
+   if (domain->ds_cnt == 1)
+   return (true);
DOMAINSET_CLR(i, >ds_mask);
+   }
domain->ds_cnt = DOMAINSET_COUNT(>ds_mask);
max = DOMAINSET_FLS(>ds_mask) + 1;
for (i = j = 0; i < max; i++) {

> I would prefer for kmem_malloc_domainset(DOMAINSET_FIXED(unpopulated domain))
> to fail with NULL result, and then I would manually fall-back to
> DOMAINSET_PREF().
> 
> OTOH, I think the chunk for mp_realloc_cpu() is the final fix.

Looks ok to me.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Panic on boot with r351461 (AMD ThreadRipper 2990WX)

2019-08-25 Thread Konstantin Belousov

On Sun, Aug 25, 2019 at 07:17:20AM -0600, Rebecca Cran wrote:
> On 2019-08-25 00:24, Konstantin Belousov wrote:
> > What are the panic messages ?
> 
> Fatal trap 18: integer divide fault while in kernel mode
> 
> instruction pointer = 0x20:0x80f1027c
> 
> stack pointer = 0x28:0x845809f0
> 
> frame pointer = 0x28:0x84580a00
> 
> code segment = base 0x0, limit 0xff, type 0x1b
> 
>     = DPL 0, pres 1, long 1, def32 0, gran 1
> 
> processor eflags = resume, IOPL = 0
> 
> current process = 0 ()
> 
> trap number = 18
> 
> panic: integer divide fault
> 
> cpuid = 0
> 
> time = 1
> 
> 
> > What is the source line ?
> 
> (gdb) info line *0x80f1027c
> Line 102 of "/usr/src/sys/vm/vm_domainset.c" starts at address
> 0x80f10267 
>    and ends at 0x80f1027f .

There was one more source line I asked about.

So what happens, IMO, is that for memory-less domains ds_cnt is zero
because ds_mask is zero, which causes the exception on divide.  You
can try the following combined patch, but I really dislike the fact
that I cannot safely use DOMAINSET_FIXED (if my diagnosis is correct).

I would prefer for kmem_malloc_domainset(DOMAINSET_FIXED(unpopulated domain))
to fail with NULL result, and then I would manually fall-back to
DOMAINSET_PREF().

OTOH, I think the chunk for mp_realloc_cpu() is the final fix.

diff --git a/sys/amd64/amd64/mp_machdep.c b/sys/amd64/amd64/mp_machdep.c
index b38c688f8b4..2c3dc8744f6 100644
--- a/sys/amd64/amd64/mp_machdep.c
+++ b/sys/amd64/amd64/mp_machdep.c
@@ -402,6 +402,8 @@ mp_realloc_pcpu(int cpuid, int domain)
return;
m = vm_page_alloc_domain(NULL, 0, domain,
VM_ALLOC_NORMAL | VM_ALLOC_NOOBJ);
+   if (m == NULL)
+   return;
na = PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m));
pagecopy((void *)oa, (void *)na);
pmap_qenter((vm_offset_t)&__pcpu[cpuid], , 1);
@@ -481,10 +483,10 @@ native_start_all_aps(void)
M_ZERO);
mce_stack = (char *)kmem_malloc(PAGE_SIZE, M_WAITOK | M_ZERO);
nmi_stack = (char *)kmem_malloc_domainset(
-   DOMAINSET_FIXED(domain), PAGE_SIZE, M_WAITOK | M_ZERO);
+   DOMAINSET_PREF(domain), PAGE_SIZE, M_WAITOK | M_ZERO);
dbg_stack = (char *)kmem_malloc_domainset(
-   DOMAINSET_FIXED(domain), PAGE_SIZE, M_WAITOK | M_ZERO);
-   dpcpu = (void *)kmem_malloc_domainset(DOMAINSET_FIXED(domain),
+   DOMAINSET_PREF(domain), PAGE_SIZE, M_WAITOK | M_ZERO);
+   dpcpu = (void *)kmem_malloc_domainset(DOMAINSET_PREF(domain),
DPCPU_SIZE, M_WAITOK | M_ZERO);
 
bootSTK = (char *)bootstacks[cpu] +
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Panic on boot with r351461 (AMD ThreadRipper 2990WX)

2019-08-25 Thread Rebecca Cran

On 2019-08-25 00:24, Konstantin Belousov wrote:
> What are the panic messages ?

Fatal trap 18: integer divide fault while in kernel mode

instruction pointer = 0x20:0x80f1027c

stack pointer = 0x28:0x845809f0

frame pointer = 0x28:0x84580a00

code segment = base 0x0, limit 0xff, type 0x1b

    = DPL 0, pres 1, long 1, def32 0, gran 1

processor eflags = resume, IOPL = 0

current process = 0 ()

trap number = 18

panic: integer divide fault

cpuid = 0

time = 1


> What is the source line ?

(gdb) info line *0x80f1027c
Line 102 of "/usr/src/sys/vm/vm_domainset.c" starts at address
0x80f10267 
   and ends at 0x80f1027f .


-- 

Rebecca Cran

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Panic on boot with r351461 (AMD ThreadRipper 2990WX)

2019-08-25 Thread Konstantin Belousov

On Sat, Aug 24, 2019 at 05:29:23PM -0600, Rebecca Cran wrote:
> On 2019-08-24 17:08, Konstantin Belousov wrote:
> >
> > Use gdb instead.
> 
> Ah, thanks.
> 
> (gdb) info line *0x8117f67c
> Line 405 of "/usr/src/sys/amd64/amd64/mp_machdep.c" starts at address
> 0x8117f674  and ends at
> 0x8117f69a 
> 
> 
> > What was the previous bootable version of the kernel ? 
> 
> I attempted to upgrade from r350575.
> 
> > Do you happen to have NUMA node without any local memory ? (Look at the
> > SRAT table).  If yes, try this patch.
> 
> After applying the patch, I get a crash with the following backtrace:
What are the panic messages ?

> 
> 
> vm_domainset_iter_first()
What is the source line ?

> 
> vm_domainset_iter_policy_init()
> 
> kmem_malloc_domainset()
> 
> native_start_all_aps()
What is the source line ?

> 
> cpu_mp_start()
> 
> mp_start()
> 
> mi_startup()
> 
> 
> (gdb) info line *0x80f1027c
> Line 102 of "/usr/src/sys/vm/vm_domainset.c" starts at address
> 0x80f10267 
>    and ends at 0x80f1027f .
> 
> The SRAT contains:
> 
> 
>     Type=Memory
>     Flags={ENABLED}
>     Base Address=0x
>     Length=0x000a
>     Proximity Domain=0
> 
>     Type=Memory
>     Flags={ENABLED}
>     Base Address=0x0010
>     Length=0x7ff0
>     Proximity Domain=0
> 
>     Type=Memory
>     Flags={ENABLED}
>     Base Address=0x0001
>     Length=0x000f8000
>     Proximity Domain=0
> 
>     Type=Memory
>     Flags={ENABLED}
>     Base Address=0x00108000
>     Length=0x0010
>     Proximity Domain=2
> 
> 
> -- 
> 
> Rebecca Cran
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Panic on boot with r351461 (AMD ThreadRipper 2990WX)

2019-08-24 Thread Rebecca Cran

On 2019-08-24 17:08, Konstantin Belousov wrote:
>
> Do you happen to have NUMA node without any local memory ? (Look at the
> SRAT table).  If yes, try this patch.

I've just remembered, that's one of the big differences between
ThreadRipper and EPYC: the EPYC has memory links on all four dies, while
the ThreadRipper 2990WX only has links on half.

From
https://www.anandtech.com/show/13124/the-amd-threadripper-2990wx-and-2950x-review/15

"With the new processors, we have the situation on the right, where only
some cores are directly attached to memory, and others are not. In order
to go from one of these cores to main memory, it requires an extra hop,
which adds latency. When all the cores are requesting access, this
causes congestion."

-- 
Rebecca Cran

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Panic on boot with r351461 (AMD ThreadRipper 2990WX)

2019-08-24 Thread Rebecca Cran

On 2019-08-24 17:08, Konstantin Belousov wrote:
>
> Use gdb instead.

Ah, thanks.

(gdb) info line *0x8117f67c
Line 405 of "/usr/src/sys/amd64/amd64/mp_machdep.c" starts at address
0x8117f674  and ends at
0x8117f69a 


> What was the previous bootable version of the kernel ? 

I attempted to upgrade from r350575.

> Do you happen to have NUMA node without any local memory ? (Look at the
> SRAT table).  If yes, try this patch.

After applying the patch, I get a crash with the following backtrace:


vm_domainset_iter_first()

vm_domainset_iter_policy_init()

kmem_malloc_domainset()

native_start_all_aps()

cpu_mp_start()

mp_start()

mi_startup()


(gdb) info line *0x80f1027c
Line 102 of "/usr/src/sys/vm/vm_domainset.c" starts at address
0x80f10267 
   and ends at 0x80f1027f .

The SRAT contains:


    Type=Memory
    Flags={ENABLED}
    Base Address=0x
    Length=0x000a
    Proximity Domain=0

    Type=Memory
    Flags={ENABLED}
    Base Address=0x0010
    Length=0x7ff0
    Proximity Domain=0

    Type=Memory
    Flags={ENABLED}
    Base Address=0x0001
    Length=0x000f8000
    Proximity Domain=0

    Type=Memory
    Flags={ENABLED}
    Base Address=0x00108000
    Length=0x0010
    Proximity Domain=2


-- 

Rebecca Cran

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Panic on boot with r351461 (AMD ThreadRipper 2990WX)

2019-08-24 Thread Konstantin Belousov

On Sat, Aug 24, 2019 at 04:54:26PM -0600, Rebecca Cran wrote:
> On 2019-08-24 14:33, Konstantin Belousov wrote:
> > On Sat, Aug 24, 2019 at 02:22:18PM -0600, Rebecca Cran wrote:
> >> instruction pointer = 0x20: 0x811bc664
> > So what is the source line for this address ?
> 
> 
> I built a new kernel and got a new panic instruction pointer address of
> 0x8117f67c, but running it through addr2line only gave a
> function name, not a line number:
> 
> addr2line -af -e /usr/lib/debug/boot/kernel/kernel.debug 0x8117f67c
Use gdb instead.

> 
> mp_realloc_pcpu
> /usr/src/sys/amd64/amd64/mp_machdep.c:0

What was the previous bootable version of the kernel ?

Do you happen to have NUMA node without any local memory ? (Look at the
SRAT table).  If yes, try this patch.

diff --git a/sys/amd64/amd64/mp_machdep.c b/sys/amd64/amd64/mp_machdep.c
index b38c688f8b4..84ce0b779ab 100644
--- a/sys/amd64/amd64/mp_machdep.c
+++ b/sys/amd64/amd64/mp_machdep.c
@@ -402,6 +402,8 @@ mp_realloc_pcpu(int cpuid, int domain)
return;
m = vm_page_alloc_domain(NULL, 0, domain,
VM_ALLOC_NORMAL | VM_ALLOC_NOOBJ);
+   if (m == NULL)
+   return;
na = PHYS_TO_DMAP(VM_PAGE_TO_PHYS(m));
pagecopy((void *)oa, (void *)na);
pmap_qenter((vm_offset_t)&__pcpu[cpuid], , 1);
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Panic on boot with r351461 (AMD ThreadRipper 2990WX)

2019-08-24 Thread Rebecca Cran

On 2019-08-24 14:33, Konstantin Belousov wrote:
> On Sat, Aug 24, 2019 at 02:22:18PM -0600, Rebecca Cran wrote:
>> instruction pointer = 0x20: 0x811bc664
> So what is the source line for this address ?


I built a new kernel and got a new panic instruction pointer address of
0x8117f67c, but running it through addr2line only gave a
function name, not a line number:

addr2line -af -e /usr/lib/debug/boot/kernel/kernel.debug 0x8117f67c

mp_realloc_pcpu
/usr/src/sys/amd64/amd64/mp_machdep.c:0


-- 
Rebecca Cran

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Panic on boot with r351461 (AMD ThreadRipper 2990WX)

2019-08-24 Thread Konstantin Belousov

On Sat, Aug 24, 2019 at 02:22:18PM -0600, Rebecca Cran wrote:
> I updated my kernel to r351461 today and now get a panic on boot.
> 
> 
> CPU: AMD Ryzen Threadripper 2990WX 32-Core Processor (2994.45-MHz
> K8-class CPU)
>   Origin="AuthenticAMD"  Id=0x800f82  Family=0x17  Model=0x8  Stepping=2
>  
> Features=0x178bfbff
>  
> Features2=0x7ed8320b
>   AMD Features=0x2e500800
>   AMD
> Features2=0x35c233ff
>   Structured Extended
> Features=0x209c01a9
>   XSAVE Features=0xf
>   AMD Extended Feature Extensions ID
> EBX=0x1007
>   SVM: NP,NRIP,VClean,AFlush,DAssist,NAsids=32768
>   TSC: P-state invariant, performance statistics
> real memory  = 137438953472 (131072 MB)
> avail memory = 133711564800 (127517 MB)
> Event timer "LAPIC" quality 600
> ACPI APIC Table: 
> 
> Fatal trap 12: page fault while in kernel mode
> 
> cpuid = 0; apic id = 00
> 
> fault virtual address   = 0x30
> 
> fault code  = supervisor read data, page not present
> 
> instruction pointer = 0x20: 0x811bc664
So what is the source line for this address ?

> 
> stack pointer = 0x28:0x8441faa0
> 
> frame pointer = 0x28:0x8441fae0
> 
> code segment = base 0x0, limit 0xf, type 0x1b
> 
>   = DPL 0, pres 1, long 1, def32 0, gran 1
> 
> processor eflags = resume, IOPL = 0
> 
> current process = 0 ()
> 
> tral number = 12
> 
> panic: page fault
> 
> cpuid = 0
> 
> time = 1
> 
> KDB: stack backtrace:
> 
> db_trace_self_wrapper()
> 
> ...
> 
> --- trap 0xc, rip = 0x811bc664, rsp = 0x8441faa0, rbp =
> 0x8441fae0 ---
> 
> native_start_all_aps()
> 
> cpu_mp_start()
> 
> mp_start()
> 
> mi_startup()
> 
> 
> -- 
> Rebecca Cran
> 
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Panic on boot with r351461 (AMD ThreadRipper 2990WX)

2019-08-24 Thread Rebecca Cran

I updated my kernel to r351461 today and now get a panic on boot.


CPU: AMD Ryzen Threadripper 2990WX 32-Core Processor (2994.45-MHz
K8-class CPU)
  Origin="AuthenticAMD"  Id=0x800f82  Family=0x17  Model=0x8  Stepping=2
 
Features=0x178bfbff
 
Features2=0x7ed8320b
  AMD Features=0x2e500800
  AMD
Features2=0x35c233ff
  Structured Extended
Features=0x209c01a9
  XSAVE Features=0xf
  AMD Extended Feature Extensions ID
EBX=0x1007
  SVM: NP,NRIP,VClean,AFlush,DAssist,NAsids=32768
  TSC: P-state invariant, performance statistics
real memory  = 137438953472 (131072 MB)
avail memory = 133711564800 (127517 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: 

Fatal trap 12: page fault while in kernel mode

cpuid = 0; apic id = 00

fault virtual address   = 0x30

fault code  = supervisor read data, page not present

instruction pointer = 0x20: 0x811bc664

stack pointer = 0x28:0x8441faa0

frame pointer = 0x28:0x8441fae0

code segment = base 0x0, limit 0xf, type 0x1b

  = DPL 0, pres 1, long 1, def32 0, gran 1

processor eflags = resume, IOPL = 0

current process = 0 ()

tral number = 12

panic: page fault

cpuid = 0

time = 1

KDB: stack backtrace:

db_trace_self_wrapper()

...

--- trap 0xc, rip = 0x811bc664, rsp = 0x8441faa0, rbp =
0x8441fae0 ---

native_start_all_aps()

cpu_mp_start()

mp_start()

mi_startup()


-- 
Rebecca Cran

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Strange panic at boot with vmm in loader.conf vs manually loading it

2018-10-15 Thread Mike Tancsa

On 10/14/2018 2:19 PM, Mateusz Guzik wrote:
> On 10/14/18, Mike Tancsa  wrote:
>> On 10/13/2018 12:48 PM, Allan Jude wrote:
>>> Strange that your crash is in ZFS here...
>>>
>>> Can you take a crash dump?
>>>
>>> It looks like something is trying to write to uninitialized memory here.
>> I will need to pop in another drive or can I do a netdump at this point ?
>>
> This should be fixed with https://svnweb.freebsd.org/changeset/base/339355
> i.e. just update.
>
Thanks, just tried and all is good!


    ---Mike


-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Strange panic at boot with vmm in loader.conf vs manually loading it

2018-10-14 Thread Mateusz Guzik

On 10/14/18, Mike Tancsa  wrote:
> On 10/13/2018 12:48 PM, Allan Jude wrote:
>>
>> Strange that your crash is in ZFS here...
>>
>> Can you take a crash dump?
>>
>> It looks like something is trying to write to uninitialized memory here.
>
> I will need to pop in another drive or can I do a netdump at this point ?
>

This should be fixed with https://svnweb.freebsd.org/changeset/base/339355
i.e. just update.

-- 
Mateusz Guzik 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Strange panic at boot with vmm in loader.conf vs manually loading it

2018-10-14 Thread Mike Tancsa

On 10/13/2018 12:48 PM, Allan Jude wrote:
>
> Strange that your crash is in ZFS here...
>
> Can you take a crash dump?
>
> It looks like something is trying to write to uninitialized memory here. 

I will need to pop in another drive or can I do a netdump at this point ?

    ---Mike

-- 
---
Mike Tancsa, tel +1 519 651 3400 x203
Sentex Communications, m...@sentex.net
Providing Internet services since 1994 www.sentex.net
Cambridge, Ontario Canada   


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Strange panic at boot with vmm in loader.conf vs manually loading it

2018-10-13 Thread Cy Schubert

In message <8f033c7c-af8f-1ebc-d787-548634f10...@freebsd.org>, Allan 
Jude write
s:
> On 10/12/2018 11:52, Mike Tancsa wrote:
> > I am guessing this does not have anything to do with vmm being loaded,
> > but hardware being initialized in a particular order? If I load vmm in
> > loader.conf, the box panics at boot up.Â  However, manually loading it
> > all seems to work.Â  Hardware is PRIME X370-PRO, AMD Ryzen 5 1600X 32G
> > RAM.Â  FreeBSD 12.0-ALPHA9 r339328 GENERIC-NODEBUG
> > 
> > 
> > Leading up to the crash, I see
> > 
> > 
> > ugen0.1: <0x1022 XHCI root HUB> at usbus0
> > ugen1.1: <0x1b21 XHCI root HUB> at usbus1
> > Trying to mount root from zfs:zroot/ROOT/default []...
> > uhub0: ugen2.1: <0x1022 XHCI root HUB> at usbus2
> > Root mount waiting for: usbus2<0x1022 XHCI root HUB, class 9/0, rev
> > 3.00/1.00, addr 1> on usbus0
> >  Â usbus1 usbus0
> > uhub1: <0x1022 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus2
> > uhub2: <0x1b21 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus1
> > uhub2: 4 ports with 4 removable, self powered
> > uhub1: 8 ports with 8 removable, self powered
> > uhub0: 22 ports with 22 removable, self powered
> > 
> > Fatal trap 12: page fault while in kernel mode
> > cpuid = 0; apic id = 00
> > fault virtual addressÂ Â  = 0x398
> > fault codeÂ Â Â Â Â Â Â Â Â Â Â Â Â  = supervisor write data, page not pres
> ent
> > instruction pointerÂ Â Â Â  = 0x20:0x8273d776
> > stack pointerÂ Â Â Â Â Â Â Â Â Â  = 0x28:0xfe0075d55230
> > frame pointerÂ Â Â Â Â Â Â Â Â Â  = 0x28:0xfe0075d55270
> > code segmentÂ Â Â Â Â Â Â Â Â Â Â  = base 0x0, limit 0xf, type 0x1b
> >  Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â  = DPL 0, pres 1, long 1, de
> f32 0, gran 1
> > processor eflagsÂ Â Â Â Â Â Â  = interrupt enabled, resume, IOPL = 0
> > current processÂ Â Â Â Â Â Â Â  = 1 (kernel)
> > [ thread pid 1 tid 12 ]
> > Stopped atÂ Â Â Â Â  rrw_enter_read_impl+0x36:Â Â Â Â Â Â  lock cmpxchgq
> > %r14,0x18(%rbx)
> > db> bt
> > Tracing pid 1 tid 12 td 0xf8000567d580
> > rrw_enter_read_impl() at rrw_enter_read_impl+0x36/frame 0xfe0075d55270
> > zfs_mount() at zfs_mount+0x7b2/frame 0xfe0075d55400
> > vfs_domount() at vfs_domount+0x5b2/frame 0xfe0075d55630
> > vfs_donmount() at vfs_donmount+0x930/frame 0xfe0075d556d0
> > kernel_mount() at kernel_mount+0x3d/frame 0xfe0075d55720
> > parse_mount() at parse_mount+0x451/frame 0xfe0075d55860
> > vfs_mountroot() at vfs_mountroot+0x7a0/frame 0xfe0075d559f0
> > start_init() at start_init+0x27/frame 0xfe0075d55a70
> > fork_exit() at fork_exit+0x83/frame 0xfe0075d55ab0
> > fork_trampoline() at fork_trampoline+0xe/frame 0xfe0075d55ab0
> > --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> > db>
>
> Strange that your crash is in ZFS here...
>
> Can you take a crash dump?
>
> It looks like something is trying to write to uninitialized memory here.
>

I was digging into this before I left on vacation. You can recreate 
this by,

mount -t zfs tank/nonexistent /mnt

A nonexistent dataset or zpool triggers the panic. I discovered it by 
chance through a typo in fstab. The panic occurs with INVARIANTS. 
Without INVARIANTS results in a hard hang.

I got as far as discovering that f_mntfromname pointed to a null string 
but ran out of time before I left.


-- 
Cheers,
Cy Schubert 
FreeBSD UNIX: Web:  http://www.FreeBSD.org

The need of the many outweighs the greed of the few.


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Strange panic at boot with vmm in loader.conf vs manually loading it

2018-10-13 Thread Allan Jude


On 10/12/2018 11:52, Mike Tancsa wrote:

I am guessing this does not have anything to do with vmm being loaded,
but hardware being initialized in a particular order? If I load vmm in
loader.conf, the box panics at boot up.  However, manually loading it
all seems to work.  Hardware is PRIME X370-PRO, AMD Ryzen 5 1600X 32G
RAM.  FreeBSD 12.0-ALPHA9 r339328 GENERIC-NODEBUG


Leading up to the crash, I see


ugen0.1: <0x1022 XHCI root HUB> at usbus0
ugen1.1: <0x1b21 XHCI root HUB> at usbus1
Trying to mount root from zfs:zroot/ROOT/default []...
uhub0: ugen2.1: <0x1022 XHCI root HUB> at usbus2
Root mount waiting for: usbus2<0x1022 XHCI root HUB, class 9/0, rev
3.00/1.00, addr 1> on usbus0
  usbus1 usbus0
uhub1: <0x1022 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus2
uhub2: <0x1b21 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus1
uhub2: 4 ports with 4 removable, self powered
uhub1: 8 ports with 8 removable, self powered
uhub0: 22 ports with 22 removable, self powered

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x398
fault code  = supervisor write data, page not present
instruction pointer = 0x20:0x8273d776
stack pointer   = 0x28:0xfe0075d55230
frame pointer   = 0x28:0xfe0075d55270
code segment    = base 0x0, limit 0xf, type 0x1b
     = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags    = interrupt enabled, resume, IOPL = 0
current process = 1 (kernel)
[ thread pid 1 tid 12 ]
Stopped at  rrw_enter_read_impl+0x36:   lock cmpxchgq
%r14,0x18(%rbx)
db> bt
Tracing pid 1 tid 12 td 0xf8000567d580
rrw_enter_read_impl() at rrw_enter_read_impl+0x36/frame 0xfe0075d55270
zfs_mount() at zfs_mount+0x7b2/frame 0xfe0075d55400
vfs_domount() at vfs_domount+0x5b2/frame 0xfe0075d55630
vfs_donmount() at vfs_donmount+0x930/frame 0xfe0075d556d0
kernel_mount() at kernel_mount+0x3d/frame 0xfe0075d55720
parse_mount() at parse_mount+0x451/frame 0xfe0075d55860
vfs_mountroot() at vfs_mountroot+0x7a0/frame 0xfe0075d559f0
start_init() at start_init+0x27/frame 0xfe0075d55a70
fork_exit() at fork_exit+0x83/frame 0xfe0075d55ab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfe0075d55ab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
db>


Strange that your crash is in ZFS here...

Can you take a crash dump?

It looks like something is trying to write to uninitialized memory here.



On a normal boot, the next line would be atrtc0

uhub0: Root mount waiting for: usbus2ugen2.1: <0x1022 XHCI root HUB> at usbus2
<0x1022 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0
  usbus1 usbus0uhub1: <0x1b21 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> 
on usbus1

uhub2: <0x1022 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus2
uhub1: 4 ports with 4 removable, self powered
uhub2: 8 ports with 8 removable, self powered
uhub0: 22 ports with 22 removable, self powered
atrtc0: providing initial system time
start_init: trying /sbin/init
Setting hostuuid: c3297ba0-3f01-11e7-8725-6045cba08a84.
Setting hostid: 0x094fa67e.
Starting file system checks:
Mounting local filesystems:.


snip




---Mike






--
Allan Jude
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Strange panic at boot with vmm in loader.conf vs manually loading it

2018-10-12 Thread Mike Tancsa

I am guessing this does not have anything to do with vmm being loaded,
but hardware being initialized in a particular order? If I load vmm in
loader.conf, the box panics at boot up.  However, manually loading it
all seems to work.  Hardware is PRIME X370-PRO, AMD Ryzen 5 1600X 32G
RAM.  FreeBSD 12.0-ALPHA9 r339328 GENERIC-NODEBUG


Leading up to the crash, I see


ugen0.1: <0x1022 XHCI root HUB> at usbus0
ugen1.1: <0x1b21 XHCI root HUB> at usbus1
Trying to mount root from zfs:zroot/ROOT/default []...
uhub0: ugen2.1: <0x1022 XHCI root HUB> at usbus2
Root mount waiting for: usbus2<0x1022 XHCI root HUB, class 9/0, rev
3.00/1.00, addr 1> on usbus0
 usbus1 usbus0
uhub1: <0x1022 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus2
uhub2: <0x1b21 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus1
uhub2: 4 ports with 4 removable, self powered
uhub1: 8 ports with 8 removable, self powered
uhub0: 22 ports with 22 removable, self powered

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x398
fault code  = supervisor write data, page not present
instruction pointer = 0x20:0x8273d776
stack pointer   = 0x28:0xfe0075d55230
frame pointer   = 0x28:0xfe0075d55270
code segment    = base 0x0, limit 0xf, type 0x1b
    = DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags    = interrupt enabled, resume, IOPL = 0
current process = 1 (kernel)
[ thread pid 1 tid 12 ]
Stopped at  rrw_enter_read_impl+0x36:   lock cmpxchgq  
%r14,0x18(%rbx)
db> bt
Tracing pid 1 tid 12 td 0xf8000567d580
rrw_enter_read_impl() at rrw_enter_read_impl+0x36/frame 0xfe0075d55270
zfs_mount() at zfs_mount+0x7b2/frame 0xfe0075d55400
vfs_domount() at vfs_domount+0x5b2/frame 0xfe0075d55630
vfs_donmount() at vfs_donmount+0x930/frame 0xfe0075d556d0
kernel_mount() at kernel_mount+0x3d/frame 0xfe0075d55720
parse_mount() at parse_mount+0x451/frame 0xfe0075d55860
vfs_mountroot() at vfs_mountroot+0x7a0/frame 0xfe0075d559f0
start_init() at start_init+0x27/frame 0xfe0075d55a70
fork_exit() at fork_exit+0x83/frame 0xfe0075d55ab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfe0075d55ab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
db>

On a normal boot, the next line would be atrtc0

uhub0: Root mount waiting for: usbus2ugen2.1: <0x1022 XHCI root HUB> at usbus2
<0x1022 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus0
 usbus1 usbus0uhub1: <0x1b21 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> 
on usbus1

uhub2: <0x1022 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on usbus2
uhub1: 4 ports with 4 removable, self powered
uhub2: 8 ports with 8 removable, self powered
uhub0: 22 ports with 22 removable, self powered
atrtc0: providing initial system time
start_init: trying /sbin/init
Setting hostuuid: c3297ba0-3f01-11e7-8725-6045cba08a84.
Setting hostid: 0x094fa67e.
Starting file system checks:
Mounting local filesystems:.
ELF ldconfig path: /lib /usr/lib /usr/lib/compat /usr/local/lib 
/usr/local/lib/perl5/5.26/mach/CORE
32-bit compatibility ldconfig path: /usr/lib32
Setting hostname: ryzenbsd12.sentex.ca.

Manually loading it, dmesg shows

AMD-Vi: IVRS Info VAsize = 64 PAsize = 48 GVAsize = 2 flags:0
driver bug: Unable to set devclass (class: ppc devname: (unknown))
ivhd0:  on acpi0
ivhd0: Flag:b0
ivhd0: Features(type:0x11) MsiNumPPR = 0 PNBanks= 2 PNCounters= 0
ivhd0: Extended features[31:0]:22294ada HATS = 0x2 
GATS = 0x0 GLXSup = 0x1 SmiFSup = 0x1 SmiFRC = 0x2 GAMSup = 0x1 DualPortLogSup 
= 0x2 DualEventLogSup = 0x2
ivhd0: Extended features[62:32]:f77ef Max PASID: 0x2f DevTblSegSup = 0x3 
MarcSup = 0x1
ivhd0: supported paging level:7, will use only: 4
ivhd0: device range: 0x0 - 0x
ivhd0: PCI cap 0x190b640f@0x40 feature:19

and loading it manually with boot.verbose set

pci0: driver added
found-> vendor=0x1022, dev=0x1451, revid=0x00
domain=0, bus=0, slot=0, func=2
class=08-06-00, hdrtype=0x00, mfdev=1
cmdreg=0x0004, statreg=0x0010, cachelnsz=0 (dwords)
lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns)
MSI supports 4 messages, 64 bit
pci0:0:0:2: reprobing on driver added
found-> vendor=0x1022, dev=0x790b, revid=0x59
domain=0, bus=0, slot=20, func=0
class=0c-05-00, hdrtype=0x00, mfdev=1
cmdreg=0x0403, statreg=0x0220, cachelnsz=0 (dwords)
lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns)
pci0:0:20:0: reprobing on driver added
pci1: driver added
pci2: driver added
pci3: driver added
pci4: driver added
pci5: driver added
pci6: driver added
pci7: driver added
pci8: driver added
pci9: driver added
found-> vendor=0x1425, dev=0x5501, revid=0x00
domain=0, bus=9, slot=0, func=5
class=01-00-00, hdrtype=0x00, mfdev=1
cmdreg=0x0006, statreg=0x0010, cachelnsz=16 (dwords)
lattimer=0x00 (0 ns), mingnt=0x00 (0 ns), maxlat=0x00 (0 ns)

Re: Current panic on boot on H11DSI motherboard with epyc cpu (nexus_add_irq: failed)

2018-05-18 Thread Kevin Day


> On Apr 18, 2018, at 1:42 PM, John Baldwin  wrote:
>> 
>> Chenged made for it was
>> 
>> Index: sys/x86/x86/nexus.c
>> ===
>> --- sys/x86/x86/nexus.c (revision 332663)
>> +++ sys/x86/x86/nexus.c (working copy)
>> @@ -698,7 +698,7 @@
>> {
>> 
>>if (rman_manage_region(_rman, irq, irq) != 0)
>> -   panic("%s: failed", __func__);
>> +   panic("%s: failed irq is: %lu", __func__, irq);
>> }
> 
> O, this is a different issue.  Sorry.  As a hack, try changing
> 'FIRST_MSI_INT' to 512 in sys/amd64/include/intr_machdep.h.  The issue
> is that some systems now include more than 256 interrupt pins on I/O
> APICs, so IRQ 256 is already reserved for use by one of those
> interrupt pins.  The real fix is that I need to make FIRST_MSI_INT
> dynamic instead of a constant and just define it as the first free IRQ
> after the I/O APICs have probed.

I'm testing a very large AMD Epyc system, and I had to change FIRST_MSI_INT to 
768, but that fixed this issue for me.


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Current panic on boot on H11DSI motherboard with epyc cpu (nexus_add_irq: failed)

2018-04-18 Thread Vitalij Satanivskij

JB> O, this is a different issue.  Sorry.  As a hack, try changing
JB> 'FIRST_MSI_INT' to 512 in sys/amd64/include/intr_machdep.h.  The issue
JB> is that some systems now include more than 256 interrupt pins on I/O
JB> APICs, so IRQ 256 is already reserved for use by one of those
JB> interrupt pins.  The real fix is that I need to make FIRST_MSI_INT
JB> dynamic instead of a constant and just define it as the first free IRQ
JB> after the I/O APICs have probed.
JB> 

Yep. That it.

But just one note 

irq585: ccp14:721 @cpu0(domain0): 0
irq586: ccp14:723 @cpu0(domain0): 0
irq587: ccp15:725 @cpu0(domain0): 0

If I understand correctly number of irq's even more then 512, so better to 
change to real number in system ?

Or this is another case ?

Any way thank you for help. Now I can use system with msix and msi enabled.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Current panic on boot on H11DSI motherboard with epyc cpu (nexus_add_irq: failed)

2018-04-18 Thread John Baldwin

On Wednesday, April 18, 2018 01:56:49 PM Vitalij Satanivskij wrote:
> JB> > If you need any aditional information please tell me about. 
> JB> 
> JB> Can you perhaps turn off the stack trace on boot to not lose the panic 
> messages
> JB> (remove KDB_TRACE from kernel config) and maybe modify the panic message 
> to
> JB> include the IRQ number passed to nexus_add_irq?
> 
> 
> Hm looks like it's always irq with number 256
> eg hpet - 256 
> igb - 256 
> 
> Chenged made for it was
> 
> Index: sys/x86/x86/nexus.c
> ===
> --- sys/x86/x86/nexus.c (revision 332663)
> +++ sys/x86/x86/nexus.c (working copy)
> @@ -698,7 +698,7 @@
>  {
>  
> if (rman_manage_region(_rman, irq, irq) != 0)
> -   panic("%s: failed", __func__);
> +   panic("%s: failed irq is: %lu", __func__, irq);
>  }

O, this is a different issue.  Sorry.  As a hack, try changing
'FIRST_MSI_INT' to 512 in sys/amd64/include/intr_machdep.h.  The issue
is that some systems now include more than 256 interrupt pins on I/O
APICs, so IRQ 256 is already reserved for use by one of those
interrupt pins.  The real fix is that I need to make FIRST_MSI_INT
dynamic instead of a constant and just define it as the first free IRQ
after the I/O APICs have probed.

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Current panic on boot on H11DSI motherboard with epyc cpu (nexus_add_irq: failed)

2018-04-18 Thread Vitalij Satanivskij

JB> > If you need any aditional information please tell me about. 
JB> 
JB> Can you perhaps turn off the stack trace on boot to not lose the panic 
messages
JB> (remove KDB_TRACE from kernel config) and maybe modify the panic message to
JB> include the IRQ number passed to nexus_add_irq?


Hm looks like it's always irq with number 256
eg hpet - 256 
igb - 256 

Chenged made for it was

Index: sys/x86/x86/nexus.c
===
--- sys/x86/x86/nexus.c (revision 332663)
+++ sys/x86/x86/nexus.c (working copy)
@@ -698,7 +698,7 @@
 {
 
if (rman_manage_region(_rman, irq, irq) != 0)
-   panic("%s: failed", __func__);
+   panic("%s: failed irq is: %lu", __func__, irq);
 }



___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Current panic on boot on H11DSI motherboard with epyc cpu (nexus_add_irq: failed)

2018-04-17 Thread John Baldwin

On Tuesday, April 17, 2018 10:15:53 PM Vitalij Satanivskij wrote:
> Dear John
> 
> I'm try patch with no success
> 
> http://hell.ukr.net/panic/recorder_patch165.webm
> 
> Also I'm enable verbose boot and record boot process (hpet was disabled so 
> crash in another driver atach)
> http://hell.ukr.net/panic/recorder_patch_verbose.webm
> 
> root@test:/usr/src # svnlite diff
> Index: sys/x86/x86/msi.c
> ===
> --- sys/x86/x86/msi.c   (revision 332650)
> +++ sys/x86/x86/msi.c   (working copy)
> @@ -404,7 +404,7 @@
> /* Do we need to create some new sources? */
> if (cnt < count) {
> /* If we would exceed the max, give up. */
> -   if (i + (count - cnt) > FIRST_MSI_INT + NUM_MSI_INTS) {
> +   if (i + (count - cnt) >= FIRST_MSI_INT + NUM_MSI_INTS) {
> mtx_unlock(_lock);
> free(mirqs, M_MSI);
> return (ENXIO);
> @@ -645,7 +645,7 @@
> /* Do we need to create a new source? */
> if (msi == NULL) {
> /* If we would exceed the max, give up. */
> -   if (i + 1 > FIRST_MSI_INT + NUM_MSI_INTS) {
> +   if (i + 1 >= FIRST_MSI_INT + NUM_MSI_INTS) {
> mtx_unlock(_lock);
> return (ENXIO);
> }
> root@test:/usr/src
> 
> If you need any aditional information please tell me about. 

Can you perhaps turn off the stack trace on boot to not lose the panic messages
(remove KDB_TRACE from kernel config) and maybe modify the panic message to
include the IRQ number passed to nexus_add_irq?

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Current panic on boot on H11DSI motherboard with epyc cpu (nexus_add_irq: failed)

2018-04-17 Thread Vitalij Satanivskij

Dear John

I'm try patch with no success

http://hell.ukr.net/panic/recorder_patch165.webm

Also I'm enable verbose boot and record boot process (hpet was disabled so 
crash in another driver atach)
http://hell.ukr.net/panic/recorder_patch_verbose.webm

root@test:/usr/src # svnlite diff
Index: sys/x86/x86/msi.c
===
--- sys/x86/x86/msi.c   (revision 332650)
+++ sys/x86/x86/msi.c   (working copy)
@@ -404,7 +404,7 @@
/* Do we need to create some new sources? */
if (cnt < count) {
/* If we would exceed the max, give up. */
-   if (i + (count - cnt) > FIRST_MSI_INT + NUM_MSI_INTS) {
+   if (i + (count - cnt) >= FIRST_MSI_INT + NUM_MSI_INTS) {
mtx_unlock(_lock);
free(mirqs, M_MSI);
return (ENXIO);
@@ -645,7 +645,7 @@
/* Do we need to create a new source? */
if (msi == NULL) {
/* If we would exceed the max, give up. */
-   if (i + 1 > FIRST_MSI_INT + NUM_MSI_INTS) {
+   if (i + 1 >= FIRST_MSI_INT + NUM_MSI_INTS) {
mtx_unlock(_lock);
return (ENXIO);
}
root@test:/usr/src

If you need any aditional information please tell me about. 



JB> > If one of this parameters not set as described system not boot ^( 
JB> 
JB> Please try the patch from here https://reviews.freebsd.org/P165
JB> 
JB> -- 
JB> John Baldwin
JB> ___
JB> freebsd-hack...@freebsd.org mailing list
JB> https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
JB> To unsubscribe, send any mail to "freebsd-hackers-unsubscr...@freebsd.org"
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Current panic on boot on H11DSI motherboard with epyc cpu (nexus_add_irq: failed)

2018-04-17 Thread John Baldwin

On Monday, April 16, 2018 10:12:13 PM Vitalij Satanivskij wrote:
> 
> igb0@pci0:1:0:0:class=0x02 card=0x152115d9 chip=0x15218086 
> rev=0x01 hdr=0x00
> vendor = 'Intel Corporation'
> device = 'I350 Gigabit Network Connection'
> class  = network
> subclass   = ethernet
> cap 01[40] = powerspec 3  supports D0 D3  current D0
> cap 05[50] = MSI supports 1 message, 64 bit, vector masks
> cap 11[70] = MSI-X supports 10 messages
>  Table in map 0x1c[0x0], PBA in map 0x1c[0x2000]
> cap 10[a0] = PCI-Express 2 endpoint max data 512(512) FLR RO NS
>  link x4(x4) speed 5.0(5.0) ASPM L1(L0s/L1)
> ecap 0001[100] = AER 2 0 fatal 0 non-fatal 1 corrected
> ecap 0003[140] = Serial 1 ac1f6b620e0c
> ecap 000e[150] = ARI 1
> ecap 0010[160] = SR-IOV 1 IOV disabled, Memory Space disabled, ARI 
> disabled
>  0 VFs configured out of 8 supported
>  First VF RID Offset 0x0180, VF RID Stride 0x0004
>  VF Device ID 0x1520
>  Page Sizes: 4096 (enabled), 8192, 65536, 262144, 
> 1048576, 4194304
> ecap 0017[1a0] = TPH Requester 1
> ecap 0018[1c0] = LTR 1
> ecap 000d[1d0] = ACS 1
> 
> It's info from system booted with HPET disabled and 
> hw.pci.enable_msix: 0
> hw.pci.enable_msi: 0
> 
> If one of this parameters not set as described system not boot ^( 

Please try the patch from here https://reviews.freebsd.org/P165

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Current panic on boot on H11DSI motherboard with epyc cpu (nexus_add_irq: failed)

2018-04-16 Thread Vitalij Satanivskij

Oh bios. 
It's already lastest bios for now with agesa 1.0.0.5 in it. 
It's dated 2/14/2018 So most likely new version will not appear soon


Stephen Hurd wrote:
SH> Yeah, this looks like some sort of general MSI issue, not igb specific.
SH> I'm not familiar with that part of the kernel, but maybe check if there's a
SH> BIOS update available?
SH> 
SH> On Mon, Apr 16, 2018 at 3:51 PM, Vitalij Satanivskij  wrote:
SH> 
SH> > Dear Stephen
SH> >
SH> > I'm disable msix on igb both 1 and 0
SH> > and enable HPET in bios
SH> >
SH> > get hpet_attach panic. http://hell.ukr.net/panic/recorder_hpet.webm
SH> > so i disable hpet again and get msi_alloc and so on
SH> > http://hell.ukr.net/panic/recorder_msi.webm
SH> >
SH> > So for test I'm set hw.pci.enable_msi=0 and get panic in cpp_hw_attach
SH> > wich autoloaded later wile system run rc scripts
SH> >
SH> > panic here - http://hell.ukr.net/panic/recorder_ccp.webm
SH> >
SH> > For me it's look like some kind of resource menegment problem?
SH> >
SH> >
SH> > Stephen Hurd wrote:
SH> > SH> If you disable msix just for igb0, does it crash somewhere else?
SH> > SH>
SH> > SH> On Mon, Apr 16, 2018 at 3:13 PM, Stephen Hurd  wrote:
SH> > SH>
SH> > SH> > Oh, you may need to disable msix to boot...
SH> > SH> >
SH> > SH> > dev.igb.0.iflib.disable_msix=1
SH> > SH> >
SH> > SH> > On Mon, Apr 16, 2018 at 3:02 PM, Stephen Hurd 
SH> > wrote:
SH> > SH> >
SH> > SH> >> Hrm, it should be trying to allocate three msi-x vectors there, and
SH> > it
SH> > SH> >> appears that it's reported that 10 are available.  What's the
SH> > output of
SH> > SH> >> ``pciconf -lcv pci1:0:0''?
SH> > SH> >>
SH> > SH> >> On Mon, Apr 16, 2018 at 1:27 PM, Conrad Meyer 
SH> > wrote:
SH> > SH> >>
SH> > SH> >>> Hi Vitalij,
SH> > SH> >>>
SH> > SH> >>> On Mon, Apr 16, 2018 at 3:27 AM, Vitalij Satanivskij <
SH> > sa...@ukr.net>
SH> > SH> >>> wrote:
SH> > SH> >>> > DUMP can be found here http://hell.ukr.net/panic/panic.jpg
SH> > SH> >>> > or even video record from screen http://hell.ukr.net/panic/reco
SH> > SH> >>> rder.webm
SH> > SH> >>>
SH> > SH> >>> Looks like the panic message is printed directly after: "igb0:
SH> > using 2
SH> > SH> >>> rx queues 2 tx queues" (iflib_msix_init(), called by
SH> > SH> >>> iflib_device_register()).
SH> > SH> >>>
SH> > SH> >>> And stack is indeed coming from iflib in probe (0:17 in linked
SH> > video):
SH> > SH> >>>
SH> > SH> >>> panic()
SH> > SH> >>> nexus_add_irq()
SH> > SH> >>> msix_alloc()
SH> > SH> >>> pci_alloc_msix_method()
SH> > SH> >>> iflib_device_register()
SH> > SH> >>> iflib_device_attach()
SH> > SH> >>> device_attach()
SH> > SH> >>> ...
SH> > SH> >>>
SH> > SH> >>> Stephen, Matt, or Sean might be able to help diagnose further.
SH> > SH> >>>
SH> > SH> >>> Best,
SH> > SH> >>> Conrad
SH> > SH> >>>
SH> > SH> >>
SH> > SH> >>
SH> > SH> >>
SH> > SH> >> --
SH> > SH> >> [image: Limelight Networks] 
SH> > SH> >> Stephen Hurd* Principal Engineer*
SH> > SH> >> EXPERIENCE FIRST.
SH> > SH> >> +1 616 848 0643 <+1+616+848+0643>
SH> > SH> >> www.limelight.com
SH> > SH> >> [image: Facebook]  > >[image:
SH> > SH> >> LinkedIn] [
SH> > image:
SH> > SH> >> Twitter] 
SH> > SH> >>
SH> > SH> >
SH> > SH> >
SH> > SH> >
SH> > SH> > --
SH> > SH> > [image: Limelight Networks] 
SH> > SH> > Stephen Hurd* Principal Engineer*
SH> > SH> > EXPERIENCE FIRST.
SH> > SH> > +1 616 848 0643 <+1+616+848+0643>
SH> > SH> > www.limelight.com
SH> > SH> > [image: Facebook]  > >[image:
SH> > SH> > LinkedIn] [
SH> > image:
SH> > SH> > Twitter] 
SH> > SH> >
SH> > SH>
SH> > SH>
SH> > SH>
SH> > SH> --
SH> > SH> [image: Limelight Networks] 
SH> > SH> Stephen Hurd* Principal Engineer*
SH> > SH> EXPERIENCE FIRST.
SH> > SH> +1 616 848 0643 <+1+616+848+0643>
SH> > SH> www.limelight.com
SH> > SH> [image: Facebook] [image:
SH> > SH> LinkedIn] [image:
SH> > SH> Twitter] 
SH> >
SH> 
SH> 
SH> 
SH> -- 
SH> [image: Limelight Networks] 
SH> Stephen Hurd* Principal Engineer*
SH> EXPERIENCE FIRST.
SH> +1 616 848 0643 <+1+616+848+0643>
SH> www.limelight.com
SH> [image: Facebook] [image:
SH> LinkedIn] [image:
SH> Twitter] 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Current panic on boot on H11DSI motherboard with epyc cpu (nexus_add_irq: failed)

2018-04-16 Thread Vitalij Satanivskij

Dear Stephen 

I'm disable msix on igb both 1 and 0 
and enable HPET in bios 

get hpet_attach panic. http://hell.ukr.net/panic/recorder_hpet.webm
so i disable hpet again and get msi_alloc and so on  
http://hell.ukr.net/panic/recorder_msi.webm

So for test I'm set hw.pci.enable_msi=0 and get panic in cpp_hw_attach wich 
autoloaded later wile system run rc scripts

panic here - http://hell.ukr.net/panic/recorder_ccp.webm

For me it's look like some kind of resource menegment problem?


Stephen Hurd wrote:
SH> If you disable msix just for igb0, does it crash somewhere else?
SH> 
SH> On Mon, Apr 16, 2018 at 3:13 PM, Stephen Hurd  wrote:
SH> 
SH> > Oh, you may need to disable msix to boot...
SH> >
SH> > dev.igb.0.iflib.disable_msix=1
SH> >
SH> > On Mon, Apr 16, 2018 at 3:02 PM, Stephen Hurd  wrote:
SH> >
SH> >> Hrm, it should be trying to allocate three msi-x vectors there, and it
SH> >> appears that it's reported that 10 are available.  What's the output of
SH> >> ``pciconf -lcv pci1:0:0''?
SH> >>
SH> >> On Mon, Apr 16, 2018 at 1:27 PM, Conrad Meyer  wrote:
SH> >>
SH> >>> Hi Vitalij,
SH> >>>
SH> >>> On Mon, Apr 16, 2018 at 3:27 AM, Vitalij Satanivskij 
SH> >>> wrote:
SH> >>> > DUMP can be found here http://hell.ukr.net/panic/panic.jpg
SH> >>> > or even video record from screen http://hell.ukr.net/panic/reco
SH> >>> rder.webm
SH> >>>
SH> >>> Looks like the panic message is printed directly after: "igb0: using 2
SH> >>> rx queues 2 tx queues" (iflib_msix_init(), called by
SH> >>> iflib_device_register()).
SH> >>>
SH> >>> And stack is indeed coming from iflib in probe (0:17 in linked video):
SH> >>>
SH> >>> panic()
SH> >>> nexus_add_irq()
SH> >>> msix_alloc()
SH> >>> pci_alloc_msix_method()
SH> >>> iflib_device_register()
SH> >>> iflib_device_attach()
SH> >>> device_attach()
SH> >>> ...
SH> >>>
SH> >>> Stephen, Matt, or Sean might be able to help diagnose further.
SH> >>>
SH> >>> Best,
SH> >>> Conrad
SH> >>>
SH> >>
SH> >>
SH> >>
SH> >> --
SH> >> [image: Limelight Networks] 
SH> >> Stephen Hurd* Principal Engineer*
SH> >> EXPERIENCE FIRST.
SH> >> +1 616 848 0643 <+1+616+848+0643>
SH> >> www.limelight.com
SH> >> [image: Facebook] [image:
SH> >> LinkedIn] [image:
SH> >> Twitter] 
SH> >>
SH> >
SH> >
SH> >
SH> > --
SH> > [image: Limelight Networks] 
SH> > Stephen Hurd* Principal Engineer*
SH> > EXPERIENCE FIRST.
SH> > +1 616 848 0643 <+1+616+848+0643>
SH> > www.limelight.com
SH> > [image: Facebook] [image:
SH> > LinkedIn] [image:
SH> > Twitter] 
SH> >
SH> 
SH> 
SH> 
SH> -- 
SH> [image: Limelight Networks] 
SH> Stephen Hurd* Principal Engineer*
SH> EXPERIENCE FIRST.
SH> +1 616 848 0643 <+1+616+848+0643>
SH> www.limelight.com
SH> [image: Facebook] [image:
SH> LinkedIn] [image:
SH> Twitter] 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Current panic on boot on H11DSI motherboard with epyc cpu (nexus_add_irq: failed)

2018-04-16 Thread Vitalij Satanivskij


igb0@pci0:1:0:0:class=0x02 card=0x152115d9 chip=0x15218086 rev=0x01 
hdr=0x00
vendor = 'Intel Corporation'
device = 'I350 Gigabit Network Connection'
class  = network
subclass   = ethernet
cap 01[40] = powerspec 3  supports D0 D3  current D0
cap 05[50] = MSI supports 1 message, 64 bit, vector masks
cap 11[70] = MSI-X supports 10 messages
 Table in map 0x1c[0x0], PBA in map 0x1c[0x2000]
cap 10[a0] = PCI-Express 2 endpoint max data 512(512) FLR RO NS
 link x4(x4) speed 5.0(5.0) ASPM L1(L0s/L1)
ecap 0001[100] = AER 2 0 fatal 0 non-fatal 1 corrected
ecap 0003[140] = Serial 1 ac1f6b620e0c
ecap 000e[150] = ARI 1
ecap 0010[160] = SR-IOV 1 IOV disabled, Memory Space disabled, ARI disabled
 0 VFs configured out of 8 supported
 First VF RID Offset 0x0180, VF RID Stride 0x0004
 VF Device ID 0x1520
 Page Sizes: 4096 (enabled), 8192, 65536, 262144, 1048576, 
4194304
ecap 0017[1a0] = TPH Requester 1
ecap 0018[1c0] = LTR 1
ecap 000d[1d0] = ACS 1

It's info from system booted with HPET disabled and 
hw.pci.enable_msix: 0
hw.pci.enable_msi: 0

If one of this parameters not set as described system not boot ^( 


Stephen Hurd wrote:
SH> Hrm, it should be trying to allocate three msi-x vectors there, and it
SH> appears that it's reported that 10 are available.  What's the output of
SH> ``pciconf -lcv pci1:0:0''?
SH> 
SH> On Mon, Apr 16, 2018 at 1:27 PM, Conrad Meyer  wrote:
SH> 
SH> > Hi Vitalij,
SH> >
SH> > On Mon, Apr 16, 2018 at 3:27 AM, Vitalij Satanivskij 
SH> > wrote:
SH> > > DUMP can be found here http://hell.ukr.net/panic/panic.jpg
SH> > > or even video record from screen http://hell.ukr.net/panic/recorder.webm
SH> >
SH> > Looks like the panic message is printed directly after: "igb0: using 2
SH> > rx queues 2 tx queues" (iflib_msix_init(), called by
SH> > iflib_device_register()).
SH> >
SH> > And stack is indeed coming from iflib in probe (0:17 in linked video):
SH> >
SH> > panic()
SH> > nexus_add_irq()
SH> > msix_alloc()
SH> > pci_alloc_msix_method()
SH> > iflib_device_register()
SH> > iflib_device_attach()
SH> > device_attach()
SH> > ...
SH> >
SH> > Stephen, Matt, or Sean might be able to help diagnose further.
SH> >
SH> > Best,
SH> > Conrad
SH> >
SH> 
SH> 
SH> 
SH> -- 
SH> [image: Limelight Networks] 
SH> Stephen Hurd* Principal Engineer*
SH> EXPERIENCE FIRST.
SH> +1 616 848 0643 <+1+616+848+0643>
SH> www.limelight.com
SH> [image: Facebook] [image:
SH> LinkedIn] [image:
SH> Twitter] 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Current panic on boot on H11DSI motherboard with epyc cpu (nexus_add_irq: failed)

2018-04-16 Thread Conrad Meyer

Hi Vitalij,

On Mon, Apr 16, 2018 at 3:27 AM, Vitalij Satanivskij  wrote:
> DUMP can be found here http://hell.ukr.net/panic/panic.jpg
> or even video record from screen http://hell.ukr.net/panic/recorder.webm

Looks like the panic message is printed directly after: "igb0: using 2
rx queues 2 tx queues" (iflib_msix_init(), called by
iflib_device_register()).

And stack is indeed coming from iflib in probe (0:17 in linked video):

panic()
nexus_add_irq()
msix_alloc()
pci_alloc_msix_method()
iflib_device_register()
iflib_device_attach()
device_attach()
...

Stephen, Matt, or Sean might be able to help diagnose further.

Best,
Conrad
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Current panic on boot on H11DSI motherboard with epyc cpu (nexus_add_irq: failed)

2018-04-16 Thread Vitalij Satanivskij

Hello. 

We have a kernel panic  when loading current or 11.1 snapshot 

As while booting from usb steck or from hdd/ssd with installed system

Kernel - GENERIC 

DUMP can be found here http://hell.ukr.net/panic/panic.jpg
or even video record from screen http://hell.ukr.net/panic/recorder.webm

Hardware is - 
2x AMD EPYC 7251 Processor on Supermicro H11DSI mother board.

Only way to boot system is - disable HPET in bios and set 
hw.pci.enable_msix=0
hw.pci.enable_msi=0

We already try different's loader.conf setting like 


machdep.disable_msix_migration=1
hint.hpet.0.clock=0
hint.hpet.0.per_cpu=0

#hw.pci.enable_msix=0
#hw.pci.enable_msi=0
#dev.igb.1.iflib.disable_msix=1
#dev.igb.0.iflib.disable_msix=1
#machdep.disable_msix_migration = 1
#hw.pci.msix_rewrite_table=1
#hw.pci.honor_msi_blacklist=0

In differents combination with no success.


Any suggestion we can try to test?
ANy additional information from ower side?

Thank you.




___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: ZFS panic at boot when mounting root on r330386

2018-03-04 Thread Andriy Gapon

On 05/03/2018 02:59, Bryan Drewery wrote:
>> panic: solaris assert: refcount_count(>spa_refcount) > spa->spa_minref 
>> || MUTEX_HELD(_namespace_lock), file: 
>> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c, line: 952
>> cpuid = 10
>> time = 1520207367
>> KDB: stack backtrace:
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
>> 0xfe23f57a2420
>> vpanic() at vpanic+0x18d/frame 0xfe23f57a2480
>> panic() at panic+0x43/frame 0xfe23f57a24e0
>> assfail() at assfail+0x1a/frame 0xfe23f57a24f0
>> spa_close() at spa_close+0x5d/frame 0xfe23f57a2520
>> spa_get_stats() at spa_get_stats+0x481/frame 0xfe23f57a2700
>> zfs_ioc_pool_stats() at zfs_ioc_pool_stats+0x25/frame 0xfe23f57a2740
>> zfsdev_ioctl() at zfsdev_ioctl+0x76b/frame 0xfe23f57a27e0
>> devfs_ioctl() at devfs_ioctl+0xcb/frame 0xfe23f57a2830
>> VOP_IOCTL_APV() at VOP_IOCTL_APV+0x102/frame 0xfe23f57a2860
>> vn_ioctl() at vn_ioctl+0x124/frame 0xfe23f57a2970
>> devfs_ioctl_f() at devfs_ioctl_f+0x1f/frame 0xfe23f57a2990
>> kern_ioctl() at kern_ioctl+0x2c2/frame 0xfe23f57a29f0
>> sys_ioctl() at sys_ioctl+0x15c/frame 0xfe23f57a2ac0
>> amd64_syscall() at amd64_syscall+0x786/frame 0xfe23f57a2bf0
>> fast_syscall_common() at fast_syscall_common+0x101/frame 0xfe23f57a2bf0
>> --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x80049afda, rsp = 
>> 0x7fffbd18, rbp = 0x7fffbd90 ---
>> KDB: enter: panic
>> [ thread pid 56 tid 100606 ]
>> Stopped at  kdb_enter+0x3b: movq$0,kdb_why
>> db>
> 
> It seems like a race as I can get it to boot sometimes.

Yes, it does.  https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=210409

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

ZFS panic at boot when mounting root on r330386

2018-03-04 Thread Bryan Drewery

> panic: solaris assert: refcount_count(>spa_refcount) > spa->spa_minref 
> || MUTEX_HELD(_namespace_lock), file: 
> /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/spa_misc.c, line: 952
> cpuid = 10
> time = 1520207367
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe23f57a2420
> vpanic() at vpanic+0x18d/frame 0xfe23f57a2480
> panic() at panic+0x43/frame 0xfe23f57a24e0
> assfail() at assfail+0x1a/frame 0xfe23f57a24f0
> spa_close() at spa_close+0x5d/frame 0xfe23f57a2520
> spa_get_stats() at spa_get_stats+0x481/frame 0xfe23f57a2700
> zfs_ioc_pool_stats() at zfs_ioc_pool_stats+0x25/frame 0xfe23f57a2740
> zfsdev_ioctl() at zfsdev_ioctl+0x76b/frame 0xfe23f57a27e0
> devfs_ioctl() at devfs_ioctl+0xcb/frame 0xfe23f57a2830
> VOP_IOCTL_APV() at VOP_IOCTL_APV+0x102/frame 0xfe23f57a2860
> vn_ioctl() at vn_ioctl+0x124/frame 0xfe23f57a2970
> devfs_ioctl_f() at devfs_ioctl_f+0x1f/frame 0xfe23f57a2990
> kern_ioctl() at kern_ioctl+0x2c2/frame 0xfe23f57a29f0
> sys_ioctl() at sys_ioctl+0x15c/frame 0xfe23f57a2ac0
> amd64_syscall() at amd64_syscall+0x786/frame 0xfe23f57a2bf0
> fast_syscall_common() at fast_syscall_common+0x101/frame 0xfe23f57a2bf0
> --- syscall (54, FreeBSD ELF64, sys_ioctl), rip = 0x80049afda, rsp = 
> 0x7fffbd18, rbp = 0x7fffbd90 ---
> KDB: enter: panic
> [ thread pid 56 tid 100606 ]
> Stopped at  kdb_enter+0x3b: movq$0,kdb_why
> db>

It seems like a race as I can get it to boot sometimes.

-- 
Regards,
Bryan Drewery



signature.asc
Description: OpenPGP digital signature

Re: ACPI panic on boot with new Lua loader and other minor issues

2018-02-21 Thread Kyle Evans

On Wed, Feb 21, 2018 at 8:55 AM, Warner Losh  wrote:
>
>
> On Wed, Feb 21, 2018 at 6:58 AM, Kyle Evans  wrote:
>>
>> On Wed, Feb 21, 2018 at 4:43 AM, Juan Ramón Molina Menor 
>> wrote:
>> > Le 20/02/2018 à 22:45, Kyle Evans a écrit :
>> >>
>> >> On Mon, Feb 19, 2018 at 8:21 AM, Juan Ramón Molina Menor
>> >> 
>> >> wrote:
>> >>>
>> >>> [... snip ...]
>> >>>
>> >>> Moreover, the "boot [kernel]" loader command does not work:
>> >>>
>> >>> OK ls /boot/kernel.old/kernel
>> >>>  /boot/kernel.old/kernel
>> >>> OK boot kernel.old
>> >>> Command failed
>> >>> OK boot /boot/kernel.old/kernel
>> >>> Command failed
>> >>> OK boot kernel
>> >>> Command failed
>> >>>
>> >>> On the other hand, just "boot" works.
>> >>>
>> >>
>> >> This part should work as expected as of r329674, so please give that a
>> >> shot. I'm still trying to see if I can reproduce your box drawing
>> >> problem.
>> >>
>> >> Thanks,
>> >>
>> >> Kyle Evans
>> >>
>> >
>> > Thanks Kyle.
>> >
>> > boot command works now. There is though a somewhat strangely formulated
>> > messages when trying to load an non-existent kernel:
>> >
>> > OK boot fake
>> > Failed to load kernel ’fake’
>> > Failed to load any kernel
>> > can’t load ’kernel’
>> >
>> > The two last lines are odd: Did the loader try to load a fallback kernel
>> > and
>> > failed? That would explain the ’kernel’ name in quotes, but I have such
>> > a
>> > kernel… Also, just nitpicking, but "can’t" should be capitalized.
>>
>> (CC'ing Rod, since he also commented on this)
>>
>> I'm only directly responsible for the first two messages. =) I've
>> removed the second one, though, since it was a carry-over from when it
>> would try to load the selected kernel and then some default kernel
>> that might be in your module_path.
>>
>> We can look at changing "can't load 'kernel'" to capitalize and remove
>> the contraction, but that's common loader stuff and should've also
>> been displayed for the same Forth scenario.
>>
>> > Then, I have just remembered why I was seeing a higher resolution menu
>> > before: I had set 'gop set 0' in /boot/loader.rc.local. It seems the new
>> > loader is not implementing the inclusion of this file, because I can
>> > change
>> > the gop mode in the loader with 'gop set [0-3]'.
>> >
>>
>> Oy. This is actually a Forth file, so we can't really maintain all of
>> the functionality that would have been allowed there. Technically,
>> things like this should probably either appear in your loader.conf(5)
>> in the form of 'exec="gop set 3"' to be applied when loader.conf(5) is
>> read, or we should provide some other means of running pure loader
>> command scripts at different points in the loader sequence. (CC
>> Warner, because he probably has thoughts about this latter idea)
>
>
> While loader.rc is FORTH, it's contrived FORTH designed to look like command
> line interaction. While some crazy people like me have actual forth in
> these, most do not, really (apart from the include /boot/XXX.4th lines, that
> is, which could be filtered).
>
> We have two choices here: Try to provide some level of compatibility shims,
> or provide a new way to say these things in Lua.
>
> In the original SoC code, loader.lua lived in /boot and looked to be
> something that people could modify. We likely need to do something similar.
> loader.lua right now has nothing but were in the forth world called
> 'include' and then very similar commands to the forth loader.rc. Perhaps the
> right answer is to move cli_execute out of /boot/lua/loader.lua, move that
> file up, and add the try-include functionality to try to include a
> loader.local.lua. Then we could tell people to move to the Lua syntax way of
> doing things. We'd have to hunt down all the hacks like this, but that
> wouldn't be terrible. Bonus points if we could convert the common ones
> either to lua code automatically, or to loader.conf variables.

We have something like this that I added yesterday in r329692, but
named a little bit differently ("local.lua", "local module"). Mostly
added because I've been using it to locally add test menu options and
writing some of the documentation for how to hook into our new lua
system to do things, and it was getting tiresome having to manually
revert those bits in loader.lua when I wanted to make changes to the
in-tree loader.lua.

We'd basically rename this to loader.local.lua to match more closely
to current convention, move "cli_execute" out to either core.lua or
maybe it and the boot commands are worthy of their own "cli" module,
then be happy.

> This specific example should have always been a loader.conf variable (and
> not an exec). While the gop command is useful, the loader should have, as
> part of it's forth sequence, had something that would set the GOP mode if
> the uefi_gop_mode loader.conf variable was set (I get why that wasn't done
> that way in the forth loader, insert rant of Fear of FORTH here). So we
>

Re: ACPI panic on boot with new Lua loader and other minor issues

2018-02-21 Thread Warner Losh

On Wed, Feb 21, 2018 at 6:58 AM, Kyle Evans  wrote:

> On Wed, Feb 21, 2018 at 4:43 AM, Juan Ramón Molina Menor 
> wrote:
> > Le 20/02/2018 à 22:45, Kyle Evans a écrit :
> >>
> >> On Mon, Feb 19, 2018 at 8:21 AM, Juan Ramón Molina Menor <
> lis...@club.fr>
> >> wrote:
> >>>
> >>> [... snip ...]
> >>>
> >>> Moreover, the "boot [kernel]" loader command does not work:
> >>>
> >>> OK ls /boot/kernel.old/kernel
> >>>  /boot/kernel.old/kernel
> >>> OK boot kernel.old
> >>> Command failed
> >>> OK boot /boot/kernel.old/kernel
> >>> Command failed
> >>> OK boot kernel
> >>> Command failed
> >>>
> >>> On the other hand, just "boot" works.
> >>>
> >>
> >> This part should work as expected as of r329674, so please give that a
> >> shot. I'm still trying to see if I can reproduce your box drawing
> >> problem.
> >>
> >> Thanks,
> >>
> >> Kyle Evans
> >>
> >
> > Thanks Kyle.
> >
> > boot command works now. There is though a somewhat strangely formulated
> > messages when trying to load an non-existent kernel:
> >
> > OK boot fake
> > Failed to load kernel ’fake’
> > Failed to load any kernel
> > can’t load ’kernel’
> >
> > The two last lines are odd: Did the loader try to load a fallback kernel
> and
> > failed? That would explain the ’kernel’ name in quotes, but I have such a
> > kernel… Also, just nitpicking, but "can’t" should be capitalized.
>
> (CC'ing Rod, since he also commented on this)
>
> I'm only directly responsible for the first two messages. =) I've
> removed the second one, though, since it was a carry-over from when it
> would try to load the selected kernel and then some default kernel
> that might be in your module_path.
>
> We can look at changing "can't load 'kernel'" to capitalize and remove
> the contraction, but that's common loader stuff and should've also
> been displayed for the same Forth scenario.
>
> > Then, I have just remembered why I was seeing a higher resolution menu
> > before: I had set 'gop set 0' in /boot/loader.rc.local. It seems the new
> > loader is not implementing the inclusion of this file, because I can
> change
> > the gop mode in the loader with 'gop set [0-3]'.
> >
>
> Oy. This is actually a Forth file, so we can't really maintain all of
> the functionality that would have been allowed there. Technically,
> things like this should probably either appear in your loader.conf(5)
> in the form of 'exec="gop set 3"' to be applied when loader.conf(5) is
> read, or we should provide some other means of running pure loader
> command scripts at different points in the loader sequence. (CC
> Warner, because he probably has thoughts about this latter idea)
>

While loader.rc is FORTH, it's contrived FORTH designed to look like
command line interaction. While some crazy people like me have actual forth
in these, most do not, really (apart from the include /boot/XXX.4th lines,
that is, which could be filtered).

We have two choices here: Try to provide some level of compatibility shims,
or provide a new way to say these things in Lua.

In the original SoC code, loader.lua lived in /boot and looked to be
something that people could modify. We likely need to do something similar.
loader.lua right now has nothing but were in the forth world called
'include' and then very similar commands to the forth loader.rc. Perhaps
the right answer is to move cli_execute out of /boot/lua/loader.lua, move
that file up, and add the try-include functionality to try to include a
loader.local.lua. Then we could tell people to move to the Lua syntax way
of doing things. We'd have to hunt down all the hacks like this, but that
wouldn't be terrible. Bonus points if we could convert the common ones
either to lua code automatically, or to loader.conf variables.

This specific example should have always been a loader.conf variable (and
not an exec). While the gop command is useful, the loader should have, as
part of it's forth sequence, had something that would set the GOP mode if
the uefi_gop_mode loader.conf variable was set (I get why that wasn't done
that way in the forth loader, insert rant of Fear of FORTH here). So we
should 'unkludge' it, have Lua loader grok uefi_gop_mode and maybe a few
other things that are simple settings to make it easier for users to set
this stuff up and start to move away from the FoF kludges that we've
accumulated. A new loader.conf variable would also allow coexistence of the
two loaders, which will be further helped with some patches I have to the
build system coming soon.

Warner
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: ACPI panic on boot with new Lua loader and other minor issues

2018-02-21 Thread Rodney W. Grimes

> Le 20/02/2018 ? 22:45, Kyle Evans a ?crit?:
> > On Mon, Feb 19, 2018 at 8:21 AM, Juan Ram?n Molina Menor  
> > wrote:
> >> [... snip ...]
> >>
> >> Moreover, the "boot [kernel]" loader command does not work:
> >>
> >> OK ls /boot/kernel.old/kernel
> >>  /boot/kernel.old/kernel
> >> OK boot kernel.old
> >> Command failed
> >> OK boot /boot/kernel.old/kernel
> >> Command failed
> >> OK boot kernel
> >> Command failed
> >>
> >> On the other hand, just "boot" works.
> >>
> > 
> > This part should work as expected as of r329674, so please give that a
> > shot. I'm still trying to see if I can reproduce your box drawing
> > problem.
> > 
> > Thanks,
> > 
> > Kyle Evans
> > 
> 
> Thanks Kyle.
> 
> boot command works now. There is though a somewhat strangely formulated 
> messages when trying to load an non-existent kernel:
> 
> OK boot fake
> Failed to load kernel ?fake?
> Failed to load any kernel
> can?t load ?kernel?
> 
> The two last lines are odd: Did the loader try to load a fallback kernel 
> and failed? That would explain the ?kernel? name in quotes, but I have 
> such a kernel? Also, just nitpicking, but "can?t" should be capitalized.

To be supper nt picky can't should also be spelled can not.

> Then, I have just remembered why I was seeing a higher resolution menu 
> before: I had set 'gop set 0' in /boot/loader.rc.local. It seems the new 
> loader is not implementing the inclusion of this file, because I can 
> change the gop mode in the loader with 'gop set [0-3]'.
> 
> This has thus nothing to do with the drawing lines, I guess.
> 
> Best regards.
-- 
Rod Grimes rgri...@freebsd.org
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: ACPI panic on boot with new Lua loader and other minor issues

2018-02-21 Thread Kyle Evans

On Wed, Feb 21, 2018 at 4:43 AM, Juan Ramón Molina Menor  wrote:
> Le 20/02/2018 à 22:45, Kyle Evans a écrit :
>>
>> On Mon, Feb 19, 2018 at 8:21 AM, Juan Ramón Molina Menor 
>> wrote:
>>>
>>> [... snip ...]
>>>
>>> Moreover, the "boot [kernel]" loader command does not work:
>>>
>>> OK ls /boot/kernel.old/kernel
>>>  /boot/kernel.old/kernel
>>> OK boot kernel.old
>>> Command failed
>>> OK boot /boot/kernel.old/kernel
>>> Command failed
>>> OK boot kernel
>>> Command failed
>>>
>>> On the other hand, just "boot" works.
>>>
>>
>> This part should work as expected as of r329674, so please give that a
>> shot. I'm still trying to see if I can reproduce your box drawing
>> problem.
>>
>> Thanks,
>>
>> Kyle Evans
>>
>
> Thanks Kyle.
>
> boot command works now. There is though a somewhat strangely formulated
> messages when trying to load an non-existent kernel:
>
> OK boot fake
> Failed to load kernel ’fake’
> Failed to load any kernel
> can’t load ’kernel’
>
> The two last lines are odd: Did the loader try to load a fallback kernel and
> failed? That would explain the ’kernel’ name in quotes, but I have such a
> kernel… Also, just nitpicking, but "can’t" should be capitalized.

(CC'ing Rod, since he also commented on this)

I'm only directly responsible for the first two messages. =) I've
removed the second one, though, since it was a carry-over from when it
would try to load the selected kernel and then some default kernel
that might be in your module_path.

We can look at changing "can't load 'kernel'" to capitalize and remove
the contraction, but that's common loader stuff and should've also
been displayed for the same Forth scenario.

> Then, I have just remembered why I was seeing a higher resolution menu
> before: I had set 'gop set 0' in /boot/loader.rc.local. It seems the new
> loader is not implementing the inclusion of this file, because I can change
> the gop mode in the loader with 'gop set [0-3]'.
>

Oy. This is actually a Forth file, so we can't really maintain all of
the functionality that would have been allowed there. Technically,
things like this should probably either appear in your loader.conf(5)
in the form of 'exec="gop set 3"' to be applied when loader.conf(5) is
read, or we should provide some other means of running pure loader
command scripts at different points in the loader sequence. (CC
Warner, because he probably has thoughts about this latter idea)
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: ACPI panic on boot with new Lua loader and other minor issues

2018-02-21 Thread Juan Ramón Molina Menor


Le 20/02/2018 à 22:45, Kyle Evans a écrit :

On Mon, Feb 19, 2018 at 8:21 AM, Juan Ramón Molina Menor  wrote:

[... snip ...]

Moreover, the "boot [kernel]" loader command does not work:

OK ls /boot/kernel.old/kernel
 /boot/kernel.old/kernel
OK boot kernel.old
Command failed
OK boot /boot/kernel.old/kernel
Command failed
OK boot kernel
Command failed

On the other hand, just "boot" works.



This part should work as expected as of r329674, so please give that a
shot. I'm still trying to see if I can reproduce your box drawing
problem.

Thanks,

Kyle Evans



Thanks Kyle.

boot command works now. There is though a somewhat strangely formulated 
messages when trying to load an non-existent kernel:


OK boot fake
Failed to load kernel ’fake’
Failed to load any kernel
can’t load ’kernel’

The two last lines are odd: Did the loader try to load a fallback kernel 
and failed? That would explain the ’kernel’ name in quotes, but I have 
such a kernel… Also, just nitpicking, but "can’t" should be capitalized.


Then, I have just remembered why I was seeing a higher resolution menu 
before: I had set 'gop set 0' in /boot/loader.rc.local. It seems the new 
loader is not implementing the inclusion of this file, because I can 
change the gop mode in the loader with 'gop set [0-3]'.


This has thus nothing to do with the drawing lines, I guess.

Best regards.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: ACPI panic on boot with new Lua loader and other minor issues

2018-02-20 Thread Kyle Evans

On Mon, Feb 19, 2018 at 8:21 AM, Juan Ramón Molina Menor  wrote:
> [... snip ...]
>
> Moreover, the "boot [kernel]" loader command does not work:
>
> OK ls /boot/kernel.old/kernel
> /boot/kernel.old/kernel
> OK boot kernel.old
> Command failed
> OK boot /boot/kernel.old/kernel
> Command failed
> OK boot kernel
> Command failed
>
> On the other hand, just "boot" works.
>

This part should work as expected as of r329674, so please give that a
shot. I'm still trying to see if I can reproduce your box drawing
problem.

Thanks,

Kyle Evans
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: ACPI panic on boot with new Lua loader and other minor issues

2018-02-20 Thread Juan Ramón Molina Menor


Le 19/02/2018 à 21:21, Kyle Evans a écrit :> Hello!


On Mon, Feb 19, 2018 at 8:21 AM, Juan Ramón Molina Menor  wrote:

I have done a full build of r329555 to test the new Lua boot loader.

Both the new and the old kernels panic after being loaded with:

panic: running without device atpic requires a local APIC

For reasons unknown, ACPI is off, as shown by David Wolfskill in a previous
message:
https://lists.freebsd.org/pipermail/freebsd-current/2018-February/068497.html

OK show hint.acpi.0.disabled
1

Setting ACPI to On resolves the issue.



Hi Kyle.


As David noted, this should actually Just Work (TM) now. Can you break
into a loader prompt with just the forth loader and tell me what "show
hint.acpi.0.rsdp" looks like?

OK show hint.acpi.0.rsdp
Command error

I tested both with hint.acpi.0.disabled= 1 and 0.





Also, I can not stop boot2 to try to use the copy of the Forth loader: the
keyboard only becomes responsive at the loader stage.


Hmm...

In fact, I don’t think this has ever worked here… I’ve found a very old (July 
2016) FreeBSD 12 memstick and neither can I stop the boot2 stage.



There is an error during this stage:

Loading /boot/defaults/loader.conf
Failed to open config: ’/boot/loader.conf.local’


David's diagnosis of this is right- this is more of an informational
message that you don't need to worry about.

Thanks.



Moreover, the "boot [kernel]" loader command does not work:

OK ls /boot/kernel.old/kernel
 /boot/kernel.old/kernel
OK boot kernel.old
Command failed
OK boot /boot/kernel.old/kernel
Command failed
OK boot kernel
Command failed

On the other hand, just "boot" works.


It seems that the Forth loader might be doing something sneaky and
replacing the standard common "boot" with a Forth boot that handles
this a lot better. CC'ing dteske@ so they can confirm.


Finally, the double lines drawing a frame around the loader menu do not work
with the new loader and are replaced by ? characters in a box.


Interesting, I'll look into that... anything interesting/unique about
your setup? r329387 should have addressed one potential cause of this,
but I see you're past that.

I’m using a memory stick to boot a Lenovo ThinkPad S440 (i3-4030U processor, 
4GB RAM). The only thing I can think of is that the ACPI of this model is not 
well supported, but the errors I have are related to thermal zones…:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=201678

To build the memstick I’m using a 11.1-RELEASE VM under Hyper-V, with ccache 
and WITH_META_MODE, but this build process has been working nicely for months.

The kernel is based on GENERIC-NODEBUG and has been also working reliably:

juan@Server ~ % cat /root/kernels/MEMSTICK
include GENERIC-NODEBUG

ident   MEMSTICK

nodevicefdc

nodevicech
nodevicesa
nodeviceses

nodeviceamr
nodevicearcmsr
nodeviceciss
nodevicedpt
nodevicehptmv
nodevicehptnr
nodevicehptrr
nodevicehpt27xx
nodeviceiir
nodeviceips
nodevicemly
nodevicetwa
nodevicetws

nodeviceaac
nodeviceaacp
nodeviceaacraid
nodeviceida
nodevicemfi
nodevicemlx
nodevicemrsas
nodevicepmspcv
nodevicetwe

nodevicenvme
nodevicenvd

nodevicevirtio
nodevicevirtio_pci
nodevicevtnet
nodevicevirtio_blk
nodevicevirtio_scsi
nodevicevirtio_balloon

nooptions   HYPERV
nodevicehyperv

nooptions   XENHVM
nodevicexenpci

nodevicevmx


There is maybe something fishy in my src.conf, where I disable a lot of things 
to slim down the memstick, but still, it has been stable till now:

juan@Server ~ % cat /etc/src.conf
# For memory sticks

WITH_CCACHE_BUILD=

WITHOUT_ACCT=
WITHOUT_AMD=
WITHOUT_ATM=
WITHOUT_AUTHPF=
WITHOUT_AUTOFS=
WITHOUT_BHYVE=
WITHOUT_BLACKLIST=
# iwm does not support Bluetooth
WITHOUT_BLUETOOTH=
WITHOUT_BOOTPARAMD=
WITHOUT_BOOTPD=
# WITHOUT_BSDINSTALL enforced by WITHOUT_DIALOG
WITHOUT_BSNMP=
WITHOUT_CALENDAR=
# Don't set this when building HEAD from RELENG
# WITHOUT_CROSS_COMPILER=
WITHOUT_CTM=
WITHOUT_DEBUG_FILES=
#WITHOUT_DIALOG=
WITHOUT_DICT=
WITHOUT_EE=
WITHOUT_EXAMPLES=
WITHOUT_FDT=
WITHOUT_FINGER=
WITHOUT_FLOPPY=
# For testing the Lua loader (WITH_LOADER_LUA)
WITHOUT_FORTH=
WITHOUT_FREEBSD_UPDATE=
WITHOUT_GAMES=
WITHOUT_GCOV=
WITHOUT_GPIO=
# You disable Kerberos later, but try to keep GSSAPI for curl > pkg
# But this does not work, base Kerberos is required
#WITH_GSSAPI=
WITHOUT_GSSAPI=
WITHOUT_HAST=
WITHOUT_HESIOD=
WITHOUT_HTML=
WITHOUT_HYPERV=
WITHOUT_IPFILTER=
WITHOUT_IPFW=
WITHOUT_ISCSI=
WITHOUT_JAIL=
WITHOUT_KERBEROS=
WITHOUT_KERNEL_SYMBOLS=
WITHOUT_KVM=
WITHOUT_LDNS=
# This disables moused
#WITHOUT_LEGACY_CONSOLE=
WITHOUT_LLDB=
# This requires WITHOUT_FORTH
WITH_LOADER_LUA=
# This breaks setting locale and thus tmux
#WITHOUT_LOCALES=
WITHOUT_LPR=

Re: ACPI panic on boot with new Lua loader and other minor issues

2018-02-19 Thread Warner Losh

On Feb 19, 2018 3:38 PM, "Kyle Evans"  wrote:

On Mon, Feb 19, 2018 at 4:32 PM, Warner Losh  wrote:
>
>
> On Mon, Feb 19, 2018 at 2:57 PM, Devin Teske  wrote:
>>
>>
>>
>> > On Feb 19, 2018, at 2:21 PM, Kyle Evans  wrote:
>> >
>> > It seems that the Forth loader might be doing something sneaky and
>> > replacing the standard common "boot" with a Forth boot that handles
>> > this a lot better. CC'ing dteske@ so they can confirm.
>>
>> I can indeed confirm this as fact.
>>
>> Not able to help much because I am driving cross-country (San Francisco
to
>> Orlando) right now with the spouse and dog.
>>
>> We get back March 3rd, but I will be checking-in from time to time for
>> sporadic responses during downtime.
>
>
> The command in loader.4th is defined as:
>
> : boot
>   0= if ( interpreted ) get_arguments then
>
>   \ Unload only if a path was passed
>   dup if
> >r over r> swap
> c@ [char] - <> if
>   0 1 unload drop
> else
>   s" kernelname" getenv? if ( a kernel has been loaded )
> try-menu-unset
> bootmsg 1 boot exit
>   then
>   load_kernel_and_modules
>   ?dup if exit then
>   try-menu-unset
>   bootmsg 0 1 boot exit
> then
>   else
> s" kernelname" getenv? if ( a kernel has been loaded )
>   try-menu-unset
>   bootmsg 1 boot exit
> then
> load_kernel_and_modules
> ?dup if exit then
> try-menu-unset
> bootmsg 0 1 boot exit
>   then
>   load_kernel_and_modules
>   ?dup 0= if bootmsg 0 1 boot then
> ;
>
> The thing to know here is when you see 'boot' as part of above script,
it's
> calling the 'boot' cli command, not itself recursively.
>
> I can help do more interpretation of the details if you need Kyle. Not
sure
> how much to spell out, but the brief pseudo code is:
>
> If there were any arguments that didn't start with '-', unload.
>   otherwise if kernelname is in in the environment, run the 'menu-unset'
> forth word if it exists, print the boot message and boot.
>   Otherwise load the kernel and modules, run the 'menu-unset' forth word
(if
> it exists), print the boot message and boot with kernelname
> Otherwise load the kernel and modules, run the 'menu-unset' forth word (if
> it exists), print the boot message and boot with kernelname
> if all that fails, load the kernel and modules and if that works boot
them.
>

Yeah, we have something like this on the lua side. Unfortunately, it's
going to wreck people's muscle memory- dropping to the loader prompt
and typing "boot [x]" will never work as expected because lua won't
recognize that as a function call due to spaces as delimiters.

We'd need some shim that takes "cmd [x]" and tries it as "cmd([x])"
(for some [x] that could be multiple space-delimited arguments) before
falling back to the originally typed "cmd [x]" if we want Lua to have
any chance to intercept it and adds its own salt and pepper like Forth
does.


Forth has a framework for making all commands forth words. It leverages
that to run the intercept. We already have the intercept in place with a
stupidly simple policy. We totally can do something generic that would
solve this and maybe other problems. Let's chat online tomorrow about a
couple of possibilities we can choose from.

Warner
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: ACPI panic on boot with new Lua loader and other minor issues

2018-02-19 Thread Devin Teske

> On Feb 19, 2018, at 4:32 PM, Warner Losh  wrote:
> 
> 
> 
>> On Mon, Feb 19, 2018 at 2:57 PM, Devin Teske  wrote:
>> 
>> 
>> > On Feb 19, 2018, at 2:21 PM, Kyle Evans  wrote:
>> >
>> > It seems that the Forth loader might be doing something sneaky and
>> > replacing the standard common "boot" with a Forth boot that handles
>> > this a lot better. CC'ing dteske@ so they can confirm.
>> 
>> I can indeed confirm this as fact.
>> 
>> Not able to help much because I am driving cross-country (San Francisco to 
>> Orlando) right now with the spouse and dog.
>> 
>> We get back March 3rd, but I will be checking-in from time to time for 
>> sporadic responses during downtime.
> 
> The command in loader.4th is defined as:
> 
> : boot
>   0= if ( interpreted ) get_arguments then
> 
>   \ Unload only if a path was passed
>   dup if
> >r over r> swap
> c@ [char] - <> if
>   0 1 unload drop
> else
>   s" kernelname" getenv? if ( a kernel has been loaded )
> try-menu-unset
> bootmsg 1 boot exit
>   then
>   load_kernel_and_modules
>   ?dup if exit then
>   try-menu-unset
>   bootmsg 0 1 boot exit
> then
>   else
> s" kernelname" getenv? if ( a kernel has been loaded )
>   try-menu-unset
>   bootmsg 1 boot exit
> then
> load_kernel_and_modules
> ?dup if exit then
> try-menu-unset
> bootmsg 0 1 boot exit
>   then
>   load_kernel_and_modules
>   ?dup 0= if bootmsg 0 1 boot then
> ; 
> 
> The thing to know here is when you see 'boot' as part of above script, it's 
> calling the 'boot' cli command, not itself recursively.
> 

What is actually going on is that when the “boot” function is compiled, the 
reference to “boot” inside it is to the already-existing word defined 
previously. Forth allows you to have multiply-defined names. The “boot” command 
inside the “boot” function is replaced with the address of previous boot during 
function compilation because the function is m not defined and given an address 
in the dictionary until it is completed (last line compiled).
— 
Devin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: ACPI panic on boot with new Lua loader and other minor issues

2018-02-19 Thread Kyle Evans

 gets sOn Mon, Feb 19, 2018 at 6:11 PM, Peter Lei  wrote:
>
>
> On 2/19/18 5:48 PM, Kyle Evans wrote:
>>
>>
>> On Feb 19, 2018 5:44 PM, "Peter Lei" > > wrote:
>>
>>
>>
>> On 2/19/18 2:21 PM, Kyle Evans wrote:
>> > Hello!
>> >
>> > On Mon, Feb 19, 2018 at 8:21 AM, Juan Ramón Molina Menor
>> > wrote:
>> >> I have done a full build of r329555 to test the new Lua boot loader.
>> >>
>> >> Both the new and the old kernels panic after being loaded with:
>> >>
>> >> panic: running without device atpic requires a local APIC
>> >>
>> >> For reasons unknown, ACPI is off, as shown by David Wolfskill in
>> a previous
>> >> message:
>> >>
>> 
>> https://lists.freebsd.org/pipermail/freebsd-current/2018-February/068497.html
>> 
>> 
>> >>
>> >> OK show hint.acpi.0.disabled
>> >> 1
>> >>
>> >> Setting ACPI to On resolves the issue.
>> >
>> > As David noted, this should actually Just Work (TM) now. Can you break
>> > into a loader prompt with just the forth loader and tell me what "show
>> > hint.acpi.0.rsdp" looks like?
>>
>>
>> This doesn't appear to "just work out-of-the-box" yet when EFI booting
>> amd64, as I still get the 'no local APIC' panic (I just tried @r329609).
>>
>> Under EFI and lua loader, the following is set when breaking to prompt:
>> hint.acpi.0.disabled=1
>> Under forth loader, this is not present/set.
>>
>> In neither case is hint.acpi.0.rsdp present/set as that appears to get
>> set during the exec of the loaded kernel...
>>
>> I've worked around the issue by adding hint.acpi.0.disabled="0" to
>> loader.conf (or patching the amd64 efi loader code to explicitly clear
>> that hint).
>>
>>
>> [Apologies for broken quoting, currently mobile]
>>
>> What happens if you patch this line out?
>> https://svnweb.freebsd.org/base/head/stand/lua/core.lua?view=markup#l233
>
>
> Ah, right - yep, commenting out that line works.
>

This should be fixed as of r329614. hint.acpi.0.rsdp gets set upon
exec of the loaded kernel in the EFI world, then in i386 world it's
before lualoader comes into play. We should probably do as forth does
and disable ACPI stuff on !i386 (IIRC the option disappears
completely), but IIRC we haven't yet exposed TARGET/TARGET_ARCH to
lua.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: ACPI panic on boot with new Lua loader and other minor issues

2018-02-19 Thread Peter Lei



On 2/19/18 5:48 PM, Kyle Evans wrote:
> 
> 
> On Feb 19, 2018 5:44 PM, "Peter Lei"  > wrote:
> 
> 
> 
> On 2/19/18 2:21 PM, Kyle Evans wrote:
> > Hello!
> >
> > On Mon, Feb 19, 2018 at 8:21 AM, Juan Ramón Molina Menor
> > wrote:
> >> I have done a full build of r329555 to test the new Lua boot loader.
> >>
> >> Both the new and the old kernels panic after being loaded with:
> >>
> >> panic: running without device atpic requires a local APIC
> >>
> >> For reasons unknown, ACPI is off, as shown by David Wolfskill in
> a previous
> >> message:
> >>
> 
> https://lists.freebsd.org/pipermail/freebsd-current/2018-February/068497.html
> 
> 
> >>
> >> OK show hint.acpi.0.disabled
> >> 1
> >>
> >> Setting ACPI to On resolves the issue.
> >
> > As David noted, this should actually Just Work (TM) now. Can you break
> > into a loader prompt with just the forth loader and tell me what "show
> > hint.acpi.0.rsdp" looks like?
> 
> 
> This doesn't appear to "just work out-of-the-box" yet when EFI booting
> amd64, as I still get the 'no local APIC' panic (I just tried @r329609).
> 
> Under EFI and lua loader, the following is set when breaking to prompt:
>     hint.acpi.0.disabled=1
> Under forth loader, this is not present/set.
> 
> In neither case is hint.acpi.0.rsdp present/set as that appears to get
> set during the exec of the loaded kernel...
> 
> I've worked around the issue by adding hint.acpi.0.disabled="0" to
> loader.conf (or patching the amd64 efi loader code to explicitly clear
> that hint).
> 
> 
> [Apologies for broken quoting, currently mobile]
> 
> What happens if you patch this line out?
> https://svnweb.freebsd.org/base/head/stand/lua/core.lua?view=markup#l233


Ah, right - yep, commenting out that line works.


> I'll have to go back and figure out what I was thinking here again. It
> made sense when I wrote it, maybe explicitly disabling ACPI if it's not
> immediately detected was the wrong move. =)




smime.p7s
Description: S/MIME Cryptographic Signature

Re: ACPI panic on boot with new Lua loader and other minor issues

2018-02-19 Thread Kyle Evans

On Feb 19, 2018 5:44 PM, "Peter Lei"  wrote:

On 2/19/18 2:21 PM, Kyle Evans wrote:
> Hello!
>
> On Mon, Feb 19, 2018 at 8:21 AM, Juan Ramón Molina Menor 
wrote:
>> I have done a full build of r329555 to test the new Lua boot loader.
>>
>> Both the new and the old kernels panic after being loaded with:
>>
>> panic: running without device atpic requires a local APIC
>>
>> For reasons unknown, ACPI is off, as shown by David Wolfskill in a
previous
>> message:
>> https://lists.freebsd.org/pipermail/freebsd-current/
2018-February/068497.html
>>
>> OK show hint.acpi.0.disabled
>> 1
>>
>> Setting ACPI to On resolves the issue.
>
> As David noted, this should actually Just Work (TM) now. Can you break
> into a loader prompt with just the forth loader and tell me what "show
> hint.acpi.0.rsdp" looks like?

This doesn't appear to "just work out-of-the-box" yet when EFI booting
amd64, as I still get the 'no local APIC' panic (I just tried @r329609).

Under EFI and lua loader, the following is set when breaking to prompt:
hint.acpi.0.disabled=1
Under forth loader, this is not present/set.

In neither case is hint.acpi.0.rsdp present/set as that appears to get
set during the exec of the loaded kernel...

I've worked around the issue by adding hint.acpi.0.disabled="0" to
loader.conf (or patching the amd64 efi loader code to explicitly clear
that hint).

[Apologies for broken quoting, currently mobile]

What happens if you patch this line out?
https://svnweb.freebsd.org/base/head/stand/lua/core.lua?view=markup#l233

I'll have to go back and figure out what I was thinking here again. It made
sense when I wrote it, maybe explicitly disabling ACPI if it's not
immediately detected was the wrong move. =)
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: ACPI panic on boot with new Lua loader and other minor issues

2018-02-19 Thread Peter Lei

On 2/19/18 2:21 PM, Kyle Evans wrote:
> Hello!
> 
> On Mon, Feb 19, 2018 at 8:21 AM, Juan Ramón Molina Menor  
> wrote:
>> I have done a full build of r329555 to test the new Lua boot loader.
>>
>> Both the new and the old kernels panic after being loaded with:
>>
>> panic: running without device atpic requires a local APIC
>>
>> For reasons unknown, ACPI is off, as shown by David Wolfskill in a previous
>> message:
>> https://lists.freebsd.org/pipermail/freebsd-current/2018-February/068497.html
>>
>> OK show hint.acpi.0.disabled
>> 1
>>
>> Setting ACPI to On resolves the issue.
> 
> As David noted, this should actually Just Work (TM) now. Can you break
> into a loader prompt with just the forth loader and tell me what "show
> hint.acpi.0.rsdp" looks like?

This doesn't appear to "just work out-of-the-box" yet when EFI booting
amd64, as I still get the 'no local APIC' panic (I just tried @r329609).

Under EFI and lua loader, the following is set when breaking to prompt:
hint.acpi.0.disabled=1
Under forth loader, this is not present/set.

In neither case is hint.acpi.0.rsdp present/set as that appears to get
set during the exec of the loaded kernel...

I've worked around the issue by adding hint.acpi.0.disabled="0" to
loader.conf (or patching the amd64 efi loader code to explicitly clear
that hint).

smime.p7s
Description: S/MIME Cryptographic Signature

Re: ACPI panic on boot with new Lua loader and other minor issues

2018-02-19 Thread Kyle Evans

On Mon, Feb 19, 2018 at 4:32 PM, Warner Losh  wrote:
>
>
> On Mon, Feb 19, 2018 at 2:57 PM, Devin Teske  wrote:
>>
>>
>>
>> > On Feb 19, 2018, at 2:21 PM, Kyle Evans  wrote:
>> >
>> > It seems that the Forth loader might be doing something sneaky and
>> > replacing the standard common "boot" with a Forth boot that handles
>> > this a lot better. CC'ing dteske@ so they can confirm.
>>
>> I can indeed confirm this as fact.
>>
>> Not able to help much because I am driving cross-country (San Francisco to
>> Orlando) right now with the spouse and dog.
>>
>> We get back March 3rd, but I will be checking-in from time to time for
>> sporadic responses during downtime.
>
>
> The command in loader.4th is defined as:
>
> : boot
>   0= if ( interpreted ) get_arguments then
>
>   \ Unload only if a path was passed
>   dup if
> >r over r> swap
> c@ [char] - <> if
>   0 1 unload drop
> else
>   s" kernelname" getenv? if ( a kernel has been loaded )
> try-menu-unset
> bootmsg 1 boot exit
>   then
>   load_kernel_and_modules
>   ?dup if exit then
>   try-menu-unset
>   bootmsg 0 1 boot exit
> then
>   else
> s" kernelname" getenv? if ( a kernel has been loaded )
>   try-menu-unset
>   bootmsg 1 boot exit
> then
> load_kernel_and_modules
> ?dup if exit then
> try-menu-unset
> bootmsg 0 1 boot exit
>   then
>   load_kernel_and_modules
>   ?dup 0= if bootmsg 0 1 boot then
> ;
>
> The thing to know here is when you see 'boot' as part of above script, it's
> calling the 'boot' cli command, not itself recursively.
>
> I can help do more interpretation of the details if you need Kyle. Not sure
> how much to spell out, but the brief pseudo code is:
>
> If there were any arguments that didn't start with '-', unload.
>   otherwise if kernelname is in in the environment, run the 'menu-unset'
> forth word if it exists, print the boot message and boot.
>   Otherwise load the kernel and modules, run the 'menu-unset' forth word (if
> it exists), print the boot message and boot with kernelname
> Otherwise load the kernel and modules, run the 'menu-unset' forth word (if
> it exists), print the boot message and boot with kernelname
> if all that fails, load the kernel and modules and if that works boot them.
>

Yeah, we have something like this on the lua side. Unfortunately, it's
going to wreck people's muscle memory- dropping to the loader prompt
and typing "boot [x]" will never work as expected because lua won't
recognize that as a function call due to spaces as delimiters.

We'd need some shim that takes "cmd [x]" and tries it as "cmd([x])"
(for some [x] that could be multiple space-delimited arguments) before
falling back to the originally typed "cmd [x]" if we want Lua to have
any chance to intercept it and adds its own salt and pepper like Forth
does.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: ACPI panic on boot with new Lua loader and other minor issues

2018-02-19 Thread Warner Losh

On Mon, Feb 19, 2018 at 2:57 PM, Devin Teske  wrote:

>
>
> > On Feb 19, 2018, at 2:21 PM, Kyle Evans  wrote:
> >
> > It seems that the Forth loader might be doing something sneaky and
> > replacing the standard common "boot" with a Forth boot that handles
> > this a lot better. CC'ing dteske@ so they can confirm.
>
> I can indeed confirm this as fact.
>
> Not able to help much because I am driving cross-country (San Francisco to
> Orlando) right now with the spouse and dog.
>
> We get back March 3rd, but I will be checking-in from time to time for
> sporadic responses during downtime.
>

The command in loader.4th is defined as:

: boot
  0= if ( interpreted ) get_arguments then

  \ Unload only if a path was passed
  dup if
>r over r> swap
c@ [char] - <> if
  0 1 unload drop
else
  s" kernelname" getenv? if ( a kernel has been loaded )
try-menu-unset
bootmsg 1 boot exit
  then
  load_kernel_and_modules
  ?dup if exit then
  try-menu-unset
  bootmsg 0 1 boot exit
then
  else
s" kernelname" getenv? if ( a kernel has been loaded )
  try-menu-unset
  bootmsg 1 boot exit
then
load_kernel_and_modules
?dup if exit then
try-menu-unset
bootmsg 0 1 boot exit
  then
  load_kernel_and_modules
  ?dup 0= if bootmsg 0 1 boot then
;

The thing to know here is when you see 'boot' as part of above script, it's
calling the 'boot' cli command, not itself recursively.

I can help do more interpretation of the details if you need Kyle. Not sure
how much to spell out, but the brief pseudo code is:

If there were any arguments that didn't start with '-', unload.
  otherwise if kernelname is in in the environment, run the 'menu-unset'
forth word if it exists, print the boot message and boot.
  Otherwise load the kernel and modules, run the 'menu-unset' forth word
(if it exists), print the boot message and boot with kernelname
Otherwise load the kernel and modules, run the 'menu-unset' forth word (if
it exists), print the boot message and boot with kernelname
if all that fails, load the kernel and modules and if that works boot them.

Warner
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: ACPI panic on boot with new Lua loader and other minor issues

2018-02-19 Thread Juan Ramón Molina Menor


Le 19/02/2018 à 21:21, Kyle Evans a écrit :

Hello!

On Mon, Feb 19, 2018 at 8:21 AM, Juan Ramón Molina Menor  wrote:

I have done a full build of r329555 to test the new Lua boot loader.

Both the new and the old kernels panic after being loaded with:

panic: running without device atpic requires a local APIC

For reasons unknown, ACPI is off, as shown by David Wolfskill in a previous
message:
https://lists.freebsd.org/pipermail/freebsd-current/2018-February/068497.html

OK show hint.acpi.0.disabled
1

Setting ACPI to On resolves the issue.




Hi Kyle.


As David noted, this should actually Just Work (TM) now. Can you break
into a loader prompt with just the forth loader and tell me what "show
hint.acpi.0.rsdp" looks like?


OK show hint.acpi.0.rsdp
Command error

I tested both with hint.acpi.0.disabled= 1 and 0.





Also, I can not stop boot2 to try to use the copy of the Forth loader: the
keyboard only becomes responsive at the loader stage.


Hmm...


In fact, I don’t think this has ever worked here… I’ve found a very old 
(July 2016) FreeBSD 12 memstick and neither can I stop the boot2 stage.




There is an error during this stage:

Loading /boot/defaults/loader.conf
Failed to open config: ’/boot/loader.conf.local’


David's diagnosis of this is right- this is more of an informational
message that you don't need to worry about.


Thanks.



Moreover, the "boot [kernel]" loader command does not work:

OK ls /boot/kernel.old/kernel
 /boot/kernel.old/kernel
OK boot kernel.old
Command failed
OK boot /boot/kernel.old/kernel
Command failed
OK boot kernel
Command failed

On the other hand, just "boot" works.


It seems that the Forth loader might be doing something sneaky and
replacing the standard common "boot" with a Forth boot that handles
this a lot better. CC'ing dteske@ so they can confirm.


Finally, the double lines drawing a frame around the loader menu do not work
with the new loader and are replaced by ? characters in a box.


Interesting, I'll look into that... anything interesting/unique about
your setup? r329387 should have addressed one potential cause of this,
but I see you're past that.


I’m using a memory stick to boot a Lenovo ThinkPad S440 (i3-4030U 
processor, 4GB RAM). The only thing I can think of is that the ACPI of 
this model is not well supported, but the errors I have are related to 
thermal zones…:

https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=201678

To build the memstick I’m using a 11.1-RELEASE VM under Hyper-V, with 
ccache and WITH_META_MODE, but this build process has been working 
nicely for months.


The kernel is based on GENERIC-NODEBUG and has been also working reliably:

juan@Server ~ % cat /root/kernels/MEMSTICK
include GENERIC-NODEBUG

ident   MEMSTICK

nodevicefdc

nodevicech
nodevicesa
nodeviceses

nodeviceamr
nodevicearcmsr
nodeviceciss
nodevicedpt
nodevicehptmv
nodevicehptnr
nodevicehptrr
nodevicehpt27xx
nodeviceiir
nodeviceips
nodevicemly
nodevicetwa
nodevicetws

nodeviceaac
nodeviceaacp
nodeviceaacraid
nodeviceida
nodevicemfi
nodevicemlx
nodevicemrsas
nodevicepmspcv
nodevicetwe

nodevicenvme
nodevicenvd

nodevicevirtio
nodevicevirtio_pci
nodevicevtnet
nodevicevirtio_blk
nodevicevirtio_scsi
nodevicevirtio_balloon

nooptions   HYPERV
nodevicehyperv

nooptions   XENHVM
nodevicexenpci

nodevicevmx


There is maybe something fishy in my src.conf, where I disable a lot of 
things to slim down the memstick, but still, it has been stable till now:


juan@Server ~ % cat /etc/src.conf
# For memory sticks

WITH_CCACHE_BUILD=

WITHOUT_ACCT=
WITHOUT_AMD=
WITHOUT_ATM=
WITHOUT_AUTHPF=
WITHOUT_AUTOFS=
WITHOUT_BHYVE=
WITHOUT_BLACKLIST=
# iwm does not support Bluetooth
WITHOUT_BLUETOOTH=
WITHOUT_BOOTPARAMD=
WITHOUT_BOOTPD=
# WITHOUT_BSDINSTALL enforced by WITHOUT_DIALOG
WITHOUT_BSNMP=
WITHOUT_CALENDAR=
# Don't set this when building HEAD from RELENG
# WITHOUT_CROSS_COMPILER=
WITHOUT_CTM=
WITHOUT_DEBUG_FILES=
#WITHOUT_DIALOG=
WITHOUT_DICT=
WITHOUT_EE=
WITHOUT_EXAMPLES=
WITHOUT_FDT=
WITHOUT_FINGER=
WITHOUT_FLOPPY=
# For testing the Lua loader (WITH_LOADER_LUA)
WITHOUT_FORTH=
WITHOUT_FREEBSD_UPDATE=
WITHOUT_GAMES=
WITHOUT_GCOV=
WITHOUT_GPIO=
# You disable Kerberos later, but try to keep GSSAPI for curl > pkg
# But this does not work, base Kerberos is required
#WITH_GSSAPI=
WITHOUT_GSSAPI=
WITHOUT_HAST=
WITHOUT_HESIOD=
WITHOUT_HTML=
WITHOUT_HYPERV=
WITHOUT_IPFILTER=
WITHOUT_IPFW=
WITHOUT_ISCSI=
WITHOUT_JAIL=
WITHOUT_KERBEROS=
WITHOUT_KERNEL_SYMBOLS=
WITHOUT_KVM=
WITHOUT_LDNS=
# This disables moused
#WITHOUT_LEGACY_CONSOLE=
WITHOUT_LLDB=
# This requires WITHOUT_FORTH
WITH_LOADER_LUA=
# This breaks setting locale and thus tmux
#WITHOUT_LOCALES=

Re: ACPI panic on boot with new Lua loader and other minor issues

2018-02-19 Thread Devin Teske

> On Feb 19, 2018, at 2:21 PM, Kyle Evans  wrote:
> 
> It seems that the Forth loader might be doing something sneaky and
> replacing the standard common "boot" with a Forth boot that handles
> this a lot better. CC'ing dteske@ so they can confirm.

I can indeed confirm this as fact.

Not able to help much because I am driving cross-country (San Francisco to 
Orlando) right now with the spouse and dog.

We get back March 3rd, but I will be checking-in from time to time for sporadic 
responses during downtime.
— 
Cheers,
Devin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: ACPI panic on boot with new Lua loader and other minor issues

2018-02-19 Thread Kyle Evans

On Mon, Feb 19, 2018 at 3:37 PM, Warner Losh  wrote:
>
>
> On Feb 19, 2018 1:23 PM, "Kyle Evans"  wrote:
>
> Hello!
>
> On Mon, Feb 19, 2018 at 8:21 AM, Juan Ramón Molina Menor 
> wrote:
>> I have done a full build of r329555 to test the new Lua boot loader.
>>
>> Both the new and the old kernels panic after being loaded with:
>>
>> panic: running without device atpic requires a local APIC
>>
>> For reasons unknown, ACPI is off, as shown by David Wolfskill in a
>> previous
>> message:
>>
>> https://lists.freebsd.org/pipermail/freebsd-current/2018-February/068497.html
>>
>> OK show hint.acpi.0.disabled
>> 1
>>
>> Setting ACPI to On resolves the issue.
>
> As David noted, this should actually Just Work (TM) now. Can you break
> into a loader prompt with just the forth loader and tell me what "show
> hint.acpi.0.rsdp" looks like?
>
>> Also, I can not stop boot2 to try to use the copy of the Forth loader: the
>> keyboard only becomes responsive at the loader stage.
>
> Hmm...
>
>> There is an error during this stage:
>>
>> Loading /boot/defaults/loader.conf
>> Failed to open config: ’/boot/loader.conf.local’
>
> David's diagnosis of this is right- this is more of an informational
> message that you don't need to worry about.
>
>> Moreover, the "boot [kernel]" loader command does not work:
>>
>> OK ls /boot/kernel.old/kernel
>> /boot/kernel.old/kernel
>> OK boot kernel.old
>> Command failed
>> OK boot /boot/kernel.old/kernel
>> Command failed
>> OK boot kernel
>> Command failed
>>
>> On the other hand, just "boot" works.
>
> It seems that the Forth loader might be doing something sneaky and
> replacing the standard common "boot" with a Forth boot that handles
> this a lot better. CC'ing dteske@ so they can confirm.
>
>
> Indeed, it does.
>
> Loader.4th defines boot. Search for ': boot' to see it.
>

I've created D14442 [1] to improve this situation a little bit. We
should also either:

1.) Provide a way for lua to register a function to handle a loader command, or
2.) Provide a way for lua/forth to tell the common boot what modules to load.

These both entail a good amount of work and quite a few places to
fail, but one of them needs to happen. =(

[1] https://reviews.freebsd.org/D14442
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: ACPI panic on boot with new Lua loader and other minor issues

2018-02-19 Thread Juan Ramón Molina Menor


Le 19/02/2018 à 15:39, David Wolfskill a écrit :

On Mon, Feb 19, 2018 at 03:21:50PM +0100, Juan Ramón Molina Menor wrote:

I have done a full build of r329555 to test the new Lua boot loader.

Both the new and the old kernels panic after being loaded with:

panic: running without device atpic requires a local APIC

For reasons unknown, ACPI is off, as shown by David Wolfskill in a
previous message:
https://lists.freebsd.org/pipermail/freebsd-current/2018-February/068497.html


That has been fixed (for me, at least).  My last two build/smoke-tests
were at r329517 and r329561; I believe that r329366 was the fix for ACPI
detection/setting.


OK show hint.acpi.0.disabled
1

Setting ACPI to On resolves the issue.

Also, I can not stop boot2 to try to use the copy of the Forth loader:
the keyboard only becomes responsive at the loader stage.



There is an error during this stage:

Loading /boot/defaults/loader.conf
Failed to open config: ’/boot/loader.conf.local’


IIUC, that's merely an informational message, not an error.  (None of my
systems have a /boot/loader.conf.local, either.)


Moreover, the "boot [kernel]" loader command does not work:

OK ls /boot/kernel.old/kernel
  /boot/kernel.old/kernel
OK boot kernel.old
Command failed
OK boot /boot/kernel.old/kernel
Command failed
OK boot kernel
Command failed

On the other hand, just "boot" works.


And the Lua loader permits kernel selection, as well (as the Forth
laoder has).


Finally, the double lines drawing a frame around the loader menu do not
work with the new loader and are replaced by ? characters in a box.


That has also been fixed for me (as of r329517).


Hope it helps,
Juan



Peace,
david


Thanks David. It’s strange I’m having issues resolved for you in commits 
older than the one I used here…


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: ACPI panic on boot with new Lua loader and other minor issues

2018-02-19 Thread Warner Losh

On Feb 19, 2018 1:23 PM, "Kyle Evans"  wrote:

Hello!

On Mon, Feb 19, 2018 at 8:21 AM, Juan Ramón Molina Menor 
wrote:
> I have done a full build of r329555 to test the new Lua boot loader.
>
> Both the new and the old kernels panic after being loaded with:
>
> panic: running without device atpic requires a local APIC
>
> For reasons unknown, ACPI is off, as shown by David Wolfskill in a
previous
> message:
> https://lists.freebsd.org/pipermail/freebsd-current/
2018-February/068497.html
>
> OK show hint.acpi.0.disabled
> 1
>
> Setting ACPI to On resolves the issue.

As David noted, this should actually Just Work (TM) now. Can you break
into a loader prompt with just the forth loader and tell me what "show
hint.acpi.0.rsdp" looks like?

> Also, I can not stop boot2 to try to use the copy of the Forth loader: the
> keyboard only becomes responsive at the loader stage.

Hmm...

> There is an error during this stage:
>
> Loading /boot/defaults/loader.conf
> Failed to open config: ’/boot/loader.conf.local’

David's diagnosis of this is right- this is more of an informational
message that you don't need to worry about.

> Moreover, the "boot [kernel]" loader command does not work:
>
> OK ls /boot/kernel.old/kernel
> /boot/kernel.old/kernel
> OK boot kernel.old
> Command failed
> OK boot /boot/kernel.old/kernel
> Command failed
> OK boot kernel
> Command failed
>
> On the other hand, just "boot" works.

It seems that the Forth loader might be doing something sneaky and
replacing the standard common "boot" with a Forth boot that handles
this a lot better. CC'ing dteske@ so they can confirm.

Indeed, it does.

Loader.4th defines boot. Search for ': boot' to see it.

Warner

> Finally, the double lines drawing a frame around the loader menu do not
work
> with the new loader and are replaced by ? characters in a box.

Interesting, I'll look into that... anything interesting/unique about
your setup? r329387 should have addressed one potential cause of this,
but I see you're past that.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: ACPI panic on boot with new Lua loader and other minor issues

2018-02-19 Thread Kyle Evans

Hello!

On Mon, Feb 19, 2018 at 8:21 AM, Juan Ramón Molina Menor  wrote:
> I have done a full build of r329555 to test the new Lua boot loader.
>
> Both the new and the old kernels panic after being loaded with:
>
> panic: running without device atpic requires a local APIC
>
> For reasons unknown, ACPI is off, as shown by David Wolfskill in a previous
> message:
> https://lists.freebsd.org/pipermail/freebsd-current/2018-February/068497.html
>
> OK show hint.acpi.0.disabled
> 1
>
> Setting ACPI to On resolves the issue.

As David noted, this should actually Just Work (TM) now. Can you break
into a loader prompt with just the forth loader and tell me what "show
hint.acpi.0.rsdp" looks like?

> Also, I can not stop boot2 to try to use the copy of the Forth loader: the
> keyboard only becomes responsive at the loader stage.

Hmm...

> There is an error during this stage:
>
> Loading /boot/defaults/loader.conf
> Failed to open config: ’/boot/loader.conf.local’

David's diagnosis of this is right- this is more of an informational
message that you don't need to worry about.

> Moreover, the "boot [kernel]" loader command does not work:
>
> OK ls /boot/kernel.old/kernel
> /boot/kernel.old/kernel
> OK boot kernel.old
> Command failed
> OK boot /boot/kernel.old/kernel
> Command failed
> OK boot kernel
> Command failed
>
> On the other hand, just "boot" works.

It seems that the Forth loader might be doing something sneaky and
replacing the standard common "boot" with a Forth boot that handles
this a lot better. CC'ing dteske@ so they can confirm.

> Finally, the double lines drawing a frame around the loader menu do not work
> with the new loader and are replaced by ? characters in a box.

Interesting, I'll look into that... anything interesting/unique about
your setup? r329387 should have addressed one potential cause of this,
but I see you're past that.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: ACPI panic on boot with new Lua loader and other minor issues

2018-02-19 Thread David Wolfskill

On Mon, Feb 19, 2018 at 03:21:50PM +0100, Juan Ramón Molina Menor wrote:
> I have done a full build of r329555 to test the new Lua boot loader.
> 
> Both the new and the old kernels panic after being loaded with:
> 
> panic: running without device atpic requires a local APIC
> 
> For reasons unknown, ACPI is off, as shown by David Wolfskill in a 
> previous message:
> https://lists.freebsd.org/pipermail/freebsd-current/2018-February/068497.html

That has been fixed (for me, at least).  My last two build/smoke-tests
were at r329517 and r329561; I believe that r329366 was the fix for ACPI
detection/setting.

> OK show hint.acpi.0.disabled
> 1
> 
> Setting ACPI to On resolves the issue.
> 
> Also, I can not stop boot2 to try to use the copy of the Forth loader: 
> the keyboard only becomes responsive at the loader stage.

> There is an error during this stage:
> 
> Loading /boot/defaults/loader.conf
> Failed to open config: ’/boot/loader.conf.local’

IIUC, that's merely an informational message, not an error.  (None of my
systems have a /boot/loader.conf.local, either.)

> Moreover, the "boot [kernel]" loader command does not work:
> 
> OK ls /boot/kernel.old/kernel
>  /boot/kernel.old/kernel
> OK boot kernel.old
> Command failed
> OK boot /boot/kernel.old/kernel
> Command failed
> OK boot kernel
> Command failed
> 
> On the other hand, just "boot" works.

And the Lua loader permits kernel selection, as well (as the Forth
laoder has).

> Finally, the double lines drawing a frame around the loader menu do not 
> work with the new loader and are replaced by ? characters in a box.

That has also been fixed for me (as of r329517).

> Hope it helps,
> Juan
> 

Peace,
david
-- 
David H. Wolfskill  da...@catwhisker.org
The circus around that memo helps confirm that Mr. Trump is unfit for office.

See http://www.catwhisker.org/~david/publickey.gpg for my public key.


signature.asc
Description: PGP signature

ACPI panic on boot with new Lua loader and other minor issues

2018-02-19 Thread Juan Ramón Molina Menor


Moreover, the "boot [kernel]" loader command does not work:

OK ls /boot/kernel.old/kernel
 /boot/kernel.old/kernel
OK boot kernel.old
Command failed
OK boot /boot/kernel.old/kernel
Command failed
OK boot kernel
Command failed



I forgot that I tried starting with "unload", which seems to work, but 
does not correct the issue:


OK unload
OK boot kernel.old
Command failed
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

ACPI panic on boot with new Lua loader and other minor issues

2018-02-19 Thread Juan Ramón Molina Menor


I have done a full build of r329555 to test the new Lua boot loader.

Both the new and the old kernels panic after being loaded with:

panic: running without device atpic requires a local APIC

For reasons unknown, ACPI is off, as shown by David Wolfskill in a 
previous message:

https://lists.freebsd.org/pipermail/freebsd-current/2018-February/068497.html

OK show hint.acpi.0.disabled
1

Setting ACPI to On resolves the issue.

Also, I can not stop boot2 to try to use the copy of the Forth loader: 
the keyboard only becomes responsive at the loader stage.


There is an error during this stage:

Loading /boot/defaults/loader.conf
Failed to open config: ’/boot/loader.conf.local’

Moreover, the "boot [kernel]" loader command does not work:

OK ls /boot/kernel.old/kernel
/boot/kernel.old/kernel
OK boot kernel.old
Command failed
OK boot /boot/kernel.old/kernel
Command failed
OK boot kernel
Command failed

On the other hand, just "boot" works.

Finally, the double lines drawing a frame around the loader menu do not 
work with the new loader and are replaced by ? characters in a box.


Hope it helps,
Juan
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic on boot after SVN r328988

2018-02-16 Thread Olivier Houchard

Hi Michael,

On Fri, Feb 16, 2018 at 10:13:07AM -0500, Michael Butler wrote:
> On 02/16/18 10:05, Andrey V. Elsukov wrote:
> > On 16.02.2018 17:44, Michael Butler wrote:
> >>> do you have some specific optimization flags in make.conf?
> >>> Can you show the output of `head -40 /var/run/dmesg.boot`?
> >>>
> >>
> >> The only relevant flags in /etc/make.conf are ..
> >>
> >> CPUTYPE?=pentium3
> >> KERNCONF=SARAH
> >> NO_MODULES=YES
> >>
> >> Boot log from last night's failure attached,
> > 
> > Ok, it seems ConcurrencyKit was not tested with Pintium3.
> > Can you show the output from kgdb:
> > 
> >  disassemble *0xc0991ff0
> 
> That'd do it .. :-(

[...]

Sorry about this.
It should be fixed with r329388.
Can you update and let me know how it goes ?

Thanks !

Olivier
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic on boot after SVN r328988

2018-02-16 Thread Michael Butler

On 02/16/18 10:05, Andrey V. Elsukov wrote:
> On 16.02.2018 17:44, Michael Butler wrote:
>>> do you have some specific optimization flags in make.conf?
>>> Can you show the output of `head -40 /var/run/dmesg.boot`?
>>>
>>
>> The only relevant flags in /etc/make.conf are ..
>>
>> CPUTYPE?=pentium3
>> KERNCONF=SARAH
>> NO_MODULES=YES
>>
>> Boot log from last night's failure attached,
> 
> Ok, it seems ConcurrencyKit was not tested with Pintium3.
> Can you show the output from kgdb:
> 
>  disassemble *0xc0991ff0

That'd do it .. :-(

(kgdb) disassemble 0xc0991ff0
Dump of assembler code for function dyn_lookup_ipv4_state:
   0xc0991fa0 <+0>: push   %ebp
   0xc0991fa1 <+1>: mov%esp,%ebp
   0xc0991fa3 <+3>: push   %ebx
   0xc0991fa4 <+4>: push   %edi
   0xc0991fa5 <+5>: push   %esi
   0xc0991fa6 <+6>: sub$0x18,%esp
   0xc0991fa9 <+9>: mov0x10(%ebp),%edx
   0xc0991fac <+12>:mov0xc0d19dd4,%ecx
   0xc0991fb2 <+18>:dec%ecx
   0xc0991fb3 <+19>:and0x4(%edx),%ecx
   0xc0991fb6 <+22>:mov0xc0d19dd8,%eax
   0xc0991fbb <+27>:mov(%eax,%ecx,4),%eax
   0xc0991fbe <+30>:mov%eax,0x8(%edx)
   0xc0991fc1 <+33>:mov0xc0d19ddc,%eax
   0xc0991fc6 <+38>:mov(%eax,%ecx,4),%eax
   0xc0991fc9 <+41>:mov%eax,-0x14(%ebp)
   0xc0991fcc <+44>:mov0xc0d19de0,%eax
   0xc0991fd1 <+49>:mov(%eax,%ecx,4),%ebx
   0xc0991fd4 <+52>:test   %ebx,%ebx
   0xc0991fd6 <+54>:je 0xc09920ba 
   0xc0991fdc <+60>:mov0x8(%ebp),%edx
   0xc0991fdf <+63>:mov%ecx,-0x10(%ebp)
   0xc0991fe2 <+66>:data16 data16 data16 data16 nopw
%cs:0x0(%eax,%eax,1)
   0xc0991ff0 <+80>:lfence
   0xc0991ff3 <+83>:mov%fs:0x58,%eax
   0xc0991ff9 <+89>:mov%eax,-0x18(%ebp)
   0xc0991ffc <+92>:mov%ebx,-0x3f409e80(%eax)
   0xc0992002 <+98>:mov0xc0d19ddc,%eax
   0xc0992007 <+103>:   mov(%eax,%ecx,4),%eax
   0xc099200a <+106>:   cmp%eax,-0x14(%ebp)
   0xc099200d <+109>:   jne0xc099209d 
   0xc0992013 <+115>:   movzbl 0x1(%ebx),%eax
   0xc0992017 <+119>:   cmp0xd(%edx),%al
   0xc099201a <+122>:   jne0xc0992090 
   0xc099201c <+124>:   mov0x10(%ebp),%eax
   0xc099201f <+127>:   movzwl 0x2(%eax),%eax
   0xc0992023 <+131>:   test   %ax,%ax
   0xc0992026 <+134>:   je 0xc099202e 
   0xc0992028 <+136>:   cmp%ax,0x2(%ebx)
   0xc099202c <+140>:   jne0xc0992090 
   0xc099202e <+142>:   movzwl 0x4(%ebx),%esi
   0xc0992032 <+146>:   movzwl 0xa(%edx),%edi
   0xc0992036 <+150>:   cmp%di,%si
   0xc0992039 <+153>:   jne0xc0992063 



___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic on boot after SVN r328988

2018-02-16 Thread Andrey V. Elsukov

On 16.02.2018 17:44, Michael Butler wrote:
>> do you have some specific optimization flags in make.conf?
>> Can you show the output of `head -40 /var/run/dmesg.boot`?
>>
> 
> The only relevant flags in /etc/make.conf are ..
> 
> CPUTYPE?=pentium3
> KERNCONF=SARAH
> NO_MODULES=YES
> 
> Boot log from last night's failure attached,

Ok, it seems ConcurrencyKit was not tested with Pintium3.
Can you show the output from kgdb:

 disassemble *0xc0991ff0

-- 
WBR, Andrey V. Elsukov



signature.asc
Description: OpenPGP digital signature

Re: panic on boot after SVN r328988

2018-02-16 Thread Michael Butler

On 02/16/18 09:31, Andrey V. Elsukov wrote:
> On 16.02.2018 17:13, Michael Butler wrote:
>> ipfw is compiled into the kernel not loaded as a module.
> 
> Hi,
> 
> do you have some specific optimization flags in make.conf?
> Can you show the output of `head -40 /var/run/dmesg.boot`?
> 

The only relevant flags in /etc/make.conf are ..

CPUTYPE?=pentium3
KERNCONF=SARAH
NO_MODULES=YES

Boot log from last night's failure attached,

imb
Copyright (c) 1992-2018 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 12.0-CURRENT #0 r329344: Thu Feb 15 19:07:40 EST 2018
r...@sarah.protected-networks.net:/usr/obj/usr/src/i386.i386/sys/SARAH i386
FreeBSD clang version 6.0.0 (branches/release_60 324090) (based on LLVM 6.0.0)
VT(vga): resolution 640x480
CPU: Intel Pentium III (701.61-MHz 686-class CPU)
  Origin="GenuineIntel"  Id=0x681  Family=0x6  Model=0x8  Stepping=1

Features=0x387f9ff
real memory  = 536870912 (512 MB)
avail memory = 512491520 (488 MB)
random: unblocking device.
Timecounter "TSC" frequency 701607416 Hz quality 800
random: entropy device external interface
kbd1 at kbdmux0
ACPI: Overriding _OS definition with "Linux"
nexus0
vtvga0:  on motherboard
cryptosoft0:  on motherboard
acpi0:  on motherboard
acpi0: Power Button (fixed)
cpu0:  on acpi0
attimer0:  port 0x40-0x43 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
atrtc0:  port 0x70-0x71 irq 8 on acpi0
atrtc0: registered as a time-of-day clock, resolution 1.00s
Event timer "RTC" frequency 32768 Hz quality 0
Timecounter "ACPI-safe" frequency 3579545 Hz quality 850
acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
pcib0:  port 0xcf8-0xcff on acpi0
pcib0: Length mismatch for 3 range: 18000 vs 15000
pci0:  on pcib0
isab0:  at device 7.0 on pci0
isa0:  on isab0
atapci0:  port 
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xffa0-0xffaf at device 7.1 on pci0
ata0:  at channel 0 on atapci0
ata1:  at channel 1 on atapci0
uhci0:  port 0xef80-0xef9f irq 5 at 
device 7.2 on pci0
usbus0 on uhci0
intsmb0:  port 0x440-0x44f at device 7.3 on pci0
intsmb0: intr IRQ 9 enabled revision 0
smbus0:  on intsmb0
smb0:  on smbus0
fxp0:  port 0xef00-0xef3f mem 
0xfebfa000-0xfebfafff,0xfea0-0xfeaf irq 7 at device 12.0 on pci0
miibus0:  on fxp0
inphy0:  PHY 1 on miibus0
inphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow
fxp0: Ethernet address: 00:d0:b7:9a:87:e8
fxp1:  port 0xee80-0xeebf mem 
0xfebf9000-0xfebf9fff,0xfe80-0xfe8f irq 7 at device 13.0 on pci0
miibus1:  on fxp1
inphy1:  PHY 1 on miibus1
inphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto, auto-flow
fxp1: Ethernet address: 00:d0:b7:9a:87:e9
vgapci0:  mem 
0xfc00-0xfcff,0xfebfc000-0xfebf,0xfe00-0xfe7f irq 11 at 
device 14.0 on pci0
vgapci0: Boot video device
ahc0:  port 0xe800-0xe8ff mem 
0xfebfb000-0xfebfbfff irq 10 at device 15.0 on pci0
aic7892: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
acpi_button0:  on acpi0
atkbdc0:  port 0x60,0x64 irq 1 on acpi0
atkbd0:  irq 1 on atkbdc0
kbd0 at atkbd0
atkbd0: [GIANT-LOCKED]
uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0
orm0:  at iomem 
0xc-0xc7fff,0xcd800-0xcefff,0xcf000-0xd07ff pnpid ORM on isa0
Timecounters tick every 1.000 msec
ipfw2 (+ipv6) initialized, divert loadable, nat loadable, default to deny, 
logging usbus0: 12Mbps Full Speed USB v1.0
disabled
ugen0.1:  at usbus0
uhub0:  on usbus0
uhub0: 2 ports with 2 removable, self powered
da0 at ahc0 bus 0 scbus2 target 0 lun 0
da0:  Fixed Direct Access SCSI-3 device
da0: Serial Number 141018850941
da0: 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit)
da0: Command Queueing enabled
da0: 17510MB (35861388 512 byte sectors)
da1 at ahc0 bus 0 scbus2 target 1 lun 0
da1:  Fixed Direct Access SCSI-3 device
da1: Serial Number 141018850840
da1: 160.000MB/s transfers (80.000MHz DT, offset 63, 16bit)
da1: Command Queueing enabled
da1: 17510MB (35861388 512 byte sectors)
cd0 at ata1 bus 0 scbus1 target 0 lun 0
cd0:  Removable CD-ROM SCSI device
cd0: 16.700MB/s transfers (WDMA2, ATAPI 12bytes, PIO 65534bytes)
cd0: Att
hwpmc: SOFT/16/64/0x67 TSC/1/64/0x20 
P6/2/40/0x1fe
Trying to mount root from ufs:/dev/mirror/gm0s1a [rw]...
GEOM_MIRROR: Cancelling unmapped because of da1s1.
GEOM_MIRROR: Cancelling unmapped because of da0s1.
GEOM_MIRROR: Device mirror/gm0s1 launched (2/2).
warning: total configured swap (287821 pages) exceeds maximum recommended 
amount (251328 pages).
warning: increase kern.maxswzone or reduce amount of swap.
bridge0: Ethernet address:

Re: panic on boot after SVN r328988

2018-02-16 Thread Andrey V. Elsukov

On 16.02.2018 17:13, Michael Butler wrote:
> ipfw is compiled into the kernel not loaded as a module.

Hi,

do you have some specific optimization flags in make.conf?
Can you show the output of `head -40 /var/run/dmesg.boot`?

-- 
WBR, Andrey V. Elsukov



signature.asc
Description: OpenPGP digital signature

panic on boot after SVN r328988

2018-02-16 Thread Michael Butler

This is on a slow (and remote :-() i386

(kgdb) bt
#0  0xc076bfe8 in doadump ()
#1  0xc076c008 in doadump ()
#2  0xc0d00ee0 in suspend_blocked ()
#3  0xcf607548 in ?? ()
#4  0xc076bd8b in kern_reboot ()
#5  0xc076c141 in vpanic ()
#6  0xc076c03b in panic ()
#7  0xc0ab5065 in trap_fatal ()
#8  0xc0ab4de0 in trap ()
#9  
#10 0xc0991ff0 in dyn_lookup_ipv4_state ()
#11 0xc0992323 in ipfw_dyn_lookup_state ()
#12 0xc098e2b9 in ipfw_chk ()
#13 0xc09983b9 in ipfw_check_packet ()
#14 0xc0879cf8 in pfil_run_hooks ()
#15 0xc08983e4 in ip_input ()
#16 0xc0878e94 in netisr_dispatch_src ()
#17 0xc0879190 in netisr_dispatch ()
#18 0xc0860434 in ether_demux ()
#19 0xc08611c9 in ether_nh_input ()
#20 0xc0878e94 in netisr_dispatch_src ()
#21 0xc0879190 in netisr_dispatch ()
#22 0xc08607aa in ether_input ()
#23 0xc08574e9 in if_input ()
#24 0xc05b5bbe in fxp_intr_body ()
#25 0xc05b4137 in fxp_intr ()
#26 0xc0732da9 in intr_event_execute_handlers ()
#27 0xc073303f in ithread_loop ()
#28 0xc0730351 in fork_exit ()
#29 

ipfw is compiled into the kernel not loaded as a module.

The best I can do to capture where in the boot process it was comes from
searching the vmcore file for strings ..

 [ .. ]

<118>devmatch: Can't read linker hints file.
<118>add host 127.0.0.1: gateway lo0 fib 0: route already in table
<118>add net default: gateway 64.xx.xxx.x
<118>Additional inet routing options: gateway=YES.
<118>add host ::1: gateway lo0 fib 0: route already in table
<118>add net fe80::: gateway ::1
<118>add net ff02::: gateway ::1
<118>add net :::0.0.0.0: gateway ::1
<118>add net ::0.0.0.0: gateway ::1
<118>add net default: gateway 2001:xxx::xxx::x
<118>Additional inet6 routing options: gateway=YES.
<118>00600 allow ip6 from any to any via lo0
<118>00700 deny ip6 from any to ::1
<118>00800 deny ip6 from ::1 to any
<118>00900 allow ip6 from :: to ff02::/16 ipv6-icmp
<118>01000 allow ip6 from fe80::/10 to fe80::/10 ipv6-icmp
<118>01100 allow ip6 from fe80::/10 to ff02::/16 ipv6-icmp
<118>01200 allow ip6 from any to any proto ipv6-icmp ip6 icmp6types
1,2,135,136
<118>01300 allow ip6 from me6 to any
<110>ipfw: 4900 Deny TCP 77.72.82.80:50823 202.xx.xxx.xx:7696 in via fxp0
<118>Firewall rules loaded.
<5>bridge0: link state changed to DOWN
<6>tap0: promiscuous mode enabled
<110>ipfw: 4900 Deny TCP 77.72.85.24:57077 202.xx.xxx.xx:33970 in via fxp0
<118>Creating and/or trimming log files
<110>ipfw: 4900 Deny TCP 5.188.11.111:59849 202.xx.xxx.xxx:7300 in via fxp0
<118>.
<118>Starting syslogd.
<110>ipfw: 4900 Deny TCP 5.188.11.25:51041 202.xx.xxx.xxx:20826 in via fxp0
<118>Setting date via ntp.
Fatal trap 1: privileged instruction fault while in kernel mode
instruction pointer = 0x20:0xc0991ff0
stack pointer   = 0x28:0xcf607704
frame pointer   = 0x28:0xcf607728
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 12 (irq7: fxp0 fxp1)
trap number = 1
panic: privileged instruction fault
time = 1518744036
Uptime: 26s
Physical memory: 499 MB
Dumping 51 MB:


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Panic on Boot - Current AMD64

2018-02-08 Thread Juan Ramón Molina Menor


On Wed, Feb 07, 2018 at 12:18:26PM +0100, Juan Ramón Molina Menor wrote:
J> > Same panic here with HEAD from this afternoon in a Lenovo ThinkPad S440 
J> > with 4 GB.
J> > 
J> > Workaround: break into the loader prompt and:
J> > 
J> > set vm.boot_pages=120

J> > boot
J> > 
J> > When booting kernel.old, vm.boot_pages is 64.
J> > 
J> > There is something wrong with r328916.
J> 
J> Recent commits 328955, 328953 and 328952 by glebius@ do not resolve the 
J> issue here.


r328982 should fix the boot without specifing vm.boot_pages.

I'm sorry for problems.

--
Gleb Smirnoff


Yes, it is fixed, thanks!
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Panic on Boot - Current AMD64

2018-02-07 Thread Gleb Smirnoff

On Wed, Feb 07, 2018 at 12:18:26PM +0100, Juan Ramón Molina Menor wrote:
J> > Same panic here with HEAD from this afternoon in a Lenovo ThinkPad S440 
J> > with 4 GB.
J> > 
J> > Workaround: break into the loader prompt and:
J> > 
J> > set vm.boot_pages=120
J> > boot
J> > 
J> > When booting kernel.old, vm.boot_pages is 64.
J> > 
J> > There is something wrong with r328916.
J> 
J> Recent commits 328955, 328953 and 328952 by glebius@ do not resolve the 
J> issue here.

r328982 should fix the boot without specifing vm.boot_pages.

I'm sorry for problems.

-- 
Gleb Smirnoff
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

RE: Panic on Boot - Current AMD64

2018-02-07 Thread M - Krasznai András

Hi

yesterday I experienced the problem, but after completely synchronizing the src 
tree (deleting the content of the /usr/src folder and svnlite checkout etc.) 
and recompiling the kernel I can boot my FreeBSD-CURRENT (12.0, amd64, r328967) 
with the default setting (64) for vm.boot_pages.

rgds
András Krasznai



-Eredeti üzenet-
Feladó: owner-freebsd-curr...@freebsd.org 
[mailto:owner-freebsd-curr...@freebsd.org] Meghatalmazó Juan Ramón Molina Menor
Küldve: 2018. február 7. 12:18
Címzett: freebsd-current@freebsd.org
Másolatot kap: manfredan...@gmail.com; gleb...@freebsd.org
Tárgy: Panic on Boot - Current AMD64

>> I get panic on boot from current kernel.
>> Since last night - changes to vm system ?
>> World is Current as of this morning
>>
>> FreeBSD is a registered trademark of The FreeBSD Foundation.
>> FreeBSD 12.0-CURRENT #0 r328948: Tue Feb  6 11:30:57 PST 2018
>>  root at pozo.com
>> <https://lists.freebsd.org/mailman/listinfo/freebsd-current>:/usr/src
>> /sys/amd64/compile/pozo amd64 FreeBSD clang version 6.0.0 
>> (branches/release_60 324090) (based on LLVM 6.0.0) Table 'FACP' at 
>> 0xdfbc57e8 Table 'APIC' at 0xdfbc585c Table 'ASF!' at 0xdfbc58e0 
>> Table 'MCFG' at 0xdfbc5943 Table 'TCPA' at 0xdfbc597f Table 'SLIC' at 
>> 0xdfbc59b1 Table 'HPET' at 0xdfbc5b27
>> ACPI: No SRAT table found
>> panic: UMA: Increase vm.boot_pages
>> cpuid = 0
>> time = 1
>> KDB: stack backtrace:
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 
>> 0x820bc820
>> vpanic() at vpanic+0x18d/frame 0x820bc880
>> panic() at panic+0x43/frame 0x820bc8e0
>> startup_alloc() at startup_alloc+0x19c/frame 0x820bc940
>> keg_alloc_slab() at keg_alloc_slab+0xef/frame 0x820bc9c0
>> keg_fetch_slab() at keg_fetch_slab+0x128/frame 0x820bca20
>> zone_fetch_slab() at zone_fetch_slab+0x69/frame 0x820bca50
>> zone_import() at zone_import+0x5a/frame 0x820bcaa0
>> zone_alloc_item() at zone_alloc_item+0x3b/frame 0x820bcae0
>> uma_startup() at uma_startup+0x3d3/frame 0x820bcbd0
>> vm_page_startup() at vm_page_startup+0x338/frame 0x820bcc20
>> vm_mem_init() at vm_mem_init+0x1d/frame 0x820bcc50
>> mi_startup() at mi_startup+0x118/frame 0x820bcc70
>> btext() at btext+0x2c
>> KDB: enter: panic
>> [ thread pid 0 tid 0 ]
>> Stopped at  kdb_enter+0x3b: movq$0,kdb_why
>> db> bt
>> Tracing pid 0 tid 0 td 0x80ff1240
>> kdb_enter() at kdb_enter+0x3b/frame 0x820bc820
>> vpanic() at vpanic+0x1aa/frame 0x820bc880
>> panic() at panic+0x43/frame 0x820bc8e0
>> startup_alloc() at startup_alloc+0x19c/frame 0x820bc940
>> keg_alloc_slab() at keg_alloc_slab+0xef/frame 0x820bc9c0
>> keg_fetch_slab() at keg_fetch_slab+0x128/frame 0x820bca20
>> zone_fetch_slab() at zone_fetch_slab+0x69/frame 0x820bca50
>> zone_import() at zone_import+0x5a/frame 0x820bcaa0
>> zone_alloc_item() at zone_alloc_item+0x3b/frame 0x820bcae0
>> uma_startup() at uma_startup+0x3d3/frame 0x820bcbd0
>> vm_page_startup() at vm_page_startup+0x338/frame 0x820bcc20
>> vm_mem_init() at vm_mem_init+0x1d/frame 0x820bcc50
>> mi_startup() at mi_startup+0x118/frame 0x820bcc70
>> btext() at btext+0x2c
>> db>
> 
> 
> Same panic here with HEAD from this afternoon in a Lenovo ThinkPad 
> S440 with 4 GB.
> 
> Workaround: break into the loader prompt and:
> 
> set vm.boot_pages=120
> boot
> 
> When booting kernel.old, vm.boot_pages is 64.
> 
> There is something wrong with r328916.
> 
> Hope it helps,
> Juan

Hi!

Recent commits 328955, 328953 and 328952 by glebius@ do not resolve the issue 
here.

Hope it helps,
Juan
___
freebsd-current@freebsd.org mailing list 
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Panic on Boot - Current AMD64

2018-02-07 Thread Juan Ramón Molina Menor


I get panic on boot from current kernel.
Since last night - changes to vm system ?
World is Current as of this morning

FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 12.0-CURRENT #0 r328948: Tue Feb  6 11:30:57 PST 2018
 root at pozo.com 
<https://lists.freebsd.org/mailman/listinfo/freebsd-current>:/usr/src/sys/amd64/compile/pozo amd64

FreeBSD clang version 6.0.0 (branches/release_60 324090) (based on LLVM 6.0.0)
Table 'FACP' at 0xdfbc57e8
Table 'APIC' at 0xdfbc585c
Table 'ASF!' at 0xdfbc58e0
Table 'MCFG' at 0xdfbc5943
Table 'TCPA' at 0xdfbc597f
Table 'SLIC' at 0xdfbc59b1
Table 'HPET' at 0xdfbc5b27
ACPI: No SRAT table found
panic: UMA: Increase vm.boot_pages
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0x820bc820
vpanic() at vpanic+0x18d/frame 0x820bc880
panic() at panic+0x43/frame 0x820bc8e0
startup_alloc() at startup_alloc+0x19c/frame 0x820bc940
keg_alloc_slab() at keg_alloc_slab+0xef/frame 0x820bc9c0
keg_fetch_slab() at keg_fetch_slab+0x128/frame 0x820bca20
zone_fetch_slab() at zone_fetch_slab+0x69/frame 0x820bca50
zone_import() at zone_import+0x5a/frame 0x820bcaa0
zone_alloc_item() at zone_alloc_item+0x3b/frame 0x820bcae0
uma_startup() at uma_startup+0x3d3/frame 0x820bcbd0
vm_page_startup() at vm_page_startup+0x338/frame 0x820bcc20
vm_mem_init() at vm_mem_init+0x1d/frame 0x820bcc50
mi_startup() at mi_startup+0x118/frame 0x820bcc70
btext() at btext+0x2c
KDB: enter: panic
[ thread pid 0 tid 0 ]
Stopped at  kdb_enter+0x3b: movq$0,kdb_why
db> bt
Tracing pid 0 tid 0 td 0x80ff1240
kdb_enter() at kdb_enter+0x3b/frame 0x820bc820
vpanic() at vpanic+0x1aa/frame 0x820bc880
panic() at panic+0x43/frame 0x820bc8e0
startup_alloc() at startup_alloc+0x19c/frame 0x820bc940
keg_alloc_slab() at keg_alloc_slab+0xef/frame 0x820bc9c0
keg_fetch_slab() at keg_fetch_slab+0x128/frame 0x820bca20
zone_fetch_slab() at zone_fetch_slab+0x69/frame 0x820bca50
zone_import() at zone_import+0x5a/frame 0x820bcaa0
zone_alloc_item() at zone_alloc_item+0x3b/frame 0x820bcae0
uma_startup() at uma_startup+0x3d3/frame 0x820bcbd0
vm_page_startup() at vm_page_startup+0x338/frame 0x820bcc20
vm_mem_init() at vm_mem_init+0x1d/frame 0x820bcc50
mi_startup() at mi_startup+0x118/frame 0x820bcc70
btext() at btext+0x2c
db>



Same panic here with HEAD from this afternoon in a Lenovo ThinkPad S440 
with 4 GB.


Workaround: break into the loader prompt and:

set vm.boot_pages=120
boot

When booting kernel.old, vm.boot_pages is 64.

There is something wrong with r328916.

Hope it helps,
Juan


Hi!

Recent commits 328955, 328953 and 328952 by glebius@ do not resolve the 
issue here.


Hope it helps,
Juan
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Panic on Boot - Current AMD64

2018-02-06 Thread Juan Ramón Molina Menor


I get panic on boot from current kernel.
Since last night - changes to vm system ?
World is Current as of this morning

FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 12.0-CURRENT #0 r328948: Tue Feb  6 11:30:57 PST 2018
 root at pozo.com 
<https://lists.freebsd.org/mailman/listinfo/freebsd-current>:/usr/src/sys/amd64/compile/pozo amd64

FreeBSD clang version 6.0.0 (branches/release_60 324090) (based on LLVM 6.0.0)
Table 'FACP' at 0xdfbc57e8
Table 'APIC' at 0xdfbc585c
Table 'ASF!' at 0xdfbc58e0
Table 'MCFG' at 0xdfbc5943
Table 'TCPA' at 0xdfbc597f
Table 'SLIC' at 0xdfbc59b1
Table 'HPET' at 0xdfbc5b27
ACPI: No SRAT table found
panic: UMA: Increase vm.boot_pages
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0x820bc820
vpanic() at vpanic+0x18d/frame 0x820bc880
panic() at panic+0x43/frame 0x820bc8e0
startup_alloc() at startup_alloc+0x19c/frame 0x820bc940
keg_alloc_slab() at keg_alloc_slab+0xef/frame 0x820bc9c0
keg_fetch_slab() at keg_fetch_slab+0x128/frame 0x820bca20
zone_fetch_slab() at zone_fetch_slab+0x69/frame 0x820bca50
zone_import() at zone_import+0x5a/frame 0x820bcaa0
zone_alloc_item() at zone_alloc_item+0x3b/frame 0x820bcae0
uma_startup() at uma_startup+0x3d3/frame 0x820bcbd0
vm_page_startup() at vm_page_startup+0x338/frame 0x820bcc20
vm_mem_init() at vm_mem_init+0x1d/frame 0x820bcc50
mi_startup() at mi_startup+0x118/frame 0x820bcc70
btext() at btext+0x2c
KDB: enter: panic
[ thread pid 0 tid 0 ]
Stopped at  kdb_enter+0x3b: movq$0,kdb_why
db> bt
Tracing pid 0 tid 0 td 0x80ff1240
kdb_enter() at kdb_enter+0x3b/frame 0x820bc820
vpanic() at vpanic+0x1aa/frame 0x820bc880
panic() at panic+0x43/frame 0x820bc8e0
startup_alloc() at startup_alloc+0x19c/frame 0x820bc940
keg_alloc_slab() at keg_alloc_slab+0xef/frame 0x820bc9c0
keg_fetch_slab() at keg_fetch_slab+0x128/frame 0x820bca20
zone_fetch_slab() at zone_fetch_slab+0x69/frame 0x820bca50
zone_import() at zone_import+0x5a/frame 0x820bcaa0
zone_alloc_item() at zone_alloc_item+0x3b/frame 0x820bcae0
uma_startup() at uma_startup+0x3d3/frame 0x820bcbd0
vm_page_startup() at vm_page_startup+0x338/frame 0x820bcc20
vm_mem_init() at vm_mem_init+0x1d/frame 0x820bcc50
mi_startup() at mi_startup+0x118/frame 0x820bcc70
btext() at btext+0x2c
db>



Same panic here with HEAD from this afternoon in a Lenovo ThinkPad S440 
with 4 GB.


Workaround: break into the loader prompt and:

set vm.boot_pages=120
boot

When booting kernel.old, vm.boot_pages is 64.

There is something wrong with r328916.

Hope it helps,
Juan


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Panic on Boot - Current AMD64

2018-02-06 Thread Manfred Antar

I get panic on boot from current kernel.
Since last night - changes to vm system ?
World is Current as of this morning

FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 12.0-CURRENT #0 r328948: Tue Feb  6 11:30:57 PST 2018
r...@pozo.com:/usr/src/sys/amd64/compile/pozo amd64
FreeBSD clang version 6.0.0 (branches/release_60 324090) (based on LLVM 6.0.0)
Table 'FACP' at 0xdfbc57e8
Table 'APIC' at 0xdfbc585c
Table 'ASF!' at 0xdfbc58e0
Table 'MCFG' at 0xdfbc5943
Table 'TCPA' at 0xdfbc597f
Table 'SLIC' at 0xdfbc59b1
Table 'HPET' at 0xdfbc5b27
ACPI: No SRAT table found
panic: UMA: Increase vm.boot_pages
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0x820bc820
vpanic() at vpanic+0x18d/frame 0x820bc880
panic() at panic+0x43/frame 0x820bc8e0
startup_alloc() at startup_alloc+0x19c/frame 0x820bc940
keg_alloc_slab() at keg_alloc_slab+0xef/frame 0x820bc9c0
keg_fetch_slab() at keg_fetch_slab+0x128/frame 0x820bca20
zone_fetch_slab() at zone_fetch_slab+0x69/frame 0x820bca50
zone_import() at zone_import+0x5a/frame 0x820bcaa0
zone_alloc_item() at zone_alloc_item+0x3b/frame 0x820bcae0
uma_startup() at uma_startup+0x3d3/frame 0x820bcbd0
vm_page_startup() at vm_page_startup+0x338/frame 0x820bcc20
vm_mem_init() at vm_mem_init+0x1d/frame 0x820bcc50
mi_startup() at mi_startup+0x118/frame 0x820bcc70
btext() at btext+0x2c
KDB: enter: panic
[ thread pid 0 tid 0 ]
Stopped at  kdb_enter+0x3b: movq$0,kdb_why
db> bt
Tracing pid 0 tid 0 td 0x80ff1240
kdb_enter() at kdb_enter+0x3b/frame 0x820bc820
vpanic() at vpanic+0x1aa/frame 0x820bc880
panic() at panic+0x43/frame 0x820bc8e0
startup_alloc() at startup_alloc+0x19c/frame 0x820bc940
keg_alloc_slab() at keg_alloc_slab+0xef/frame 0x820bc9c0
keg_fetch_slab() at keg_fetch_slab+0x128/frame 0x820bca20
zone_fetch_slab() at zone_fetch_slab+0x69/frame 0x820bca50
zone_import() at zone_import+0x5a/frame 0x820bcaa0
zone_alloc_item() at zone_alloc_item+0x3b/frame 0x820bcae0
uma_startup() at uma_startup+0x3d3/frame 0x820bcbd0
vm_page_startup() at vm_page_startup+0x338/frame 0x820bcc20
vm_mem_init() at vm_mem_init+0x1d/frame 0x820bcc50
mi_startup() at mi_startup+0x118/frame 0x820bcc70
btext() at btext+0x2c
db> 

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

[SOLVED] Re: Kernel Panic On Boot after r327979

2018-01-15 Thread Pete Wright



On 01/15/2018 09:26, Pete Wright wrote:

Hello,

I updated an amd64 system last night to r327979 and it panics into gdb 
after rc attempts to mount local filesystems.


The panic line is:
Fatal trap 12: page fault while in kernel mode

gdb states that it stopped at:
Stopped at    prison_allow+0x4    movq    0x30(%rdi),%rax


Is this a known issue?  This is my primary workstation - so I'm going 
to revert back to an older kernel, but if more info is needed I can 
put some cycles into debugging today.


closing the loop on this.  it looks like the panic was due to debugfs 
being mounted on my system.  debugfs is part of the drm-next-kmod port 
which enables i915 gfx on recent intel GPU's, and debugfs doesn't *need* 
to be mounted but is quite useful and fun to play with.


anywho - the fix on my end is to:
- remove the drm-next-kmod port/pkg
- build and install latest kernel+world
- reboot and build/install the drm-next-kmod port/pkg
- reboot and enjoy i915 graphics

i'm kinda interested in what prison_allow does now as i haven't run 
across it before.  is this part of the jails infrastrucutre?


cheers,
-p

--
Pete Wright
p...@nomadlogic.org
@nomadlogicLA

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Kernel Panic On Boot after r327979

2018-01-15 Thread Pete Wright


Hello,

I updated an amd64 system last night to r327979 and it panics into gdb 
after rc attempts to mount local filesystems.


The panic line is:
Fatal trap 12: page fault while in kernel mode

gdb states that it stopped at:
Stopped at    prison_allow+0x4    movq    0x30(%rdi),%rax


Is this a known issue?  This is my primary workstation - so I'm going to 
revert back to an older kernel, but if more info is needed I can put 
some cycles into debugging today.


Cheers!
-pete


--
Pete Wright
p...@nomadlogic.org
@nomadlogicLA

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: r323412: Panic on boot (slab->us_keg == keg)

2017-09-13 Thread Mark Johnston

On Tue, Sep 12, 2017 at 10:34:00AM +0200, Raphael Kubo da Costa wrote:
> Mark Johnston  writes:
> 
> > I think the bug is that keg_large_init() doesn't take
> > sizeof(struct uma_slab) into account when setting uk_ppera for the keg.
> > In particular, the bug isn't specific to the bootup process; it only
> > affects internal zones with an item size in the range [4016, 4096].
> >
> > The patch below should fix this - could you give it a try?
> 
> I've tried it and can confirm it fixed the panic here.

Thanks, committed as r323544.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: r323412: Panic on boot (slab->us_keg == keg)

2017-09-12 Thread Andrey V. Elsukov

On 12.09.2017 06:35, Mark Johnston wrote:
>> [...]
>> FreeBSD/SMP: 2 package(s) x 14 core(s) x 2 hardware threads
>>
>> Also I determined that it can successfully boot with disabled
>> hyper-threading.
> 
> After the change to CACHE_LINE_SIZE, we have
> sizeof(struct uma_zone) == 448 and sizeof(struct uma_cache) == 64. With
> 56 CPUs, we therefore need 4032 bytes per UMA zone, plus 80 bytes for
> the slab header - "internal" zones always keep the slab header in the
> slab itself. That's slightly larger than one page, but the UMA zone
> zone's keg will have uk_ppera == 1. So, when allocating slabzone,
> keg_alloc_slab() will call startup_alloc(uk_ppera * PAGE_SIZE), which
> will allocate 4096 bytes for a structure that is 4032 + 80 = 4112 bytes
> in size.
> 
> I think the bug is that keg_large_init() doesn't take
> sizeof(struct uma_slab) into account when setting uk_ppera for the keg.
> In particular, the bug isn't specific to the bootup process; it only
> affects internal zones with an item size in the range [4016, 4096].
> 
> The patch below should fix this - could you give it a try?
Hi Mark,

I can confirm that it fixes this panic. Thanks!

-- 
WBR, Andrey V. Elsukov



signature.asc
Description: OpenPGP digital signature

Re: r323412: Panic on boot (slab->us_keg == keg)

2017-09-12 Thread Raphael Kubo da Costa

Mark Johnston  writes:

> I think the bug is that keg_large_init() doesn't take
> sizeof(struct uma_slab) into account when setting uk_ppera for the keg.
> In particular, the bug isn't specific to the bootup process; it only
> affects internal zones with an item size in the range [4016, 4096].
>
> The patch below should fix this - could you give it a try?

I've tried it and can confirm it fixed the panic here.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: r323412: Panic on boot (slab->us_keg == keg)

2017-09-11 Thread Mark Johnston

On Mon, Sep 11, 2017 at 09:15:51PM +0300, Andrey V. Elsukov wrote:
> On 11.09.2017 15:23, Andrey V. Elsukov wrote:
> > --- trap 0xc, rip = 0x80d84870, rsp = 0x82193970, rbp =
> > 0x821939b0 ---
> > zone_import() at zone_import+0x110/frame 0x821939b0
> > zone_alloc_item() at zone_alloc_item+0x36/frame 0x821939f0
> > uma_startup() at uma_startup+0x1d0/frame 0x82193ae0
> > vm_page_startup() at vm_page_startup+0x34e/frame 0x82193b30
> > vm_mem_init() at vm_mem_init+0x1a/frame 0x82193b50
> > mi_startup() at mi_startup+0x9c/frame 0x82193b70
> > btext() at btext+0x2c
> > Uptime: 1s
> 
> I bisected revisions, and the last working is r322988.
> This machine is E5-2660 v4@ based.
> 
> [...]
> FreeBSD/SMP: 2 package(s) x 14 core(s) x 2 hardware threads
> 
> Also I determined that it can successfully boot with disabled
> hyper-threading.

After the change to CACHE_LINE_SIZE, we have
sizeof(struct uma_zone) == 448 and sizeof(struct uma_cache) == 64. With
56 CPUs, we therefore need 4032 bytes per UMA zone, plus 80 bytes for
the slab header - "internal" zones always keep the slab header in the
slab itself. That's slightly larger than one page, but the UMA zone
zone's keg will have uk_ppera == 1. So, when allocating slabzone,
keg_alloc_slab() will call startup_alloc(uk_ppera * PAGE_SIZE), which
will allocate 4096 bytes for a structure that is 4032 + 80 = 4112 bytes
in size.

I think the bug is that keg_large_init() doesn't take
sizeof(struct uma_slab) into account when setting uk_ppera for the keg.
In particular, the bug isn't specific to the bootup process; it only
affects internal zones with an item size in the range [4016, 4096].

The patch below should fix this - could you give it a try?

diff --git a/sys/vm/uma_core.c b/sys/vm/uma_core.c
index 44c91e66769a..48daeb18f9c3 100644
--- a/sys/vm/uma_core.c
+++ b/sys/vm/uma_core.c
@@ -1306,10 +1306,6 @@ keg_large_init(uma_keg_t keg)
keg->uk_ipers = 1;
keg->uk_rsize = keg->uk_size;
 
-   /* We can't do OFFPAGE if we're internal, bail out here. */
-   if (keg->uk_flags & UMA_ZFLAG_INTERNAL)
-   return;
-
/* Check whether we have enough space to not do OFFPAGE. */
if ((keg->uk_flags & UMA_ZONE_OFFPAGE) == 0) {
shsize = sizeof(struct uma_slab);
@@ -1317,8 +1313,17 @@ keg_large_init(uma_keg_t keg)
shsize = (shsize & ~UMA_ALIGN_PTR) +
(UMA_ALIGN_PTR + 1);
 
-   if ((PAGE_SIZE * keg->uk_ppera) - keg->uk_rsize < shsize)
-   keg->uk_flags |= UMA_ZONE_OFFPAGE;
+   if ((PAGE_SIZE * keg->uk_ppera) - keg->uk_rsize < shsize) {
+   /*
+* We can't do offpage if we're internal, in which case
+* we need an extra page per allocation to contain the
+* slab header.
+*/
+   if ((keg->uk_flags & UMA_ZFLAG_INTERNAL) == 0)
+   keg->uk_flags |= UMA_ZONE_OFFPAGE;
+   else
+   keg->uk_ppera++;
+   }
}
 
if ((keg->uk_flags & UMA_ZONE_OFFPAGE) &&
diff --git a/sys/vm/vm_page.c b/sys/vm/vm_page.c
index ee7b93bbd719..477a816b0bd2 100644
--- a/sys/vm/vm_page.c
+++ b/sys/vm/vm_page.c
@@ -475,7 +475,8 @@ vm_page_startup(vm_offset_t vaddr)
 * in proportion to the zone structure size.
 */
pages_per_zone = howmany(sizeof(struct uma_zone) +
-   sizeof(struct uma_cache) * (mp_maxid + 1), UMA_SLAB_SIZE);
+   sizeof(struct uma_slab) + sizeof(struct uma_cache) * (mp_maxid + 1),
+   UMA_SLAB_SIZE);
if (pages_per_zone > 1) {
/* Reserve more pages so that we don't run out. */
boot_pages = UMA_BOOT_PAGES_ZONES * pages_per_zone;
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: r323412: Panic on boot (slab->us_keg == keg)

2017-09-11 Thread Raphael Kubo da Costa

Raphael Kubo da Costa  writes:

> "Andrey V. Elsukov"  writes:
>
>> On 11.09.2017 15:23, Andrey V. Elsukov wrote:
>>
>>> --- trap 0xc, rip = 0x80d84870, rsp = 0x82193970, rbp =
>>> 0x821939b0 ---
>>> zone_import() at zone_import+0x110/frame 0x821939b0
>>> zone_alloc_item() at zone_alloc_item+0x36/frame 0x821939f0
>>> uma_startup() at uma_startup+0x1d0/frame 0x82193ae0
>>> vm_page_startup() at vm_page_startup+0x34e/frame 0x82193b30
>>> vm_mem_init() at vm_mem_init+0x1a/frame 0x82193b50
>>> mi_startup() at mi_startup+0x9c/frame 0x82193b70
>>> btext() at btext+0x2c
>>> Uptime: 1s
>>
>> I bisected revisions, and the last working is r322988.
>
> [...]
>
>> Also I determined that it can successfully boot with disabled
>> hyper-threading.
>
> Did you mistype the revision number? r322988 is "rtwn(4): some initial
> preparations for (basic) VHT support" by avos@.

Sorry for the brain fart. I can confirm that reverting r322989 ("Drop
CACHE_LINE_SIZE to 64 bytes on x86") here on top of r323412 allows the
boot to proceed here.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: r323412: Panic on boot (slab->us_keg == keg)

2017-09-11 Thread Andrey V. Elsukov

On 11.09.2017 21:38, John Baldwin wrote:
> On Monday, September 11, 2017 09:15:51 PM Andrey V. Elsukov wrote:
>> On 11.09.2017 15:23, Andrey V. Elsukov wrote:
>>> --- trap 0xc, rip = 0x80d84870, rsp = 0x82193970, rbp =
>>> 0x821939b0 ---
>>> zone_import() at zone_import+0x110/frame 0x821939b0
>>> zone_alloc_item() at zone_alloc_item+0x36/frame 0x821939f0
>>> uma_startup() at uma_startup+0x1d0/frame 0x82193ae0
>>> vm_page_startup() at vm_page_startup+0x34e/frame 0x82193b30
>>> vm_mem_init() at vm_mem_init+0x1a/frame 0x82193b50
>>> mi_startup() at mi_startup+0x9c/frame 0x82193b70
>>> btext() at btext+0x2c
>>> Uptime: 1s
>>
>> I bisected revisions, and the last working is r322988.
>> This machine is E5-2660 v4@ based.
> 
> If you just revert r322988 on a newer tree does it work ok?

r322988 works, reverting r322989 (commit about CACHELINE) does help.

-- 
WBR, Andrey V. Elsukov



signature.asc
Description: OpenPGP digital signature

Re: r323412: Panic on boot (slab->us_keg == keg)

2017-09-11 Thread Raphael Kubo da Costa

"Andrey V. Elsukov"  writes:

> On 11.09.2017 15:23, Andrey V. Elsukov wrote:
>
>> --- trap 0xc, rip = 0x80d84870, rsp = 0x82193970, rbp =
>> 0x821939b0 ---
>> zone_import() at zone_import+0x110/frame 0x821939b0
>> zone_alloc_item() at zone_alloc_item+0x36/frame 0x821939f0
>> uma_startup() at uma_startup+0x1d0/frame 0x82193ae0
>> vm_page_startup() at vm_page_startup+0x34e/frame 0x82193b30
>> vm_mem_init() at vm_mem_init+0x1a/frame 0x82193b50
>> mi_startup() at mi_startup+0x9c/frame 0x82193b70
>> btext() at btext+0x2c
>> Uptime: 1s
>
> I bisected revisions, and the last working is r322988.

[...]

> Also I determined that it can successfully boot with disabled
> hyper-threading.

Did you mistype the revision number? r322988 is "rtwn(4): some initial
preparations for (basic) VHT support" by avos@.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: r323412: Panic on boot (slab->us_keg == keg)

2017-09-11 Thread John Baldwin

On Monday, September 11, 2017 09:15:51 PM Andrey V. Elsukov wrote:
> On 11.09.2017 15:23, Andrey V. Elsukov wrote:
> > --- trap 0xc, rip = 0x80d84870, rsp = 0x82193970, rbp =
> > 0x821939b0 ---
> > zone_import() at zone_import+0x110/frame 0x821939b0
> > zone_alloc_item() at zone_alloc_item+0x36/frame 0x821939f0
> > uma_startup() at uma_startup+0x1d0/frame 0x82193ae0
> > vm_page_startup() at vm_page_startup+0x34e/frame 0x82193b30
> > vm_mem_init() at vm_mem_init+0x1a/frame 0x82193b50
> > mi_startup() at mi_startup+0x9c/frame 0x82193b70
> > btext() at btext+0x2c
> > Uptime: 1s
> 
> I bisected revisions, and the last working is r322988.
> This machine is E5-2660 v4@ based.

If you just revert r322988 on a newer tree does it work ok?

-- 
John Baldwin
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: r323412: Panic on boot (slab->us_keg == keg)

2017-09-11 Thread Andrey V. Elsukov

On 11.09.2017 15:23, Andrey V. Elsukov wrote:
> --- trap 0xc, rip = 0x80d84870, rsp = 0x82193970, rbp =
> 0x821939b0 ---
> zone_import() at zone_import+0x110/frame 0x821939b0
> zone_alloc_item() at zone_alloc_item+0x36/frame 0x821939f0
> uma_startup() at uma_startup+0x1d0/frame 0x82193ae0
> vm_page_startup() at vm_page_startup+0x34e/frame 0x82193b30
> vm_mem_init() at vm_mem_init+0x1a/frame 0x82193b50
> mi_startup() at mi_startup+0x9c/frame 0x82193b70
> btext() at btext+0x2c
> Uptime: 1s

I bisected revisions, and the last working is r322988.
This machine is E5-2660 v4@ based.

CPU: Intel(R) Xeon(R) CPU E5-2660 v4@ 2.00GHz (2000.04-MHz K8-class CPU)
  Origin="GenuineIntel"  Id=0x406f1  Family=0x6  Model=0x4f  Stepping=1

Features=0xbfebfbff

Features2=0x7ffefbff
  AMD Features=0x2c100800
  AMD Features2=0x121
  Structured Extended
Features=0x21cbfbb
  XSAVE Features=0x1
  VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID,VID,PostIntr
  TSC: P-state invariant, performance statistics
real memory  = 68719476736 (65536 MB)
avail memory = 66562076672 (63478 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 56 CPUs
FreeBSD/SMP: 2 package(s) x 14 core(s) x 2 hardware threads


Also I determined that it can successfully boot with disabled
hyper-threading.

-- 
WBR, Andrey V. Elsukov



signature.asc
Description: OpenPGP digital signature

Re: r323412: Panic on boot (slab->us_keg == keg)

2017-09-11 Thread Andrey V. Elsukov

On 11.09.2017 11:31, Raphael Kubo da Costa wrote:
> I've recently tried to upgrade a HEAD VM (running on a Linux host with
> QEMU) from r321082 to r323412.
> 
> The new kernel panics right after I try to boot into it with:
> 
> panic: Assertion slab->us_keg == keg failed at /usr/src/sys/vm/uma_core.c:2285
> cpuid = 0
> time = 1
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0x81c4d780
> vpanic() at vpanic+0x19c/frame 0x81c4d800
> kassert_panic() at kassert_panic+0x126/frame 0x81c4d870
> keg_fetch_slab() at keg_fetch_slab+0x2a9/frame 0x81c4d8c0
> zone_fetch_slab() at zone_fetch_slab+0x51/frame 0x81c4d8f0
> zone_import() at zone_import+0x4f/frame 0x81c4d960
> zone_alloc_item() at zone_alloc_item+0x36/frame 0x81c4d9a0
> uma_zcreate() at uma_zcreate+0x3d3/frame 0x81c4da40
> uma_startup() at uma_startup+0x147/frame 0x81c4dae0
> vm_page_startup() at vm_page_startup+0x34e/frame 0x81c4db30
> vm_mem_init() at vm_mem_init+0x1a/frame 0x81c4db50
> mi_startup() at mi_startup+0x9c/frame 0x81c4db70
> btext() at btext+0x2c
> KDB: enter: panic
> [ thread 0 pid 0 tid 0 ]

I have r323177 based system without INVARIANTS that panics at
netboot with similar trace:

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x84
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x80d84870
stack pointer   = 0x28:0x82193970
frame pointer   = 0x28:0x821939b0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= resume, IOPL = 0
current process = 0 ()
trap number = 12
panic: page fault
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
0x82193550
vpanic() at vpanic+0x19c/frame 0x821935d0
panic() at panic+0x43/frame 0x82193630
trap_fatal() at trap_fatal+0x34d/frame 0x82193680
trap_pfault() at trap_pfault+0x49/frame 0x821936e0
trap() at trap+0x2a9/frame 0x821938a0
calltrap() at calltrap+0x8/frame 0x821938a0
--- trap 0xc, rip = 0x80d84870, rsp = 0x82193970, rbp =
0x821939b0 ---
zone_import() at zone_import+0x110/frame 0x821939b0
zone_alloc_item() at zone_alloc_item+0x36/frame 0x821939f0
uma_startup() at uma_startup+0x1d0/frame 0x82193ae0
vm_page_startup() at vm_page_startup+0x34e/frame 0x82193b30
vm_mem_init() at vm_mem_init+0x1a/frame 0x82193b50
mi_startup() at mi_startup+0x9c/frame 0x82193b70
btext() at btext+0x2c
Uptime: 1s

-- 
WBR, Andrey V. Elsukov



signature.asc
Description: OpenPGP digital signature

1 2 3 4 >

1 - 100 of 342 matches

Mail list logo