Re: Booting anything after r352057 kills console - additional info

2019-09-26 Thread Thomas Laus
On 2019-09-25 17:51, Thomas Laus wrote:
> I was able to mount my zfs filesystem r+w and added the rc_debug="YES"
> to my rc.conf.  The additional debug messages were written to the beadm
> boot environment /var/log/messages.  Since the computer locked up, I was
> unable to read this log and email the results.  That log is inaccessible
> to a running beadm snapshot, so I gave up on this quest.  I blew away my
> source and object files and did a fresh checkout of HEAD today.
> Everything built fine and I was able to boot r352710 today.
> 
> Thank you and the rest of this list for the help along the way.  I guess
> that this is just one of those computer mysteries that won't get solved
> at thus time.
>
There is still something happening with syscons after r352057 and before
r352064.  I updated 2 more computers today from r352057 to r352710 to
have everything match.  I updated the package for
drm-current-kmod-4.16.g20190918 but commented it out of rc.conf because
of potential problems with drm.  I wanted to boot into a working system
and then activate drm later.  I built and installed my kernel and on
first boot into the new kernel, the console changed to a very dim
display about halfway through the boot process and then disappeared
completely. Switching to another console gave me a green on black
display and a login screen.  I installed world and rebooted.  Everything
worked as expected and I was able to kldload drm without any issues.  It
looks like the UPDATING file should have a note with a 'heads up' about
this issue because the standing instruction statement to always load the
new kernel before installing world may not always be true.  Without some
random key pokes, I would have never seen that something put syscons in
a non-default state until world could be installed.  I was thinking that
my original black console screen was back again.

This happened on 2 computers today.  I wonder if my first problem was
related?  Both computers have Intel CPU's one is a Core2-Duo and the
other is an Atom D510.

Tom
 --
Public Keys:
PGP KeyID = 0x5F22FDC1
GnuPG KeyID = 0x620836CF
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Booting anything after r352057 kills console

2019-09-25 Thread Thomas Laus
On 2019-09-24 15:09, Ian Lepore wrote:
> 
> On my system, a whole lotta stuff happens between ntpd and syscons (the
> thing that configures blanktime).  Try setting rc_debug=YES in rc.conf,
> that should write more info to syslog about what's happening between
> ntpd and the lockup point.
>
Ian:

I was able to mount my zfs filesystem r+w and added the rc_debug="YES"
to my rc.conf.  The additional debug messages were written to the beadm
boot environment /var/log/messages.  Since the computer locked up, I was
unable to read this log and email the results.  That log is inaccessible
to a running beadm snapshot, so I gave up on this quest.  I blew away my
source and object files and did a fresh checkout of HEAD today.
Everything built fine and I was able to boot r352710 today.

Thank you and the rest of this list for the help along the way.  I guess
that this is just one of those computer mysteries that won't get solved
at thus time.

Tom


-- 
Public Keys:
PGP KeyID = 0x5F22FDC1
GnuPG KeyID = 0x620836CF
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Booting anything after r352057 kills console

2019-09-24 Thread Thomas Laus
> Ian Lepore [i...@freebsd.org] wrote:
> > 
> > On my system, a whole lotta stuff happens between ntpd and syscons (the
> > thing that configures blanktime).  Try setting rc_debug=YES in rc.conf,
> > that should write more info to syslog about what's happening between
> > ntpd and the lockup point.
> >
> The results were not very informative.  The 'bad' release did not show any
> additional log entries with the rc_debug turned on.  The working release
> was very chatty.  Maybe this will get closer to finding the root cause?
>
I just realized that the rc.conf file is part of the BEADM boot environment
and when I switch releases, rc.conf comes with it.  I will need to
grab my ZFS admin book and find out how to mount a zfs filesystem from
the loader prompt and remove the 'read only' attribute so that I can edit
the rc.conf that will be used to load the 'bad' release.  Maybe then I
will have something useful to report

Tom

-- 
Public Keys:
PGP KeyID = 0x5F22FDC1
GnuPG KeyID = 0x620836CF
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Booting anything after r352057 kills console

2019-09-24 Thread Thomas Laus
Ian Lepore [i...@freebsd.org] wrote:
> 
> On my system, a whole lotta stuff happens between ntpd and syscons (the
> thing that configures blanktime).  Try setting rc_debug=YES in rc.conf,
> that should write more info to syslog about what's happening between
> ntpd and the lockup point.
>
The results were not very informative.  The 'bad' release did not show any
additional log entries with the rc_debug turned on.  The working release
was very chatty.  Maybe this will get closer to finding the root cause?

Tom

-- 
Public Keys:
PGP KeyID = 0x5F22FDC1
GnuPG KeyID = 0x620836CF
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Booting anything after r352057 kills console

2019-09-24 Thread Ian Lepore
On Tue, 2019-09-24 at 14:32 -0400, Thomas Laus wrote:
> On 2019-09-24 11:58, Pete Wright wrote:
> > 
> > darn, and they didn't give you any additional information in the
> > messages buffer, or generate a core file?
> > 
> 
> There were no messages in the syslog and no other log that showed
> anything after killing the console.  On my working BEADM boot
> environment, the next message after starting ntp is:
> 
> configuring vt: blanktime
> 
> This message never showed in the syslog on any of the 'black screens'.
> I looked over the changes made between r352057 and r352064 and nothing
> 'vt' related popped up.  I have not built anything between these 2
> releases and probably should in order to help narrow the problem.  I
> originally updated to r352304 and killed my system and was reverting
> back toward r352057 by splitting the difference by half.  The problem
> was still there in r352064, so I stopped and asked for help.
> 
> This problem did not generate a core and building GENERIC with the DEBUG
> symbols did not add any more information.
> 
> Tom

On my system, a whole lotta stuff happens between ntpd and syscons (the
thing that configures blanktime).  Try setting rc_debug=YES in rc.conf,
that should write more info to syslog about what's happening between
ntpd and the lockup point.

-- Ian


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Booting anything after r352057 kills console

2019-09-24 Thread Thomas Laus
On 2019-09-24 11:58, Pete Wright wrote:
> 
> darn, and they didn't give you any additional information in the
> messages buffer, or generate a core file?
>
There were no messages in the syslog and no other log that showed
anything after killing the console.  On my working BEADM boot
environment, the next message after starting ntp is:

configuring vt: blanktime

This message never showed in the syslog on any of the 'black screens'.
I looked over the changes made between r352057 and r352064 and nothing
'vt' related popped up.  I have not built anything between these 2
releases and probably should in order to help narrow the problem.  I
originally updated to r352304 and killed my system and was reverting
back toward r352057 by splitting the difference by half.  The problem
was still there in r352064, so I stopped and asked for help.

This problem did not generate a core and building GENERIC with the DEBUG
symbols did not add any more information.

Tom



-- 
Public Keys:
PGP KeyID = 0x5F22FDC1
GnuPG KeyID = 0x620836CF
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Booting anything after r352057 kills console

2019-09-23 Thread Thomas Laus
Pete Wright [p...@nomadlogic.org] wrote:
>
> I remember having similar issues a while ago when we were first hacking on
> drm, one thing to try is updating /boot/loader.conf with the following:
> debug.debugger_on_panic=0
> dev.drm.skip_ddb="1"
> dev.drm.drm_debug_persist="1"
> 
> these are semi-documented in the wiki here:
> https://wiki.freebsd.org/Graphics#Issues_.2F_Bugs
> 
> while they may not solve the issue, they will hopefully give us better info
> as to why the system is hanging.  Also, are you able to boot the previously
> working kernel (iirc you can do this via the boot loader menu) successfully? 
> and lasty, can you boot single user then manually attempt to load the kernel
> module via kldload i915kms.ko?
>
I am not 100 percent sure that this is a DRM problem.  I have de-installed
everything related to DRM and commented out the rc.conf statement that
loads the DRM modules and still can't get past the last few steps of the
startup.  I did see that ntpd does read it's configuration file because
my /var/log/messages has an entry for reading the leap seconds file.  That
is the last entry in the /var/log/messages file.

I am able to successfully use the beadm choice 7 in the boot chooser to load
a previously good boot environment working kernel.

I'll try your other suggestions tomorrow morning and post the result to
this group.

Tom


-- 
Public Keys:
PGP KeyID = 0x5F22FDC1
GnuPG KeyID = 0x620836CF
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Booting anything after r352057 kills console

2019-09-23 Thread Pete Wright



On 9/23/19 2:32 PM, Thomas Laus wrote:

Poul-Henning Kamp [p...@phk.freebsd.dk] wrote:


In message <11db909b-57ee-b452-6a17-90ec2765c...@acm.org>, Thomas Laus writes:


Where do I go from here?  The computer is an Intel i5 Skylake with
onboard graphics.

Based on personal experience:

1. Deinstall drm ports

2. Remove all remaining drm related files under /boot

3. Reinstall drm port


That did not work.

On a successful boot after using beadm to rollback to r352057, I see the
following items startup after setting the ntpd security policy:

starting ntpd
configuring vt: blanktime
sanity check of sshd configuration
start sshd
start sendmail & sendmail submit as well as cron
start background checks
login

On all svn updates after r352057, the last item logged is the ntpd security
policy and then the console goes black.  The computer is dead and I can't
login through ssh nor change to another console.  I hae to hit the reset
switch to reboot.  Even ctrl-alt-delete is not functioning.
I remember having similar issues a while ago when we were first hacking 
on drm, one thing to try is updating /boot/loader.conf with the following:

debug.debugger_on_panic=0
dev.drm.skip_ddb="1"
dev.drm.drm_debug_persist="1"

these are semi-documented in the wiki here: 
https://wiki.freebsd.org/Graphics#Issues_.2F_Bugs


while they may not solve the issue, they will hopefully give us better 
info as to why the system is hanging.  Also, are you able to boot the 
previously working kernel (iirc you can do this via the boot loader 
menu) successfully?  and lasty, can you boot single user then manually 
attempt to load the kernel module via kldload i915kms.ko?


cheers,
-pete

--
Pete Wright
p...@nomadlogic.org
@nomadlogicLA

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Booting anything after r352057 kills console

2019-09-23 Thread Thomas Laus
Poul-Henning Kamp [p...@phk.freebsd.dk] wrote:
> 
> In message <11db909b-57ee-b452-6a17-90ec2765c...@acm.org>, Thomas Laus writes:
> 
> >Where do I go from here?  The computer is an Intel i5 Skylake with
> >onboard graphics.
> 
> Based on personal experience:
> 
> 1. Deinstall drm ports
> 
> 2. Remove all remaining drm related files under /boot
> 
> 3. Reinstall drm port
>
That did not work.

On a successful boot after using beadm to rollback to r352057, I see the
following items startup after setting the ntpd security policy:

starting ntpd
configuring vt: blanktime
sanity check of sshd configuration
start sshd
start sendmail & sendmail submit as well as cron
start background checks
login

On all svn updates after r352057, the last item logged is the ntpd security
policy and then the console goes black.  The computer is dead and I can't
login through ssh nor change to another console.  I hae to hit the reset
switch to reboot.  Even ctrl-alt-delete is not functioning.

Tom

-- 
Public Keys:
PGP KeyID = 0x5F22FDC1
GnuPG KeyID = 0x620836CF
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Booting anything after r352057 kills console

2019-09-23 Thread Andrey Fesenko
On Mon, Sep 23, 2019 at 11:51 PM Poul-Henning Kamp  wrote:
>
> 
> In message 
> 
> , Warner Losh writes:
>
> >We are working on making drm ports less problematic on upgrade...
>
> Yes, I know.
>
> But when you track current, it seems that it takes a port-reinstall
> to get on that wagon...
>

beadm/bectl rules

1) make world/kernel
2) install in new BE
3) make new pkg drm
4) install it in new BE too
5) activate
6) reboot
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Booting anything after r352057 kills console

2019-09-23 Thread Poul-Henning Kamp

In message 

, Warner Losh writes:

>We are working on making drm ports less problematic on upgrade...

Yes, I know.

But when you track current, it seems that it takes a port-reinstall
to get on that wagon...

-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Booting anything after r352057 kills console

2019-09-23 Thread Warner Losh
On Mon, Sep 23, 2019, 10:02 PM Poul-Henning Kamp  wrote:

> 
> In message <11db909b-57ee-b452-6a17-90ec2765c...@acm.org>, Thomas Laus
> writes:
>
> >Where do I go from here?  The computer is an Intel i5 Skylake with
> >onboard graphics.
>
> Based on personal experience:
>
> 1. Deinstall drm ports
>
> 2. Remove all remaining drm related files under /boot
>
> 3. Reinstall drm port
>

We are working on making drm ports less problematic on upgrade...

Warner

>
>
> --
> Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
> p...@freebsd.org | TCP/IP since RFC 956
> FreeBSD committer   | BSD since 4.3-tahoe
> Never attribute to malice what can adequately be explained by incompetence.
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Booting anything after r352057 kills console

2019-09-23 Thread Poul-Henning Kamp

In message <11db909b-57ee-b452-6a17-90ec2765c...@acm.org>, Thomas Laus writes:

>Where do I go from here?  The computer is an Intel i5 Skylake with
>onboard graphics.

Based on personal experience:

1. Deinstall drm ports

2. Remove all remaining drm related files under /boot

3. Reinstall drm port


-- 
Poul-Henning Kamp   | UNIX since Zilog Zeus 3.20
p...@freebsd.org | TCP/IP since RFC 956
FreeBSD committer   | BSD since 4.3-tahoe
Never attribute to malice what can adequately be explained by incompetence.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Booting anything after r352057 kills console

2019-09-23 Thread Thomas Laus
I updated my source today and when the computer was booted, the screen
turned black at the point that the drm related kernel modules would
normally load.  Suspecting a drm issue, I commented out the rc.conf line
that loads those kernel modules.  This did not fix my problem.  My last
good kernel was r352057.  I started to bi-sect the svn updates between
today and r352057.  Going backward, all the way to r352064 is not
working.  There were very few changes between r352057 and r352064.  None
of them seem to be console graphics related.  The last entry in my boot
log shows a successful entry of the security policy for ntp.  No logged
messages after this point.

Where do I go from here?  The computer is an Intel i5 Skylake with
onboard graphics.

Tom


-- 
Public Keys:
PGP KeyID = 0x5F22FDC1
GnuPG KeyID = 0x620836CF
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"