Re: Devd / devmatch(8) -- netif race 12-RC1

2018-11-20 Thread dan_partelly
No, that's not what's happening. wlan0 isn't racing anything, because 
it's no longer listed in ifconfig



But when is created lagg0 ? Acording rc output on screen , creation of 
cloned interface lagg0 takes place before wlan0 is created. Then this  
means SIOCLAGPORT will fail with Invalid argument.  Also lagg0 is 
started at netif time as far as I know.
 Firmware for the wireless card is loaded later, and only even later 
wlan0 is created. So the way I see it, lagg0 cannot have a wlan0 port 
until firmware for the card is loaded and wlan0 is created, which takes 
place way after the system attempts to configure lagg0  ? Am I missing 
something ?


Also, can you please tell me what happens that devmatch tries to load  
uhidd multiple times ?


Dan

În 2018-11-20 06:38, Warner Losh a scris:

On Mon, Nov 19, 2018 at 7:48 PM Dan Partelly 
wrote:


Hello,

Today I tried a simple wireless failover on a machine  running
free-bsd. After reboot the system cannot complete the initialization
sequence OK with devmatcher.
The devd/devmatch(8) combo correctly identified the wireless card
and loaded required drivers and firmware. rcorder(8) reports  that
devd(8) runs after netif. As far as I gather, devd (8) runs
devmatch(8) on nomatch class events. This results in the situation
in which the interfaces are created before “plug and play”
initialization of the wireless device is complete (no driver no
firmware yet ) , wlan0 creation is impossible and so on and so
forth.


No, that's not what's happening. wlan0 isn't racing anything, because
it's no longer listed in ifconfig.


More so, I believe the runs of devmatch(8) are async in this
scenario, so even if you moved devd(8) before netif service, this
would not solve the issue, there will be race conditions.  I know
this can be solved by loading the drivers manually, but still rising
some issue is in order:


Network configuration happens asynchronously. devmatch gets run in
response to NOMATCH events which then causes the driver to load which
then causes the pccard_ether script to run which causes the device to
be configured. At least that's how it's supposed to work.


1) Why does devd(8) service runs after netif ? I believe it should
run before netif service, probably after kld service. Is there
anything which prevents changing this order ?


Because it doesn't matter? And because if devd is run too eary, too
few services are available. Getting the ordering right was... a
somewhat tricky and frustrating experience when I first committed
devd.


2.) In the scenario in which devd(8) is started before netif, what
can be done to ensure that a barier exists such that an arbitrary
devmatch(8) run is guaranteed to finish loading required drivers
before netif ? Ignore this if Im wrong about asyc nature of
devmatch(8) run.


Nothing. No such barrier is necessary. It should all happen
asynchronously. Maybe there's a config problem?


3 In what state is devmatcher now ? A lot of modules seems to be
loaded ok, but some do not yet. coretemp(4) hwpmc(4) , intel serie 9
smbus driver seems not.


All of USB is done, part of PCI is done, all of the really old PC Card
(since it was easy), parts of FDT for embedded and parts of ACPI are
done.

The drivers you've called out I think are PCI drivers that haven't
been updated. They should all be in GENERIC, but none are in MINIMAL
or perhaps a custom kernel.

coretemp is a CPU device, and so I'm not sure we have the right PNP
information for the CPU bus for it to even load.

Warner

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Panic on kern_event.c

2018-11-20 Thread Sylvain GALLIANO
No issue using patched kernel on 2 servers (under stress test since +2
hours), Thanks !


Le mar. 20 nov. 2018 à 07:45, Mark Johnston  a écrit :

> On Mon, Nov 19, 2018 at 10:26:51AM +0100, Sylvain GALLIANO wrote:
> > With this latest patch, after stressing syslog-ng few minutes, it do not
> > log anymore and a simple kill do not work (I have to do kill -9)
>
> Thanks for your patience.  I finally managed to reproduce the problem
> and can see the bug now.  Please try this patch instead.
>
> diff --git a/sys/kern/kern_event.c b/sys/kern/kern_event.c
> index d9c670e29d60..0be765a040ed 100644
> --- a/sys/kern/kern_event.c
> +++ b/sys/kern/kern_event.c
> @@ -1538,6 +1538,10 @@ kqueue_register(struct kqueue *kq, struct kevent
> *kev, struct thread *td, int wa
> kn_enter_flux(kn);
>
> error = knote_attach(kn, kq);
> +   if ((kev->flags & EV_ENABLE) != 0)
> +   kn->kn_status &= ~KN_DISABLED;
> +   else if ((kev->flags & EV_DISABLE) != 0)
> +   kn->kn_status |= KN_DISABLED;
> KQ_UNLOCK(kq);
> if (error != 0) {
> tkn = kn;
> @@ -1570,6 +1574,11 @@ kqueue_register(struct kqueue *kq, struct kevent
> *kev, struct thread *td, int wa
> KNOTE_ACTIVATE(kn, 1);
> }
>
> +   if ((kev->flags & EV_ENABLE) != 0)
> +   kn->kn_status &= ~KN_DISABLED;
> +   else if ((kev->flags & EV_DISABLE) != 0)
> +   kn->kn_status |= KN_DISABLED;
> +
> /*
>  * The user may change some filter values after the initial EV_ADD,
>  * but doing so will not reset any filter which has already been
> @@ -1595,11 +1604,9 @@ kqueue_register(struct kqueue *kq, struct kevent
> *kev, struct thread *td, int wa
>  * kn_knlist.
>  */
>  done_ev_add:
> -   if ((kev->flags & EV_ENABLE) != 0)
> -   kn->kn_status &= ~KN_DISABLED;
> -   else if ((kev->flags & EV_DISABLE) != 0)
> -   kn->kn_status |= KN_DISABLED;
> -
> +   /*
> +* KN_DISABLED will be stable while the knote is in flux.
> +*/
> if ((kn->kn_status & KN_DISABLED) == 0)
> event = kn->kn_fop->f_event(kn, 0);
> else
> @@ -1861,6 +1868,8 @@ kqueue_scan(struct kqueue *kq, int maxevents, struct
> kevent_copyops *k_ops,
> }
>
> TAILQ_REMOVE(&kq->kq_head, kn, kn_tqe);
> +   KASSERT(kn == marker || (kn->kn_status & KN_QUEUED) != 0,
> +   ("knote %p not queued", kn));
> if ((kn->kn_status & KN_DISABLED) == KN_DISABLED) {
> kn->kn_status &= ~KN_QUEUED;
> kq->kq_count--;
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Devd / devmatch(8) -- netif race 12-RC1

2018-11-20 Thread Bjoern A. Zeeb

On 20 Nov 2018, at 8:17, dan_parte...@rdsor.ro wrote:

No, that's not what's happening. wlan0 isn't racing anything, 
because it's no longer listed in ifconfig



But when is created lagg0 ? Acording rc output on screen , creation of 
cloned interface lagg0 takes place before wlan0 is created. Then this  
means SIOCLAGPORT will fail with Invalid argument.  Also lagg0 is 
started at netif time as far as I know.
 Firmware for the wireless card is loaded later, and only even later 
wlan0 is created. So the way I see it, lagg0 cannot have a wlan0 port 
until firmware for the card is loaded and wlan0 is created, which 
takes place way after the system attempts to configure lagg0  ? Am I 
missing something ?


lagg might be a problem.


While we are on the topic: I also noticed on a fixed 10G card that the 
network startup it went through strangely wasn’t the same as it was 
when the driver was loaded and service netif start was called again.  I 
have not had time to debug that any further.



Also, can you please tell me what happens that devmatch tries to load  
uhidd multiple times ?


That’s probably similar to 
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=232782 ?

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


ACPI Error: No handler for Region [ECOR]

2018-11-20 Thread Charlie Li
Somewhere between r340491 and r340650, probably starting from r340595,
my ThinkPad W550s started spewing these messages repeatedly in the
system log since boot:

Nov 20 09:35:19 ardmore kernel: ACPI Error: No handler for Region [ECOR]
(0xf80003662300) [EmbeddedControl] (20181031/evregion-288)
Nov 20 09:35:19 ardmore kernel: ACPI Error: Region EmbeddedControl
(ID=3) has no handler (20181031/exfldio-428)
Nov 20 09:35:19 ardmore kernel: ACPI Error: Method parse/execution
failed \_SB.PCI0.LPC.EC.BAT1._BST, AE_NOT_EXIST (20181031/psparse-677)

As a result, I am now unable to query battery information at the very
least. r340490 is my last built revision with this working.

-- 
Charlie Li
Can't think of a witty .sigline today…

(This email address is for mailing list use only; replace local-part
with vishwin for off-list communication)



signature.asc
Description: OpenPGP digital signature


Re: [regression] drm-stable-kmod doesn't work in i386 jail on amd64 host

2018-11-20 Thread Jan Beich
Jan Beich  writes:

> Jan Beich  writes:
>
>> I often test Firefox on 10.4 i386 and sometimes play games via Wine.
>> Both require working OpenGL for COMPAT_FREEBSD32. My GPU is Skylake
>> which worked fine a few weegs ago i.e., before r338990.
>>
>> Any clue?
>
> I've opened https://github.com/FreeBSDDesktop/kms-drm/issues/99 but so
> far no response. Would a sample would help?
>
> $ pciconf -l | fgrep 0:0:2:0
> vgapci1@pci0:0:2:0: class=0x03 card=0x79681462 chip=0x19128086 
> rev=0x06 hdr=0x00
> $ /poudriere/jails/112i386/usr/sbin/pciconf -l
> pciconf: ioctl(PCIOCGETCONF): Operation not permitted
>
> $ ./test 0 0 2 0
> vendor=0x8086, device=0x1912, subvendor=0x1462, subdevice=0x7968, revid=0x6
> $ ./test32 0 0 2 0
> test32: PCIOCGETCONF failed: Operation not permitted
> $ sudo ./test32 0 0 2 0
> test32: PCIOCGETCONF failed: Operation not permitted

For posterity, 12.0-RC2 should have the fix:
https://svnweb.freebsd.org/changeset/base/340657

If someone is on Reddit maybe they can inform the following user
https://www.reddit.com/r/freebsd/comments/9yllxo/not_getting_hardware_acceleration_with/
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ACPI Error: No handler for Region [ECOR]

2018-11-20 Thread Jung-uk Kim
On 18. 11. 20., Charlie Li wrote:
> Somewhere between r340491 and r340650, probably starting from r340595,
> my ThinkPad W550s started spewing these messages repeatedly in the
> system log since boot:
> 
> Nov 20 09:35:19 ardmore kernel: ACPI Error: No handler for Region [ECOR]
> (0xf80003662300) [EmbeddedControl] (20181031/evregion-288)
> Nov 20 09:35:19 ardmore kernel: ACPI Error: Region EmbeddedControl
> (ID=3) has no handler (20181031/exfldio-428)
> Nov 20 09:35:19 ardmore kernel: ACPI Error: Method parse/execution
> failed \_SB.PCI0.LPC.EC.BAT1._BST, AE_NOT_EXIST (20181031/psparse-677)
> 
> As a result, I am now unable to query battery information at the very
> least. r340490 is my last built revision with this working.

I am pretty sure r340644 caused the regression.

https://svnweb.freebsd.org/changeset/base/340644

Jung-uk Kim
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ACPI Error: No handler for Region [ECOR]

2018-11-20 Thread Ben Widawsky
On 18-11-20 14:09:08, Jung-uk Kim wrote:
> On 18. 11. 20., Charlie Li wrote:
> > Somewhere between r340491 and r340650, probably starting from r340595,
> > my ThinkPad W550s started spewing these messages repeatedly in the
> > system log since boot:
> > 
> > Nov 20 09:35:19 ardmore kernel: ACPI Error: No handler for Region [ECOR]
> > (0xf80003662300) [EmbeddedControl] (20181031/evregion-288)
> > Nov 20 09:35:19 ardmore kernel: ACPI Error: Region EmbeddedControl
> > (ID=3) has no handler (20181031/exfldio-428)
> > Nov 20 09:35:19 ardmore kernel: ACPI Error: Method parse/execution
> > failed \_SB.PCI0.LPC.EC.BAT1._BST, AE_NOT_EXIST (20181031/psparse-677)
> > 
> > As a result, I am now unable to query battery information at the very
> > least. r340490 is my last built revision with this working.
> 
> I am pretty sure r340644 caused the regression.
> 
> https://svnweb.freebsd.org/changeset/base/340644
> 
> Jung-uk Kim

Seems like a good bet. Could you please add the full dmesg as well as an ACPI
dump and the output of `sysctl dev.acpi_ec.` Thanks.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: ACPI Error: No handler for Region [ECOR]

2018-11-20 Thread Ben Widawsky
On 18-11-20 11:28:56, Ben Widawsky wrote:
> On 18-11-20 14:09:08, Jung-uk Kim wrote:
> > On 18. 11. 20., Charlie Li wrote:
> > > Somewhere between r340491 and r340650, probably starting from r340595,
> > > my ThinkPad W550s started spewing these messages repeatedly in the
> > > system log since boot:
> > > 
> > > Nov 20 09:35:19 ardmore kernel: ACPI Error: No handler for Region [ECOR]
> > > (0xf80003662300) [EmbeddedControl] (20181031/evregion-288)
> > > Nov 20 09:35:19 ardmore kernel: ACPI Error: Region EmbeddedControl
> > > (ID=3) has no handler (20181031/exfldio-428)
> > > Nov 20 09:35:19 ardmore kernel: ACPI Error: Method parse/execution
> > > failed \_SB.PCI0.LPC.EC.BAT1._BST, AE_NOT_EXIST (20181031/psparse-677)
> > > 
> > > As a result, I am now unable to query battery information at the very
> > > least. r340490 is my last built revision with this working.
> > 
> > I am pretty sure r340644 caused the regression.
> > 
> > https://svnweb.freebsd.org/changeset/base/340644
> > 
> > Jung-uk Kim
> 
> Seems like a good bet. Could you please add the full dmesg as well as an ACPI
> dump and the output of `sysctl dev.acpi_ec.` Thanks.

Just for a quick eyeball, this looks suspicious. You could also try this:
diff --git a/sys/dev/acpica/acpi_ec.c b/sys/dev/acpica/acpi_ec.c
index a21dbc963af..5d6dba6a887 100644
--- a/sys/dev/acpica/acpi_ec.c
+++ b/sys/dev/acpica/acpi_ec.c
@@ -422,6 +422,7 @@ acpi_ec_probe(device_t dev)
 /* Store the values we got from the namespace for attach. */
 acpi_set_private(dev, params);

+#if 0
 /*
  * Check for a duplicate probe. This can happen when a probe via ECDT
  * succeeded already. If this is a duplicate, disable this device.
@@ -431,6 +432,7 @@ acpi_ec_probe(device_t dev)
ret = 0;
 else
device_disable(dev);
+#endif

 if (buf.Pointer)
AcpiOsFree(buf.Pointer);

-- 
Ben Widawsky, Intel Open Source Technology Center
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"