I2C iic_exec() and clock stretch
Sorry to intrude... So... I have a I2C device I am writing a driver for that has a read cycle that needs one of the following: 1) Start a read with a STOP, except that before the data is sent down the bus, clock stretch for a bit while the device is working on what is is suppose to be working on. When it is done, the data will be sent. 2) Start the read with no STOP cycle, but return NACK part way though until the device is done doing what it is suppose to be doing. When it stops sending a NACK, or after you wait a bit, it seems that you can do a read with a STOP and get the data. Neither of these two situations appear to be supported with iic_exec() as it exists today. What I was wondering is if anyone has worked on anything to enhance iic_exec() to deal with this sort of thing. The device in question is the SI7021 humidity and temperature sensor. It is a fairly inexpensive device that works pretty well and has tons of example code out there for all kinds of OSs and devices. The data sheet is available at, among other places: https://cdn-learn.adafruit.com/assets/assets/000/035/931/original/Support_Documents_TechnicalDocs_Si7021-A20.pdf I am using a Raspberry PI 3 as the computer. To do read type #1, I hacked up bcm2835_bsc.c to provide a simple device property to specify the amount of clock stretch to the BCM2835 iic that is desired and a new flag in i2cvar.h that can be used by iic_exec() that will tell the controller driver to use the stretch, but getting that property set from attached children provided to be ugly. I am thinking that something like a iic_execv() should maybe exist that works, mechanically, a little like sysctl_createv() so that a si70xx driver can do a I2C transaction, but tell the controller driver, do this transaction with these additional arguments that are more complicated then a simple flag. To do read type #2 with the SI7021 iic_exec() needs to be able to do a 0 length read, I think and this is disallowed by bcm2835_bsc.c by a KASSERT(dlen >= 1). I didn't really look at any other drivers to see if this sort of thing is allowed. This is basically a cycle that looks like: START+ADDRESS+COMMAND+START+ADDRESS NACK will be returned until the measurement is done START+ADDRESS You get up to three bytes back. It was simpler to hack in support for clock stretching then to figure out if it was possible to do 0 length reads. In any case, I was curious if anyone had any thoughts or was working on something like this?? -- Brad Spencer - b...@anduin.eldar.org - KC8VKS http://anduin.eldar.org - & - http://anduin.ipv6.eldar.org [IPv6 only]
Re: i2c and indirect vs. direct config
Mouse writes: >> Your points about explicit config make a lot of sense; reminds me of >> qbus and isa bus where you have to know. > > Well, except that on Qbus and ISA, most devices can be probed > relatively harmlessly. That is, asking a driver "there might be a > device you handle at this address, check it out?" is a reasonable thing > to do. There certainly are exceptions, a few cases of devices X and Y > such that having device Y's driver probe for a Y at A when there's > really an X at A will do something unfortunate to the X. But they are > few and rare. > > As I read thorpej's mail, that's far less true of i2c. > > /~\ The ASCII Mouse > \ / Ribbon Campaign > X Against HTML mo...@rodents-montreal.org > / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B Can "far less true" be quantified somehow?? I seem to remember one known device that didn't like to be probed, something on a Lenovo laptop I think, but I don't remember if it was a read cycle or a write cycle that made the device upset. Or something else.. Does the techniques mentioned in these offer any hope of determining the presence of an actual device at a particular address on the bus in a harmless manor: http://forum.arduino.cc/index.php?topic=61520.0 https://electronics.stackexchange.com/questions/76617/determining-i2c-address-without-datasheet The second, in particular, simply does a start+address+stop.. if the response to the address was ACK there was a device, otherwise, there if it was a NAK there wasn't anything there. I don't know how well this works in in practice. But it would seem to be something that would not be able to upset a device. -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Re: i2c and indirect vs. direct config
Jason Thorpe writes: >> On May 30, 2018, at 11:54 AM, Brad Spencer wrote: > >> Does the techniques mentioned in these offer any hope of determining the >> presence of an actual device at a particular address on the bus in a >> harmless manor: >> >> http://forum.arduino.cc/index.php?topic=61520.0 >> https://electronics.stackexchange.com/questions/76617/determining-i2c-address-without-datasheet >> >> The second, in particular, simply does a start+address+stop.. if the >> response to the address was ACK there was a device, otherwise, there if >> it was a NAK there wasn't anything there. I don't know how well this >> works in in practice. But it would seem to be something that would not >> be able to upset a device. > > That’s a great idea, actually. It would certainly help the ghosting problem. > Perhaps we can centralize the logic into a helper function that does the > START-address-STOP handshake and also filters the address against a list of > valid addresses passed by the driver (in general, i2c devices can’t be at > arbitrary addresses). > > -- thorpej A zero length write would probably also work and should be just as safe, although I am not sure that every i2c controller supports that sort of thing. The RPI had a KASSERT() that wasn't needed that would have paniced upon trying such a thing. It was removed when the si70xx driver was added to the tree because that driver needed to be able to do that a zero length write, then wait before doing a read. If clock stretching support existed in the i2c infrastructure, this would not have been required. I submitted the PR that became the am2315 and si70xx drivers that are in the tree and I didn't like the ghosting problem when I was doing the development, so those drivers have some additional stuff added to their match function to do something like the start+address+stop thing, just not in any sort of general way. -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Re: i2c and indirect vs. direct config
Jason Thorpe writes: [snip] > NOW. i2c_scan defaults to doing a “quick write”, HOWEVER, that code has the > comment about the “quick write” method corrupting the EEPROM on Thinkpads. > But it also has a comment about the “receive byte” method possibly causing > problems with write-only devices? > > So, I’m a little perplexed about what to do… perhaps the “receive byte” > method is the best for now? > > [snip] Unfortunate behavior. Looking back over the sensor driver I worked on, it appears that I always read something to determine if the device was actually there. I note that the gpioow, a onewire bus, may have simular ghost issues as i2c: [7.496848] gpioow1 at gpio0 pins 25: can't map pins [7.516852] gpioow2 at gpio0 pins 25: can't map pins [7.536858] gpioow3 at gpio0 pins 25: can't map pins There is only one device present, and it is wired down in the config file to a particular gpio parent and pin, but it manages to ghost 3 times anyway. [Unrelated topic, gpioow is a module, onewire is a module, but owtemp isn't, so you really end up not being able to modload the them into place with the "generic" RPI kernel, hence a config file] I wonder if the i2c bus attachments should have the option of being treated like gpio attachements with a new command... probably a lot of work: iicctl iic2 attach dstrc 0x68 3231 > -- thorpej -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Re: i2c and indirect vs. direct config
Jason Thorpe writes: >> On May 31, 2018, at 9:34 PM, Jason Thorpe wrote: >> >> I spent some time reviewing NXP’s i2c spec this evening (well, during >> timeouts, etc. — GO DUBS), and I’m becoming convinced that there is a subtle >> error in our i2c_bitbang code… > > *smacks forehead* … RPI, of course, has a “smart” i2c controller. There may > be a bug in that driver. What I should do is try gpioiic to see if the > “quick read” method works there. > > Same caveat as below applies, however. > >> (Not sure I'll get to it this weekend, though… despite my kid’s spring >> soccer season being finished, I seem to have a full calendar nonetheless…) > > -- thorpej gpioiic may have bugs too.. I could not get it to work with the am2315. Reads would not error, but no data was returned. However, the am2315 is a strange thing. However, the si70xxtemp(4) driver did work with gpioiic. -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Re: Missing compat_43 stuff for netbsd32?
Eduardo Horvath writes: > On Tue, 11 Sep 2018, Paul Goyette wrote: > >> While working on the compat code, I noticed that there are a few old >> syscalls which are defined in syc/compat/netbsd323/syscalls.master >> with a type of COMPAT_43, yet there does not exist any compat_netbsd32 >> implementation as far as I can see... >> >> #64 ogetpagesize >> #84 owait >> #89 ogetdtablesize >> #108osigvec >> #142ogethostid (interestingly, there _is_ an implementation >> for osethostid!) >> #149oquota >> >> Does any of this really matter? Should we attempt to implement them? > > I believe COMPAT_43 is not NetBSD 4.3 it's BSD 4.3. Anybody have any old > BSD 4.3 80386 binaries they still run? Did BSD 4.3 run on an 80386? Did > the 80386 even exist when Berkeley published BSD 4.3? > > It's probably only useful for running ancient SunOS 4.x binaries, maybe > Ultrix, Irix or OSF-1 depending on how closely they followed BSD 4.3. > > Eduardo It has been a very long time since I did this, and I may not remember correctly, but I believe that COMPAT_43 is needed on NetBSD/i386 to run BSDI binaries. I remember using the BSDI Netscape 3.x binary back in the day and I think it was required. -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Re: Things not referenced in kernel configs, but mentioned in files.*
co...@sdf.org writes: > This is an automatically generated list with some hand touchups, feel > free to do whatever with it. I only generated the output. > Random ramblings below... [snip] > am2315temp The am2315 will, in theory, work on any reasonable i2c bus, but has only been tested on a RPIish sort of thing. [snip] > gpioirq > gpiopps The gpioirq and gpiopps drivers should function with any system with gpio(4) support. But again, only tested on the RPI. [snip] > si70xxtemp The si70xx will also, in theory, work on any fairly well behaved i2c bus. In particular, you need to be able a zero length read cycle... or we need to implement the ability to perform clock stretching on the i2c bus. Only tested and messed with on a RPI. Maybe support will appear some day for one of the USB to i2c chips that exist in the world... [hint: there is one on Adafruit that is FTDI based and fully documented], and it will be more obvious how the above would be useful on more than the RPI. I don't know where it leaves these with respect to a ALL sort of config file. Nothing should really be harmed by including any of them in such a file. -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Re: /dev/random is hot garbage
Kamil Rytarowski writes: > On 22.07.2019 13:12, Greg Troxel wrote: >> Taylor R Campbell writes: >> >>>> It would also be reasonable to have a sysctl to allow /dev/random to >>>> return bytes anyway, like urandom would, and to turn this on for our xen >>>> builders, as a different workaround. That's easy, and it doesn't break >>>> the way things are supposed to be for people that don't ask for it. >>> >>> What's the advantage of this over using replacing /dev/random by a >>> symlink to /dev/urandom in the build system? >>> >>> A symlink can be restricted to a chroot, while a sysctl knob would >>> affect the host outside the chroot. The two would presumably require >>> essentially the same privileges to enact. >> >> None, now that I think of it. >> >> So let's change that on the xen build host. >> >> And, the other issue is that systems need randomness, and we need a way >> to inject some into xen guests. Enabling some with rndctl works, or at >> least used to, even if it is theoretically dangerous. But we aren't >> trying to defend against the dom0. >> > > It looks like we need a paravirt random driver for xen that could solve > the rust / random(6) problem. > > There is already viornd(4) for virtio(4). rnd and Xen guests is a vexing problem. Lots of things seem to consume bits from the pool until you are often left with none. For me it was Kerberos authentication against a Postgresql DB, but ssh seems to use them and it appears that some are consumed when you use ntpd keys with peers. I built this -> https://anduin.eldar.org/true-rng/ and feed randomness into the Xen guests I have and other systems that I suspect do not produce randomness on their own very well. It is not at all a perfect answer, but appears to work well enough for what I need. For Xen guests, a paravirt driver would seem to be a better answer. -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Re: Cisco USB serial console compatiblity?
Brian Buhrow writes: > hello. Does anyone know what USB serial chip the Cisco USB serial > console is closest to > in our USB serial drivers list? the vendor code for the device I'm talking > about is: 0x05a6 > and the product code is: 0x0009. It doesn't look like we have a driver that > recognizes these > codes, but I'm thinking it should be easy enough to add them to an existing > driver. The > question is, which driver most closely supports these devices? My simple > Google search didn't > turn up much accept that the linux cdc_acm driver seems to support it. I > think this driver is > for serial class USB devices. Do we have a generic serial clas driver, like > we do for class > ethernet USB devices? > Any thoughts on this would be greatly appreciated. > -thanks > -Brian This may not answer your question, but I have several 2960 Cisco switches and the USB port on them works just fine with umodem. In the 2960, they identify as: [ 609123.259177] umodem0 at uhub1 port 5 configuration 1 interface 0 [ 609123.259177] umodem0: Cisco (0x5a6) Cisco USB Console (0x09), rev 2.00/0.00, addr 6, iclass 2/2 [ 609123.259177] umodem0: data interface 1, has no CM over data, has no break [ 609123.259177] umodem0: status change notification available [ 609123.259177] ucom1 at umodem0 [ 609163.321067] ucom1: detached [ 609163.321067] umodem0: detached [ 609163.321067] umodem0: at uhub1 port 5 (addr 6) disconnected Seems like that may be what you have. -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Re: Memory corruption after fork, only on AMD CPUs
Michael Pratt writes: > On Tue, Dec 14, 2021 at 1:06 PM Michael Pratt wrote: >> >> [This is a reply to >> https://mail-index.netbsd.org/tech-kern/2021/12/01/msg027830.html. I >> just joined the mailing list and can't seem to find the metadata >> required for a proper reply. Apologies.] >> >> I filed https://gnats.netbsd.org/56535 for this a while ago, which has >> an even simpler reproducer: a direct fork() call with a child that >> immediately exits sometimes causes memory corruption in the parent >> process. >> >> We've kept looking since filing https://gnats.netbsd.org/56535 but >> haven't had luck on further simplification. No C reproducer yet, >> unfortunately. (No crashes if the Go parent process is single-threaded >> either.) > > I spoke too soon here, we managed to get a reproducer in C today, > which I've posted at > https://github.com/golang/go/issues/34988#issuecomment-994115345. I don't have a big collection of AMD systems, but I do have a couple. Everything here is Xen, however and nothing is really very recent either from the hardware POV or the OS in a lot of cases... Ryzen 3 2200G - 2 vcpu DOMU running 9.0_STABLE and a 1 processor DOM0 running 8.99.25 could not reproduce this running the code from the DOMU or DOM0. Athlon 64 X2 5600+ - 1 vcpu DOMU running 9.99.74 and a 1 processor DOM0 running 8.0_STABLE could not reproduce this running the code from the DOMU or DOM0. As a control test a 2 vcpu 9.0_STABLE DOMU running on an Intel system could also not reproduce this. Since this test is a bit brutal, I didn't let this run too long as the systems are doing other stuff, but it was several minutes and no fails reported. Are of the systems are NetBSD/amd64. >> This feels like a bug in memory management somewhere (TLB invalidation >> issue, bug in copy-on-write?). Fundamentally, we have the parent >> process getting corrupt memory after calling fork with an >> (effectively) no-op child. That just shouldn't happen. >> >> I think we need someone familiar with NetBSD memory management >> internals to help take a look. Otherwise I'm afraid we won't figure it >> out and will have to declare that Go doesn't work on NetBSD on AMD >> CPUs. >> >> gdt: that does sound like a different issue to me. It may be worth >> filing a bug at https://github.com/golang/go/issues with the crash >> details. >> >> Thanks, >> Michael -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Re: Can't mount root partition after rebuilding kernel with DKWEDGE_METHOD_MBR
Pham Ngoc-Dung writes: > Hi. On my hard drive (wd0, MBR), I have 3 partitions originally for Linux, > including a boot partition (ext2 formatted), and another partition with > NetBSD installed. > For a reason I wanted to mount the Linux boot partition, which should have > been easy since ext2 is supported by NBSD. But except fdisk where it showed > up, I couldn't find it anywhere, nor a way to mount it. > A little bit more of research, then I found out about dk(4). I tried to > rebuild the kernel with DKWEDGE_METHOD_MBR uncommented. It did boot from that > kernel, but it couldn't mount my root device, However the 3 Linux partitions > were now detected: > > [4.8364408] wd0 at atabus0 drive 0 > [4.8364408] wd0: > [4.9467467] dk0 at wd0: "wd0e" > [4.9467467] dk1 at wd0: "wd0f" > [4.9467467] dk2 at wd0: "wd0h" > [5.0676107] boot device: wd0 > [5.0676107] root on wd0a dumps on wd0b > [5.0676107] vfs_mountroot: can't open root device > [5.0676107] cannot mount root, error = 16 > [5.0676107] root device (default wd0a): > > Is there a workaround or a fix to this behavior? You also want this -> DKWEDGE_METHOD_BSDLABEL as well as (or even instead of) DKWEDGE_METHOD_MBR. That cause a wedge to be added for each disklabel entry which should the system to find your root filesystem. It appears that with just DKWEDGE_METHOD_MBR the system didn't notice where its root fs was at. In fact, you may need to leave out DKWEDGE_METHOD_MBR and just use DKWEDGE_METHOD_BSDLABEL if your disklabel contains information about all of the native NetBSD filesystems and the linux ones. You will have to change everything in your /etc/fstab to use /dev/dkN notation rather than /dev/wd0M notation, but that isn't too hard to do. -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Re: Dell PERC H330: no disks, no volumes
Edgar Fuß writes: >> There is a PERC H330 and a PERC HBA330 and the Dell PERC9 user manual >> (includes the H330) says you can boot it in HBA mode. Not sure if >> that means that you can chose the firmware. > When I set the H330 to HBA mode, it still attaches as mfii0, the only > difference to RAID mode being that the attachment in HBA mode says > scsibus0 at mfii0: 0 targets, 8 luns per target > instead of > scsibus0 at mfii0: 32 targets, 8 luns per target > in RAID mode. > > I tried to force it to use mpii (by adding the PCI Id in mpii.c and > disabling mfii in the kernel config, but that didn't work either > (I had the faint hope the controller would use the MPT-2 protocol in > HBA mode despite showing the RAID PCI Ids). > > What /does/ work is setting the controller to RAID mode and create two > volumes with a one-element RAID-0. But that feels like crazy. In the foggy recesses of my memory this is Just How It Is Done. At my final $DAYJOB we had a set of systems that had some PERC controller in them. The desire was to present the raw disks to Hadoop and the only way that could be done was to create a virtual disk for each physical device. There was no other option available to us. -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Re: Dell PERC H330: no disks, no volumes
David Brownlee writes: > On Thu, 15 Sept 2022 at 19:27, Brad Spencer wrote: >> >> In the foggy recesses of my memory this is Just How It Is Done. At my >> final $DAYJOB we had a set of systems that had some PERC controller in >> them. The desire was to present the raw disks to Hadoop and the only >> way that could be done was to create a virtual disk for each physical >> device. There was no other option available to us. > > I was annoyed enough by this behaviour to swap out the PERC on my old > T320 for another model, specifically one for which I could find > generic LSI firmware, so it would expose the 8 disks directly to > NetBSD (for ZFS use) > > mpii0: SAS9217-8i, firmware 20.0.7.0, MPI 2.0 > > David I have totally forgotten the model numbers of any of this, but in another area at the last $DAYJOB place we did that too for SmartOS and for the same reason. Until somewhat recently, however, that wasn't a supported configuration by Dell... it is now. There was also a BIOS annoyance in that the simple HBA wasn't seen and would not be reported in the DRAC (at least the versions we had). -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Re: Can't mount root partition after rebuilding kernel with DKWEDGE_METHOD_MBR
Pham Ngoc-Dung writes: > I rebuilt the kernel as you suggested. It booted up, but the Linux > partitions are not detected. The root and swap partitions of NetBSD, > however, is now on /dev/dk0 and dk1. > > On 9/29/22 12:50 AM, Brad Spencer wrote: >> You also want this -> DKWEDGE_METHOD_BSDLABEL as well as (or even >> instead of) DKWEDGE_METHOD_MBR. That cause a wedge to be added for each >> disklabel entry which should the system to find your root filesystem. >> It appears that with just DKWEDGE_METHOD_MBR the system didn't notice >> where its root fs was at. In fact, you may need to leave out >> DKWEDGE_METHOD_MBR and just use DKWEDGE_METHOD_BSDLABEL if your >> disklabel contains information about all of the native NetBSD >> filesystems and the linux ones. >> >> You will have to change everything in your /etc/fstab to use /dev/dkN >> notation rather than /dev/wd0M notation, but that isn't too hard to do. That probably means that the disklabel does not include the Linux filesystems... probably... look closely at 'disklabel wd0' or whatever your disk is called and see if anything Linux is there. DKWEDGE_METHOD_BSDLABEL should create a wedge for every filesystem described in the disklabel. But... as someone else mentioned, you don't have to use wedges at all if the disklabel contains a description of the Linux filesystems. You can just mount the filesystems directly as wd0N. -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Re: Likely lock botch in NPF firewall
Taylor R Campbell writes: >> Date: Wed, 11 Jan 2023 13:05:04 -0500 >> From: Brad Spencer >> >> I think I know what the root problem is for kern/57136 and >> kern/57181... a couple of PRs I wrote about problems I have been having >> with NPF, but I am not at all sure that my solution is correct. Thanks for the response > These are the same issue. Your analysis is correct that this happens > because the spin lock t_lock is held across copyout, which is > forbidden, because copyout might sleep indefinitely and sleeping at > all (let alone indefinitely) under a spin lock is forbidden. > > The kernel with LOCKDEBUG just goes out of its way to detect the > problem early, whereas the non-LOCKDEBUG kernel mostly does the > copyout without sleeping so it gets by without triggering the panic. > > However, simply changing the lock to be IPL_SOFTNET isn't enough. > Holding an IPL_SOFTNET lock across copyout is also forbidden (but > LOCKDEBUG might not detect it). It is forbidden for anything in hard > or soft interrupt context to block for copyout (which might be > happening in another thread). I mostly picked IPL_SOFTNET from the statements made in mutex(9). It seems to state the IPL_SOFTNET is one of the adaptative locks as are all IPL_SOFT* locks, that is, it isn't a spin lock. [sorry for what is likely to be dumb questions] Is what you mean by "forbidden for anything in hard or soft interrupt context to block for copyout" to mean holding any mutex across copyout?? (This is an area I am pretty dumb about and am not entirely sure what I am doing). I could not find anything in the man pages about holding mutexs around copyout. (That is, if it IS NOT an interrupt context can you hold a mutex during a copyout). The case that messes with me the most is the listing a table ioctl. If that ioctl manages to get the IPL_SOFTNET lock, nothing else, even if it is interrupt context of some sort, should have it and it won't get the lock until it is available?? (or am I missing something)... or is a ioctl itself some sort of interrupt context or is it a the case of just not allowed to hold any sort of mutex across a copyout at all?? (like I said, don't really understand this that well). > This code needs to coordinate the table iteration and copyout with > other access (insert/remove/lookup) in another way. Probably the > easiest way will be to create an rwlock, and use t_gc for lpm-type > tables in addition to thmap-type tables: > > - npf_table_remove moves lpm-type entries to the t_gc list instead of > freeing them immediately > - npf_table_gc takes a write lock and frees the t_gc entries > - npf_table_list takes a read lock (and pserialize if needed to grab > pointers to entries) around copyout -- never takes t_lock > > This could still potentially result in deadlock scenarios, if > npf_table_gc is ever reached from the pagedaemon; if so, npf_table_gc > could do rw_tryenter instead of rw_enter to avoid this deadlock, at > the cost of sometimes delaying the freeing of table entries. I understand the code well enough to read it some, but probably not well enough to implement what you suggest. As it is right now, this is a bad bug. The abuse test I came up with that doesn't use LOCKDEBUG trips a panic by hitting two xbd drives pretty hard while doing a npf table list but I think that other activity will trigger it. I get 2 to 6 days between panics right now. With the simple patch I could not cause the abuse panic in my test DOMU that I was able to without it. For me right now that may be an improvement if it helps the firewall / router not panic. -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Likely lock botch in NPF firewall
Hello... I think I know what the root problem is for kern/57136 and kern/57181... a couple of PRs I wrote about problems I have been having with NPF, but I am not at all sure that my solution is correct. The problem is that it seems that a spin lock is being and then an action is performed that is illegal while holding a spin lock. In kern/57136 a panic happens when a CPU switch occurs (I think), and kern/57181 (a much simpler panic to reproduce) the panic happens when LOCKDEBUG is defined and certain NPF operations are performed. What I think is happening is that a mutex is created per NPF table held at ipl IPL_NET (see the use of >t_lock in src/sys/net/npf/npf_tableset.c). This would be a spin lock (I think). This lock is held during NPF operation such as "npfctl table list" and that ioctl can perform a copyout which I believe is against the rules when holding a spin lock (there may be other copyout instances too, but the table list is the one that is messing with me). This seems to be strongly hinted at in the LOCKDEBUG PR, but I don't fully understand everything I see there. My simple patch to address this is this: --- npf_tableset.c.DIST 2022-04-09 19:38:33.0 -0400 +++ npf_tableset.c 2023-01-11 09:26:30.627895981 -0500 @@ -410,7 +410,8 @@ default: KASSERT(false); } - mutex_init(>t_lock, MUTEX_DEFAULT, IPL_NET); + /* mutex_init(>t_lock, MUTEX_DEFAULT, IPL_NET); */ + mutex_init(>t_lock, MUTEX_DEFAULT, IPL_SOFTNET); t->t_type = type; t->t_id = tid; return t; This patch seems to work alright, the LOCKDEBUG panic is gone addressing kern/57181 and the abusive test I found which can trigger kern/57136 appears to also be gone (i.e. I don't have to wait 6 days to see the panic, I have managed to come up with a way to produce it in a few minutes). What I do not know is why this mutex was IPL_NET in the first place (the comments mention that it is so, but not why it is so). I compiled a kernel for the firewall that was panicing with this patch and it appears to still appears to function as a firewall / router just fine, which suggests that lock need not have been a spin lock in the first place (at least with Xen DOMUs). While I am running Xen DOMUs everywhere at least kern/57181 (the LOCKDEBUG PR) effects everything. I managed to panic a Virtualbox "bare metal" system using the GENERIC kernel, for example. This suggests that right now the use of NPF with tables is probably a bit dangerous. This problem is also probably racing in some way, as I have a couple of other DOMUs that use NPF as a firewall and use NPF tables that do not panic at all. They do exactly the same operations, more or less (with a different /etc/npf.conf) as the one that does panic regularly. Any advise or hints about this would be appreciated. I am honestly a bit in the dark as to some of what may be going on here. -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Re: Likely lock botch in NPF firewall
Taylor R Campbell writes: > Try the attached patch? Thanks... I will test it as soon as I am able to. -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Re: Finding the slot in the ioconf table a module attaches to?
Brian Buhrow writes: > hello Brad. I thought the idea behind modules was that you didn't need > to rebuild a > kernel to add devices to the ioconf table? And, in fact, under the old > module framework, that > is, NetBSD-5 and earlier, you could add devices and major numbers to the > table without having > to rebuild the kernel. If, in fact, I need to rebuild the kernel to add > device drivers to the > kernel, then I submit our module framework is fatally broken. So, I'll hope > that isn't the > case and proceed. If I figure out how to do it, I'll post here so others > won't have to climb > that learning curve using the same path. > -thanks > -Brian My general experience seems to be that a recompile of the kernel is needed, at least when used in the manor I used it (adding a new module that used a static major number in one of the major files). I was bundling a module that would be added to src anyway, so the recompile wasn't a notable problem. The devsw_attach(9) man page implies that the major and minor number can be selected with that call, but I have never used it that way. -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Re: Finding the slot in the ioconf table a module attaches to?
RVP writes: > On Wed, 1 Feb 2023, Brian Buhrow wrote: > >> hello. Following up on my own post, I found the mechanism by which the >> cdevsw structure >> gets tied to the ioconf table in NetBSD-5. It's done with: >> >> >> MOD_DEV("zaptel", "zaptel", NULL, -1, _cdevsw, ZT_MAJOR) >> >> This macro has been removed from the new module framework. Can someone >> point me in the correct >> direction as to where to look for the replacement function for this macro >> with the new module >> framework? >> > > These should help you: > > /usr/src/sys/modules/examples/readhappy/readhappy.c > /usr/src/sys/conf/majors* > > -RVP To add a bit... generally I have just added an entry to one of the "major" files in sys/conf. However, I have noticed that in order for the module to be able to use it, after the major file edit, I had to rebuild the kernel as well. I have never been 100% sure that was proper behavior, but it seems to be the case. That is, just editing the major file and building or rebuilding the module has not been enough. -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Re: Polymorphic devices
m...@goathill.org (MLH) writes: > Brad Spencer wrote : > >> It has a built in UART which is a separate USB device and then >> a USB device that can be programmed to provide I2C, Hello... > What are you using to flash these devices? No flashing of anything is needed to use the I2C mode. All you need to do is send the proper control commands down the USB bus and have the I2C bus glue in place to make a bus. The MPSSE mode uses all of the same USB endpoints as the UART mode, just with a different protocol. You can do a lot of this today from userland with pyftdi (a pure Python solution), libmpsse and libftdi, all of which are open source based upon reverse engineering the FTDI USB protocol. Or.. you can download the closed source FTDI userland library and work with the chip with that, but that is only available for some systems, of course. What I want is a proper NetBSD i2c bus, which should be possible. > I have been using several I2C devices such as servomotor controllers, > i/o extenders, relay controllers, displays, etc. with arduino > controllers (waiting on some esp32 boards now) and most of the > non-standard arduino boards require flashing (via usb) of some sort > to set what you want to use, as I will have to so with the esp32s. > > I am having to use a 10-yr-old windows laptop to flash and program > these things. I wish we had the Arduino IDE ported to NetBSD as it > is a very nice tools for this but waiting for 10 minutes for the > thing to come up under windows is tiring. Once they are up, many > of these controllers have bluetooth, ethernet and wifi support > where I can control them directly from NetBSD but getting them > there and programming them is archaic on windows. Ya, I thought that the Ardunio compiler was in pkgsrc, which I think is just gcc. > Yes, there are the basic programmers in pkgsrc but the arduino ide > can easily program a complete package linked to a git repository > into these devices very nicely (once you have it up and running) > and runs on linux. Ya, I do some Ardunio stuff from time to time and use the graphically IDE for that. > Having I2C devices directly accessible sounds interesting though. I think so as well,, which is mostly why I am trying to get this working. -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Re: Polymorphic devices
Iain Hibbert writes: > On Fri, 5 Jan 2024, Brad Spencer wrote: > >> I see a few options for doing this, such as simply matching all of the >> possible children and using sysctl to enable the one you want. Probably >> followed by a rescan call. That seems ugly, however. The use of >> 'drvctl -r -a ' seemed to hold promise. It seems like a >> better idea to require the detachment of whatever followed by the rescan >> with the attribute indicating which sort of thing you wanted to attach >> (detach ucom in favor of i2c, for example). But I do not completely >> understand if this sort of thing is possible. > > I don't see why it shouldn't be.. you attach your main driver as a bus, > then send it a message as to what configuration you need, then it > essentially rescans and finds and attaches or detaches sub-drivers as > appropriate. > > I did this sort of thing in software with bthub(4) > > iain Thanks for that. bthub(4) appears to be close to the idea I need. I will look at it in detail. -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Polymorphic devices
Hello I am in need of some advise and guidance. I am working slowly on device support for a couple of USB chips that have multiple, but somewhat mutually exclusive, behaviors. The first is enhancements to ufdti to support the MPSSE engine that some of the FTDI chip variants have. This engine allows the chip to be a I2C bus, SPI bus and provides some simple GPIO, and a bunch of other stuff, as well as the normal USB UART. It is not possible to use all of the modes at the same time. That is, these are not separate devices, but modes within one device. Or another way, depending on the mode of the chip you get different child devices attached to it. I am curious on what the thoughts are on how this might be modeled. I see a few options for doing this, such as simply matching all of the possible children and using sysctl to enable the one you want. Probably followed by a rescan call. That seems ugly, however. The use of 'drvctl -r -a ' seemed to hold promise. It seems like a better idea to require the detachment of whatever followed by the rescan with the attribute indicating which sort of thing you wanted to attach (detach ucom in favor of i2c, for example). But I do not completely understand if this sort of thing is possible. The second chip is the MCP2221 which provides some of the same features as the FTDI MPSSE engine. It has a built in UART which is a separate USB device and then a USB device that can be programmed to provide I2C, GPIO, some DAC a bit of ADC (low number of bits in both of those cases, but still interesting enough). You can probably always just provide the I2C bus and then flip the behaviors of the GPIO pins by setting their alt settings. This chip may be a bit simpler to deal with with the using gpio alt settings and such, but I also have not looked at this one as much. If I can get these to work, the end result will be that any system with USB, including virtual systems with USB devices presented to it in some way, can have I2C, simple GPIO and maybe SPI ... and this is the place I want to end up for my own needs. I have looked around the tree for other devices that do this sort of thing and didn't really find any that tried to deal with this sort of situation, but I will admit I looked mostly at the MI devices. Any thoughts would be appreciated, -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Re: Polymorphic devices
Greg Troxel writes: > Brad Spencer writes: > >> The first is enhancements to ufdti to support the MPSSE engine that some >> of the FTDI chip variants have. This engine allows the chip to be a I2C >> bus, SPI bus and provides some simple GPIO, and a bunch of other stuff, >> as well as the normal USB UART. It is not possible to use all of the >> modes at the same time. That is, these are not separate devices, but >> modes within one device. Or another way, depending on the mode of the >> chip you get different child devices attached to it. I am curious on >> what the thoughts are on how this might be modeled. > > My reaction without much thought is to attach them all and to have the > non-selected one return ENXIO or similar. And to have another device on > which you call the ioctl to choose which device to enable. > > Or perhaps, to let you open any of them, flipping the mode, and to fail > the 2nd simultaenous open. Those techniques and models have occured to be too, except I probably would still elect to use use a sysctl instead of another device with a ioctl. I don't know just yet, but there might be unwanted device reset the "use the one you open" technique. That is, you might have to reset the chip to change mode and if you support say, I2C and GPIO at the same time (which is possible), but then change to just GPIO the chip has to be reset and that will disrupt any setting you might have set (I think, I am am still working out what needs to happen with the mode switches). This may not matter in the bigger picture and it wouldn't matter as much if the mode switch was a sysctl, which one can say will reset the chip anyway. In any case, thanks for the comments... -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
usbip??
Has anyone contemplated, at the very least, a USBIP[1] client for NetBSD?? From looking at the tree, it would seem that the Not-Well-Documented vhci driver that exists in 10.x and -current could be used towards that end. Simular concepts are used in other OSs. I could not find a clean (i.e. license compatible) USBIP library anywhere, but there are a number of examples all over the place. [1] - USBIP appears to be, more or less, Linux URB (the method that Linux uses to describe a USB transfer, etc..) structs over IP. There are implementations of this for MS-WINDOWs, among others, as well as some of the more powerful small boards like the ESP series. It is a method of presenting a USB device to a system that does not have physical access to USB ports, and/or would like to use a USB device that is "over there". Many of the implementations seem to make use of a vhci driver, which is a virtual USB host controller for the client side of things. -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Re: Forcing a USB device to "ugen"
Brian Buhrow writes: > Isn't it possible to do most of what Jason proposes by using the drvctl > interface to > detach a driver from a specific USB device? Then, some glue could be added > to the ugen driver > to allow it to be attached to arbitrary devices using the same drvctl > interface? That seems a > lot easier to me than building a registry of devices and device IDs, which > will be woefully > out of date before it gets published. It also has the advantage of allowing > the user to do > creative stuff that the developers didn't think of. Am I missing something > obvious? > > -thanks > -Brian I don't think that the detach part is a problem, but the second part is murky. The only thing I know of that can do the second part is a rescan call against the USB bus. Unless the ugen driver is allowed to have a higher priority than the specific device driver, the specific wins during the rescan of the USB bus. The use of ugenif was mentioned, and this is a way to do the "make ugen have a higher priority" game, but ugenif requires a custom kernel. See the ugen man page for the hint on how to use ugenif. There is the concept of tags that can be passed in a rescan, but to use those effectively for this problem is messy. You may end up having to put the "detect that ugen has a high priority" in every driver or at the very least the [uoevx]hci drivers and even then, it isn't tied to a specific device really, just the concept of "on this rescan, have ugen take priority" and ANY device found would get ugen. Jason's notion of using ugen as a bus instead of a leaf has merit and may be the better approach. The devil will be in the details, however. -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Re: Forcing a USB device to "ugen"
Jason Thorpe writes: [snip] >> What would be wrong with attaching an ugen to interface 1 instead of >> an ucom in the ftdi driver itself? > > ugen can’t currently attach to things other than uhub. I think attaching > ugen as the leaf is the wrong model, though; ugen should be what the kernel > drivers themselves attach to, IMO. This would probably make it very possible to have a USBIP server. For that, you need to more or less be able to do ugen against any physically present USB device, without the specific device drivers getting in the way. Even if you had to detach the specific driver, to get ugen exposed that would be ok, even better if you didn't have to do that, of course... According to the man page for ugen there is a way to compile a kernel such that ugen takes priority over the specific driver, but that is more than a little clunky in the USBIP server case. This is the ugenif case that was mentioned before. > -- thorpej -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Re: Forcing a USB device to "ugen"
Jason Thorpe writes: > I have a device based on the FTDI FT2232C: > > [ 3285.311079] uftdi1 at uhub1 port 2 configuration 1 interface 0 > [ 3285.311079] uftdi1: SecuringHardware.com (0x0403) Tigard V1.1 > (0x6010), rev 2.00/7.00, addr 3 > [ 3285.311079] ucom1 at uftdi1 portno 1 > [ 3285.311079] uftdi2 at uhub1 port 2 configuration 1 interface 1 > [ 3285.311079] uftdi2: SecuringHardware.com (0x0403) Tigard V1.1 > (0x6010), rev 2.00/7.00, addr 3 > [ 3285.311079] ucom2 at uftdi2 portno 2 > > It's a combo device that, in addition to a standard TTL-level UART, has a > bunch of break-out headers for doing things like SPI, SWD, and JTAG (in my > case, I need to use JTAG for programming some Atmel CPLDs). I should be able > to do this with OpenOCD (pkgsrc/devel/openocd), but libfdti1 fails to find > the device because libusb1 only deals in "ugen". > > "ugenif" might have been a possible solution here, except for the fact that > 0x0403,0x6010 is the standard VID,PID for the FTDI FT2232C, and I don't want > "interface 1" of ALL FT2232C devices to get the "ugen" treatment. The desire > to use "ugen" on "interface 1" is not a property of 0x0403,0x6010, it's > really a property of "SecuringHardware.com","Tigard V1.1". Unfortunately, > there's isn't a way to express that in the kernel config syntax. > > I think my only short-term option here is to, in uftdi_match(), specifically > reject based on this criteria: > > - VID == 0x0403 > - PID == 0x6010 > - interface number == 1 > - vendor string == "SecuringHardware.com" > - product string == "Tigard V1.1" > > (It's never useful, on this particular board, to use the second port as a > UART.) > > -- thorpej [going a little off topic, but maybe this fills in some blanks about that chip] The FT2232 and its buddies the FT4232 and FT232H are nice chips. They are also polymorphic in that there is at least one or more engines in the chip called the MPSSE that can program the pins to support any number of different sorts of things such as I2C, SPI, and simple GPIO (and others) in addition to a mode that is the usual FTDI UART (at the very least, some of the chips support other UART modes). The chip always identifies itself as a FTwhatever, however, and you can't use that as the basis for what you want the pins to do and in some cases you can have a usual FTDI UART plus a I2C bus (for example) at the same time because there is more than one MPSSE engine present. I have samples of all of the chips, and some interest in supporting the polymorphic behavior, but no particular time to do the work. My personal main goal would be to allow any system with a USB port the ability to have I2C, probably SPI and simple GPIO. -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org
Re: Forcing a USB device to "ugen"
Robert Swindells writes: > Jason Thorpe wrote: >> I have a device based on the FTDI FT2232C: >> >> [ 3285.311079] uftdi1 at uhub1 port 2 configuration 1 interface 0 >> [ 3285.311079] uftdi1: SecuringHardware.com (0x0403) Tigard V1.1 >> (0x6010), rev 2.00/7.00, addr 3 >> [ 3285.311079] ucom1 at uftdi1 portno 1 >> [ 3285.311079] uftdi2 at uhub1 port 2 configuration 1 interface 1 >> [ 3285.311079] uftdi2: SecuringHardware.com (0x0403) Tigard V1.1 >> (0x6010), rev 2.00/7.00, addr 3 >> [ 3285.311079] ucom2 at uftdi2 portno 2 >> >> It's a combo device that, in addition to a standard TTL-level UART, >> has a bunch of break-out headers for doing things like SPI, SWD, and >> JTAG (in my case, I need to use JTAG for programming some Atmel >> CPLDs). I should be able to do this with OpenOCD >> (pkgsrc/devel/openocd), but libfdti1 fails to find the device because >> libusb1 only deals in "ugen". > > I thought that all FTDI devices provided JTAG etc. functionality, just > that the pins are not connected to anything in some devices. I believe that only the ones with the more advanced engine design can do JTAG (I think that uses the BITBANG mode, which is yet another mode along side UART and MPSSE, but I don't exactly remember). FTDI makes a lot of chip types, and some of them are only UART. > What would be wrong with attaching an ugen to interface 1 instead of > an ucom in the ftdi driver itself? In the FT4232 chip, for example, there are two MPSSE engines attached to two of the four ports and 4 distinct ports total (the FT2232 has two MPSSE engines and two ports total, and the FT232H has one MPSSE engine and two ports, but they are 16 bits and not the usual 8 bits that the others have). On the FT4232, the two MPSSE engines can program their respective ports to be I2C, SPI, GPIO and etc.. and support a BITBANG mode that can also be used for GPIO and etc... They also support the usual UART mode on a port. The two ports that don't have the MPSSE engine can do the UART and BITBANG mode. In general, the chip is more complicated then simply setting a specific interface to something or other. There probably should be a userland utility or a plist file somewhere that tells uftdi what you want a particular interface / port to be. It is entirely reasonable to want to change the personality without a reboot from userland. A detach / reattach is acceptable. -- Brad Spencer - b...@anduin.eldar.org - KC8VKS - http://anduin.eldar.org