Re: RFC: PERCPU_LIST to fix PR kern/52515
On Fri, Dec 8, 2017 at 6:51 PM, Kengo NAKAHARAwrote: > Hi, > > On 2017/12/07 14:21, Taylor R Campbell wrote: >> I dropped this thread on the floor a while ago and I forget what the >> status was. I've had a patch sitting in my tree for a while which I >> brushed off to put the list update logic in separate functions, as I >> think chuq requested a while ago, but still keep it all isolated to >> subr_psref.c and avoid defining any new macros. >> >> If you've measured that SLIST works better -- which would make sense >> because the typical bracketed psref_acquire/release nesting makes a >> nice LIFO stack of the things so that SLIST_REMOVE should usually be >> done in the first iteration -- then I'm happy to replace it by SLIST. >> We should just make sure to fix this bug before netbsd-8 goes out! >> >> Thoughts? > > I measure IPv4 forwarding performance your patch(PSREF_LIST) version > and SLIST version. At first, the result is quite affected by > optimization option "-falign-functions"... Based on this, it seems > there is almost no difference between PSREF_LIST and SLIST. > > However, it seems your patch has large diff... From the point of code > stability, smaller diff SLIST version would be better for netbsd-8 branch > to fix the bug. Because your patch causes some new ATF failures such as > ldp_regen and route_change_ifp (reported by ozaki-r@n.o). We can probably > fix them at once but guaranteeing its stability would take more time. Hmm, the same failures also happen with the SLIST version. IIRC there was no regression with it two months ago and so I didn't test it enough days ago. Anyway I guess the failures are not because of your patch. I'm sorry for that. I have no idea why the failures happen. The patches may dig out a latent bug in -current. ozaki-r
I2C iic_exec() and clock stretch
Sorry to intrude... So... I have a I2C device I am writing a driver for that has a read cycle that needs one of the following: 1) Start a read with a STOP, except that before the data is sent down the bus, clock stretch for a bit while the device is working on what is is suppose to be working on. When it is done, the data will be sent. 2) Start the read with no STOP cycle, but return NACK part way though until the device is done doing what it is suppose to be doing. When it stops sending a NACK, or after you wait a bit, it seems that you can do a read with a STOP and get the data. Neither of these two situations appear to be supported with iic_exec() as it exists today. What I was wondering is if anyone has worked on anything to enhance iic_exec() to deal with this sort of thing. The device in question is the SI7021 humidity and temperature sensor. It is a fairly inexpensive device that works pretty well and has tons of example code out there for all kinds of OSs and devices. The data sheet is available at, among other places: https://cdn-learn.adafruit.com/assets/assets/000/035/931/original/Support_Documents_TechnicalDocs_Si7021-A20.pdf I am using a Raspberry PI 3 as the computer. To do read type #1, I hacked up bcm2835_bsc.c to provide a simple device property to specify the amount of clock stretch to the BCM2835 iic that is desired and a new flag in i2cvar.h that can be used by iic_exec() that will tell the controller driver to use the stretch, but getting that property set from attached children provided to be ugly. I am thinking that something like a iic_execv() should maybe exist that works, mechanically, a little like sysctl_createv() so that a si70xx driver can do a I2C transaction, but tell the controller driver, do this transaction with these additional arguments that are more complicated then a simple flag. To do read type #2 with the SI7021 iic_exec() needs to be able to do a 0 length read, I think and this is disallowed by bcm2835_bsc.c by a KASSERT(dlen >= 1). I didn't really look at any other drivers to see if this sort of thing is allowed. This is basically a cycle that looks like: START+ADDRESS+COMMAND+START+ADDRESS NACK will be returned until the measurement is done START+ADDRESS You get up to three bytes back. It was simpler to hack in support for clock stretching then to figure out if it was possible to do 0 length reads. In any case, I was curious if anyone had any thoughts or was working on something like this?? -- Brad Spencer - b...@anduin.eldar.org - KC8VKS http://anduin.eldar.org - & - http://anduin.ipv6.eldar.org [IPv6 only]
Re: getsockopt(2)
chris...@astron.com (Christos Zoulas) wrote: >In article, >Robert Swindells wrote: >> >>chris...@astron.com (Christos Zoulas) wrote: >>>In article , >>> >>>So depending on the contents of the sockval we do something different? >> >>FreeBSD does. The calls to copy in or out the data are scattered >>throughout the network stack. > >Is sockval always an int? Do you mean optname or optval ? >>Linux seems to cheat and make use of the fact that their errors are >>negative to return a positive integer word value from >>setsockopt(). Their implementation of getsockopt() copies in and out. > >Hmm, that is not what the man page claims :-) Linux doesn't really do man pages though, I looked at the source. The FreeBSD man page is wrong too. There is a thread on lkml describing what Linux did. >>I would still prefer to change both getsockopt(2) and setsockopt(2). >> >>I would make getsockopt always copy in the supplied optval argument. > >Well, it has to copyout, so presumably that won't break existing code. I have run a kernel for a while with this change to getsockopt() and it seemed fine. >>For setsockopt, the protocol code can update sopt_size to indicate the >>amount of data to be copied back. I would add a line to the PRCO_SETOPT >>handler in most protocols to set sopt_size to 0 so that nothing was >>copied back. > >That could break things; grepping for sopt_size finds many checks for it. >There is also sockopt_setmbuf that can potentially allocate larger than >MCLBYTES socket option? I meant that I would set sopt_size to zero after all the processing of the input data has been finished. The only place that I could see mbufs being used was in ipsec, which doesn't work because it needs these changes too. >>The syscall definition for setsockopt() would need to be changed to >>remove the 'const' from the optval argument. > >That is arguably the worst part of the change. The API for it is part of >the standard. Well, we can go the linux way and make getsockopt read and >write, but that makes me cringe. Perhaps adding an updatesockopt() is better? It is the FreeBSD way as well. The operation required for getsockopt() isn't really *updating* the socket options, it is more "get with an extra parameter". The operation required for setsockopt() is to mostly set things but in certain cases to return an int value that results from setting stuff.
Re: getsockopt(2)
In article, Robert Swindells wrote: > >chris...@astron.com (Christos Zoulas) wrote: >>In article , >> >>So depending on the contents of the sockval we do something different? > >FreeBSD does. The calls to copy in or out the data are scattered >throughout the network stack. Is sockval always an int? >Linux seems to cheat and make use of the fact that their errors are >negative to return a positive integer word value from >setsockopt(). Their implementation of getsockopt() copies in and out. Hmm, that is not what the man page claims :-) >I would still prefer to change both getsockopt(2) and setsockopt(2). > >I would make getsockopt always copy in the supplied optval argument. Well, it has to copyout, so presumably that won't break existing code. >For setsockopt, the protocol code can update sopt_size to indicate the >amount of data to be copied back. I would add a line to the PRCO_SETOPT >handler in most protocols to set sopt_size to 0 so that nothing was >copied back. That could break things; grepping for sopt_size finds many checks for it. There is also sockopt_setmbuf that can potentially allocate larger than MCLBYTES socket option? >The syscall definition for setsockopt() would need to be changed to >remove the 'const' from the optval argument. That is arguably the worst part of the change. The API for it is part of the standard. Well, we can go the linux way and make getsockopt read and write, but that makes me cringe. Perhaps adding an updatesockopt() is better? christos
Re: getsockopt(2)
chris...@astron.com (Christos Zoulas) wrote: >In article, >Robert Swindells wrote: >> >>I wrote: >>>Does anyone know why we don't use the input 'optlen' parameter to the >>>getsockopt(2) syscall, we do write back to it on return. >>> >>>In ip_output() there is this, which suggests that someone had come >>>across this before. >>> >>>#if 0 /* defined(IPSEC) */ >>>case IP_IPSEC_POLICY: >>>{ >>>struct mbuf *m = NULL; >>> >>>/* XXX this will return EINVAL as sopt is empty */ >>>error = ipsec4_get_policy(inp, sopt->sopt_data, >>>sopt->sopt_size, ); >>>if (error == 0) >>>error = sockopt_setmbuf(sopt, m); >>>break; >>>} >>>#endif /*IPSEC*/ >>> >>>There are also lots of places in sctp_usrreq that want to use it. >>> >>>I can set it with the following patch (line numbers will be slightly >>>out), but wondered if there was a reason for the current behaviour. >>> >>>Index: uipc_syscalls.c >>>=== >>>RCS file: /cvsroot/src/sys/kern/uipc_syscalls.c,v >>>retrieving revision 1.187 >>>diff -u -r1.187 uipc_syscalls.c >>>--- uipc_syscalls.c 20 Jun 2017 20:34:49 - 1.187 >>>+++ uipc_syscalls.c 14 Oct 2017 21:33:09 - >>>@@ -1235,7 +1240,7 @@ >>> if ((error = fd_getsock1(SCARG(uap, s), , )) != 0) >>> return (error); >>> >>>-sockopt_init(, SCARG(uap, level), SCARG(uap, name), 0); >>>+sockopt_init(, SCARG(uap, level), SCARG(uap, name), valsize); >>> >>> if (fp->f_flag & FNOSIGPIPE) >>> so->so_options |= SO_NOSIGPIPE; >> >>Has anyone had any other thoughts on this ? > >Yes, if you don't protect valsize against MCLBYTES like setsockopt does, >it can be used to DoS kernel. Otherwise is can be done. Is 2K enough? Yes, 2K would be enough. Have added the check in my tree. >>I'm thinking of adding an extra socket option, maybe SO_PARAM, that you >>would use with setsockopt(2) to set a selector to be used by a following >>getsockopt(2) call. > >This would require some state keeping in the kernel and might not be >reliable. I know, it is the only alternative that I can think of to changing setsockopt(2). >>This wouldn't fix the IPSEC usage without changing userland though. >> >>The alternative is to make both setsockopt(2) and getsockopt(2) >>bidirectional. > >So depending on the contents of the sockval we do something different? FreeBSD does. The calls to copy in or out the data are scattered throughout the network stack. Linux seems to cheat and make use of the fact that their errors are negative to return a positive integer word value from setsockopt(). Their implementation of getsockopt() copies in and out. I would still prefer to change both getsockopt(2) and setsockopt(2). I would make getsockopt always copy in the supplied optval argument. For setsockopt, the protocol code can update sopt_size to indicate the amount of data to be copied back. I would add a line to the PRCO_SETOPT handler in most protocols to set sopt_size to 0 so that nothing was copied back. The syscall definition for setsockopt() would need to be changed to remove the 'const' from the optval argument. >>I haven't checked in the userland implementation of sctp_opt_info(3) >>yet. I took the one from FreeBSD but can modify it to work with a >>different API. > >Ok. This still requires an alternative API to use. I haven't found an example of an OS that hides one in its version of sctp_opt_info().
Re: kernel condvars: how to use?
> Date: Mon, 11 Dec 2017 11:13:16 -0500 (EST) > From: Mouse> > > If read does > > > while (sc->sc_foo < sc->sc_bar) > > cv_wait(>sc_cv, >sc_lock); > > > then whoever changes sc_foo or sc_bar might test whether they changed > > from sc->sc_foo < sc->sc_bar to !(sc->sc_foo < sc->sc_bar) before > > they cv_broadcast. > > ...then you have the same test in two places, leaving them vulnerable > to one but not the other being changed. (Of course, only some changes > will break functionality, but that's a separate issue.) > > Furthermore, they and LPT_RF_WAITING are testing/describing different > things: "is anyone actually waiting?" versus "if anyone does happen to > be waiting, could this make a difference?". Correct. Your responsibility as a user of condvar(9) is to make sure that if you ever make a difference for a waiter, then you wake. That's why I advise either issuing wakeups for any changes to sc_foo/bar, or if you're going to conditionalize them to avoid unnecessary wakeups, then conditionalizing them only on changes to sc_foo/bar that transition from a state that blocks waiters to a state that doesn't block waiters. On the other side, condvar(9) handles skipping some potentially expensive logic if there isn't actually anyone waiting. You shouldn't worry about testing for the presence of waiters unless you actually observe a performance impact. That's why I advise _against_ using LPT_RF_WAITING. (One exception: It's OK to KASSERT(!cv_has_waiters(cv)) in cases when you need to be sure there are no waiters. Aside: Maybe cv_destroy should do this itself anyway.) > > In any case, neither __insn_barrier nor volatile is sufficient in the > > multiprocessor model, > > Not sufficient, true, but necessary (though, depending on the > implementations of things like C and mutexes, possibly implicitly > provided). mutex_enter/exit guarantees the appropriate memory barriers to make the mutual exclusion sections for a single lock object appear globally ordered on all CPUs, in thread context and in interrupt context.
Re: kernel condvars: how to use?
>>> If anything, you should just test rather whether you changed from a >>> state that would block a read to a state in which a read can make >>> progress. >> That's kind-of what LPT_RF_WAITING records. > What I meant is the condition that read actually loops on. [...] Yes, that's what I thought you meant. But... > If read does > while (sc->sc_foo < sc->sc_bar) > cv_wait(>sc_cv, >sc_lock); > then whoever changes sc_foo or sc_bar might test whether they changed > from sc->sc_foo < sc->sc_bar to !(sc->sc_foo < sc->sc_bar) before > they cv_broadcast. ...then you have the same test in two places, leaving them vulnerable to one but not the other being changed. (Of course, only some changes will break functionality, but that's a separate issue.) Furthermore, they and LPT_RF_WAITING are testing/describing different things: "is anyone actually waiting?" versus "if anyone does happen to be waiting, could this make a difference?". (Each one is a useful test under some circumstances; it just seems to me that, while each can be a useful test to apply there, confusing the tested-for conditions is likely to lead to muddy thinking and thence muddy code.) >> Looking at the implementation of __insn_barrier(), it looks like >> overkill: is forces _everything_ out of registers, [etc]. Using >> volatile does so only for the things for which it actually matters > Yes, that's right. On the other hand, volatile also applies to every > access to fields in question, whereas __insn_barrier applies a > barrier only in one specific place. That's why I write struct lpt_softc *sc; volatile struct lpt_softc *vsc; vsc = sc; and use vsc-> when volatile is appropriate and sc-> when not. It would be better if C provided a way to mark a sequence point as volatile-synchronizing, relaxed (as compared to the current rules) so that object accesses are synchronized not as of the next sequence point but rather as of the next _marked_ sequence point. Possibly, instead, a way to name objects and force their values to be synchronized. But we don't have either, and, with no namespace for sequence points (many of which have no explicit textual representation in the source), I'm not sure how doable the former would be. > In any case, neither __insn_barrier nor volatile is sufficient in the > multiprocessor model, Not sufficient, true, but necessary (though, depending on the implementations of things like C and mutexes, possibly implicitly provided). /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: Merging ugen into the usb stack
Martin Husemannwrites: > On Mon, Dec 11, 2017 at 08:24:00AM -0500, Greg Troxel wrote: >> I wonder if we should be attaching drivers to endpoints, rather than >> devices. > > This is the drivers decision (we have drivers that do it). > > However, ugen is not able to attach to a single interface itself (currently). Well, I guess I think it's better to allow drivers to attach to single interfaces in general, than to make ugen special and try to integrate it. But I haven't looked at things enough to justify that opinion. signature.asc Description: PGP signature
Re: Merging ugen into the usb stack
On Mon, Dec 11, 2017 at 08:24:00AM -0500, Greg Troxel wrote: > I wonder if we should be attaching drivers to endpoints, rather than > devices. This is the drivers decision (we have drivers that do it). However, ugen is not able to attach to a single interface itself (currently). Martin
Re: Merging ugen into the usb stack
Martin Husemannwrites: > However, it can not work with the way NetBSD uses ugen devices: > > uftdi0 at uhub3 port 2 > uftdi0: FTDI (0x9e88) SheevaPlug JTAGKey FT2232D B (0x9e8f), rev 2.00/5.00, > addr 3 > ucom0 at uftdi0 portno 1 > ucom1 at uftdi0 portno 2 > > I can disable the ucom at uftdi0 portno 1, but there is no way to get a ugen > device to attach there. > > The uftdi chip itself offers a separate interface for each of the ports, > at that layer there should not be a problem. I wonder if we should be attaching drivers to endpoints, rather than devices. It seems fairly common to have multiple endpoints doing different things (among more-than-one-thing devices), rather than multiple devices behind a hub. Letting ugen attach to endpoints that drivers don't deal with seems like an entirely reasonable solution, and it seems to have lower risk of problems from things we can't predict. I also wonder what other operating systems do here (beyond point solutions that they've had to do). signature.asc Description: PGP signature
Re: Merging ugen into the usb stack
> Date: Mon, 11 Dec 2017 12:44:32 +0100 > From: Martin Husemann> > I could hack uftdi to allow attaching a ugen child, but that sounds > like a very special hack. Jared suggested to instead make ugen not a > separate device, but globaly fold it into the usb stack. That would > also solve similar issues we have seen with usb scanner devices and > ulpts. Sounds reasonable as long as we clearly define the semantics of opening the ugen device as it affects any other usage of the device, and as long as Someone^TM takes the time to carefully implement that semantics.
Re: RFC: PERCPU_LIST to fix PR kern/52515
> Date: Mon, 11 Dec 2017 11:55:27 +0100 > From: Edgar Fuß> > > struct psref { > > - LIST_ENTRY(psref) psref_entry; > > + SLIST_ENTRY(psref) psref_entry; > > + /* To keep ABI with LIST_ENTRY(psref) version. */ > > + void*psref_unused0; > Isn't it somewhat fishy to manally pad this, knowing how big a [S]LIST_ENTRY > is? Wouldn't it be cleaner to use a union? Also fine by me, but a little more work to implement. Feel free to add some ctasserts to raise your confidence in this. I don't expect LIST_ENTRY or SLIST_ENTRY to change, but I don't mind assertions to add additional confidence.
Merging ugen into the usb stack
Hi folks, I have a Guruplug JTAG device (and right now need to use it to unbrick one of my Guruplugs after some ... stupid u-boot experiment). The device is supported by openocd, which we have in pkgsrc. The driver used is: http://openocd.org/doc/html/Debug-Adapter-Configuration.html#Debug-Adapter-Configuration (search for ftdi): Interface Driver: ftdi This driver is for adapters using the MPSSE (Multi-Protocol Synchronous Serial Engine) mode built into many FTDI chips, such as the FT2232, FT4232 and FT232H. The driver is using libusb-1.0 in asynchronous mode to talk to the FTDI device, bypassing intermediate libraries like libftdi or D2XX. For the recovery procedure I need access to the serial console at the same time. The uftdi jtag luckily offers that as well. However, it can not work with the way NetBSD uses ugen devices: uftdi0 at uhub3 port 2 uftdi0: FTDI (0x9e88) SheevaPlug JTAGKey FT2232D B (0x9e8f), rev 2.00/5.00, addr 3 ucom0 at uftdi0 portno 1 ucom1 at uftdi0 portno 2 I can disable the ucom at uftdi0 portno 1, but there is no way to get a ugen device to attach there. The uftdi chip itself offers a separate interface for each of the ports, at that layer there should not be a problem. I could hack uftdi to allow attaching a ugen child, but that sounds like a very special hack. Jared suggested to instead make ugen not a separate device, but globaly fold it into the usb stack. That would also solve similar issues we have seen with usb scanner devices and ulpts. So, how should we proceed here? Martin
Re: RFC: PERCPU_LIST to fix PR kern/52515
> struct psref { > - LIST_ENTRY(psref) psref_entry; > + SLIST_ENTRY(psref) psref_entry; > + /* To keep ABI with LIST_ENTRY(psref) version. */ > + void*psref_unused0; Isn't it somewhat fishy to manally pad this, knowing how big a [S]LIST_ENTRY is? Wouldn't it be cleaner to use a union?