>>> Where do you set LPT_RF_WAITING? >> Hm, now that you mention it, I think I don't. > Note that a flag like this is not normally necessary,
Of course. > or even particularly helpful -- cv_broadcast already has essentially > the same optimization internally. That doesn't surprise me, but it wasn't clear from the manpage - at least not the 5.2 manpage I had on hand - so I didn't want to depend on it. (I don't like to depend on accidents of the implementation, though of course in a lot of cases I end up doing so anyway.) > If anything, you should just test rather whether you changed from a > state that would block a read to a state in which a read can make > progress. That's kind-of what LPT_RF_WAITING records. > In this case, you might test whether sc->rbptr is newly less than > sc->rbfill. Yes, that's the test for whether a read can make progress...now. But wiring that into the wakeup code is a case of keeping the same information ("when can a read progress?") in two different places and thus vulnerable to getting out of sync. LPT_RF_WAITING is basically a way for the read code to communicate that status to its paired wakeup. > But I'd just skip the test altogether and always broadcast whenever > sc->rbfill changes. I may change it to do that.... >>> [...], and definitely avoid volatile voodoo. >> volatile is hardly voodoo. It is required by C for data accessed by >> both the main line and the interrupt line and mutated by at least >> one of them. (I didn't explicitly say, but this is a case of "necessary but likely not sufficient". volatile is needed from a C perspective, but there may be other things, like interprocessor cache protocols, that need attention as well.) > However, we still normally use __insn_barrier to enforce ordering > where it is important in those cases where we aren't using mutexes or > splfoo for some reason. What are the semantics of __insn_barrier (ie, as an API definition - I can certainly look up the semantics of the implementation)? "man -k barrier" shows me the bus_space stuff, mn, membar_*, and a handful of pthread_* pages, nothing more. > In the multiprocessor model, volatile is basically always red > herring. Looking at the implementation of __insn_barrier(), it looks like overkill: is forces _everything_ out of registers, stack temporaries, and the like, into wherever it nominally lives. Using volatile does so only for the things for which it actually matters (well, when used correctly, at least; like anything else, volatile can be misused). >> (If the mutex implementation uses hooks into the C implementation >> such that volatile is guaranteed unnecessary, fine, but that really >> needs to be clearly spelled out in the documentation - and probably >> would be very fragile, as a new compiler, or a new compiler version, >> could break it.) It also has the problem sketched above, in that it sledgehammers _everything_ out of temporaries, rather than confining the attention to just the things that actually need it. >>> Can you write down all the states you expect this thing to be in, >>> and the state transitions that each action [...] >> I think so, yes: [...] > That's helpful, but I should have phrased my question a little more > clearly: Can you write down the states that the lpt_softc data > structure can be in? Given a diagram of something like this: [...] Ah! sc_state is easy: if RAW is not set, the only piece of my new code that should ever run is the tests in lptopen() to see if we're opening the raw device and my other new stuff in the softc is all don't care, except for the condvar and mutex set up at attach time and thus aren't really don't-care, though they should never be touched when RAW is clear.. RAW set without OPEN also set, or with any of the other bits set, is a can't-happen. When open raw (sc_state == OPEN|RAW), the semantics of the pieces are: rawbuf buffers values read from the hardware but not yet passed back to userland. Elements subscripted < rbptr or >= rbfill are don't-care. rbfill being <0 or >=LPT_RAWBUFSIZE is a can't-happen. So is rbptr being <0 or >rbfill. laststat is used to detect changed status bits. Except for its initialization, it is never touched except by the callout ticker. LPT_NO_STAT is (by intent, at least) a value that cannot match anything read from lpt_status. (The code is, admittedly, broken if unsigned int and unsigned char are the same; while permitted by C, I think that's unlikely enough to be ignored on any hardware where this driver is useful. I really should comment it and/or switch to a separate flag.) > What are all the possible classes of tuples of values that sc_state, > rbptr, rbfill, &c., can have? Each class is a discrete state. Sounds as though the states you're looking for are (1) Buffer empty: rbptr == rbfill (2) Something buffered: rbptr < rbfill with transitions (elaborated below) (1)->(1) or (2)->(2): lpt_raw_read_tick finds status bits unchanged. (2)->(2): lpt_raw_read consumes some but not all buffered samples. (1)->(2): lpt_raw_read_tick finds the status bits changed. (2)->(1): lpt_raw_read drains the last buffered sample. > In each state, what can read do, and what state does it transition > to? read blocks in state (1). In state (2), it either drains it all, returning with what it found and leaving state (1) behind it, or it doesn't drain it all, in which case it leaves state (2) behind it (with rbptr advanced, but not all the way to rbfill). > In which states must read block? (1). > In each state, what can read_tick do, and what state does it > transition to? About the only notable thing not mentioned above is that it can transition from (2) to (2) when overflowing the buffer; it ends up still in state 2, but with the formerly-buffered data thrown away and only the one new sample buffered afterwards. (Speaking of which, that code needs a little further attention, which I've now given it in my dev version, to keep it from overflowing gratuitiously.) > Which state transitions does read_tick cause that may unblock read? When it actually moves from (1) to (2). >> [...], I would expect the use of a too-high IPL when creating a >> mutex to be a relatively benign mistake. > Just from the perspective of the program's function, you're mostly > right. Good, then I guess I am getting something approaching an understanding of this stuff. > From the perspective of performance, [...] Oh, certainly. I'm still at the "first make it work" stage. "Then make it better" is off in the future. :-) > But from the perspective of the next human to read your code, [using > splhigh() is] expressing a confusing message to them: `This code MUST > exclude ALL interrupts on the current CPU!' There's actually another possible semantic behind it, which unfortunately there is no way to express short of a comment (which usually isn't used): "this code must lock against itself, and we don't know what the highest spl level it may be called at is". For example, my current development version includes #define DLRINGSIZE 65536 static volatile int dlring[DLRINGSIZE]; static volatile int dlh = 0; static volatile int dlt = DLRINGSIZE - 1; static void record_line(int lno) { int s; int h; s = splhigh(); h = dlh; dlt = h; dlring[h] = lno; h ++; if (h >= DLRINGSIZE) h = 0; dlring[h] = -1; dlh = h; splx(s); } #define DLINE() record_line(__LINE__) which is _exactly_ that use of splhigh(); I _think_ it will never run at anything above splsoftclock(), but I'm not sure enough to change it, especially since that's debugging code. (I suspect it arguably should use an SPL_HIGH spin mutex, though you've been saying in its current state this driver runs giantlocked so there is little immediate practical difference.) > ddb's lock == kernel's mutex_t Good. Thanks. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTML mo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B