Kristian Høgsberg wrote: > On Fri, Jul 4, 2008 at 4:01 AM, Thomas Hellström > <[EMAIL PROTECTED]> wrote: > >> Kristian, >> >> As we're starting to incorporate dri2 in our code, I've run into some major >> concerns about the sarea and the event mechanisms: >> > > Hi Thomas. Sorry for the late reply, I had a long and busy weekend :) > > Hi, Kristian!
Hope it was a nice one :) >> 1) The SAREA is likely to be used by mixed 32-bit / 64 -bit clients and >> servers. (If I understand things correctly, the kernel never accesses the >> SAREA). However the sarea structs are safe against this. The layout of the >> sarea structures should must be identcal, and one should probably use the >> fixed size types (uint64_t, uint32_t etc.) and be very careful with >> alignment. >> > > Yup, and I believe I've done that, but please let me know if you find > any struct field in a non-aligned place. I didn't use uint32_t, but > (unsigned) int is 32 bit everywhere now. I don't have a strong > opinion about this, we can switch to uint32_t if it's a concern. > > Hmm, Yes, I jumped into premature conclusions here. Perhaps it's better to move over to the explicitly sized types but since it seems to be correct anyway at this point, I think that's a low priority item. >> 2) The event buffer mechanism doesn't appear to be safe against buffer >> wraps: Consider the following code from dri_util.c: >> >> >> /* Check for wraparound. */ >> if (pcp && psp->dri2.buffer->prealloc - pdp->dri2.tail > >> psp->dri2.buffer->size) { >> /* If prealloc overlaps into what we just parsed, the >> * server overwrote it and we have to reset our tail >> * pointer. */ >> DRM_UNLOCK(psp->fd, psp->lock, pcp->hHWContext); >> (*psp->dri2.loader->reemitDrawableInfo)(pdp, &pdp->dri2.tail, >> pdp->loaderPrivate); >> DRM_LIGHT_LOCK(psp->fd, psp->lock, pcp->hHWContext); >> } >> >> >> a) First, the if statement should really be a while statement, since we have >> no guarantee that the X server won't wrap the event buffer after it emits >> our drawable info, but before we get the lock in DRM_LIGHT_LOCK() >> > > The SAREA is designed as a lockless data structure, since once we stop > using the lock, there's never any guarantee that the X server won't > overwrite the data that we're trying to read. The assumption here is > that once the X server reemits the drawable info and hands us back a > pointer to the head of the buffer, there's enough time for us to > access the info before a new wrap occurs. I don't think that's a bad > assumption, considering that clip rects typically change when windows > appears, disappears or are moved around, which does limit the > frequency of the changes. > OK. I was assuming that we'd always need a hardware lock to make sure that the latest drawable information was consistent with the X server's view of things. > To be safe, we can add an overflow check *after* parsing the data out > of the sarea buffer, but before acting on the new data. That was > always part of the design, but it's not implemented at this point. > This check will guarantee consistent behaviour even in the > post-drm-lock world. > Yes, I think we need that, but to be 100% sure, we probably need to copy the relevant sarea data, validate it, and then parse it to make sure we don't confuse the parser, should the data have been overwritten. > >> b) This is the most serious concern. If the client using pdp sleeps for a >> while, and the X server rushs ahead and wraps the event buffer more than one >> time, the pdp->dri2.tail will be completely stale and cannot even be used >> for the wrapaound test, since the outcome will be unpredictable. I think >> what's needed is a single wrap stamp in the sarea, which is bumped by the X >> server each time it wraps the event buffer. The client can then compare a >> saved value against this stamp and get a new tail pointer if there's a >> mismatch. If the stamp is made 64-bit there's only a microscopic chance >> that we get a false stamp match due to stamp wraparound. >> > > The buffer indexes are 32 bit integers but the buffer itself is only > 32kb. This means that the upper 17 bits counts the number of > wraparounds and so even if the X server wraps multiple times before > the clients detects it. It can wrap around 128 * 1024 times before we > get into the situation you describe. Alternatively, think of the DRI2 > event buffer as a 4gb buffer where we only keep the last 32kb event > data. > > As such, the wraparound check is correct, and it even handles the case > where the X server index wraps around the 4G limit while the client is > still not wrapped. > Still, if this is the case, there are situations where clients are put to sleep for a long time and If we're not having a completely fullproof solution it doesn't suffice to say that it will be OK in most cases. We would probably need a mean-time-between-failures in the order of a system lifetime. Would it be possible to extend the indices to 64-bit to avoid this ever happening? >> 3) The code in dri_util.c: >> >> if (psp->dri2.enabled) { >> __driParseEvents(pcp, pdp); >> __driParseEvents(pcp, prp); >> >> >> On the second call to __driParseEvent(pcp, prp), we may release the hardware >> lock, which may invalidate the pdp info, and have it use stale buffers / >> cliprects. One remedy to this is to have a function >> __driParseEventsTwoDrawables(pcp, pdp, prp) >> That makes sure both drawables are up-to-date after the last DRM_LIGHT_LOCK. >> And yes, the DRI1 code is incorrect here too. >> > > Hmm, true. What we can do there is to put a while loop around the two > calls to __driParseEvents than loops for as long as the pdp and prp > tail pointers are different from the sarea head pointer. If both > drawable tails are wrapped, this will cause two round trips to reemit > the data, but that unavoidable. Second time around in the loop, we'll > just parse the newly emitted data and be good to go. > > That's just the locked case though. To get the unlocked case working > there are two ways to go: either 1) add a dri2 protocol request to > make the X server implement swapbuffers or 2) add a conditional batch > buffer ioctl to drm. I like 1) most and I'm planning to implement it > to see how it works. There's a lot of benefits to doing this in the X > server and since we have to post damage events anyway, it's > essentially free. 1) is a little more tricky and probably not worth it > - I know Eric frowns whenever I bring this up :) The idea is to have > a conditional version of the batch buffer ioctls, that on top of all > the batch buffer args take a uint32_t value, v, and a uint32_t > userspace pointer, p. The drm takes the lock, reads the value *p from > userspace and if it matches v, submits the batch buffer as usual. If > the values doesn't match it returns EAGAIN. The way we would use this > from mesa is to pass the drawable tail as v and a pointer to the sarea > head as p. There needs to be another similar ioctl that the X server > can use when it updates the sarea and submits blits to move window > contents around, that asks the drm to, while holding the lock, write a > value, v, to a userspace location, p, and submit a batch buffer. > > So, while a lockless, client side swap buffers can be made to work, I > don't think it's worth it. > OK. Is the desire to get rid of the hardware lock as a drawable info mutex based on it being to course-grained? > cheers, > Kristian > Thanks, Thomas ------------------------------------------------------------------------- Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 -- _______________________________________________ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel