Kristian Høgsberg wrote: > On Tue, Aug 18, 2009 at 6:12 PM, Thomas Hellström<tho...@shipmail.org> wrote: > >> Kristian Høgsberg wrote: >> >>> On Tue, Aug 18, 2009 at 5:03 PM, Dave Airlie<airl...@gmail.com> wrote: >>> >>> >>>> On Wed, Aug 19, 2009 at 2:12 AM, Keith Whitwell<kei...@vmware.com> wrote: >>>> >>>> >>>>> I think the bug in question was because somebody (Jon Smirl??) removed >>>>> the empty & apparently unused poll implementation from the drm fd, only to >>>>> discover that the X server was actually polling the fd. >>>>> >>>>> If this code adds to, extends or at least doesn't remove the ability to >>>>> poll the drm fd, it should be fine. >>>>> >>>>> >>>> You have a better memory than me, but thats exactly what happened >>>> alright. >>>> >>>> >>> And in the meantime I did dig up the story of the poll implementation >>> in drm. I'm guessing the trouble started when Jon Smirl removed the >>> drm_poll implementation that returned 0 in >>> 3aef3841d0c8099a97a56a285f0a21d9147405bd. The default behaviour in >>> the case of an fd with missing poll implementation is to assume always >>> readable and writable (DEFAULT_POLLMASK). So by removing the drm_poll >>> function the drm fd went from never selecting readable or writable to >>> always selecting readable and writable. To "fix" this >>> 5e8838fd115879174567c4c2db8ad25331619994 in drm.git and looks like this: >>> >>> +/** No-op. */ >>> +/* This is to deal with older X servers that believe 0 means data is >>> + * available which is not the correct return for a poll function. >>> + * By alternating returns both interfaces are happy. This is fixed >>> + * in newer X servers. >>> + */ >>> +unsigned int drm_poll(struct file *filp, struct poll_table_struct *wait) >>> +{ >>> + static int flip; >>> + if ((flip = !flip)) >>> + return (POLLIN | POLLOUT | POLLRDNORM | POLLWRNORM); >>> + return 0; >>> +} >>> +EXPORT_SYMBOL(drm_poll); >>> >>> which wins the WTF of the day. It was the fixed in >>> b974e2cd683fa798970cd1bdc5e20acfb7a34a9c where it was changed to just >>> return 0 like before. >>> >>> Again, the problem was not that select returns zero, the problem was >>> that select would error with EINVAL. I don't see any error path in >>> do_select() in fs/select.c that could cause EINVAL based on the return >>> value from one of the poll fops implementations. The only two cases I >>> can think of is that either the core select implementation was >>> different back then, for example return EINVAL if a poll fops returns >>> 0 without registering a poll waitqueue or not having a read fops or >>> something. However, that should still be valid and if it errors it >>> shouldn't throw EINVAL since it's not a user error. The other theory >>> is that since the default behaviour in case of no poll fops >>> implementation is to return immediately (or in case of the wtf poll, >>> to return immediately on every other select() call) always with the >>> drm fd readable and writable, it could have triggered an overflow edge >>> case in the poll timeout value computation. Aside from negative nfds, >>> an invalid timeout value is the only condition that can trigger EINVAL >>> according to the man page. >>> >>> That's it, I rest my case ;) >>> Kristian >>> >>> >> Actually, looking at the dri1 code, it installs a SIGIO handler to do >> user-space context switches, and it's waiting for data on the drm file >> descriptor. >> >> Now, when the X server receives a SIGIO no matter from what client, it calls >> select to determine from which sigio'd file descriptor. If the drm file >> descriptor indicates reading would not block, it goes ahead reading data to >> process a context switch, which will of course be bogus data. >> > > Yes, so if we use DRI1 and DRI2 page flipping at the same time we > might have a problem. If we just used DRI1, the drm fd will never > select readable, if we just use DRI2, we never install the SIGIO > handler. > > Looking way back in the drm.git history, I see that there actually > used to be a drm_read() and a drm_poll(). They weren't used for > reading out the contents of the buffer maps, they were for sending > events to userspace. Yes, I've already given up on that usage :) so never mind that comment. I'll resort to using IOCTLS or a ttm char device.
Still I must say I'm quite doubtful of the event mechanism and how it's used in this context. 1) It will not work with dri1. If somebody tries to have dri1 and dri2 enabled at the same time (XvMC, OpenGL) it will break. 2) It puts a requirement on user-space for correct functionality of a drm IOCTL. This means that any future pageflip ioctl user, be it EGL or whatever needs to implement that event parsing mechanism to be sure to work with any driver. 3) Drivers with a decent command submission mechanism, triple buffering or capable hardware will probably do their best to work around it. Given this, shouldn't we try to avoid this? Now? /Thomas ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july -- _______________________________________________ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel