2010/1/31 Jerome Glisse <gli...@freedesktop.org>:
>> <snip>
>> Eventually, strace log is flooded with
>> ioctl(4, 0xc0106451, 0x60000fffffd530f8) = 0
>> roughly at the time the CPU charge increases. This is consistent with
>> what is recorded in syslog:
>> Jan 29 21:16:03 longspeak kernel: [  318.611783] [drm:drm_ioctl],
>> pid=2426, cmd=0xc0106451, nr=0x51, dev 0xe200, auth=1
>> Jan 29 21:16:03 longspeak kernel: [  318.611789]
>> [drm:radeon_cp_getparam], pid=2426
>> repeated several tens of thousands times where 2426 is glxgears PID.
>> <snip>
> You are hitting GPU lockup which traduce by userspace keep
> trying the same ioctl over and over again which completely
> eat all CPU.

Thank you for clarifying. Does GPU lockup mean that this problem is
specific to my current hardware configuration? If I try an other
graphics adapter (choices are scarce on ia64), is it possible that I
don't experience GPU lockup at all or a different one?

> There is no easy way to debug GPU lockup and no way at
> all than by staring a GPU command stream or making wild
> guess and testing things randomly.

Just to clarify: I imagine that a GPU command stream is specific to a
given GPU/driver. Does it mean that the commands sent to the GPU are
not the sames on different Linux platforms (e.g. ia64/r300 vs.
x86/r300)?

About GPU command, is this something I can read in the various
logfiles? Is there some kind of command generator to send a specific
command or command stream to the GPU in order to help determine which
one is the faulty one?

I don't know if these are the command sent to the GPU but, looking
again at the strace glxgears output I've recorded, I'm getting:
futex(0x60000fffffd53420,
FUTEX_WAIT_BITSET_PRIVATE|FUTEX_CLOCK_REALTIME, 1, NULL,
200000000004d1e8) = -1 EAGAIN (Resource temporarily unavailable)
and numerous
read(3, 0x60000000000093e4, 4096)       = -1 EAGAIN (Resource
temporarily unavailable)
Should the return value of read() be equal to the number of blocks (I
imagine) passed as the third argument? In this case, before getting
EAGAIN error when trying to read blocks, I'm getting this following
sequence that seem to shift something:
writev(3, [{"b\0\5\0\f\0\0\0BIG-REQUESTS", 20}], 1) = 20
poll([{fd=3, events=POLLIN}], 1, -1)    = 1 ([{fd=3, revents=POLLIN}])
read(3, "\1\0\1\0\0\0\0\0\1\216\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0",
4096) = 32
poll([{fd=3, events=POLLIN|POLLOUT}], 1, -1) = 1 ([{fd=3, revents=POLLOUT}])
writev(3, [{"\216\0\1\0", 4}], 1)       = 4
poll([{fd=3, events=POLLIN}], 1, -1)    = 1 ([{fd=3, revents=POLLIN}])
read(3, "\1\0\2\0\0\0\0\0\377\377?\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0",
4096) = 32
read(3, 0x60000000000093e4, 4096)       = -1 EAGAIN (Resource
temporarily unavailable)
poll([{fd=3, events=POLLIN|POLLOUT}], 1, -1) = 1 ([{fd=3, revents=POLLOUT}])
>From there, all subsequent pair of read() calls fail.
By contrast, in the (old) strace glxgears excerpt posted here
(http://ubuntuforums.org/showthread.php?t=75007), the read calls seem
to always succeed.

Could this be a starting point or not at all?

Émeric

------------------------------------------------------------------------------
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
--
_______________________________________________
Dri-devel mailing list
Dri-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dri-devel

Reply via email to