Hello everyone, its been awhile. I have been thinking about something lately and I thought I would share some some very rough proof of concept code for a lock free single reader / single writer ring buffer. It uses atomic operations when the ring is not empty or full, and uses the futex system call to provide blocking in those conditions. I was thinking that something like this could be useful to build a method to communicate to a kernel based dma dispatch thread.
The basic idea is like this: Each process will call into the kernel when they initialize the DRI. The kernel would setup a set of shared pages to use for command dispatch. The producer end would be used by the user space driver, and the consumer end would terminate in a kernel thread. This single thread would service all dma requests for the DRI processes in the system, and could even be made more complex if necessary or desired (managing saving and restoring of hardware state between the different process, managing memory management requests, etc.) One of the things that is good about a design like this is obviously there are very few system calls involved, and it can offload some bookkeeping to another cpu core. Was anything like this explored when you guys did the redesign for DRI2? Has anyone measured what kind of raw performance is possible with some of the current dma dispatch mechanisms used by the DRI 2 drivers? This code is quite rough and is just meant as a demonstration, no attempt was made to optimize it. It might have bugs and races, but it seems pretty solid with my limited testing. It will probably only run on 32-bit x86, and no attempt was made to make anything else work. I would be interested if some people could compile and test the code I've attached and see what kind of output it produces. I only have access to a older CORE 2 laptop running linux right now, so I wonder how good or bad it would do on a recent good desktop. I guess this sort of approach might also be useful for any other bulk data transfer between processes (Xlib transport maybe) or with the kernel. It performs 3 basic tests with various different buffer configurations (1 big 64k buffer, 4 16k buffers, and 8 4k buffers), a write only, read/write, and write/validate test (tiger hash calculated on the buffers as they are read). Here is the output on my system: Beginning test : write only, read is a noop (one 64k buffer). Starting write test Starting read test Time 1: 0, Time 2: 5380000 Number of seconds: 5.3800 Number of bytes: 10240000000 Rate (MB/sec): 1815.1719 Number of read operations: 10000000 Number of write operations: 10000000 Number of blocking writes: 92673 Number of blocking reads: 713363 Write block ratio: 0.0093 Read block ratio: 0.0713 Beginning test : read and write. (one 64k buffer) Starting write test Starting read test Time 1: 5380000, Time 2: 9530000 Number of seconds: 4.1500 Number of bytes: 10240000000 Rate (MB/sec): 2353.1627 Number of read operations: 10000000 Number of write operations: 10000000 Number of blocking writes: 91006 Number of blocking reads: 96142 Write block ratio: 0.0091 Read block ratio: 0.0096 Beginning test : data that is written is validated as it is read. (one 64k buffer) Starting write validate test Starting read validate test Matches: 2000000, failures: 0, success ratio: 1.0000 Time 1: 9530000, Time 2: 15830000 Number of seconds: 6.3000 Number of bytes: 2048000000 Rate (MB/sec): 310.0198 Number of read operations: 2000000 Number of write operations: 2000000 Number of blocking writes: 1987287 Number of blocking reads: 93 Write block ratio: 0.9936 Read block ratio: 0.0000 Beginning test : write only, read is a noop (four 16k buffers). Starting write test Starting read test Time 1: 15830000, Time 2: 21200000 Number of seconds: 5.3700 Number of bytes: 10240000000 Rate (MB/sec): 1818.5521 Number of read operations: 10000000 Number of write operations: 10000000 Number of blocking writes: 200612 Number of blocking reads: 541775 Write block ratio: 0.0201 Read block ratio: 0.0542 Beginning test : read and write. (four 16k buffers) Starting write test Starting read test Time 1: 21200000, Time 2: 27460000 Number of seconds: 6.2600 Number of bytes: 10240000000 Rate (MB/sec): 1560.0040 Number of read operations: 10000000 Number of write operations: 10000000 Number of blocking writes: 636012 Number of blocking reads: 349444 Write block ratio: 0.0636 Read block ratio: 0.0349 Beginning test : data that is written is validated as it is read. (four 16k buffers) Starting write validate test Starting read validate test Matches: 2000000, failures: 0, success ratio: 1.0000 Time 1: 27460000, Time 2: 33830000 Number of seconds: 6.3700 Number of bytes: 2048000000 Rate (MB/sec): 306.6130 Number of read operations: 2000000 Number of write operations: 2000000 Number of blocking writes: 1998643 Number of blocking reads: 79 Write block ratio: 0.9993 Read block ratio: 0.0000 Beginning test : write only, read is a noop (eight 4k buffers). Starting write test Starting read test Time 1: 33830000, Time 2: 53320000 Number of seconds: 19.4900 Number of bytes: 10240000000 Rate (MB/sec): 501.0582 Number of read operations: 10000000 Number of write operations: 10000000 Number of blocking writes: 1514672 Number of blocking reads: 5898047 Write block ratio: 0.1515 Read block ratio: 0.5898 Beginning test : read and write. (eight 4k buffers) Starting write test Starting read test Time 1: 53320000, Time 2: 69760000 Number of seconds: 16.4400 Number of bytes: 10240000000 Rate (MB/sec): 594.0161 Number of read operations: 10000000 Number of write operations: 10000000 Number of blocking writes: 3061476 Number of blocking reads: 3021581 Write block ratio: 0.3061 Read block ratio: 0.3022 Beginning test : data that is written is validated as it is read. (eight 4k buffers) Starting write validate test Starting read validate test Matches: 2000000, failures: 0, success ratio: 1.0000 Time 1: 69760000, Time 2: 76320000 Number of seconds: 6.5600 Number of bytes: 2048000000 Rate (MB/sec): 297.7325 Number of read operations: 2000000 Number of write operations: 2000000 Number of blocking writes: 1999559 Number of blocking reads: 203 Write block ratio: 0.9998 Read block ratio: 0.0001 Thanks, -Jeff
fastpipe.tar.gz
Description: GNU Zip compressed data
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev