> I thought to the SMP case - because memory sharing between processes
> may already have given rise to mechanisms we could reuse.
You mean something similar to shm IPC stuff. This was discussed on the MM
list. The problem with this is shm doesn't handle access to these memory
regions. If two processes share a memory region both can write to the same
area. It will be indeterminate what will be in that memory region. Usually
with shm you use userland semaphores to handle access to shm memory
regions.
> What happens in the SMP case then? If two processes share a memory area?
> Are shared memory mechanisms totally different from those of memory mapping?
Yes. See linux/ipc/shm.c
> I ask this question because, with multiple CPUs and thus multiple L1 caches,
> the MMU should then _also_ handle cache coherence issues... But I guess it
> is the case.
Yes it does. In struct mm_struct their is a mmap_sem semaphore that
handles CPUs access to mm_struct (memory management) data. It prevents
things like a process doing a mprotect on a region and another process
tries to write to it casuing a page fault. You don't want things like
this happening. This semaphomore works on the VM level. Then there is the
spinlock on the pagetables. This prevents things like pte_present(pte)
(this function test if a page is in memory) tehn a context switch happens
and that page is swapped out. The spinlock would prevent that page from
being swapped out because it was tested to see if it was present.
> I do to. There are TWO hardware devices that can access the fb
> memory: the MMU and the graphic accelerator. If letting both
> enabled can lock up the machine, you need to stop one before
> using the other...
No. The MMU only touches the framebuffer if a page has never been
accessed. Once all pages in the framebuffer have been touched then the MMU
has nothing to do with it. Userland can happely touch the framebuffer and
accel engine without kernel help. Same for MMIO regions. What you really
have after all the page faults happens is two process accessing two memory
regions. This is what has to be prevented. The unmapping the framebuffer
approach clears the TBL of the framebuffer pages. This way its back in the
kernels hands if you use a custom have a custom no_page method that puts a
process to sleep if the accel engine is going. But once that page is
faulted the kernel loses control over that page of memory. Thats why you
have to unmap it on every access to teh framebuffer. To regain control.
Sort of what you are doing is making the kernel go from handling memory
only on page faults to handle on every access to a memory region. Now you
see hwo it can get expensive.
> BTW, is it the unmap-ping process that is expensive, or is it the
> subsequent (potential) page fault handling?
Its the removing of all the pages from the TBLs on each CPU thats
expensive. The you have to flush the cache.
> There is a similar issue that _may_ arise between the 2d and 3d
> accelerated engines. But for them, in a KGI controlled design,
> we can do the arbitration inside the KGI driver. (BTW, this is
> again a statement that supports the idea of kernel-controlled
> access to the card processing units - as opposed to memory
> mapping the whole accel registers set to a userspace program. ;-)
Well their is nothing wrong with mmapping accels as long as it safe. Like
it will not lock you machine when you acccess the MMIO registers. Or when
you mmap the accel region it also has mode setting registers in this
region as well.