On 29.6.2012 5:20, Adam Hraska wrote: > I was trying to understand the rationale behind accessing > page tables without a lock from TLB miss handlers without > much luck. Could anyone point me in the right direction?
There are several reasons that motivated me to drop the explicit synchronization in TLB-miss handlers: - unification between various platforms (hw walked pt vs. TLB-only) There is no way you can lock the page tables before letting hw walk them while resolving a TLB-miss on arm32, ia32 and amd64. Therefore, for consistency reasons, it doesn't make much sense to require synchronization in the software TLB-miss handler on the rest of the architectures. You are then left with the only option to arrange for the lookup and insert operations to somehow work together in a peaceful way without locking. - need to be able to lookup a non-identity kernel mapping when in interrupt context This is predominantly necessitated by the fact that it is a very bad idea to block when in an interrupt handler or when holding a spinlock. In case of HelenOS, an interrupt may come when THREAD is NULL or when the kernel holds a spinlock (but not an IRQ spinlock). Our interrupt handlers often touch kernel non-identity mappings in order to access some I/O registers. Resolving such an access will go through a TLB-miss handler on some platforms. It would be therefore disastrous if the kernel decided to block in such and interrupt handler. > The only reason I could come up with is that page_mapping- > _insert() may itself block. Examining pt_mapping_insert() > this really seems to be the case - it calls frame_alloc() > without FRAME_ATOMIC. As a result, if there is not enough > memory, frame_alloc() in page_mapping_insert() will sleep > with the page table lock held until more memory is available. > That would in turn block TLB miss handlers attempting to > access the table with the same lock (in case of the tree- > based page table it is protected by locking the address > space). In other words, until more memory is freed threads > of the affected AS would only be able to access TLB cached > pages without blocking, which is definitely a problem. This is not such a big problem though. If there is not enough memory for one of the threads in this address space (despite all the efforts of reserving the memory beforehands) to proceed, there is not much difference if the other threads in the same address space are held up by this. > The problem is exacerbated in case of a global page hash > table. The hash table uses a single mutex for the entire > system. Therefore, sleeping with the lock held would > effectively deadlock the system (but it uses FRAME_ATOMIC). This is indeed a more deadly case than the previous one (FRAME_ATOMIC should be changed to a blocking allocation here anyway). I agree that it was possible for the system to lock up this way before the changes which made page_mapping_find() lock-free. Note that the code may and will contain also other similar issues. In general, we are trying to fight the out-of-memory deadlocks by using memory reservations, but there are still cases, when we are not 100% thorough or efficient. For example, various syscalls will use blocking allocations for some kernel objects (for which there is no reservation) instead of returning a failure when memory is not available. > Are there any other more fundamental reasons for allowing > tlb miss handlers to access the table without a lock? Yes, already described above :-) > In earlier communications, Jakub mentioned: > "This has several reasons, such as consistency with the > behavior on platforms with hardware-walked page tables > and the requirements of our non-identity kernel memory." > > I am not sure how non-identity kernel memory (is that > the same as high mem?) pertains to page table locking. Already explained above. The I/O registers are often accessed using a kernel non-identity mapping from an interrupt context. > Bugs > ---- > I am afraid traversing the page tables without proper > locking exposes us to several nasty time-dependent bugs > if page_mapping_find() is not coded with extreme care. > A glance at the hash-based as well as tree-base page > table reveals that we would at a minimum have to insert > a number of memory barriers into the code (check out > ht_mapping_insert() and pt_mapping_insert()). Otherwise, > a TLB miss handler might eg see a new page table level/ > node before it is fully initialized (the compiler or cpu > may reorder writes and loads). Or have I forgotten to > consider something? In case of ht_mapping_insert(), the need for memory barriers is questionable because the TLB-miss exception will have a serializing effect, IMO. On the other hand, I can definitely see your point in case of pt_mapping_insert(). Looks like we will need to add some barriers between the memsetb()'s and the respective SET_PTLn_ADDRES() calls. Jakub _______________________________________________ HelenOS-devel mailing list [email protected] http://lists.modry.cz/cgi-bin/listinfo/helenos-devel
