Benjamin Herrenschmidt wrote: >> * objects have rwsem to protect migration. >> * no_page() does: >> - takes that object read sem >> - if object is in vram or other non-memory location then do >> io_remap_pfn_range() and get a dummy page struct pointer >> - else get the struct page of the object page in memory >> - release the object read sem and return whatever struct page we got >> * migration does: >> - take that object write sem >> - copy the data to the new location >> - call unmap_mapping_ranges() for that object >> - release the object write sem > > Ok, there is one fault in my reasoning: io_remap_pfn_range() isn't > designed to be used in that context (it's really made to be called with > the mmap_sem held for writing) and thus doesn't check if the PTE is > still empty after locking it which is necessary if you are only holding > the read semaphore. > > That means that it's still all possible, but not using > io_remap_pfn_range(). Best is to provide a specific new function, called > something like map_one_io_page() or something like that, which does > something along the lines of > > pgd = pgd_offset(mm, address); > pud = pud_alloc(mm, pgd, address); > if (!pud) > return VM_FAULT_OOM; > pmd = pmd_alloc(mm, pud, address); > if (!pmd) > return VM_FAULT_OOM; > pte = pte_alloc_map(mm, pmd, address); > if (!pte) > return VM_FAULT_OOM; > pte = pte_offset_map_lock(mm, pmd, address, &ptl); > if (pte_none(*page_table)) { > flush_icache_page(vma, new_page); > entry = mk_pte(new_page, vma->vm_page_prot); > if (write_access) > entry = maybe_mkwrite(pte_mkdirty(entry), vma); > set_pte_at(mm, address, page_table, entry); > } else { > page_cache_release(new_page); > goto unlock; > } > update_mmu_cache(vma, address, entry); > lazy_mmu_prot_update(entry); > unlock: > pte_unmap_unlock(pte, ptl); > > Note that it's clear that this is to be used exclusively for mapping on > non real pages and it doesn't handle racing with truncate (concurrent > unmap_mapping_ranges(), which is fine in our case as we have the object > semaphore). > > We're looking into doing something like that for Cell to not require > sparsemem anymore and thus not create struct page's for SPE local stores > and registers which is a real pain... > > We should probably move that discussion to linux-mm and/or lkml tho :) >
I'm finding this an interesting discussion. If it shifts to lkml, for instance, is there a way to follow *and post* on the thread without either subscribing to lkml or requiring myself to be on the CC list? Keith ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys -- and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV -- _______________________________________________ Dri-devel mailing list Dri-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dri-devel