On 16.01.21 04:40, John Hubbard wrote:
> On 1/15/21 11:46 AM, David Hildenbrand wrote:
7) There is no easy way to detect if a page really was pinned: we might
have false positives. Further, there is no way to distinguish if it was
pinned with FOLL_WRITE or not (R vs R/W). To perform
On 1/15/21 11:46 AM, David Hildenbrand wrote:
7) There is no easy way to detect if a page really was pinned: we might
have false positives. Further, there is no way to distinguish if it was
pinned with FOLL_WRITE or not (R vs R/W). To perform reliable tracking
we most probably would need more
On Fri, Jan 15, 2021 at 08:46:48PM +0100, David Hildenbrand wrote:
> Just wild ideas. Most probably that has already been discussed, and most
> probably people figured that it's impossible :)
No, I think it is all fair topics.
There is no API reason for any of this to be limited, but in
>> 7) There is no easy way to detect if a page really was pinned: we might
>> have false positives. Further, there is no way to distinguish if it was
>> pinned with FOLL_WRITE or not (R vs R/W). To perform reliable tracking
>> we most probably would need more counters, which we cannot fit into
>>
On Fri, Jan 15, 2021 at 09:59:23AM +0100, David Hildenbrand wrote:
> AFAIU, a more extreme case is probably VFIO: A VM with VFIO (e.g.,
> passthrough of a PCI device) can essentially be corrupted by "echo 4 >
> /proc/[pid]/clear_refs".
I've been told when doing migration with RDMA the VM's
On 10.01.21 01:44, Andrea Arcangeli wrote:
> Hello Andrew and everyone,
>
> Once we agree that COW page reuse requires full accuracy, the next
> step is to re-apply 17839856fd588f4ab6b789f482ed3ffd7c403e1f and to
> return going in that direction.
After stumbling over the heated discussion
On Sun, Jan 10, 2021 at 11:30:57AM -0800, Linus Torvalds wrote:
> So if you start off with the rule that "I will always COW unless I can
> trivially see I'm the only owner", then I think we have really made
> for a really clear and unambiguous rule.
I must confess that's the major reason that
On Wed, Jan 13, 2021 at 4:56 AM Matthew Wilcox wrote:
>
> Yes, Linus mis-stated it:
Yeah, I got the order wrong.
> ... but as David pointed out, I fixed this in e320d3012d25
.. and I must have seen it, but not really internalized it.
And now that I look at it more closely, I'm actually
On Wed, Jan 13, 2021 at 03:32:32PM +0300, Kirill A. Shutemov wrote:
> On Tue, Jan 12, 2021 at 07:31:07PM -0800, Linus Torvalds wrote:
> > On Tue, Jan 12, 2021 at 6:16 PM Matthew Wilcox wrote:
> > >
> > > The thing about the speculative page cache references is that they can
> > > temporarily bump
On Tue, Jan 12, 2021 at 07:31:07PM -0800, Linus Torvalds wrote:
> On Tue, Jan 12, 2021 at 6:16 PM Matthew Wilcox wrote:
> >
> > The thing about the speculative page cache references is that they can
> > temporarily bump a refcount on a page which _used_ to be in the page
> > cache and has now
On 13.01.21 09:52, David Hildenbrand wrote:
> On 13.01.21 04:31, Linus Torvalds wrote:
>> On Tue, Jan 12, 2021 at 6:16 PM Matthew Wilcox wrote:
>>>
>>> The thing about the speculative page cache references is that they can
>>> temporarily bump a refcount on a page which _used_ to be in the page
On 13.01.21 04:31, Linus Torvalds wrote:
> On Tue, Jan 12, 2021 at 6:16 PM Matthew Wilcox wrote:
>>
>> The thing about the speculative page cache references is that they can
>> temporarily bump a refcount on a page which _used_ to be in the page
>> cache and has now been reallocated as some other
On Tue, Jan 12, 2021 at 6:16 PM Matthew Wilcox wrote:
>
> The thing about the speculative page cache references is that they can
> temporarily bump a refcount on a page which _used_ to be in the page
> cache and has now been reallocated as some other kind of page.
Oh, and thinking about this
On Tue, Jan 12, 2021 at 6:16 PM Matthew Wilcox wrote:
>
> The thing about the speculative page cache references is that they can
> temporarily bump a refcount on a page which _used_ to be in the page
> cache and has now been reallocated as some other kind of page.
Right you are. Yeah, scratch
On Mon, Jan 11, 2021 at 02:18:13PM -0800, Linus Torvalds wrote:
> The whole "optimistic page references throigh page cache" etc are
> complete non-issues, because the whole point is that we already know
> it's not a page cache page. There is simply no other way to reach that
> page than through
On Mon, Jan 11, 2021 at 02:18:13PM -0800, Linus Torvalds wrote:
> On Mon, Jan 11, 2021 at 11:19 AM Linus Torvalds
> wrote:
> >
> > On Sun, Jan 10, 2021 at 11:27 PM John Hubbard wrote:
> > > IMHO, a lot of the bits in page _refcount are still being wasted (even
> > > after GUP_PIN_COUNTING_BIAS
On Mon, Jan 11, 2021 at 2:18 PM Linus Torvalds
wrote:
>
> On Mon, Jan 11, 2021 at 11:19 AM Linus Torvalds
> wrote:
> Actually, what I think might be a better model is to actually
> strengthen the rules even more, and get rid of GUP_PIN_COUNTING_BIAS
> entirely.
>
> What we could do is just make
On Mon, Jan 11, 2021 at 11:19 AM Linus Torvalds
wrote:
>
> On Sun, Jan 10, 2021 at 11:27 PM John Hubbard wrote:
> > IMHO, a lot of the bits in page _refcount are still being wasted (even
> > after GUP_PIN_COUNTING_BIAS overloading), because it's unlikely that
> > there are many callers of
On Sun, Jan 10, 2021 at 11:27 PM John Hubbard wrote:
>
> There is at least one way to improve this part of it--maybe.
It's problematic..
> IMHO, a lot of the bits in page _refcount are still being wasted (even
> after GUP_PIN_COUNTING_BIAS overloading), because it's unlikely that
> there are
On Mon 11-01-21 12:05:49, Jason Gunthorpe wrote:
> On Sun, Jan 10, 2021 at 11:26:57PM -0800, John Hubbard wrote:
>
> > So:
> >
> > FOLL_PIN: would use DMA_PIN_COUNTING_BIAS to increment page refcount.
> > These are long term pins for dma.
> >
> > FOLL_GET: would use GUP_PIN_COUNTING_BIAS to
On Sun, Jan 10, 2021 at 11:26:57PM -0800, John Hubbard wrote:
> So:
>
> FOLL_PIN: would use DMA_PIN_COUNTING_BIAS to increment page refcount.
> These are long term pins for dma.
>
> FOLL_GET: would use GUP_PIN_COUNTING_BIAS to increment page refcount.
> These are not long term pins.
Do we have
On Sat, Jan 09, 2021 at 09:51:14PM -0500, Andrea Arcangeli wrote:
> Are we spending 32bit in mm_struct atomic_t just to call atomic_set(1)
> on it? Why isn't it a MMF_HAS_PINNED that already can be set
> atomically under mmap_read_lock too? There's bit left free there, we
> didn't run out yet to
On Sun, Jan 10, 2021 at 11:26:57PM -0800, John Hubbard wrote:
> IMHO, a lot of the bits in page _refcount are still being wasted (even
> after GUP_PIN_COUNTING_BIAS overloading), because it's unlikely that
> there are many callers of gup/pup per page. If anyone points out that
> that is wrong,
On 1/10/21 11:30 AM, Linus Torvalds wrote:
On Sat, Jan 9, 2021 at 7:51 PM Linus Torvalds
wrote:
Just having a bit in the page flags for "I already made this
exclusive, and fork() will keep it so" is I feel the best option. In a
way, "page is writable" right now _is_ that bit. By definition, if
On Sun, Jan 10, 2021 at 11:30:57AM -0800, Linus Torvalds wrote:
> Notice how this is all both conceptually fairly simple (ie I can
> explain the rules in plain English without really making any complex
> argument) and it is arguably internally fairly self-consistent (ie the
> whole notion of "oh,
On Sat, Jan 9, 2021 at 7:51 PM Linus Torvalds
wrote:
>
> COW is about "I'm about to write to this page, and that means I need
> an _exclusive_ page so that I don't write to a page that somebody else
> is using".
So this kind of fundamentally explains why I hate the games we used to
play wrt
On Sat, Jan 9, 2021 at 6:51 PM Andrea Arcangeli wrote:
>
> I just don't see the simplification coming from
> 09854ba94c6aad7886996bfbee2530b3d8a7f4f4. Instead of checking
> page_mapcount above as an optimization, to me it looks much simpler to
> check it in a single place, in do_wp_page, that in
On Sat, Jan 09, 2021 at 05:37:09PM -0800, Linus Torvalds wrote:
> On Sat, Jan 9, 2021 at 5:19 PM Linus Torvalds
> wrote:
> >
> > And no, I didn't make the UFFDIO_WRITEPROTECT code take the mmap_sem
> > for writing. For whoever wants to look at that, it's
> > mwriteprotect_range() in
Hello Linus,
On Sat, Jan 09, 2021 at 05:19:51PM -0800, Linus Torvalds wrote:
> +#define is_cow_mapping(flags) (((flags) & (VM_SHARED | VM_MAYWRITE)) ==
> VM_MAYWRITE)
> +
> +static inline bool pte_is_pinned(struct vm_area_struct *vma, unsigned long
> addr, pte_t pte)
> +{
> + struct page
On Sat, Jan 9, 2021 at 5:19 PM Linus Torvalds
wrote:
>
> And no, I didn't make the UFFDIO_WRITEPROTECT code take the mmap_sem
> for writing. For whoever wants to look at that, it's
> mwriteprotect_range() in mm/userfaultfd.c and the fix is literally to
> turn the read-lock (and unlock) into a
On Sat, Jan 9, 2021 at 4:55 PM Linus Torvalds
wrote:
>
> What part of "clear_refs is the _least_ important of the three cases"
> are you not willing to understand?
In fact, I couldn't even turn on that code with my normal config,
because it depends on CONFIG_CHECKPOINT_RESTORE that I didn't even
On Sat, Jan 9, 2021 at 4:44 PM Andrea Arcangeli wrote:
>
> Once we agree that COW page reuse requires full accuracy, [...]
You have completely and utterly ignored every single argument against that.
Instead, you just continue to push your agenda.
The thing is, GUP works fine.
COW works fine.
Hello Andrew and everyone,
Once we agree that COW page reuse requires full accuracy, the next
step is to re-apply 17839856fd588f4ab6b789f482ed3ffd7c403e1f and to
return going in that direction.
Who is going to orthogonally secure vmsplice, Andy, Jason, Jens? Once
vmsplice is secured from taking
33 matches
Mail list logo