Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-10 Thread Michal Hocko
On Tue 08-08-17 19:15:36, Jerome Glisse wrote:
> On Tue, Aug 08, 2017 at 03:59:36PM +0300, Igor Stoppa wrote:
> > On 07/08/17 22:12, Jerome Glisse wrote:
> > > On Mon, Aug 07, 2017 at 05:13:00PM +0300, Igor Stoppa wrote:
> > 
> > [...]
> > 
> > >> I have an updated version of the old proposal:
> > >>
> > >> * put a magic number in the private field, during initialization of
> > >> pmalloc pages
> > >>
> > >> * during hardened usercopy verification, when I have to assess if a page
> > >> is of pmalloc type, compare the private field against the magic number
> > >>
> > >> * if and only if the private field matches the magic number, then invoke
> > >> find_vm_area(), so that the slowness affects only a possibly limited
> > >> amount of false positives.
> > > 
> > > This all sounds good to me.
> > 
> > ok, I still have one doubt wrt defining the flag.
> > Where should I do it?
> > 
> > vmalloc.h has the following:
> > 
> > /* bits in flags of vmalloc's vm_struct below */
> > #define VM_IOREMAP  0x0001  /* ioremap() and friends
> > */
> > #define VM_ALLOC0x0002  /* vmalloc() */
> > #define VM_MAP  0x0004  /* vmap()ed pages */
> > #define VM_USERMAP  0x0008  /* suitable for
> >remap_vmalloc_range
> > */
> > #define VM_UNINITIALIZED0x0020  /* vm_struct is not
> >fully initialized */
> > #define VM_NO_GUARD 0x0040  /* don't add guard page
> > */
> > #define VM_KASAN0x0080  /* has allocated kasan
> > shadow memory */
> > /* bits [20..32] reserved for arch specific ioremap internals */
> > 
> > 
> > 
> > I am tempted to add
> > 
> > #define VM_PMALLOC  0x0100
> > 
> > But would it be acceptable, to mention pmalloc into vmalloc?
> > 
> > Should I name it VM_PRIVATE bit, instead?
> > 
> > Using VM_PRIVATE would avoid contaminating vmalloc with something that
> > depends on it (like VM_PMALLOC would do).
> > 
> > But using VM_PRIVATE will likely add tracking issues, if someone else
> > wants to use the same bit and it's not clear who is the user, if any.
> 
> VM_PMALLOC sounds fine to me also adding a comment there pointing to
> pmalloc documentation would be a good thing to do. The above are flags
> that are use only inside vmalloc context and so there is no issue
> here of conflicting with other potential user.

Yes I agree. VM_PRIVATE just calls for the issues you are dealing with
at struct page level where you simply do not know who might be (ab)using
mapping and what not because the naming is just too generic...
-- 
Michal Hocko
SUSE Labs


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-10 Thread Michal Hocko
On Tue 08-08-17 19:15:36, Jerome Glisse wrote:
> On Tue, Aug 08, 2017 at 03:59:36PM +0300, Igor Stoppa wrote:
> > On 07/08/17 22:12, Jerome Glisse wrote:
> > > On Mon, Aug 07, 2017 at 05:13:00PM +0300, Igor Stoppa wrote:
> > 
> > [...]
> > 
> > >> I have an updated version of the old proposal:
> > >>
> > >> * put a magic number in the private field, during initialization of
> > >> pmalloc pages
> > >>
> > >> * during hardened usercopy verification, when I have to assess if a page
> > >> is of pmalloc type, compare the private field against the magic number
> > >>
> > >> * if and only if the private field matches the magic number, then invoke
> > >> find_vm_area(), so that the slowness affects only a possibly limited
> > >> amount of false positives.
> > > 
> > > This all sounds good to me.
> > 
> > ok, I still have one doubt wrt defining the flag.
> > Where should I do it?
> > 
> > vmalloc.h has the following:
> > 
> > /* bits in flags of vmalloc's vm_struct below */
> > #define VM_IOREMAP  0x0001  /* ioremap() and friends
> > */
> > #define VM_ALLOC0x0002  /* vmalloc() */
> > #define VM_MAP  0x0004  /* vmap()ed pages */
> > #define VM_USERMAP  0x0008  /* suitable for
> >remap_vmalloc_range
> > */
> > #define VM_UNINITIALIZED0x0020  /* vm_struct is not
> >fully initialized */
> > #define VM_NO_GUARD 0x0040  /* don't add guard page
> > */
> > #define VM_KASAN0x0080  /* has allocated kasan
> > shadow memory */
> > /* bits [20..32] reserved for arch specific ioremap internals */
> > 
> > 
> > 
> > I am tempted to add
> > 
> > #define VM_PMALLOC  0x0100
> > 
> > But would it be acceptable, to mention pmalloc into vmalloc?
> > 
> > Should I name it VM_PRIVATE bit, instead?
> > 
> > Using VM_PRIVATE would avoid contaminating vmalloc with something that
> > depends on it (like VM_PMALLOC would do).
> > 
> > But using VM_PRIVATE will likely add tracking issues, if someone else
> > wants to use the same bit and it's not clear who is the user, if any.
> 
> VM_PMALLOC sounds fine to me also adding a comment there pointing to
> pmalloc documentation would be a good thing to do. The above are flags
> that are use only inside vmalloc context and so there is no issue
> here of conflicting with other potential user.

Yes I agree. VM_PRIVATE just calls for the issues you are dealing with
at struct page level where you simply do not know who might be (ab)using
mapping and what not because the naming is just too generic...
-- 
Michal Hocko
SUSE Labs


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-09 Thread Igor Stoppa


On 09/08/17 02:15, Jerome Glisse wrote:
> On Tue, Aug 08, 2017 at 03:59:36PM +0300, Igor Stoppa wrote:

[...]

>> I am tempted to add
>>
>> #define VM_PMALLOC   0x0100

[...]

> VM_PMALLOC sounds fine to me also adding a comment there pointing to
> pmalloc documentation would be a good thing to do. The above are flags
> that are use only inside vmalloc context and so there is no issue
> here of conflicting with other potential user.

ok, will do

>>
>> Unless it's acceptable to check the private field in the page struct.
>> It would bear the pmalloc magic number.
> 
> I thought you wanted to do:
>   check struct page mapping field
>   check vmap->flags for VM_PMALLOC
> 
> bool is_pmalloc(unsigned long addr)
> {
> struct page *page;
> struct vm_struct *vm_struct;
> 
> if (!is_vmalloc_addr(addr))
> return false;
> page = vmalloc_to_page(addr);
> if (!page)
> return false;
> if (page->mapping != pmalloc_magic_key)

page->private  ?
I thought mapping would not work in the cases you mentioned?

> return false;
> 
> vm_struct = find_vm_area(addr);
> if (!vm_struct)
> return false;
> 
> return vm_struct->flags & VM_PMALLOC;
> }
> 
> Did you change your plan ?

No, the code I have is almost 1:1 what you wrote.
Apart from mapping <=> private

In my previous mail I referred to page->private.

Maybe I was not very clear in what I wrote, but I'm almost 100% aligned
with your snippet.

>> I'm thinking to use a pointer to one of pmalloc data items, as signature.
> 
> What ever is easier for you. Note that dereferencing such pointer before
> asserting this is really a pmalloc page would be hazardous.

Yes, it's not even needed in this scenario.
It was just a way to ensure that it would be a value that cannot be come
out accidentally as pointer value, since it is already taken.

--
igor


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-09 Thread Igor Stoppa


On 09/08/17 02:15, Jerome Glisse wrote:
> On Tue, Aug 08, 2017 at 03:59:36PM +0300, Igor Stoppa wrote:

[...]

>> I am tempted to add
>>
>> #define VM_PMALLOC   0x0100

[...]

> VM_PMALLOC sounds fine to me also adding a comment there pointing to
> pmalloc documentation would be a good thing to do. The above are flags
> that are use only inside vmalloc context and so there is no issue
> here of conflicting with other potential user.

ok, will do

>>
>> Unless it's acceptable to check the private field in the page struct.
>> It would bear the pmalloc magic number.
> 
> I thought you wanted to do:
>   check struct page mapping field
>   check vmap->flags for VM_PMALLOC
> 
> bool is_pmalloc(unsigned long addr)
> {
> struct page *page;
> struct vm_struct *vm_struct;
> 
> if (!is_vmalloc_addr(addr))
> return false;
> page = vmalloc_to_page(addr);
> if (!page)
> return false;
> if (page->mapping != pmalloc_magic_key)

page->private  ?
I thought mapping would not work in the cases you mentioned?

> return false;
> 
> vm_struct = find_vm_area(addr);
> if (!vm_struct)
> return false;
> 
> return vm_struct->flags & VM_PMALLOC;
> }
> 
> Did you change your plan ?

No, the code I have is almost 1:1 what you wrote.
Apart from mapping <=> private

In my previous mail I referred to page->private.

Maybe I was not very clear in what I wrote, but I'm almost 100% aligned
with your snippet.

>> I'm thinking to use a pointer to one of pmalloc data items, as signature.
> 
> What ever is easier for you. Note that dereferencing such pointer before
> asserting this is really a pmalloc page would be hazardous.

Yes, it's not even needed in this scenario.
It was just a way to ensure that it would be a value that cannot be come
out accidentally as pointer value, since it is already taken.

--
igor


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-08 Thread Jerome Glisse
On Tue, Aug 08, 2017 at 03:59:36PM +0300, Igor Stoppa wrote:
> On 07/08/17 22:12, Jerome Glisse wrote:
> > On Mon, Aug 07, 2017 at 05:13:00PM +0300, Igor Stoppa wrote:
> 
> [...]
> 
> >> I have an updated version of the old proposal:
> >>
> >> * put a magic number in the private field, during initialization of
> >> pmalloc pages
> >>
> >> * during hardened usercopy verification, when I have to assess if a page
> >> is of pmalloc type, compare the private field against the magic number
> >>
> >> * if and only if the private field matches the magic number, then invoke
> >> find_vm_area(), so that the slowness affects only a possibly limited
> >> amount of false positives.
> > 
> > This all sounds good to me.
> 
> ok, I still have one doubt wrt defining the flag.
> Where should I do it?
> 
> vmalloc.h has the following:
> 
> /* bits in flags of vmalloc's vm_struct below */
> #define VM_IOREMAP0x0001  /* ioremap() and friends
>   */
> #define VM_ALLOC  0x0002  /* vmalloc() */
> #define VM_MAP0x0004  /* vmap()ed pages */
> #define VM_USERMAP0x0008  /* suitable for
>  remap_vmalloc_range
>   */
> #define VM_UNINITIALIZED  0x0020  /* vm_struct is not
>  fully initialized */
> #define VM_NO_GUARD   0x0040  /* don't add guard page
>   */
> #define VM_KASAN  0x0080  /* has allocated kasan
>   shadow memory */
> /* bits [20..32] reserved for arch specific ioremap internals */
> 
> 
> 
> I am tempted to add
> 
> #define VM_PMALLOC0x0100
> 
> But would it be acceptable, to mention pmalloc into vmalloc?
> 
> Should I name it VM_PRIVATE bit, instead?
> 
> Using VM_PRIVATE would avoid contaminating vmalloc with something that
> depends on it (like VM_PMALLOC would do).
> 
> But using VM_PRIVATE will likely add tracking issues, if someone else
> wants to use the same bit and it's not clear who is the user, if any.

VM_PMALLOC sounds fine to me also adding a comment there pointing to
pmalloc documentation would be a good thing to do. The above are flags
that are use only inside vmalloc context and so there is no issue
here of conflicting with other potential user.

> 
> Unless it's acceptable to check the private field in the page struct.
> It would bear the pmalloc magic number.

I thought you wanted to do:
  check struct page mapping field
  check vmap->flags for VM_PMALLOC

bool is_pmalloc(unsigned long addr)
{
struct page *page;
struct vm_struct *vm_struct;

if (!is_vmalloc_addr(addr))
return false;
page = vmalloc_to_page(addr);
if (!page)
return false;
if (page->mapping != pmalloc_magic_key)
return false;

vm_struct = find_vm_area(addr);
if (!vm_struct)
return false;

return vm_struct->flags & VM_PMALLOC;
}

Did you change your plan ?

> 
> I'm thinking to use a pointer to one of pmalloc data items, as signature.

What ever is easier for you. Note that dereferencing such pointer before
asserting this is really a pmalloc page would be hazardous.

Jérôme


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-08 Thread Jerome Glisse
On Tue, Aug 08, 2017 at 03:59:36PM +0300, Igor Stoppa wrote:
> On 07/08/17 22:12, Jerome Glisse wrote:
> > On Mon, Aug 07, 2017 at 05:13:00PM +0300, Igor Stoppa wrote:
> 
> [...]
> 
> >> I have an updated version of the old proposal:
> >>
> >> * put a magic number in the private field, during initialization of
> >> pmalloc pages
> >>
> >> * during hardened usercopy verification, when I have to assess if a page
> >> is of pmalloc type, compare the private field against the magic number
> >>
> >> * if and only if the private field matches the magic number, then invoke
> >> find_vm_area(), so that the slowness affects only a possibly limited
> >> amount of false positives.
> > 
> > This all sounds good to me.
> 
> ok, I still have one doubt wrt defining the flag.
> Where should I do it?
> 
> vmalloc.h has the following:
> 
> /* bits in flags of vmalloc's vm_struct below */
> #define VM_IOREMAP0x0001  /* ioremap() and friends
>   */
> #define VM_ALLOC  0x0002  /* vmalloc() */
> #define VM_MAP0x0004  /* vmap()ed pages */
> #define VM_USERMAP0x0008  /* suitable for
>  remap_vmalloc_range
>   */
> #define VM_UNINITIALIZED  0x0020  /* vm_struct is not
>  fully initialized */
> #define VM_NO_GUARD   0x0040  /* don't add guard page
>   */
> #define VM_KASAN  0x0080  /* has allocated kasan
>   shadow memory */
> /* bits [20..32] reserved for arch specific ioremap internals */
> 
> 
> 
> I am tempted to add
> 
> #define VM_PMALLOC0x0100
> 
> But would it be acceptable, to mention pmalloc into vmalloc?
> 
> Should I name it VM_PRIVATE bit, instead?
> 
> Using VM_PRIVATE would avoid contaminating vmalloc with something that
> depends on it (like VM_PMALLOC would do).
> 
> But using VM_PRIVATE will likely add tracking issues, if someone else
> wants to use the same bit and it's not clear who is the user, if any.

VM_PMALLOC sounds fine to me also adding a comment there pointing to
pmalloc documentation would be a good thing to do. The above are flags
that are use only inside vmalloc context and so there is no issue
here of conflicting with other potential user.

> 
> Unless it's acceptable to check the private field in the page struct.
> It would bear the pmalloc magic number.

I thought you wanted to do:
  check struct page mapping field
  check vmap->flags for VM_PMALLOC

bool is_pmalloc(unsigned long addr)
{
struct page *page;
struct vm_struct *vm_struct;

if (!is_vmalloc_addr(addr))
return false;
page = vmalloc_to_page(addr);
if (!page)
return false;
if (page->mapping != pmalloc_magic_key)
return false;

vm_struct = find_vm_area(addr);
if (!vm_struct)
return false;

return vm_struct->flags & VM_PMALLOC;
}

Did you change your plan ?

> 
> I'm thinking to use a pointer to one of pmalloc data items, as signature.

What ever is easier for you. Note that dereferencing such pointer before
asserting this is really a pmalloc page would be hazardous.

Jérôme


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-08 Thread Igor Stoppa
On 07/08/17 22:12, Jerome Glisse wrote:
> On Mon, Aug 07, 2017 at 05:13:00PM +0300, Igor Stoppa wrote:

[...]

>> I have an updated version of the old proposal:
>>
>> * put a magic number in the private field, during initialization of
>> pmalloc pages
>>
>> * during hardened usercopy verification, when I have to assess if a page
>> is of pmalloc type, compare the private field against the magic number
>>
>> * if and only if the private field matches the magic number, then invoke
>> find_vm_area(), so that the slowness affects only a possibly limited
>> amount of false positives.
> 
> This all sounds good to me.

ok, I still have one doubt wrt defining the flag.
Where should I do it?

vmalloc.h has the following:

/* bits in flags of vmalloc's vm_struct below */
#define VM_IOREMAP  0x0001  /* ioremap() and friends
*/
#define VM_ALLOC0x0002  /* vmalloc() */
#define VM_MAP  0x0004  /* vmap()ed pages */
#define VM_USERMAP  0x0008  /* suitable for
   remap_vmalloc_range
*/
#define VM_UNINITIALIZED0x0020  /* vm_struct is not
   fully initialized */
#define VM_NO_GUARD 0x0040  /* don't add guard page
*/
#define VM_KASAN0x0080  /* has allocated kasan
shadow memory */
/* bits [20..32] reserved for arch specific ioremap internals */



I am tempted to add

#define VM_PMALLOC  0x0100

But would it be acceptable, to mention pmalloc into vmalloc?

Should I name it VM_PRIVATE bit, instead?

Using VM_PRIVATE would avoid contaminating vmalloc with something that
depends on it (like VM_PMALLOC would do).

But using VM_PRIVATE will likely add tracking issues, if someone else
wants to use the same bit and it's not clear who is the user, if any.

Unless it's acceptable to check the private field in the page struct.
It would bear the pmalloc magic number.

I'm thinking to use a pointer to one of pmalloc data items, as signature.


--
igor


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-08 Thread Igor Stoppa
On 07/08/17 22:12, Jerome Glisse wrote:
> On Mon, Aug 07, 2017 at 05:13:00PM +0300, Igor Stoppa wrote:

[...]

>> I have an updated version of the old proposal:
>>
>> * put a magic number in the private field, during initialization of
>> pmalloc pages
>>
>> * during hardened usercopy verification, when I have to assess if a page
>> is of pmalloc type, compare the private field against the magic number
>>
>> * if and only if the private field matches the magic number, then invoke
>> find_vm_area(), so that the slowness affects only a possibly limited
>> amount of false positives.
> 
> This all sounds good to me.

ok, I still have one doubt wrt defining the flag.
Where should I do it?

vmalloc.h has the following:

/* bits in flags of vmalloc's vm_struct below */
#define VM_IOREMAP  0x0001  /* ioremap() and friends
*/
#define VM_ALLOC0x0002  /* vmalloc() */
#define VM_MAP  0x0004  /* vmap()ed pages */
#define VM_USERMAP  0x0008  /* suitable for
   remap_vmalloc_range
*/
#define VM_UNINITIALIZED0x0020  /* vm_struct is not
   fully initialized */
#define VM_NO_GUARD 0x0040  /* don't add guard page
*/
#define VM_KASAN0x0080  /* has allocated kasan
shadow memory */
/* bits [20..32] reserved for arch specific ioremap internals */



I am tempted to add

#define VM_PMALLOC  0x0100

But would it be acceptable, to mention pmalloc into vmalloc?

Should I name it VM_PRIVATE bit, instead?

Using VM_PRIVATE would avoid contaminating vmalloc with something that
depends on it (like VM_PMALLOC would do).

But using VM_PRIVATE will likely add tracking issues, if someone else
wants to use the same bit and it's not clear who is the user, if any.

Unless it's acceptable to check the private field in the page struct.
It would bear the pmalloc magic number.

I'm thinking to use a pointer to one of pmalloc data items, as signature.


--
igor


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-07 Thread Jerome Glisse
On Mon, Aug 07, 2017 at 05:13:00PM +0300, Igor Stoppa wrote:
> 
> 
> On 07/08/17 16:31, Jerome Glisse wrote:
> > On Mon, Aug 07, 2017 at 02:26:21PM +0300, Igor Stoppa wrote:
> 
> [...]
> 
> >> I'll add a vm_area field as you advised.
> >>
> >> Is this something I could send as standalone patch?
> > 
> > Note that vmalloc() is not the only thing that use vmalloc address
> > space. There is also vmap() and i know one set of drivers that use
> > vmap() and also use the mapping field of struct page namely GPU
> > drivers.
> 
> Ah, yes, you mentioned this.
> 
> > So like i said previously i would store a flag inside vm_struct to
> > know if page you are looking at are pmalloc or not.
> 
> And I was planning to follow your advice, using one of the flags.
> But ...
> 
> > Again do you
> > need to store something per page ? Would storing it per vm_struct
> > not be enough ?
> 
> ... there was this further comment, about speeding up the access to
> vm_area, which seemed good from performance perspective.
> 
> ---8<--8<--8<--8<--8<---
> On 03/08/17 14:48, Michal Hocko wrote:
> > On Thu 03-08-17 13:11:45, Igor Stoppa wrote:
> 
> [...]
> 
> >> But, to reply more specifically to your advice, yes, I think I could
> >> add a flag to vm_struct and then retrieve its value, for the address
> >> being processed, by passing through find_vm_area().
> >
> > ... and you can store vm_struct pointer to the struct page there and
> > you won't need to do the slow find_vm_area. I haven't checked very
> > closely but this should be possible in principle. I guess other
> > callers might benefit from this as well.
> ---8<--8<--8<--8<--8<---
> 
> I do not strictly need to modify the page struct, but it seems it might
> harm performance, if it is added on the path of hardened usercopy.
> 
> I have an updated version of the old proposal:
> 
> * put a magic number in the private field, during initialization of
> pmalloc pages
> 
> * during hardened usercopy verification, when I have to assess if a page
> is of pmalloc type, compare the private field against the magic number
> 
> * if and only if the private field matches the magic number, then invoke
> find_vm_area(), so that the slowness affects only a possibly limited
> amount of false positives.

This all sounds good to me.

Jérôme


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-07 Thread Jerome Glisse
On Mon, Aug 07, 2017 at 05:13:00PM +0300, Igor Stoppa wrote:
> 
> 
> On 07/08/17 16:31, Jerome Glisse wrote:
> > On Mon, Aug 07, 2017 at 02:26:21PM +0300, Igor Stoppa wrote:
> 
> [...]
> 
> >> I'll add a vm_area field as you advised.
> >>
> >> Is this something I could send as standalone patch?
> > 
> > Note that vmalloc() is not the only thing that use vmalloc address
> > space. There is also vmap() and i know one set of drivers that use
> > vmap() and also use the mapping field of struct page namely GPU
> > drivers.
> 
> Ah, yes, you mentioned this.
> 
> > So like i said previously i would store a flag inside vm_struct to
> > know if page you are looking at are pmalloc or not.
> 
> And I was planning to follow your advice, using one of the flags.
> But ...
> 
> > Again do you
> > need to store something per page ? Would storing it per vm_struct
> > not be enough ?
> 
> ... there was this further comment, about speeding up the access to
> vm_area, which seemed good from performance perspective.
> 
> ---8<--8<--8<--8<--8<---
> On 03/08/17 14:48, Michal Hocko wrote:
> > On Thu 03-08-17 13:11:45, Igor Stoppa wrote:
> 
> [...]
> 
> >> But, to reply more specifically to your advice, yes, I think I could
> >> add a flag to vm_struct and then retrieve its value, for the address
> >> being processed, by passing through find_vm_area().
> >
> > ... and you can store vm_struct pointer to the struct page there and
> > you won't need to do the slow find_vm_area. I haven't checked very
> > closely but this should be possible in principle. I guess other
> > callers might benefit from this as well.
> ---8<--8<--8<--8<--8<---
> 
> I do not strictly need to modify the page struct, but it seems it might
> harm performance, if it is added on the path of hardened usercopy.
> 
> I have an updated version of the old proposal:
> 
> * put a magic number in the private field, during initialization of
> pmalloc pages
> 
> * during hardened usercopy verification, when I have to assess if a page
> is of pmalloc type, compare the private field against the magic number
> 
> * if and only if the private field matches the magic number, then invoke
> find_vm_area(), so that the slowness affects only a possibly limited
> amount of false positives.

This all sounds good to me.

Jérôme


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-07 Thread Igor Stoppa


On 07/08/17 16:31, Jerome Glisse wrote:
> On Mon, Aug 07, 2017 at 02:26:21PM +0300, Igor Stoppa wrote:

[...]

>> I'll add a vm_area field as you advised.
>>
>> Is this something I could send as standalone patch?
> 
> Note that vmalloc() is not the only thing that use vmalloc address
> space. There is also vmap() and i know one set of drivers that use
> vmap() and also use the mapping field of struct page namely GPU
> drivers.

Ah, yes, you mentioned this.

> So like i said previously i would store a flag inside vm_struct to
> know if page you are looking at are pmalloc or not.

And I was planning to follow your advice, using one of the flags.
But ...

> Again do you
> need to store something per page ? Would storing it per vm_struct
> not be enough ?

... there was this further comment, about speeding up the access to
vm_area, which seemed good from performance perspective.

---8<--8<--8<--8<--8<---
On 03/08/17 14:48, Michal Hocko wrote:
> On Thu 03-08-17 13:11:45, Igor Stoppa wrote:

[...]

>> But, to reply more specifically to your advice, yes, I think I could
>> add a flag to vm_struct and then retrieve its value, for the address
>> being processed, by passing through find_vm_area().
>
> ... and you can store vm_struct pointer to the struct page there and
> you won't need to do the slow find_vm_area. I haven't checked very
> closely but this should be possible in principle. I guess other
> callers might benefit from this as well.
---8<--8<--8<--8<--8<---

I do not strictly need to modify the page struct, but it seems it might
harm performance, if it is added on the path of hardened usercopy.

I have an updated version of the old proposal:

* put a magic number in the private field, during initialization of
pmalloc pages

* during hardened usercopy verification, when I have to assess if a page
is of pmalloc type, compare the private field against the magic number

* if and only if the private field matches the magic number, then invoke
find_vm_area(), so that the slowness affects only a possibly limited
amount of false positives.


--
igor


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-07 Thread Igor Stoppa


On 07/08/17 16:31, Jerome Glisse wrote:
> On Mon, Aug 07, 2017 at 02:26:21PM +0300, Igor Stoppa wrote:

[...]

>> I'll add a vm_area field as you advised.
>>
>> Is this something I could send as standalone patch?
> 
> Note that vmalloc() is not the only thing that use vmalloc address
> space. There is also vmap() and i know one set of drivers that use
> vmap() and also use the mapping field of struct page namely GPU
> drivers.

Ah, yes, you mentioned this.

> So like i said previously i would store a flag inside vm_struct to
> know if page you are looking at are pmalloc or not.

And I was planning to follow your advice, using one of the flags.
But ...

> Again do you
> need to store something per page ? Would storing it per vm_struct
> not be enough ?

... there was this further comment, about speeding up the access to
vm_area, which seemed good from performance perspective.

---8<--8<--8<--8<--8<---
On 03/08/17 14:48, Michal Hocko wrote:
> On Thu 03-08-17 13:11:45, Igor Stoppa wrote:

[...]

>> But, to reply more specifically to your advice, yes, I think I could
>> add a flag to vm_struct and then retrieve its value, for the address
>> being processed, by passing through find_vm_area().
>
> ... and you can store vm_struct pointer to the struct page there and
> you won't need to do the slow find_vm_area. I haven't checked very
> closely but this should be possible in principle. I guess other
> callers might benefit from this as well.
---8<--8<--8<--8<--8<---

I do not strictly need to modify the page struct, but it seems it might
harm performance, if it is added on the path of hardened usercopy.

I have an updated version of the old proposal:

* put a magic number in the private field, during initialization of
pmalloc pages

* during hardened usercopy verification, when I have to assess if a page
is of pmalloc type, compare the private field against the magic number

* if and only if the private field matches the magic number, then invoke
find_vm_area(), so that the slowness affects only a possibly limited
amount of false positives.


--
igor


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-07 Thread Jerome Glisse
On Mon, Aug 07, 2017 at 02:26:21PM +0300, Igor Stoppa wrote:
> On 04/08/17 11:12, Michal Hocko wrote:
> > On Fri 04-08-17 11:02:46, Igor Stoppa wrote:
> 
> [...]
> 
> >> struct page {
> >>   /* First double word block */
> >>   unsigned long flags; /* Atomic flags, some possibly
> >> * updated asynchronously */
> >> union {
> >>struct address_space *mapping;  /* If low bit clear, points to
> >> * inode address_space, or NULL.
> >> * If page mapped as anonymous
> >> * memory, low bit is set, and
> >> * it points to anon_vma object:
> >> * see PAGE_MAPPING_ANON below.
> >> */
> >> ...
> >> }
> >>
> >> mapping seems to be used exclusively in 2 ways, based on the value of
> >> its lower bit.
> > 
> > Not really. The above applies to LRU pages. Please note that Slab pages
> > use s_mem and huge pages use compound_mapcount. If vmalloc pages are
> > using none of those already you can add a new field there.
> 
> Yes, both from reading the code and some experimentation, it seems that
> vmalloc is not using either field.
> 
> I'll add a vm_area field as you advised.
> 
> Is this something I could send as standalone patch?

Note that vmalloc() is not the only thing that use vmalloc address
space. There is also vmap() and i know one set of drivers that use
vmap() and also use the mapping field of struct page namely GPU
drivers.

So like i said previously i would store a flag inside vm_struct to
know if page you are looking at are pmalloc or not. Again do you
need to store something per page ? Would storing it per vm_struct
not be enough ?

Cheers,
Jérôme


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-07 Thread Jerome Glisse
On Mon, Aug 07, 2017 at 02:26:21PM +0300, Igor Stoppa wrote:
> On 04/08/17 11:12, Michal Hocko wrote:
> > On Fri 04-08-17 11:02:46, Igor Stoppa wrote:
> 
> [...]
> 
> >> struct page {
> >>   /* First double word block */
> >>   unsigned long flags; /* Atomic flags, some possibly
> >> * updated asynchronously */
> >> union {
> >>struct address_space *mapping;  /* If low bit clear, points to
> >> * inode address_space, or NULL.
> >> * If page mapped as anonymous
> >> * memory, low bit is set, and
> >> * it points to anon_vma object:
> >> * see PAGE_MAPPING_ANON below.
> >> */
> >> ...
> >> }
> >>
> >> mapping seems to be used exclusively in 2 ways, based on the value of
> >> its lower bit.
> > 
> > Not really. The above applies to LRU pages. Please note that Slab pages
> > use s_mem and huge pages use compound_mapcount. If vmalloc pages are
> > using none of those already you can add a new field there.
> 
> Yes, both from reading the code and some experimentation, it seems that
> vmalloc is not using either field.
> 
> I'll add a vm_area field as you advised.
> 
> Is this something I could send as standalone patch?

Note that vmalloc() is not the only thing that use vmalloc address
space. There is also vmap() and i know one set of drivers that use
vmap() and also use the mapping field of struct page namely GPU
drivers.

So like i said previously i would store a flag inside vm_struct to
know if page you are looking at are pmalloc or not. Again do you
need to store something per page ? Would storing it per vm_struct
not be enough ?

Cheers,
Jérôme


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-07 Thread Michal Hocko
On Mon 07-08-17 14:26:21, Igor Stoppa wrote:
> On 04/08/17 11:12, Michal Hocko wrote:
> > On Fri 04-08-17 11:02:46, Igor Stoppa wrote:
> 
> [...]
> 
> >> struct page {
> >>   /* First double word block */
> >>   unsigned long flags; /* Atomic flags, some possibly
> >> * updated asynchronously */
> >> union {
> >>struct address_space *mapping;  /* If low bit clear, points to
> >> * inode address_space, or NULL.
> >> * If page mapped as anonymous
> >> * memory, low bit is set, and
> >> * it points to anon_vma object:
> >> * see PAGE_MAPPING_ANON below.
> >> */
> >> ...
> >> }
> >>
> >> mapping seems to be used exclusively in 2 ways, based on the value of
> >> its lower bit.
> > 
> > Not really. The above applies to LRU pages. Please note that Slab pages
> > use s_mem and huge pages use compound_mapcount. If vmalloc pages are
> > using none of those already you can add a new field there.
> 
> Yes, both from reading the code and some experimentation, it seems that
> vmalloc is not using either field.
> 
> I'll add a vm_area field as you advised.
> 
> Is this something I could send as standalone patch?

Yes I would start that way and also look at some find_vm_area callers
and maybe they can be simplified. The most obvious one being
task_struct::stack_vm_area but I have to confess I haven't checked that
too deeply.
-- 
Michal Hocko
SUSE Labs


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-07 Thread Michal Hocko
On Mon 07-08-17 14:26:21, Igor Stoppa wrote:
> On 04/08/17 11:12, Michal Hocko wrote:
> > On Fri 04-08-17 11:02:46, Igor Stoppa wrote:
> 
> [...]
> 
> >> struct page {
> >>   /* First double word block */
> >>   unsigned long flags; /* Atomic flags, some possibly
> >> * updated asynchronously */
> >> union {
> >>struct address_space *mapping;  /* If low bit clear, points to
> >> * inode address_space, or NULL.
> >> * If page mapped as anonymous
> >> * memory, low bit is set, and
> >> * it points to anon_vma object:
> >> * see PAGE_MAPPING_ANON below.
> >> */
> >> ...
> >> }
> >>
> >> mapping seems to be used exclusively in 2 ways, based on the value of
> >> its lower bit.
> > 
> > Not really. The above applies to LRU pages. Please note that Slab pages
> > use s_mem and huge pages use compound_mapcount. If vmalloc pages are
> > using none of those already you can add a new field there.
> 
> Yes, both from reading the code and some experimentation, it seems that
> vmalloc is not using either field.
> 
> I'll add a vm_area field as you advised.
> 
> Is this something I could send as standalone patch?

Yes I would start that way and also look at some find_vm_area callers
and maybe they can be simplified. The most obvious one being
task_struct::stack_vm_area but I have to confess I haven't checked that
too deeply.
-- 
Michal Hocko
SUSE Labs


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-07 Thread Igor Stoppa
On 04/08/17 11:12, Michal Hocko wrote:
> On Fri 04-08-17 11:02:46, Igor Stoppa wrote:

[...]

>> struct page {
>>   /* First double word block */
>>   unsigned long flags;   /* Atomic flags, some possibly
>>   * updated asynchronously */
>> union {
>>  struct address_space *mapping;  /* If low bit clear, points to
>>   * inode address_space, or NULL.
>>   * If page mapped as anonymous
>>   * memory, low bit is set, and
>>   * it points to anon_vma object:
>>   * see PAGE_MAPPING_ANON below.
>>   */
>> ...
>> }
>>
>> mapping seems to be used exclusively in 2 ways, based on the value of
>> its lower bit.
> 
> Not really. The above applies to LRU pages. Please note that Slab pages
> use s_mem and huge pages use compound_mapcount. If vmalloc pages are
> using none of those already you can add a new field there.

Yes, both from reading the code and some experimentation, it seems that
vmalloc is not using either field.

I'll add a vm_area field as you advised.

Is this something I could send as standalone patch?

--
thank you, igor



Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-07 Thread Igor Stoppa
On 04/08/17 11:12, Michal Hocko wrote:
> On Fri 04-08-17 11:02:46, Igor Stoppa wrote:

[...]

>> struct page {
>>   /* First double word block */
>>   unsigned long flags;   /* Atomic flags, some possibly
>>   * updated asynchronously */
>> union {
>>  struct address_space *mapping;  /* If low bit clear, points to
>>   * inode address_space, or NULL.
>>   * If page mapped as anonymous
>>   * memory, low bit is set, and
>>   * it points to anon_vma object:
>>   * see PAGE_MAPPING_ANON below.
>>   */
>> ...
>> }
>>
>> mapping seems to be used exclusively in 2 ways, based on the value of
>> its lower bit.
> 
> Not really. The above applies to LRU pages. Please note that Slab pages
> use s_mem and huge pages use compound_mapcount. If vmalloc pages are
> using none of those already you can add a new field there.

Yes, both from reading the code and some experimentation, it seems that
vmalloc is not using either field.

I'll add a vm_area field as you advised.

Is this something I could send as standalone patch?

--
thank you, igor



Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-04 Thread Michal Hocko
On Fri 04-08-17 11:02:46, Igor Stoppa wrote:
> 
> 
> On 03/08/17 18:15, Michal Hocko wrote:
> 
> > I would check the one where we have mapping. It is rather unlikely
> > vmalloc users would touch this one.
> 
> That was also the initial recommendation from Jerome Glisse, but it
> seemed unusable, because of the related comment.
> 
> I should have asked for clarifications back then :-(
> 
> But it's never too late ...
> 
> 
> struct page {
>   /* First double word block */
>   unsigned long flags;/* Atomic flags, some possibly
>* updated asynchronously */
> union {
>   struct address_space *mapping;  /* If low bit clear, points to
>* inode address_space, or NULL.
>* If page mapped as anonymous
>* memory, low bit is set, and
>* it points to anon_vma object:
>* see PAGE_MAPPING_ANON below.
>*/
> ...
> }
> 
> mapping seems to be used exclusively in 2 ways, based on the value of
> its lower bit.

Not really. The above applies to LRU pages. Please note that Slab pages
use s_mem and huge pages use compound_mapcount. If vmalloc pages are
using none of those already you can add a new field there.

-- 
Michal Hocko
SUSE Labs


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-04 Thread Michal Hocko
On Fri 04-08-17 11:02:46, Igor Stoppa wrote:
> 
> 
> On 03/08/17 18:15, Michal Hocko wrote:
> 
> > I would check the one where we have mapping. It is rather unlikely
> > vmalloc users would touch this one.
> 
> That was also the initial recommendation from Jerome Glisse, but it
> seemed unusable, because of the related comment.
> 
> I should have asked for clarifications back then :-(
> 
> But it's never too late ...
> 
> 
> struct page {
>   /* First double word block */
>   unsigned long flags;/* Atomic flags, some possibly
>* updated asynchronously */
> union {
>   struct address_space *mapping;  /* If low bit clear, points to
>* inode address_space, or NULL.
>* If page mapped as anonymous
>* memory, low bit is set, and
>* it points to anon_vma object:
>* see PAGE_MAPPING_ANON below.
>*/
> ...
> }
> 
> mapping seems to be used exclusively in 2 ways, based on the value of
> its lower bit.

Not really. The above applies to LRU pages. Please note that Slab pages
use s_mem and huge pages use compound_mapcount. If vmalloc pages are
using none of those already you can add a new field there.

-- 
Michal Hocko
SUSE Labs


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-04 Thread Igor Stoppa


On 03/08/17 18:15, Michal Hocko wrote:

> I would check the one where we have mapping. It is rather unlikely
> vmalloc users would touch this one.

That was also the initial recommendation from Jerome Glisse, but it
seemed unusable, because of the related comment.

I should have asked for clarifications back then :-(

But it's never too late ...


struct page {
  /* First double word block */
  unsigned long flags;  /* Atomic flags, some possibly
 * updated asynchronously */
union {
struct address_space *mapping;  /* If low bit clear, points to
 * inode address_space, or NULL.
 * If page mapped as anonymous
 * memory, low bit is set, and
 * it points to anon_vma object:
 * see PAGE_MAPPING_ANON below.
 */
...
}

mapping seems to be used exclusively in 2 ways, based on the value of
its lower bit.

Therefore I discarded it as valid option ("private", otoh was far more
alluring), but maybe I could wrap it inside a union, together with vm_area?

---
thanks, igor


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-04 Thread Igor Stoppa


On 03/08/17 18:15, Michal Hocko wrote:

> I would check the one where we have mapping. It is rather unlikely
> vmalloc users would touch this one.

That was also the initial recommendation from Jerome Glisse, but it
seemed unusable, because of the related comment.

I should have asked for clarifications back then :-(

But it's never too late ...


struct page {
  /* First double word block */
  unsigned long flags;  /* Atomic flags, some possibly
 * updated asynchronously */
union {
struct address_space *mapping;  /* If low bit clear, points to
 * inode address_space, or NULL.
 * If page mapped as anonymous
 * memory, low bit is set, and
 * it points to anon_vma object:
 * see PAGE_MAPPING_ANON below.
 */
...
}

mapping seems to be used exclusively in 2 ways, based on the value of
its lower bit.

Therefore I discarded it as valid option ("private", otoh was far more
alluring), but maybe I could wrap it inside a union, together with vm_area?

---
thanks, igor


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-03 Thread Michal Hocko
On Thu 03-08-17 18:06:11, Igor Stoppa wrote:
> 
> 
> On 03/08/17 17:47, Jerome Glisse wrote:
> > On Thu, Aug 03, 2017 at 03:55:50PM +0200, Michal Hocko wrote:
> >> On Thu 03-08-17 15:20:31, Igor Stoppa wrote:
> 
> [...]
> 
> >>> I am confused about this: if "private2" is a pointer, but when I get an
> >>> address, I do not even know if the address represents a valid pmalloc
> >>> page, how can i know when it's ok to dereference "private2"?
> >>
> >> because you can make all pages which back vmalloc mappings have vm_area
> >> pointer set.
> > 
> > Note that i think this might break some device driver that use vmap()
> > i think some of them use private field to store device driver specific
> > informations. But there likely is an unuse field in struct page that
> > can be use for that.
> 
> This increases the unease from my side ... it looks like there is no way
> to fully understand if a field is really used or not, without having
> deep intimate knowledge of lots of code that is only marginally involved :-/

welcome to the struct page heaven...
 
> Similarly, how would I be able to specify what would be the correct way
> to decide the member of the union to use for handling the field?

I would check the one where we have mapping. It is rather unlikely
vmalloc users would touch this one.
-- 
Michal Hocko
SUSE Labs


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-03 Thread Michal Hocko
On Thu 03-08-17 18:06:11, Igor Stoppa wrote:
> 
> 
> On 03/08/17 17:47, Jerome Glisse wrote:
> > On Thu, Aug 03, 2017 at 03:55:50PM +0200, Michal Hocko wrote:
> >> On Thu 03-08-17 15:20:31, Igor Stoppa wrote:
> 
> [...]
> 
> >>> I am confused about this: if "private2" is a pointer, but when I get an
> >>> address, I do not even know if the address represents a valid pmalloc
> >>> page, how can i know when it's ok to dereference "private2"?
> >>
> >> because you can make all pages which back vmalloc mappings have vm_area
> >> pointer set.
> > 
> > Note that i think this might break some device driver that use vmap()
> > i think some of them use private field to store device driver specific
> > informations. But there likely is an unuse field in struct page that
> > can be use for that.
> 
> This increases the unease from my side ... it looks like there is no way
> to fully understand if a field is really used or not, without having
> deep intimate knowledge of lots of code that is only marginally involved :-/

welcome to the struct page heaven...
 
> Similarly, how would I be able to specify what would be the correct way
> to decide the member of the union to use for handling the field?

I would check the one where we have mapping. It is rather unlikely
vmalloc users would touch this one.
-- 
Michal Hocko
SUSE Labs


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-03 Thread Igor Stoppa


On 03/08/17 17:47, Jerome Glisse wrote:
> On Thu, Aug 03, 2017 at 03:55:50PM +0200, Michal Hocko wrote:
>> On Thu 03-08-17 15:20:31, Igor Stoppa wrote:

[...]

>>> I am confused about this: if "private2" is a pointer, but when I get an
>>> address, I do not even know if the address represents a valid pmalloc
>>> page, how can i know when it's ok to dereference "private2"?
>>
>> because you can make all pages which back vmalloc mappings have vm_area
>> pointer set.
> 
> Note that i think this might break some device driver that use vmap()
> i think some of them use private field to store device driver specific
> informations. But there likely is an unuse field in struct page that
> can be use for that.

This increases the unease from my side ... it looks like there is no way
to fully understand if a field is really used or not, without having
deep intimate knowledge of lots of code that is only marginally involved :-/

Similarly, how would I be able to specify what would be the correct way
to decide the member of the union to use for handling the field?

If there were either some sort of non-multiplexed tag/cookie field or a
function, that would specify how to treat the various unions, then it
would be easier to multiplex the remaining data, according to how the
page is used.

--
igor


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-03 Thread Igor Stoppa


On 03/08/17 17:47, Jerome Glisse wrote:
> On Thu, Aug 03, 2017 at 03:55:50PM +0200, Michal Hocko wrote:
>> On Thu 03-08-17 15:20:31, Igor Stoppa wrote:

[...]

>>> I am confused about this: if "private2" is a pointer, but when I get an
>>> address, I do not even know if the address represents a valid pmalloc
>>> page, how can i know when it's ok to dereference "private2"?
>>
>> because you can make all pages which back vmalloc mappings have vm_area
>> pointer set.
> 
> Note that i think this might break some device driver that use vmap()
> i think some of them use private field to store device driver specific
> informations. But there likely is an unuse field in struct page that
> can be use for that.

This increases the unease from my side ... it looks like there is no way
to fully understand if a field is really used or not, without having
deep intimate knowledge of lots of code that is only marginally involved :-/

Similarly, how would I be able to specify what would be the correct way
to decide the member of the union to use for handling the field?

If there were either some sort of non-multiplexed tag/cookie field or a
function, that would specify how to treat the various unions, then it
would be easier to multiplex the remaining data, according to how the
page is used.

--
igor


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-03 Thread Jerome Glisse
On Thu, Aug 03, 2017 at 03:55:50PM +0200, Michal Hocko wrote:
> On Thu 03-08-17 15:20:31, Igor Stoppa wrote:
> > On 03/08/17 14:48, Michal Hocko wrote:
> > > On Thu 03-08-17 13:11:45, Igor Stoppa wrote:
> > >> On 02/08/17 20:08, Jerome Glisse wrote:
> > >>> On Wed, Aug 02, 2017 at 06:14:28PM +0300, Igor Stoppa wrote:
> > 
> > [...]
> > 
> >  from include/linux/mm_types.h:
> > 
> >  struct page {
> >  ...
> >    union {
> >  unsigned long private; /* Mapping-private opaque data:
> >  * usually used for buffer_heads
> >  * if PagePrivate set; used for
> >  * swp_entry_t if PageSwapCache;
> >  * indicates order in the buddy
> >  * system if PG_buddy is set.
> >  */
> > 
> > [...]
> > 
> > >> If the "Mapping-private" was dropped or somehow connected exclusively to
> > >> the cases listed in the comment, then I think it would be more clear
> > >> that the comment needs to be intended as related to mapping in certain
> > >> cases only.
> > >> But it is otherwise ok to use the "private" field for whatever purpose
> > >> it might be suitable, as long as it is not already in use.
> > > 
> > > I would recommend adding a new field into the enum...
> > 
> > s/enum/union/ ?
> > 
> > If not, I am not sure what is the enum that you are talking about.
> 
> yeah, fat fingers on my side
> 
> > 
> > [...]
> > 
> > >> But, to reply more specifically to your advice, yes, I think I could add
> > >> a flag to vm_struct and then retrieve its value, for the address being
> > >> processed, by passing through find_vm_area().
> > > 
> > > ... and you can store vm_struct pointer to the struct page there 
> > 
> > "there" as in the new field of the union?
> > btw, what would be a meaningful name, since "private" is already taken?
> > 
> > For simplicity, I'll use, for now, "private2"
> 
> why not explicit vm_area?
> 
> > > and you> won't need to do the slow find_vm_area. I haven't checked
> > very closely
> > > but this should be possible in principle. I guess other callers might
> > > benefit from this as well.
> > 
> > I am confused about this: if "private2" is a pointer, but when I get an
> > address, I do not even know if the address represents a valid pmalloc
> > page, how can i know when it's ok to dereference "private2"?
> 
> because you can make all pages which back vmalloc mappings have vm_area
> pointer set.

Note that i think this might break some device driver that use vmap()
i think some of them use private field to store device driver specific
informations. But there likely is an unuse field in struct page that
can be use for that.

Jérôme


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-03 Thread Jerome Glisse
On Thu, Aug 03, 2017 at 03:55:50PM +0200, Michal Hocko wrote:
> On Thu 03-08-17 15:20:31, Igor Stoppa wrote:
> > On 03/08/17 14:48, Michal Hocko wrote:
> > > On Thu 03-08-17 13:11:45, Igor Stoppa wrote:
> > >> On 02/08/17 20:08, Jerome Glisse wrote:
> > >>> On Wed, Aug 02, 2017 at 06:14:28PM +0300, Igor Stoppa wrote:
> > 
> > [...]
> > 
> >  from include/linux/mm_types.h:
> > 
> >  struct page {
> >  ...
> >    union {
> >  unsigned long private; /* Mapping-private opaque data:
> >  * usually used for buffer_heads
> >  * if PagePrivate set; used for
> >  * swp_entry_t if PageSwapCache;
> >  * indicates order in the buddy
> >  * system if PG_buddy is set.
> >  */
> > 
> > [...]
> > 
> > >> If the "Mapping-private" was dropped or somehow connected exclusively to
> > >> the cases listed in the comment, then I think it would be more clear
> > >> that the comment needs to be intended as related to mapping in certain
> > >> cases only.
> > >> But it is otherwise ok to use the "private" field for whatever purpose
> > >> it might be suitable, as long as it is not already in use.
> > > 
> > > I would recommend adding a new field into the enum...
> > 
> > s/enum/union/ ?
> > 
> > If not, I am not sure what is the enum that you are talking about.
> 
> yeah, fat fingers on my side
> 
> > 
> > [...]
> > 
> > >> But, to reply more specifically to your advice, yes, I think I could add
> > >> a flag to vm_struct and then retrieve its value, for the address being
> > >> processed, by passing through find_vm_area().
> > > 
> > > ... and you can store vm_struct pointer to the struct page there 
> > 
> > "there" as in the new field of the union?
> > btw, what would be a meaningful name, since "private" is already taken?
> > 
> > For simplicity, I'll use, for now, "private2"
> 
> why not explicit vm_area?
> 
> > > and you> won't need to do the slow find_vm_area. I haven't checked
> > very closely
> > > but this should be possible in principle. I guess other callers might
> > > benefit from this as well.
> > 
> > I am confused about this: if "private2" is a pointer, but when I get an
> > address, I do not even know if the address represents a valid pmalloc
> > page, how can i know when it's ok to dereference "private2"?
> 
> because you can make all pages which back vmalloc mappings have vm_area
> pointer set.

Note that i think this might break some device driver that use vmap()
i think some of them use private field to store device driver specific
informations. But there likely is an unuse field in struct page that
can be use for that.

Jérôme


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-03 Thread Igor Stoppa


On 03/08/17 16:55, Michal Hocko wrote:
> On Thu 03-08-17 15:20:31, Igor Stoppa wrote:
>> On 03/08/17 14:48, Michal Hocko wrote:
>>> On Thu 03-08-17 13:11:45, Igor Stoppa wrote:

[...]

 But, to reply more specifically to your advice, yes, I think I could add
 a flag to vm_struct and then retrieve its value, for the address being
 processed, by passing through find_vm_area().
>>>
>>> ... and you can store vm_struct pointer to the struct page there 
>>
>> "there" as in the new field of the union?
>> btw, what would be a meaningful name, since "private" is already taken?
>>
>> For simplicity, I'll use, for now, "private2"
> 
> why not explicit vm_area?

ok :-)

>>> and you won't need to do the slow find_vm_area. I haven't checked
>> very closely
>>> but this should be possible in principle. I guess other callers might
>>> benefit from this as well.
>>
>> I am confused about this: if "private2" is a pointer, but when I get an
>> address, I do not even know if the address represents a valid pmalloc
>> page, how can i know when it's ok to dereference "private2"?
> 
> because you can make all pages which back vmalloc mappings have vm_area
> pointer set.

Ah, now I see, I had missed that the field would be set for *all* pages
backed by vmalloc.

So, given a pointer, I still have to figure out if it refers to a
vmalloc area or not.

However, that is something I need to do anyway, to get the reference to
the corresponding page struct, in case it is indeed a vmalloc address.

--
thanks, igor



Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-03 Thread Igor Stoppa


On 03/08/17 16:55, Michal Hocko wrote:
> On Thu 03-08-17 15:20:31, Igor Stoppa wrote:
>> On 03/08/17 14:48, Michal Hocko wrote:
>>> On Thu 03-08-17 13:11:45, Igor Stoppa wrote:

[...]

 But, to reply more specifically to your advice, yes, I think I could add
 a flag to vm_struct and then retrieve its value, for the address being
 processed, by passing through find_vm_area().
>>>
>>> ... and you can store vm_struct pointer to the struct page there 
>>
>> "there" as in the new field of the union?
>> btw, what would be a meaningful name, since "private" is already taken?
>>
>> For simplicity, I'll use, for now, "private2"
> 
> why not explicit vm_area?

ok :-)

>>> and you won't need to do the slow find_vm_area. I haven't checked
>> very closely
>>> but this should be possible in principle. I guess other callers might
>>> benefit from this as well.
>>
>> I am confused about this: if "private2" is a pointer, but when I get an
>> address, I do not even know if the address represents a valid pmalloc
>> page, how can i know when it's ok to dereference "private2"?
> 
> because you can make all pages which back vmalloc mappings have vm_area
> pointer set.

Ah, now I see, I had missed that the field would be set for *all* pages
backed by vmalloc.

So, given a pointer, I still have to figure out if it refers to a
vmalloc area or not.

However, that is something I need to do anyway, to get the reference to
the corresponding page struct, in case it is indeed a vmalloc address.

--
thanks, igor



Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-03 Thread Michal Hocko
On Thu 03-08-17 15:20:31, Igor Stoppa wrote:
> On 03/08/17 14:48, Michal Hocko wrote:
> > On Thu 03-08-17 13:11:45, Igor Stoppa wrote:
> >> On 02/08/17 20:08, Jerome Glisse wrote:
> >>> On Wed, Aug 02, 2017 at 06:14:28PM +0300, Igor Stoppa wrote:
> 
> [...]
> 
>  from include/linux/mm_types.h:
> 
>  struct page {
>  ...
>    union {
>  unsigned long private;   /* Mapping-private opaque data:
>    * usually used for buffer_heads
>    * if PagePrivate set; used for
>    * swp_entry_t if PageSwapCache;
>    * indicates order in the buddy
>    * system if PG_buddy is set.
>    */
> 
> [...]
> 
> >> If the "Mapping-private" was dropped or somehow connected exclusively to
> >> the cases listed in the comment, then I think it would be more clear
> >> that the comment needs to be intended as related to mapping in certain
> >> cases only.
> >> But it is otherwise ok to use the "private" field for whatever purpose
> >> it might be suitable, as long as it is not already in use.
> > 
> > I would recommend adding a new field into the enum...
> 
> s/enum/union/ ?
> 
> If not, I am not sure what is the enum that you are talking about.

yeah, fat fingers on my side

> 
> [...]
> 
> >> But, to reply more specifically to your advice, yes, I think I could add
> >> a flag to vm_struct and then retrieve its value, for the address being
> >> processed, by passing through find_vm_area().
> > 
> > ... and you can store vm_struct pointer to the struct page there 
> 
> "there" as in the new field of the union?
> btw, what would be a meaningful name, since "private" is already taken?
> 
> For simplicity, I'll use, for now, "private2"

why not explicit vm_area?

> > and you> won't need to do the slow find_vm_area. I haven't checked
> very closely
> > but this should be possible in principle. I guess other callers might
> > benefit from this as well.
> 
> I am confused about this: if "private2" is a pointer, but when I get an
> address, I do not even know if the address represents a valid pmalloc
> page, how can i know when it's ok to dereference "private2"?

because you can make all pages which back vmalloc mappings have vm_area
pointer set.

-- 
Michal Hocko
SUSE Labs


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-03 Thread Michal Hocko
On Thu 03-08-17 15:20:31, Igor Stoppa wrote:
> On 03/08/17 14:48, Michal Hocko wrote:
> > On Thu 03-08-17 13:11:45, Igor Stoppa wrote:
> >> On 02/08/17 20:08, Jerome Glisse wrote:
> >>> On Wed, Aug 02, 2017 at 06:14:28PM +0300, Igor Stoppa wrote:
> 
> [...]
> 
>  from include/linux/mm_types.h:
> 
>  struct page {
>  ...
>    union {
>  unsigned long private;   /* Mapping-private opaque data:
>    * usually used for buffer_heads
>    * if PagePrivate set; used for
>    * swp_entry_t if PageSwapCache;
>    * indicates order in the buddy
>    * system if PG_buddy is set.
>    */
> 
> [...]
> 
> >> If the "Mapping-private" was dropped or somehow connected exclusively to
> >> the cases listed in the comment, then I think it would be more clear
> >> that the comment needs to be intended as related to mapping in certain
> >> cases only.
> >> But it is otherwise ok to use the "private" field for whatever purpose
> >> it might be suitable, as long as it is not already in use.
> > 
> > I would recommend adding a new field into the enum...
> 
> s/enum/union/ ?
> 
> If not, I am not sure what is the enum that you are talking about.

yeah, fat fingers on my side

> 
> [...]
> 
> >> But, to reply more specifically to your advice, yes, I think I could add
> >> a flag to vm_struct and then retrieve its value, for the address being
> >> processed, by passing through find_vm_area().
> > 
> > ... and you can store vm_struct pointer to the struct page there 
> 
> "there" as in the new field of the union?
> btw, what would be a meaningful name, since "private" is already taken?
> 
> For simplicity, I'll use, for now, "private2"

why not explicit vm_area?

> > and you> won't need to do the slow find_vm_area. I haven't checked
> very closely
> > but this should be possible in principle. I guess other callers might
> > benefit from this as well.
> 
> I am confused about this: if "private2" is a pointer, but when I get an
> address, I do not even know if the address represents a valid pmalloc
> page, how can i know when it's ok to dereference "private2"?

because you can make all pages which back vmalloc mappings have vm_area
pointer set.

-- 
Michal Hocko
SUSE Labs


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-03 Thread Igor Stoppa
On 03/08/17 14:48, Michal Hocko wrote:
> On Thu 03-08-17 13:11:45, Igor Stoppa wrote:
>> On 02/08/17 20:08, Jerome Glisse wrote:
>>> On Wed, Aug 02, 2017 at 06:14:28PM +0300, Igor Stoppa wrote:

[...]

 from include/linux/mm_types.h:

 struct page {
 ...
   union {
 unsigned long private; /* Mapping-private opaque data:
 * usually used for buffer_heads
 * if PagePrivate set; used for
 * swp_entry_t if PageSwapCache;
 * indicates order in the buddy
 * system if PG_buddy is set.
 */

[...]

>> If the "Mapping-private" was dropped or somehow connected exclusively to
>> the cases listed in the comment, then I think it would be more clear
>> that the comment needs to be intended as related to mapping in certain
>> cases only.
>> But it is otherwise ok to use the "private" field for whatever purpose
>> it might be suitable, as long as it is not already in use.
> 
> I would recommend adding a new field into the enum...

s/enum/union/ ?

If not, I am not sure what is the enum that you are talking about.

[...]

>> But, to reply more specifically to your advice, yes, I think I could add
>> a flag to vm_struct and then retrieve its value, for the address being
>> processed, by passing through find_vm_area().
> 
> ... and you can store vm_struct pointer to the struct page there 

"there" as in the new field of the union?
btw, what would be a meaningful name, since "private" is already taken?

For simplicity, I'll use, for now, "private2"

> and you> won't need to do the slow find_vm_area. I haven't checked
very closely
> but this should be possible in principle. I guess other callers might
> benefit from this as well.

I am confused about this: if "private2" is a pointer, but when I get an
address, I do not even know if the address represents a valid pmalloc
page, how can i know when it's ok to dereference "private2"?

Since it's just another field in a union, it can actually contain a
value that should be interpreted as some other field, right?

--
thanks, igor


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-03 Thread Igor Stoppa
On 03/08/17 14:48, Michal Hocko wrote:
> On Thu 03-08-17 13:11:45, Igor Stoppa wrote:
>> On 02/08/17 20:08, Jerome Glisse wrote:
>>> On Wed, Aug 02, 2017 at 06:14:28PM +0300, Igor Stoppa wrote:

[...]

 from include/linux/mm_types.h:

 struct page {
 ...
   union {
 unsigned long private; /* Mapping-private opaque data:
 * usually used for buffer_heads
 * if PagePrivate set; used for
 * swp_entry_t if PageSwapCache;
 * indicates order in the buddy
 * system if PG_buddy is set.
 */

[...]

>> If the "Mapping-private" was dropped or somehow connected exclusively to
>> the cases listed in the comment, then I think it would be more clear
>> that the comment needs to be intended as related to mapping in certain
>> cases only.
>> But it is otherwise ok to use the "private" field for whatever purpose
>> it might be suitable, as long as it is not already in use.
> 
> I would recommend adding a new field into the enum...

s/enum/union/ ?

If not, I am not sure what is the enum that you are talking about.

[...]

>> But, to reply more specifically to your advice, yes, I think I could add
>> a flag to vm_struct and then retrieve its value, for the address being
>> processed, by passing through find_vm_area().
> 
> ... and you can store vm_struct pointer to the struct page there 

"there" as in the new field of the union?
btw, what would be a meaningful name, since "private" is already taken?

For simplicity, I'll use, for now, "private2"

> and you> won't need to do the slow find_vm_area. I haven't checked
very closely
> but this should be possible in principle. I guess other callers might
> benefit from this as well.

I am confused about this: if "private2" is a pointer, but when I get an
address, I do not even know if the address represents a valid pmalloc
page, how can i know when it's ok to dereference "private2"?

Since it's just another field in a union, it can actually contain a
value that should be interpreted as some other field, right?

--
thanks, igor


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-03 Thread Michal Hocko
On Thu 03-08-17 13:11:45, Igor Stoppa wrote:
> On 02/08/17 20:08, Jerome Glisse wrote:
> > On Wed, Aug 02, 2017 at 06:14:28PM +0300, Igor Stoppa wrote:
[...]
> >> A way to ensure that the address really belongs to pmalloc would be to
> >> pre-screen it, against either the signature or some magic number and,
> >> if such test is passed, then compare the address against those really
> >> available in the pmalloc pools.
> >>
> >> This would be slower, but it would be limited only to those cases where
> >> the signature/magic number matches and the answer is likely to be true.
> >>
> >> 2) However, both the current (incorrect) implementation and the one I am
> >> considering, are abusing something that should be used otherwise (see
> >> the following snippet):
> >>
> >> from include/linux/mm_types.h:
> >>
> >> struct page {
> >> ...
> >>   union {
> >> unsigned long private; /* Mapping-private opaque data:
> >> * usually used for buffer_heads
> >> * if PagePrivate set; used for
> >> * swp_entry_t if PageSwapCache;
> >> * indicates order in the buddy
> >> * system if PG_buddy is set.
> >> */
> >> #if USE_SPLIT_PTE_PTLOCKS
> >> #if ALLOC_SPLIT_PTLOCKS
> >>spinlock_t *ptl;
> >> #else
> >>spinlock_t ptl;
> >> #endif
> >> #endif
> >>struct kmem_cache *slab_cache;  /* SL[AU]B: Pointer to slab */
> >>};
> >> ...
> >> }
> >>
> >>
> >> The "private" field is meant for mapping-private opaque data, which is
> >> not how I am using it.
> > 
> > As you can see this is an union and thus the meaning of that field depends
> > on how the page is use. The private comment you see is only meaningfull for
> > page that are in the page cache and are coming from a file system ie when
> > a process does an mmap of a file. When page is use by sl[au]b the slab_cache
> > field is how it is interpreted ... Context in which a page is use do matter.
> 
> I am not native English speaker, but the comment seems to imply that, no
> matter what, it's Mapping-private.
> 
> If the "Mapping-private" was dropped or somehow connected exclusively to
> the cases listed in the comment, then I think it would be more clear
> that the comment needs to be intended as related to mapping in certain
> cases only.
> But it is otherwise ok to use the "private" field for whatever purpose
> it might be suitable, as long as it is not already in use.

I would recommend adding a new field into the enum...
[...]
> But, to reply more specifically to your advice, yes, I think I could add
> a flag to vm_struct and then retrieve its value, for the address being
> processed, by passing through find_vm_area().

... and you can store vm_struct pointer to the struct page there and you
won't need to do the slow find_vm_area. I haven't checked very closely
but this should be possible in principle. I guess other callers might
benefit from this as well.
-- 
Michal Hocko
SUSE Labs


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-03 Thread Michal Hocko
On Thu 03-08-17 13:11:45, Igor Stoppa wrote:
> On 02/08/17 20:08, Jerome Glisse wrote:
> > On Wed, Aug 02, 2017 at 06:14:28PM +0300, Igor Stoppa wrote:
[...]
> >> A way to ensure that the address really belongs to pmalloc would be to
> >> pre-screen it, against either the signature or some magic number and,
> >> if such test is passed, then compare the address against those really
> >> available in the pmalloc pools.
> >>
> >> This would be slower, but it would be limited only to those cases where
> >> the signature/magic number matches and the answer is likely to be true.
> >>
> >> 2) However, both the current (incorrect) implementation and the one I am
> >> considering, are abusing something that should be used otherwise (see
> >> the following snippet):
> >>
> >> from include/linux/mm_types.h:
> >>
> >> struct page {
> >> ...
> >>   union {
> >> unsigned long private; /* Mapping-private opaque data:
> >> * usually used for buffer_heads
> >> * if PagePrivate set; used for
> >> * swp_entry_t if PageSwapCache;
> >> * indicates order in the buddy
> >> * system if PG_buddy is set.
> >> */
> >> #if USE_SPLIT_PTE_PTLOCKS
> >> #if ALLOC_SPLIT_PTLOCKS
> >>spinlock_t *ptl;
> >> #else
> >>spinlock_t ptl;
> >> #endif
> >> #endif
> >>struct kmem_cache *slab_cache;  /* SL[AU]B: Pointer to slab */
> >>};
> >> ...
> >> }
> >>
> >>
> >> The "private" field is meant for mapping-private opaque data, which is
> >> not how I am using it.
> > 
> > As you can see this is an union and thus the meaning of that field depends
> > on how the page is use. The private comment you see is only meaningfull for
> > page that are in the page cache and are coming from a file system ie when
> > a process does an mmap of a file. When page is use by sl[au]b the slab_cache
> > field is how it is interpreted ... Context in which a page is use do matter.
> 
> I am not native English speaker, but the comment seems to imply that, no
> matter what, it's Mapping-private.
> 
> If the "Mapping-private" was dropped or somehow connected exclusively to
> the cases listed in the comment, then I think it would be more clear
> that the comment needs to be intended as related to mapping in certain
> cases only.
> But it is otherwise ok to use the "private" field for whatever purpose
> it might be suitable, as long as it is not already in use.

I would recommend adding a new field into the enum...
[...]
> But, to reply more specifically to your advice, yes, I think I could add
> a flag to vm_struct and then retrieve its value, for the address being
> processed, by passing through find_vm_area().

... and you can store vm_struct pointer to the struct page there and you
won't need to do the slow find_vm_area. I haven't checked very closely
but this should be possible in principle. I guess other callers might
benefit from this as well.
-- 
Michal Hocko
SUSE Labs


Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-03 Thread Igor Stoppa
On 02/08/17 20:08, Jerome Glisse wrote:
> On Wed, Aug 02, 2017 at 06:14:28PM +0300, Igor Stoppa wrote:

[...]

>> +set_page_private(page, 1);
> 
> Above line is pointless you overwrite value right below

yes ...
> 
>> +page->private = pmalloc_signature;
>> +} else {
>> +BUG_ON(!(page_private(page) &&
>> + page->private == pmalloc_signature));
>> +set_page_private(page, 0);
> 
> Same as above

... and yes

>> +page->private = 0;
>> +}
>> +base += PAGE_SIZE;
>> +} while ((PAGE_MASK & (unsigned long)base) <=
>> + (PAGE_MASK & (unsigned long)end));
>> +return 0;
>> +}
>>
>> ...
>>
>> +static const char msg[] = "Not a valid Pmalloc object.";
>> +const char *pmalloc_check_range(const void *ptr, unsigned long n)
>> +{
>> +unsigned long p;
>> +
>> +p = (unsigned long)ptr;
>> +n = p + n - 1;
>> +for (; (PAGE_MASK & p) <= (PAGE_MASK & n); p += PAGE_SIZE) {
>> +struct page *page;
>> +
>> +if (!is_vmalloc_addr((void *)p))
>> +return msg;
>> +page = vmalloc_to_page((void *)p);
>> +if (!(page && page_private(page) &&
>> +  page->private == pmalloc_signature))
>> +return msg;
>> +}
>> +return NULL;
>> +}
>>
>>
>> The problem here comes from the way I am using page->private:
>> the fact that the page is marked as private means only that someone is
>> using it, and the way it is used could create (spoiler: it happens) a
>> collision with pmalloc_signature, which can generate false positives.
> 
> Is page->private use for vmalloc memory ? If so then pick another field.

No, it is not in use by vmalloc, as far as I can tell, by both reading
the code and empirically printing out its value in few cases.

> Thought i doubt it is use i would need to check. What was the exact
> objection made ?

The objection made is what I tried to explain below, that the comment
besides the declaration of the private field says:
"Mapping-private opaque data: ..."

I'll reply to your answer there.

>> A way to ensure that the address really belongs to pmalloc would be to
>> pre-screen it, against either the signature or some magic number and,
>> if such test is passed, then compare the address against those really
>> available in the pmalloc pools.
>>
>> This would be slower, but it would be limited only to those cases where
>> the signature/magic number matches and the answer is likely to be true.
>>
>> 2) However, both the current (incorrect) implementation and the one I am
>> considering, are abusing something that should be used otherwise (see
>> the following snippet):
>>
>> from include/linux/mm_types.h:
>>
>> struct page {
>> ...
>>   union {
>> unsigned long private;   /* Mapping-private opaque data:
>>   * usually used for buffer_heads
>>   * if PagePrivate set; used for
>>   * swp_entry_t if PageSwapCache;
>>   * indicates order in the buddy
>>   * system if PG_buddy is set.
>>   */
>> #if USE_SPLIT_PTE_PTLOCKS
>> #if ALLOC_SPLIT_PTLOCKS
>>  spinlock_t *ptl;
>> #else
>>  spinlock_t ptl;
>> #endif
>> #endif
>>  struct kmem_cache *slab_cache;  /* SL[AU]B: Pointer to slab */
>>  };
>> ...
>> }
>>
>>
>> The "private" field is meant for mapping-private opaque data, which is
>> not how I am using it.
> 
> As you can see this is an union and thus the meaning of that field depends
> on how the page is use. The private comment you see is only meaningfull for
> page that are in the page cache and are coming from a file system ie when
> a process does an mmap of a file. When page is use by sl[au]b the slab_cache
> field is how it is interpreted ... Context in which a page is use do matter.

I am not native English speaker, but the comment seems to imply that, no
matter what, it's Mapping-private.

If the "Mapping-private" was dropped or somehow connected exclusively to
the cases listed in the comment, then I think it would be more clear
that the comment needs to be intended as related to mapping in certain
cases only.
But it is otherwise ok to use the "private" field for whatever purpose
it might be suitable, as long as it is not already in use.

> Here we are talking about memory that is allocated to back vmalloc area so
> the private field is unuse AFAICR and it is safe to use it while the page
> is use for vmalloc.

Yes, my experience seems to confirm that.

> Note that i don't think anyone is doing vmap() of pages that are in the page
> cache that would seem wrong from my point of view but maybe some one is.
> Thought someone might be doing vmap() of pages in which the 

Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-03 Thread Igor Stoppa
On 02/08/17 20:08, Jerome Glisse wrote:
> On Wed, Aug 02, 2017 at 06:14:28PM +0300, Igor Stoppa wrote:

[...]

>> +set_page_private(page, 1);
> 
> Above line is pointless you overwrite value right below

yes ...
> 
>> +page->private = pmalloc_signature;
>> +} else {
>> +BUG_ON(!(page_private(page) &&
>> + page->private == pmalloc_signature));
>> +set_page_private(page, 0);
> 
> Same as above

... and yes

>> +page->private = 0;
>> +}
>> +base += PAGE_SIZE;
>> +} while ((PAGE_MASK & (unsigned long)base) <=
>> + (PAGE_MASK & (unsigned long)end));
>> +return 0;
>> +}
>>
>> ...
>>
>> +static const char msg[] = "Not a valid Pmalloc object.";
>> +const char *pmalloc_check_range(const void *ptr, unsigned long n)
>> +{
>> +unsigned long p;
>> +
>> +p = (unsigned long)ptr;
>> +n = p + n - 1;
>> +for (; (PAGE_MASK & p) <= (PAGE_MASK & n); p += PAGE_SIZE) {
>> +struct page *page;
>> +
>> +if (!is_vmalloc_addr((void *)p))
>> +return msg;
>> +page = vmalloc_to_page((void *)p);
>> +if (!(page && page_private(page) &&
>> +  page->private == pmalloc_signature))
>> +return msg;
>> +}
>> +return NULL;
>> +}
>>
>>
>> The problem here comes from the way I am using page->private:
>> the fact that the page is marked as private means only that someone is
>> using it, and the way it is used could create (spoiler: it happens) a
>> collision with pmalloc_signature, which can generate false positives.
> 
> Is page->private use for vmalloc memory ? If so then pick another field.

No, it is not in use by vmalloc, as far as I can tell, by both reading
the code and empirically printing out its value in few cases.

> Thought i doubt it is use i would need to check. What was the exact
> objection made ?

The objection made is what I tried to explain below, that the comment
besides the declaration of the private field says:
"Mapping-private opaque data: ..."

I'll reply to your answer there.

>> A way to ensure that the address really belongs to pmalloc would be to
>> pre-screen it, against either the signature or some magic number and,
>> if such test is passed, then compare the address against those really
>> available in the pmalloc pools.
>>
>> This would be slower, but it would be limited only to those cases where
>> the signature/magic number matches and the answer is likely to be true.
>>
>> 2) However, both the current (incorrect) implementation and the one I am
>> considering, are abusing something that should be used otherwise (see
>> the following snippet):
>>
>> from include/linux/mm_types.h:
>>
>> struct page {
>> ...
>>   union {
>> unsigned long private;   /* Mapping-private opaque data:
>>   * usually used for buffer_heads
>>   * if PagePrivate set; used for
>>   * swp_entry_t if PageSwapCache;
>>   * indicates order in the buddy
>>   * system if PG_buddy is set.
>>   */
>> #if USE_SPLIT_PTE_PTLOCKS
>> #if ALLOC_SPLIT_PTLOCKS
>>  spinlock_t *ptl;
>> #else
>>  spinlock_t ptl;
>> #endif
>> #endif
>>  struct kmem_cache *slab_cache;  /* SL[AU]B: Pointer to slab */
>>  };
>> ...
>> }
>>
>>
>> The "private" field is meant for mapping-private opaque data, which is
>> not how I am using it.
> 
> As you can see this is an union and thus the meaning of that field depends
> on how the page is use. The private comment you see is only meaningfull for
> page that are in the page cache and are coming from a file system ie when
> a process does an mmap of a file. When page is use by sl[au]b the slab_cache
> field is how it is interpreted ... Context in which a page is use do matter.

I am not native English speaker, but the comment seems to imply that, no
matter what, it's Mapping-private.

If the "Mapping-private" was dropped or somehow connected exclusively to
the cases listed in the comment, then I think it would be more clear
that the comment needs to be intended as related to mapping in certain
cases only.
But it is otherwise ok to use the "private" field for whatever purpose
it might be suitable, as long as it is not already in use.

> Here we are talking about memory that is allocated to back vmalloc area so
> the private field is unuse AFAICR and it is safe to use it while the page
> is use for vmalloc.

Yes, my experience seems to confirm that.

> Note that i don't think anyone is doing vmap() of pages that are in the page
> cache that would seem wrong from my point of view but maybe some one is.
> Thought someone might be doing vmap() of pages in which the 

Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-02 Thread Jerome Glisse
On Wed, Aug 02, 2017 at 06:14:28PM +0300, Igor Stoppa wrote:
> Hi,
> while I am working to another example of using pmalloc [1],
> it was pointed out to me that:
> 
> 1) I had introduced a bug when I switched to using a field of the page
> structure [2]
> 
> 2) I was also committing a layer violation in the way I was tagging the
> pages.
> 
> I am seeking help to understand what would be the correct way to do the
> tagging.
> 
> Here are snippets describing the problems:
> 
> 
> 1) from pmalloc.c:
> 
> ...
> 
> +static const unsigned long pmalloc_signature = (unsigned
> long)_mutex;
> 
> ...
> 
> +int __pmalloc_tag_pages(void *base, const size_t size, const bool set_tag)
> +{
> + void *end = base + size - 1;
> +
> + do {
> + struct page *page;
> +
> + if (!is_vmalloc_addr(base))
> + return -EINVAL;
> + page = vmalloc_to_page(base);
> + if (set_tag) {
> + BUG_ON(page_private(page) || page->private);
> + set_page_private(page, 1);

Above line is pointless you overwrite value right below

> + page->private = pmalloc_signature;
> + } else {
> + BUG_ON(!(page_private(page) &&
> +  page->private == pmalloc_signature));
> + set_page_private(page, 0);

Same as above

> + page->private = 0;
> + }
> + base += PAGE_SIZE;
> + } while ((PAGE_MASK & (unsigned long)base) <=
> +  (PAGE_MASK & (unsigned long)end));
> + return 0;
> +}
> 
> ...
> 
> +static const char msg[] = "Not a valid Pmalloc object.";
> +const char *pmalloc_check_range(const void *ptr, unsigned long n)
> +{
> + unsigned long p;
> +
> + p = (unsigned long)ptr;
> + n = p + n - 1;
> + for (; (PAGE_MASK & p) <= (PAGE_MASK & n); p += PAGE_SIZE) {
> + struct page *page;
> +
> + if (!is_vmalloc_addr((void *)p))
> + return msg;
> + page = vmalloc_to_page((void *)p);
> + if (!(page && page_private(page) &&
> +   page->private == pmalloc_signature))
> + return msg;
> + }
> + return NULL;
> +}
> 
> 
> The problem here comes from the way I am using page->private:
> the fact that the page is marked as private means only that someone is
> using it, and the way it is used could create (spoiler: it happens) a
> collision with pmalloc_signature, which can generate false positives.

Is page->private use for vmalloc memory ? If so then pick another field.
Thought i doubt it is use i would need to check. What was the exact
objection made ?

> 
> A way to ensure that the address really belongs to pmalloc would be to
> pre-screen it, against either the signature or some magic number and,
> if such test is passed, then compare the address against those really
> available in the pmalloc pools.
> 
> This would be slower, but it would be limited only to those cases where
> the signature/magic number matches and the answer is likely to be true.
> 
> 2) However, both the current (incorrect) implementation and the one I am
> considering, are abusing something that should be used otherwise (see
> the following snippet):
> 
> from include/linux/mm_types.h:
> 
> struct page {
> ...
>   union {
> unsigned long private;/* Mapping-private opaque data:
>* usually used for buffer_heads
>* if PagePrivate set; used for
>* swp_entry_t if PageSwapCache;
>* indicates order in the buddy
>* system if PG_buddy is set.
>*/
> #if USE_SPLIT_PTE_PTLOCKS
> #if ALLOC_SPLIT_PTLOCKS
>   spinlock_t *ptl;
> #else
>   spinlock_t ptl;
> #endif
> #endif
>   struct kmem_cache *slab_cache;  /* SL[AU]B: Pointer to slab */
>   };
> ...
> }
> 
> 
> The "private" field is meant for mapping-private opaque data, which is
> not how I am using it.

As you can see this is an union and thus the meaning of that field depends
on how the page is use. The private comment you see is only meaningfull for
page that are in the page cache and are coming from a file system ie when
a process does an mmap of a file. When page is use by sl[au]b the slab_cache
field is how it is interpreted ... Context in which a page is use do matter.

Here we are talking about memory that is allocated to back vmalloc area so
the private field is unuse AFAICR and it is safe to use it while the page
is use for vmalloc.

Note that i don't think anyone is doing vmap() of pages that are in the page
cache that would seem wrong from my point of view but maybe some one is.
Thought someone might be doing vmap() of pages in which the private field is
use for something 

Re: [RFC] Tagging of vmalloc pages for supporting the pmalloc allocator

2017-08-02 Thread Jerome Glisse
On Wed, Aug 02, 2017 at 06:14:28PM +0300, Igor Stoppa wrote:
> Hi,
> while I am working to another example of using pmalloc [1],
> it was pointed out to me that:
> 
> 1) I had introduced a bug when I switched to using a field of the page
> structure [2]
> 
> 2) I was also committing a layer violation in the way I was tagging the
> pages.
> 
> I am seeking help to understand what would be the correct way to do the
> tagging.
> 
> Here are snippets describing the problems:
> 
> 
> 1) from pmalloc.c:
> 
> ...
> 
> +static const unsigned long pmalloc_signature = (unsigned
> long)_mutex;
> 
> ...
> 
> +int __pmalloc_tag_pages(void *base, const size_t size, const bool set_tag)
> +{
> + void *end = base + size - 1;
> +
> + do {
> + struct page *page;
> +
> + if (!is_vmalloc_addr(base))
> + return -EINVAL;
> + page = vmalloc_to_page(base);
> + if (set_tag) {
> + BUG_ON(page_private(page) || page->private);
> + set_page_private(page, 1);

Above line is pointless you overwrite value right below

> + page->private = pmalloc_signature;
> + } else {
> + BUG_ON(!(page_private(page) &&
> +  page->private == pmalloc_signature));
> + set_page_private(page, 0);

Same as above

> + page->private = 0;
> + }
> + base += PAGE_SIZE;
> + } while ((PAGE_MASK & (unsigned long)base) <=
> +  (PAGE_MASK & (unsigned long)end));
> + return 0;
> +}
> 
> ...
> 
> +static const char msg[] = "Not a valid Pmalloc object.";
> +const char *pmalloc_check_range(const void *ptr, unsigned long n)
> +{
> + unsigned long p;
> +
> + p = (unsigned long)ptr;
> + n = p + n - 1;
> + for (; (PAGE_MASK & p) <= (PAGE_MASK & n); p += PAGE_SIZE) {
> + struct page *page;
> +
> + if (!is_vmalloc_addr((void *)p))
> + return msg;
> + page = vmalloc_to_page((void *)p);
> + if (!(page && page_private(page) &&
> +   page->private == pmalloc_signature))
> + return msg;
> + }
> + return NULL;
> +}
> 
> 
> The problem here comes from the way I am using page->private:
> the fact that the page is marked as private means only that someone is
> using it, and the way it is used could create (spoiler: it happens) a
> collision with pmalloc_signature, which can generate false positives.

Is page->private use for vmalloc memory ? If so then pick another field.
Thought i doubt it is use i would need to check. What was the exact
objection made ?

> 
> A way to ensure that the address really belongs to pmalloc would be to
> pre-screen it, against either the signature or some magic number and,
> if such test is passed, then compare the address against those really
> available in the pmalloc pools.
> 
> This would be slower, but it would be limited only to those cases where
> the signature/magic number matches and the answer is likely to be true.
> 
> 2) However, both the current (incorrect) implementation and the one I am
> considering, are abusing something that should be used otherwise (see
> the following snippet):
> 
> from include/linux/mm_types.h:
> 
> struct page {
> ...
>   union {
> unsigned long private;/* Mapping-private opaque data:
>* usually used for buffer_heads
>* if PagePrivate set; used for
>* swp_entry_t if PageSwapCache;
>* indicates order in the buddy
>* system if PG_buddy is set.
>*/
> #if USE_SPLIT_PTE_PTLOCKS
> #if ALLOC_SPLIT_PTLOCKS
>   spinlock_t *ptl;
> #else
>   spinlock_t ptl;
> #endif
> #endif
>   struct kmem_cache *slab_cache;  /* SL[AU]B: Pointer to slab */
>   };
> ...
> }
> 
> 
> The "private" field is meant for mapping-private opaque data, which is
> not how I am using it.

As you can see this is an union and thus the meaning of that field depends
on how the page is use. The private comment you see is only meaningfull for
page that are in the page cache and are coming from a file system ie when
a process does an mmap of a file. When page is use by sl[au]b the slab_cache
field is how it is interpreted ... Context in which a page is use do matter.

Here we are talking about memory that is allocated to back vmalloc area so
the private field is unuse AFAICR and it is safe to use it while the page
is use for vmalloc.

Note that i don't think anyone is doing vmap() of pages that are in the page
cache that would seem wrong from my point of view but maybe some one is.
Thought someone might be doing vmap() of pages in which the private field is
use for something