Re: [PATCH 3/4] mm/hugeltb: fix potential wrong gbl_reserve value for hugetlb_acct_memory()

2021-04-08 Thread Miaohe Lin
On 2021/4/9 12:37, Mike Kravetz wrote:
> On 4/8/21 8:01 PM, Miaohe Lin wrote:
>> On 2021/4/9 6:53, Mike Kravetz wrote:
>>>
>>> Yes, add a comment to hugetlb_unreserve_pages saying that !resv_map
>>> implies freed == 0.
>>>
>>
>> Sounds good!
>>
>>> It would also be helpful to check for (chg - freed) == 0 and skip the
>>> calls to hugepage_subpool_put_pages() and hugetlb_acct_memory().  Both
>>> of those routines may perform an unnecessary lock/unlock cycle in this
>>> case.
>>>
>>> A simple
>>> if (chg == free)
>>> return 0;
>>> before the call to hugepage_subpool_put_pages would work.
>>
>> This may not be really helpful because hugepage_subpool_put_pages() and 
>> hugetlb_acct_memory()
>> both would handle delta == 0 case without unnecessary lock/unlock cycle.
>> Does this make sense for you? If so, I will prepare v2 with the changes to 
>> add a comment
>> to hugetlb_unreserve_pages() __without__ the check for (chg - freed) == 0.
> 
> Sorry, I forgot about the recent changes to check for delta == 0.
> No need for the check here, just the comment.
> 

That's all right. Will add the comment in V2.
Thanks.


Re: [PATCH 3/4] mm/hugeltb: fix potential wrong gbl_reserve value for hugetlb_acct_memory()

2021-04-08 Thread Mike Kravetz
On 4/8/21 8:01 PM, Miaohe Lin wrote:
> On 2021/4/9 6:53, Mike Kravetz wrote:
>>
>> Yes, add a comment to hugetlb_unreserve_pages saying that !resv_map
>> implies freed == 0.
>>
> 
> Sounds good!
> 
>> It would also be helpful to check for (chg - freed) == 0 and skip the
>> calls to hugepage_subpool_put_pages() and hugetlb_acct_memory().  Both
>> of those routines may perform an unnecessary lock/unlock cycle in this
>> case.
>>
>> A simple
>>  if (chg == free)
>>  return 0;
>> before the call to hugepage_subpool_put_pages would work.
> 
> This may not be really helpful because hugepage_subpool_put_pages() and 
> hugetlb_acct_memory()
> both would handle delta == 0 case without unnecessary lock/unlock cycle.
> Does this make sense for you? If so, I will prepare v2 with the changes to 
> add a comment
> to hugetlb_unreserve_pages() __without__ the check for (chg - freed) == 0.

Sorry, I forgot about the recent changes to check for delta == 0.
No need for the check here, just the comment.
-- 
Mike Kravetz


Re: [PATCH 3/4] mm/hugeltb: fix potential wrong gbl_reserve value for hugetlb_acct_memory()

2021-04-08 Thread Miaohe Lin
On 2021/4/9 6:53, Mike Kravetz wrote:
> On 4/7/21 8:26 PM, Miaohe Lin wrote:
>> On 2021/4/8 11:24, Miaohe Lin wrote:
>>> On 2021/4/8 4:53, Mike Kravetz wrote:
 On 4/7/21 12:24 AM, Miaohe Lin wrote:
> Hi:
> On 2021/4/7 10:49, Mike Kravetz wrote:
>> On 4/2/21 2:32 AM, Miaohe Lin wrote:
>>> The resv_map could be NULL since this routine can be called in the evict
>>> inode path for all hugetlbfs inodes. So we could have chg = 0 and this
>>> would result in a negative value when chg - freed. This is unexpected 
>>> for
>>> hugepage_subpool_put_pages() and hugetlb_acct_memory().
>>
>> I am not sure if this is possible.
>>
>> It is true that resv_map could be NULL.  However, I believe resv map
>> can only be NULL for inodes that are not regular or link inodes.  This
>> is the inode creation code in hugetlbfs_get_inode().
>>
>>/*
>>  * Reserve maps are only needed for inodes that can have 
>> associated
>>  * page allocations.
>>  */
>> if (S_ISREG(mode) || S_ISLNK(mode)) {
>> resv_map = resv_map_alloc();
>> if (!resv_map)
>> return NULL;
>> }
>>
>
> Agree.
>
>> If resv_map is NULL, then no hugetlb pages can be allocated/associated
>> with the file.  As a result, remove_inode_hugepages will never find any
>> huge pages associated with the inode and the passed value 'freed' will
>> always be zero.
>>
>
> But I am confused now. AFAICS, remove_inode_hugepages() searches the 
> address_space of
> the inode to remove the hugepages while does not care if inode has 
> associated resv_map.
> How does it prevent hugetlb pages from being allocated/associated with 
> the file if
> resv_map is NULL? Could you please explain this more?
>

 Recall that there are only two ways to get huge pages associated with
 a hugetlbfs file: fallocate and mmap/write fault.  Directly writing to
 hugetlbfs files is not supported.

 If you take a closer look at hugetlbfs_get_inode, it has that code to
 allocate the resv map mentioned above as well as the following:

switch (mode & S_IFMT) {
default:
init_special_inode(inode, mode, dev);
break;
case S_IFREG:
inode->i_op = &hugetlbfs_inode_operations;
inode->i_fop = &hugetlbfs_file_operations;
break;
case S_IFDIR:
inode->i_op = &hugetlbfs_dir_inode_operations;
inode->i_fop = &simple_dir_operations;

/* directory inodes start off with i_nlink == 2 (for 
 "." entry) */
inc_nlink(inode);
break;
case S_IFLNK:
inode->i_op = &page_symlink_inode_operations;
inode_nohighmem(inode);
break;
}

 Notice that only S_IFREG inodes will have i_fop == 
 &hugetlbfs_file_operations.
 hugetlbfs_file_operations contain the hugetlbfs specific mmap and fallocate
 routines.  Hence, only files with S_IFREG inodes can potentially have
 associated huge pages.  S_IFLNK inodes can as well via file linking.

 If an inode is not S_ISREG(mode) || S_ISLNK(mode), then it will not have
 a resv_map.  In addition, it will not have hugetlbfs_file_operations and
 can not have associated huge pages.

>>>
>>> Many many thanks for detailed and patient explanation! :) I think I have 
>>> got the idea!
>>>
 I looked at this closely when adding commits
 58b6e5e8f1ad hugetlbfs: fix memory leak for resv_map
 f27a5136f70a hugetlbfs: always use address space in inode for resv_map 
 pointer

 I may not be remembering all of the details correctly.  Commit f27a5136f70a
 added the comment that resv_map could be NULL to hugetlb_unreserve_pages.

>>>
>>> Since we must have freed == 0 while chg == 0. Should we make this 
>>> assumption explict
>>> by something like below?
>>>
>>> WARN_ON(chg < freed);
>>>
>>
>> Or just a comment to avoid confusion ?
>>
> 
> Yes, add a comment to hugetlb_unreserve_pages saying that !resv_map
> implies freed == 0.
> 

Sounds good!

> It would also be helpful to check for (chg - freed) == 0 and skip the
> calls to hugepage_subpool_put_pages() and hugetlb_acct_memory().  Both
> of those routines may perform an unnecessary lock/unlock cycle in this
> case.
> 
> A simple
>   if (chg == free)
>   return 0;
> before the call to hugepage_subpool_put_pages would work.

This may not be really helpful because hugepage_subpool_put_pages() and 
hugetlb_acct_memory()
both would handle delta == 0 case without unnecessary lock/unl

Re: [PATCH 3/4] mm/hugeltb: fix potential wrong gbl_reserve value for hugetlb_acct_memory()

2021-04-08 Thread Mike Kravetz
On 4/7/21 8:26 PM, Miaohe Lin wrote:
> On 2021/4/8 11:24, Miaohe Lin wrote:
>> On 2021/4/8 4:53, Mike Kravetz wrote:
>>> On 4/7/21 12:24 AM, Miaohe Lin wrote:
 Hi:
 On 2021/4/7 10:49, Mike Kravetz wrote:
> On 4/2/21 2:32 AM, Miaohe Lin wrote:
>> The resv_map could be NULL since this routine can be called in the evict
>> inode path for all hugetlbfs inodes. So we could have chg = 0 and this
>> would result in a negative value when chg - freed. This is unexpected for
>> hugepage_subpool_put_pages() and hugetlb_acct_memory().
>
> I am not sure if this is possible.
>
> It is true that resv_map could be NULL.  However, I believe resv map
> can only be NULL for inodes that are not regular or link inodes.  This
> is the inode creation code in hugetlbfs_get_inode().
>
>/*
>  * Reserve maps are only needed for inodes that can have 
> associated
>  * page allocations.
>  */
> if (S_ISREG(mode) || S_ISLNK(mode)) {
> resv_map = resv_map_alloc();
> if (!resv_map)
> return NULL;
> }
>

 Agree.

> If resv_map is NULL, then no hugetlb pages can be allocated/associated
> with the file.  As a result, remove_inode_hugepages will never find any
> huge pages associated with the inode and the passed value 'freed' will
> always be zero.
>

 But I am confused now. AFAICS, remove_inode_hugepages() searches the 
 address_space of
 the inode to remove the hugepages while does not care if inode has 
 associated resv_map.
 How does it prevent hugetlb pages from being allocated/associated with the 
 file if
 resv_map is NULL? Could you please explain this more?

>>>
>>> Recall that there are only two ways to get huge pages associated with
>>> a hugetlbfs file: fallocate and mmap/write fault.  Directly writing to
>>> hugetlbfs files is not supported.
>>>
>>> If you take a closer look at hugetlbfs_get_inode, it has that code to
>>> allocate the resv map mentioned above as well as the following:
>>>
>>> switch (mode & S_IFMT) {
>>> default:
>>> init_special_inode(inode, mode, dev);
>>> break;
>>> case S_IFREG:
>>> inode->i_op = &hugetlbfs_inode_operations;
>>> inode->i_fop = &hugetlbfs_file_operations;
>>> break;
>>> case S_IFDIR:
>>> inode->i_op = &hugetlbfs_dir_inode_operations;
>>> inode->i_fop = &simple_dir_operations;
>>>
>>> /* directory inodes start off with i_nlink == 2 (for 
>>> "." entry) */
>>> inc_nlink(inode);
>>> break;
>>> case S_IFLNK:
>>> inode->i_op = &page_symlink_inode_operations;
>>> inode_nohighmem(inode);
>>> break;
>>> }
>>>
>>> Notice that only S_IFREG inodes will have i_fop == 
>>> &hugetlbfs_file_operations.
>>> hugetlbfs_file_operations contain the hugetlbfs specific mmap and fallocate
>>> routines.  Hence, only files with S_IFREG inodes can potentially have
>>> associated huge pages.  S_IFLNK inodes can as well via file linking.
>>>
>>> If an inode is not S_ISREG(mode) || S_ISLNK(mode), then it will not have
>>> a resv_map.  In addition, it will not have hugetlbfs_file_operations and
>>> can not have associated huge pages.
>>>
>>
>> Many many thanks for detailed and patient explanation! :) I think I have got 
>> the idea!
>>
>>> I looked at this closely when adding commits
>>> 58b6e5e8f1ad hugetlbfs: fix memory leak for resv_map
>>> f27a5136f70a hugetlbfs: always use address space in inode for resv_map 
>>> pointer
>>>
>>> I may not be remembering all of the details correctly.  Commit f27a5136f70a
>>> added the comment that resv_map could be NULL to hugetlb_unreserve_pages.
>>>
>>
>> Since we must have freed == 0 while chg == 0. Should we make this assumption 
>> explict
>> by something like below?
>>
>> WARN_ON(chg < freed);
>>
> 
> Or just a comment to avoid confusion ?
> 

Yes, add a comment to hugetlb_unreserve_pages saying that !resv_map
implies freed == 0.

It would also be helpful to check for (chg - freed) == 0 and skip the
calls to hugepage_subpool_put_pages() and hugetlb_acct_memory().  Both
of those routines may perform an unnecessary lock/unlock cycle in this
case.

A simple
if (chg == free)
return 0;
before the call to hugepage_subpool_put_pages would work.
-- 
Mike Kravetz


Re: [PATCH 3/4] mm/hugeltb: fix potential wrong gbl_reserve value for hugetlb_acct_memory()

2021-04-07 Thread Miaohe Lin
On 2021/4/8 11:24, Miaohe Lin wrote:
> On 2021/4/8 4:53, Mike Kravetz wrote:
>> On 4/7/21 12:24 AM, Miaohe Lin wrote:
>>> Hi:
>>> On 2021/4/7 10:49, Mike Kravetz wrote:
 On 4/2/21 2:32 AM, Miaohe Lin wrote:
> The resv_map could be NULL since this routine can be called in the evict
> inode path for all hugetlbfs inodes. So we could have chg = 0 and this
> would result in a negative value when chg - freed. This is unexpected for
> hugepage_subpool_put_pages() and hugetlb_acct_memory().

 I am not sure if this is possible.

 It is true that resv_map could be NULL.  However, I believe resv map
 can only be NULL for inodes that are not regular or link inodes.  This
 is the inode creation code in hugetlbfs_get_inode().

/*
  * Reserve maps are only needed for inodes that can have associated
  * page allocations.
  */
 if (S_ISREG(mode) || S_ISLNK(mode)) {
 resv_map = resv_map_alloc();
 if (!resv_map)
 return NULL;
 }

>>>
>>> Agree.
>>>
 If resv_map is NULL, then no hugetlb pages can be allocated/associated
 with the file.  As a result, remove_inode_hugepages will never find any
 huge pages associated with the inode and the passed value 'freed' will
 always be zero.

>>>
>>> But I am confused now. AFAICS, remove_inode_hugepages() searches the 
>>> address_space of
>>> the inode to remove the hugepages while does not care if inode has 
>>> associated resv_map.
>>> How does it prevent hugetlb pages from being allocated/associated with the 
>>> file if
>>> resv_map is NULL? Could you please explain this more?
>>>
>>
>> Recall that there are only two ways to get huge pages associated with
>> a hugetlbfs file: fallocate and mmap/write fault.  Directly writing to
>> hugetlbfs files is not supported.
>>
>> If you take a closer look at hugetlbfs_get_inode, it has that code to
>> allocate the resv map mentioned above as well as the following:
>>
>>  switch (mode & S_IFMT) {
>>  default:
>>  init_special_inode(inode, mode, dev);
>>  break;
>>  case S_IFREG:
>>  inode->i_op = &hugetlbfs_inode_operations;
>>  inode->i_fop = &hugetlbfs_file_operations;
>>  break;
>>  case S_IFDIR:
>>  inode->i_op = &hugetlbfs_dir_inode_operations;
>>  inode->i_fop = &simple_dir_operations;
>>
>>  /* directory inodes start off with i_nlink == 2 (for 
>> "." entry) */
>>  inc_nlink(inode);
>>  break;
>>  case S_IFLNK:
>>  inode->i_op = &page_symlink_inode_operations;
>>  inode_nohighmem(inode);
>>  break;
>>  }
>>
>> Notice that only S_IFREG inodes will have i_fop == 
>> &hugetlbfs_file_operations.
>> hugetlbfs_file_operations contain the hugetlbfs specific mmap and fallocate
>> routines.  Hence, only files with S_IFREG inodes can potentially have
>> associated huge pages.  S_IFLNK inodes can as well via file linking.
>>
>> If an inode is not S_ISREG(mode) || S_ISLNK(mode), then it will not have
>> a resv_map.  In addition, it will not have hugetlbfs_file_operations and
>> can not have associated huge pages.
>>
> 
> Many many thanks for detailed and patient explanation! :) I think I have got 
> the idea!
> 
>> I looked at this closely when adding commits
>> 58b6e5e8f1ad hugetlbfs: fix memory leak for resv_map
>> f27a5136f70a hugetlbfs: always use address space in inode for resv_map 
>> pointer
>>
>> I may not be remembering all of the details correctly.  Commit f27a5136f70a
>> added the comment that resv_map could be NULL to hugetlb_unreserve_pages.
>>
> 
> Since we must have freed == 0 while chg == 0. Should we make this assumption 
> explict
> by something like below?
> 
> WARN_ON(chg < freed);
> 

Or just a comment to avoid confusion ?

> Thanks again!
> 


Re: [PATCH 3/4] mm/hugeltb: fix potential wrong gbl_reserve value for hugetlb_acct_memory()

2021-04-07 Thread Miaohe Lin
On 2021/4/8 4:53, Mike Kravetz wrote:
> On 4/7/21 12:24 AM, Miaohe Lin wrote:
>> Hi:
>> On 2021/4/7 10:49, Mike Kravetz wrote:
>>> On 4/2/21 2:32 AM, Miaohe Lin wrote:
 The resv_map could be NULL since this routine can be called in the evict
 inode path for all hugetlbfs inodes. So we could have chg = 0 and this
 would result in a negative value when chg - freed. This is unexpected for
 hugepage_subpool_put_pages() and hugetlb_acct_memory().
>>>
>>> I am not sure if this is possible.
>>>
>>> It is true that resv_map could be NULL.  However, I believe resv map
>>> can only be NULL for inodes that are not regular or link inodes.  This
>>> is the inode creation code in hugetlbfs_get_inode().
>>>
>>>/*
>>>  * Reserve maps are only needed for inodes that can have associated
>>>  * page allocations.
>>>  */
>>> if (S_ISREG(mode) || S_ISLNK(mode)) {
>>> resv_map = resv_map_alloc();
>>> if (!resv_map)
>>> return NULL;
>>> }
>>>
>>
>> Agree.
>>
>>> If resv_map is NULL, then no hugetlb pages can be allocated/associated
>>> with the file.  As a result, remove_inode_hugepages will never find any
>>> huge pages associated with the inode and the passed value 'freed' will
>>> always be zero.
>>>
>>
>> But I am confused now. AFAICS, remove_inode_hugepages() searches the 
>> address_space of
>> the inode to remove the hugepages while does not care if inode has 
>> associated resv_map.
>> How does it prevent hugetlb pages from being allocated/associated with the 
>> file if
>> resv_map is NULL? Could you please explain this more?
>>
> 
> Recall that there are only two ways to get huge pages associated with
> a hugetlbfs file: fallocate and mmap/write fault.  Directly writing to
> hugetlbfs files is not supported.
> 
> If you take a closer look at hugetlbfs_get_inode, it has that code to
> allocate the resv map mentioned above as well as the following:
> 
>   switch (mode & S_IFMT) {
>   default:
>   init_special_inode(inode, mode, dev);
>   break;
>   case S_IFREG:
>   inode->i_op = &hugetlbfs_inode_operations;
>   inode->i_fop = &hugetlbfs_file_operations;
>   break;
>   case S_IFDIR:
>   inode->i_op = &hugetlbfs_dir_inode_operations;
>   inode->i_fop = &simple_dir_operations;
> 
>   /* directory inodes start off with i_nlink == 2 (for 
> "." entry) */
>   inc_nlink(inode);
>   break;
>   case S_IFLNK:
>   inode->i_op = &page_symlink_inode_operations;
>   inode_nohighmem(inode);
>   break;
>   }
> 
> Notice that only S_IFREG inodes will have i_fop == &hugetlbfs_file_operations.
> hugetlbfs_file_operations contain the hugetlbfs specific mmap and fallocate
> routines.  Hence, only files with S_IFREG inodes can potentially have
> associated huge pages.  S_IFLNK inodes can as well via file linking.
> 
> If an inode is not S_ISREG(mode) || S_ISLNK(mode), then it will not have
> a resv_map.  In addition, it will not have hugetlbfs_file_operations and
> can not have associated huge pages.
> 

Many many thanks for detailed and patient explanation! :) I think I have got 
the idea!

> I looked at this closely when adding commits
> 58b6e5e8f1ad hugetlbfs: fix memory leak for resv_map
> f27a5136f70a hugetlbfs: always use address space in inode for resv_map pointer
> 
> I may not be remembering all of the details correctly.  Commit f27a5136f70a
> added the comment that resv_map could be NULL to hugetlb_unreserve_pages.
> 

Since we must have freed == 0 while chg == 0. Should we make this assumption 
explict
by something like below?

WARN_ON(chg < freed);

Thanks again!


Re: [PATCH 3/4] mm/hugeltb: fix potential wrong gbl_reserve value for hugetlb_acct_memory()

2021-04-07 Thread Mike Kravetz
On 4/7/21 12:24 AM, Miaohe Lin wrote:
> Hi:
> On 2021/4/7 10:49, Mike Kravetz wrote:
>> On 4/2/21 2:32 AM, Miaohe Lin wrote:
>>> The resv_map could be NULL since this routine can be called in the evict
>>> inode path for all hugetlbfs inodes. So we could have chg = 0 and this
>>> would result in a negative value when chg - freed. This is unexpected for
>>> hugepage_subpool_put_pages() and hugetlb_acct_memory().
>>
>> I am not sure if this is possible.
>>
>> It is true that resv_map could be NULL.  However, I believe resv map
>> can only be NULL for inodes that are not regular or link inodes.  This
>> is the inode creation code in hugetlbfs_get_inode().
>>
>>/*
>>  * Reserve maps are only needed for inodes that can have associated
>>  * page allocations.
>>  */
>> if (S_ISREG(mode) || S_ISLNK(mode)) {
>> resv_map = resv_map_alloc();
>> if (!resv_map)
>> return NULL;
>> }
>>
> 
> Agree.
> 
>> If resv_map is NULL, then no hugetlb pages can be allocated/associated
>> with the file.  As a result, remove_inode_hugepages will never find any
>> huge pages associated with the inode and the passed value 'freed' will
>> always be zero.
>>
> 
> But I am confused now. AFAICS, remove_inode_hugepages() searches the 
> address_space of
> the inode to remove the hugepages while does not care if inode has associated 
> resv_map.
> How does it prevent hugetlb pages from being allocated/associated with the 
> file if
> resv_map is NULL? Could you please explain this more?
> 

Recall that there are only two ways to get huge pages associated with
a hugetlbfs file: fallocate and mmap/write fault.  Directly writing to
hugetlbfs files is not supported.

If you take a closer look at hugetlbfs_get_inode, it has that code to
allocate the resv map mentioned above as well as the following:

switch (mode & S_IFMT) {
default:
init_special_inode(inode, mode, dev);
break;
case S_IFREG:
inode->i_op = &hugetlbfs_inode_operations;
inode->i_fop = &hugetlbfs_file_operations;
break;
case S_IFDIR:
inode->i_op = &hugetlbfs_dir_inode_operations;
inode->i_fop = &simple_dir_operations;

/* directory inodes start off with i_nlink == 2 (for 
"." entry) */
inc_nlink(inode);
break;
case S_IFLNK:
inode->i_op = &page_symlink_inode_operations;
inode_nohighmem(inode);
break;
}

Notice that only S_IFREG inodes will have i_fop == &hugetlbfs_file_operations.
hugetlbfs_file_operations contain the hugetlbfs specific mmap and fallocate
routines.  Hence, only files with S_IFREG inodes can potentially have
associated huge pages.  S_IFLNK inodes can as well via file linking.

If an inode is not S_ISREG(mode) || S_ISLNK(mode), then it will not have
a resv_map.  In addition, it will not have hugetlbfs_file_operations and
can not have associated huge pages.

I looked at this closely when adding commits
58b6e5e8f1ad hugetlbfs: fix memory leak for resv_map
f27a5136f70a hugetlbfs: always use address space in inode for resv_map pointer

I may not be remembering all of the details correctly.  Commit f27a5136f70a
added the comment that resv_map could be NULL to hugetlb_unreserve_pages.
-- 
Mike Kravetz


Re: [PATCH 3/4] mm/hugeltb: fix potential wrong gbl_reserve value for hugetlb_acct_memory()

2021-04-07 Thread Miaohe Lin
Hi:
On 2021/4/7 10:49, Mike Kravetz wrote:
> On 4/2/21 2:32 AM, Miaohe Lin wrote:
>> The resv_map could be NULL since this routine can be called in the evict
>> inode path for all hugetlbfs inodes. So we could have chg = 0 and this
>> would result in a negative value when chg - freed. This is unexpected for
>> hugepage_subpool_put_pages() and hugetlb_acct_memory().
> 
> I am not sure if this is possible.
> 
> It is true that resv_map could be NULL.  However, I believe resv map
> can only be NULL for inodes that are not regular or link inodes.  This
> is the inode creation code in hugetlbfs_get_inode().
> 
>/*
>  * Reserve maps are only needed for inodes that can have associated
>  * page allocations.
>  */
> if (S_ISREG(mode) || S_ISLNK(mode)) {
> resv_map = resv_map_alloc();
> if (!resv_map)
> return NULL;
> }
> 

Agree.

> If resv_map is NULL, then no hugetlb pages can be allocated/associated
> with the file.  As a result, remove_inode_hugepages will never find any
> huge pages associated with the inode and the passed value 'freed' will
> always be zero.
> 

But I am confused now. AFAICS, remove_inode_hugepages() searches the 
address_space of
the inode to remove the hugepages while does not care if inode has associated 
resv_map.
How does it prevent hugetlb pages from being allocated/associated with the file 
if
resv_map is NULL? Could you please explain this more?

Many thanks.

> Does that sound correct?
>


Re: [PATCH 3/4] mm/hugeltb: fix potential wrong gbl_reserve value for hugetlb_acct_memory()

2021-04-06 Thread Mike Kravetz
On 4/2/21 2:32 AM, Miaohe Lin wrote:
> The resv_map could be NULL since this routine can be called in the evict
> inode path for all hugetlbfs inodes. So we could have chg = 0 and this
> would result in a negative value when chg - freed. This is unexpected for
> hugepage_subpool_put_pages() and hugetlb_acct_memory().

I am not sure if this is possible.

It is true that resv_map could be NULL.  However, I believe resv map
can only be NULL for inodes that are not regular or link inodes.  This
is the inode creation code in hugetlbfs_get_inode().

   /*
 * Reserve maps are only needed for inodes that can have associated
 * page allocations.
 */
if (S_ISREG(mode) || S_ISLNK(mode)) {
resv_map = resv_map_alloc();
if (!resv_map)
return NULL;
}

If resv_map is NULL, then no hugetlb pages can be allocated/associated
with the file.  As a result, remove_inode_hugepages will never find any
huge pages associated with the inode and the passed value 'freed' will
always be zero.

Does that sound correct?

-- 
Mike Kravetz

> 
> Fixes: b5cec28d36f5 ("hugetlbfs: truncate_hugepages() takes a range of pages")
> Signed-off-by: Miaohe Lin 
> ---
>  mm/hugetlb.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index b7864abded3d..bdff8d23803f 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -5413,6 +5413,7 @@ long hugetlb_unreserve_pages(struct inode *inode, long 
> start, long end,
>   long chg = 0;
>   struct hugepage_subpool *spool = subpool_inode(inode);
>   long gbl_reserve;
> + long delta;
>  
>   /*
>* Since this routine can be called in the evict inode path for all
> @@ -5437,7 +5438,8 @@ long hugetlb_unreserve_pages(struct inode *inode, long 
> start, long end,
>* If the subpool has a minimum size, the number of global
>* reservations to be released may be adjusted.
>*/
> - gbl_reserve = hugepage_subpool_put_pages(spool, (chg - freed));
> + delta = chg > 0 ? chg - freed : freed;
> + gbl_reserve = hugepage_subpool_put_pages(spool, delta);
>   hugetlb_acct_memory(h, -gbl_reserve);
>  
>   return 0;
> 


[PATCH 3/4] mm/hugeltb: fix potential wrong gbl_reserve value for hugetlb_acct_memory()

2021-04-02 Thread Miaohe Lin
The resv_map could be NULL since this routine can be called in the evict
inode path for all hugetlbfs inodes. So we could have chg = 0 and this
would result in a negative value when chg - freed. This is unexpected for
hugepage_subpool_put_pages() and hugetlb_acct_memory().

Fixes: b5cec28d36f5 ("hugetlbfs: truncate_hugepages() takes a range of pages")
Signed-off-by: Miaohe Lin 
---
 mm/hugetlb.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index b7864abded3d..bdff8d23803f 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -5413,6 +5413,7 @@ long hugetlb_unreserve_pages(struct inode *inode, long 
start, long end,
long chg = 0;
struct hugepage_subpool *spool = subpool_inode(inode);
long gbl_reserve;
+   long delta;
 
/*
 * Since this routine can be called in the evict inode path for all
@@ -5437,7 +5438,8 @@ long hugetlb_unreserve_pages(struct inode *inode, long 
start, long end,
 * If the subpool has a minimum size, the number of global
 * reservations to be released may be adjusted.
 */
-   gbl_reserve = hugepage_subpool_put_pages(spool, (chg - freed));
+   delta = chg > 0 ? chg - freed : freed;
+   gbl_reserve = hugepage_subpool_put_pages(spool, delta);
hugetlb_acct_memory(h, -gbl_reserve);
 
return 0;
-- 
2.19.1