Re: [PATCH v5 2/6] mm/cma: introduce new zone, ZONE_CMA
On 08/29/2016 07:07 AM, js1...@gmail.com wrote: From: Joonsoo KimAttached cover-letter: This series try to solve problems of current CMA implementation. CMA is introduced to provide physically contiguous pages at runtime without exclusive reserved memory area. But, current implementation works like as previous reserved memory approach, because freepages on CMA region are used only if there is no movable freepage. In other words, freepages on CMA region are only used as fallback. In that situation where freepages on CMA region are used as fallback, kswapd would be woken up easily since there is no unmovable and reclaimable freepage, too. If kswapd starts to reclaim memory, fallback allocation to MIGRATE_CMA doesn't occur any more since movable freepages are already refilled by kswapd and then most of freepage on CMA are left to be in free. This situation looks like exclusive reserved memory case. In my experiment, I found that if system memory has 1024 MB memory and 512 MB is reserved for CMA, kswapd is mostly woken up when roughly 512 MB free memory is left. Detailed reason is that for keeping enough free memory for unmovable and reclaimable allocation, kswapd uses below equation when calculating free memory and it easily go under the watermark. Free memory for unmovable and reclaimable = Free total - Free CMA pages This is derivated from the property of CMA freepage that CMA freepage can't be used for unmovable and reclaimable allocation. Anyway, in this case, kswapd are woken up when (FreeTotal - FreeCMA) is lower than low watermark and tries to make free memory until (FreeTotal - FreeCMA) is higher than high watermark. That results in that FreeTotal is moving around 512MB boundary consistently. It then means that we can't utilize full memory capacity. To fix this problem, I submitted some patches [1] about 10 months ago, but, found some more problems to be fixed before solving this problem. It requires many hooks in allocator hotpath so some developers doesn't like it. Instead, some of them suggest different approach [2] to fix all the problems related to CMA, that is, introducing a new zone to deal with free CMA pages. I agree that it is the best way to go so implement here. Although properties of ZONE_MOVABLE and ZONE_CMA is similar, I decide to add a new zone rather than piggyback on ZONE_MOVABLE since they have some differences. First, reserved CMA pages should not be offlined. If freepage for CMA is managed by ZONE_MOVABLE, we need to keep MIGRATE_CMA migratetype and insert many hooks on memory hotplug code to distiguish hotpluggable memory and reserved memory for CMA in the same zone. It would make memory hotplug code which is already complicated more complicated. Second, cma_alloc() can be called more frequently than memory hotplug operation and possibly we need to control allocation rate of ZONE_CMA to optimize latency in the future. In this case, separate zone approach is easy to modify. Third, I'd like to see statistics for CMA, separately. Sometimes, we need to debug why cma_alloc() is failed and separate statistics would be more helpful in this situtaion. Anyway, this patchset solves four problems related to CMA implementation. 1) Utilization problem As mentioned above, we can't utilize full memory capacity due to the limitation of CMA freepage and fallback policy. This patchset implements a new zone for CMA and uses it for GFP_HIGHUSER_MOVABLE request. This typed allocation is used for page cache and anonymous pages which occupies most of memory usage in normal case so we can utilize full memory capacity. Below is the experiment result about this problem. 8 CPUs, 1024 MB, VIRTUAL MACHINE make -j16 CMA reserve:0 MB512 MB Elapsed-time: 92.4186.5 pswpin: 82 18647 pswpout:160 69839 CMA reserve:0 MB512 MB Elapsed-time: 93.193.4 pswpin: 84 46 pswpout:183 92 FYI, there is another attempt [3] trying to solve this problem in lkml. And, as far as I know, Qualcomm also has out-of-tree solution for this problem. 2) Reclaim problem Currently, there is no logic to distinguish CMA pages in reclaim path. If reclaim is initiated for unmovable and reclaimable allocation, reclaiming CMA pages doesn't help to satisfy the request and reclaiming CMA page is just waste. By managing CMA pages in the new zone, we can skip to reclaim ZONE_CMA completely if it is unnecessary. 3) Atomic allocation failure problem Kswapd isn't started to reclaim pages when allocation request is movable type and there is enough free page in the CMA region. After bunch of consecutive movable allocation requests, free pages in ordinary region (not CMA region) would be exhausted without waking up kswapd. At that time, if atomic unmovable allocation comes, it can't be successful since there
Re: [PATCH v5 2/6] mm/cma: introduce new zone, ZONE_CMA
On 08/29/2016 07:07 AM, js1...@gmail.com wrote: From: Joonsoo Kim Attached cover-letter: This series try to solve problems of current CMA implementation. CMA is introduced to provide physically contiguous pages at runtime without exclusive reserved memory area. But, current implementation works like as previous reserved memory approach, because freepages on CMA region are used only if there is no movable freepage. In other words, freepages on CMA region are only used as fallback. In that situation where freepages on CMA region are used as fallback, kswapd would be woken up easily since there is no unmovable and reclaimable freepage, too. If kswapd starts to reclaim memory, fallback allocation to MIGRATE_CMA doesn't occur any more since movable freepages are already refilled by kswapd and then most of freepage on CMA are left to be in free. This situation looks like exclusive reserved memory case. In my experiment, I found that if system memory has 1024 MB memory and 512 MB is reserved for CMA, kswapd is mostly woken up when roughly 512 MB free memory is left. Detailed reason is that for keeping enough free memory for unmovable and reclaimable allocation, kswapd uses below equation when calculating free memory and it easily go under the watermark. Free memory for unmovable and reclaimable = Free total - Free CMA pages This is derivated from the property of CMA freepage that CMA freepage can't be used for unmovable and reclaimable allocation. Anyway, in this case, kswapd are woken up when (FreeTotal - FreeCMA) is lower than low watermark and tries to make free memory until (FreeTotal - FreeCMA) is higher than high watermark. That results in that FreeTotal is moving around 512MB boundary consistently. It then means that we can't utilize full memory capacity. To fix this problem, I submitted some patches [1] about 10 months ago, but, found some more problems to be fixed before solving this problem. It requires many hooks in allocator hotpath so some developers doesn't like it. Instead, some of them suggest different approach [2] to fix all the problems related to CMA, that is, introducing a new zone to deal with free CMA pages. I agree that it is the best way to go so implement here. Although properties of ZONE_MOVABLE and ZONE_CMA is similar, I decide to add a new zone rather than piggyback on ZONE_MOVABLE since they have some differences. First, reserved CMA pages should not be offlined. If freepage for CMA is managed by ZONE_MOVABLE, we need to keep MIGRATE_CMA migratetype and insert many hooks on memory hotplug code to distiguish hotpluggable memory and reserved memory for CMA in the same zone. It would make memory hotplug code which is already complicated more complicated. Second, cma_alloc() can be called more frequently than memory hotplug operation and possibly we need to control allocation rate of ZONE_CMA to optimize latency in the future. In this case, separate zone approach is easy to modify. Third, I'd like to see statistics for CMA, separately. Sometimes, we need to debug why cma_alloc() is failed and separate statistics would be more helpful in this situtaion. Anyway, this patchset solves four problems related to CMA implementation. 1) Utilization problem As mentioned above, we can't utilize full memory capacity due to the limitation of CMA freepage and fallback policy. This patchset implements a new zone for CMA and uses it for GFP_HIGHUSER_MOVABLE request. This typed allocation is used for page cache and anonymous pages which occupies most of memory usage in normal case so we can utilize full memory capacity. Below is the experiment result about this problem. 8 CPUs, 1024 MB, VIRTUAL MACHINE make -j16 CMA reserve:0 MB512 MB Elapsed-time: 92.4186.5 pswpin: 82 18647 pswpout:160 69839 CMA reserve:0 MB512 MB Elapsed-time: 93.193.4 pswpin: 84 46 pswpout:183 92 FYI, there is another attempt [3] trying to solve this problem in lkml. And, as far as I know, Qualcomm also has out-of-tree solution for this problem. 2) Reclaim problem Currently, there is no logic to distinguish CMA pages in reclaim path. If reclaim is initiated for unmovable and reclaimable allocation, reclaiming CMA pages doesn't help to satisfy the request and reclaiming CMA page is just waste. By managing CMA pages in the new zone, we can skip to reclaim ZONE_CMA completely if it is unnecessary. 3) Atomic allocation failure problem Kswapd isn't started to reclaim pages when allocation request is movable type and there is enough free page in the CMA region. After bunch of consecutive movable allocation requests, free pages in ordinary region (not CMA region) would be exhausted without waking up kswapd. At that time, if atomic unmovable allocation comes, it can't be successful since there is not enough page in
Re: [PATCH v5 2/6] mm/cma: introduce new zone, ZONE_CMA
On Tue, Aug 30, 2016 at 06:10:46PM +0530, Aneesh Kumar K.V wrote: > "Aneesh Kumar K.V"writes: > > > > > > >> static inline void check_highest_zone(enum zone_type k) > >> { > >> - if (k > policy_zone && k != ZONE_MOVABLE) > >> + if (k > policy_zone && k != ZONE_MOVABLE && !is_zone_cma_idx(k)) > >>policy_zone = k; > >> } > >> > > > > > > Should we apply policy to allocation from ZONE CMA ?. CMA reserve > > happens early and may mostly come from one node. Do we want the > > CMA allocation to fail if we use mbind(MPOL_BIND) with a node mask not > > including that node on which CMA is reserved, considering CMA memory is > > going to be used for special purpose. > > Looking at this again, I guess CMA alloc is not going to depend on > memory policy, but this is for other movable allocation ? This is for usual file cache or anonymous page allocation. IIUC, policy_zone is used to determine if mempolicy should be applied or not and setting policy_zone to ZONE_CMA makes mempolicy less useful. Thanks.
Re: [PATCH v5 2/6] mm/cma: introduce new zone, ZONE_CMA
On Tue, Aug 30, 2016 at 06:10:46PM +0530, Aneesh Kumar K.V wrote: > "Aneesh Kumar K.V" writes: > > > > > > >> static inline void check_highest_zone(enum zone_type k) > >> { > >> - if (k > policy_zone && k != ZONE_MOVABLE) > >> + if (k > policy_zone && k != ZONE_MOVABLE && !is_zone_cma_idx(k)) > >>policy_zone = k; > >> } > >> > > > > > > Should we apply policy to allocation from ZONE CMA ?. CMA reserve > > happens early and may mostly come from one node. Do we want the > > CMA allocation to fail if we use mbind(MPOL_BIND) with a node mask not > > including that node on which CMA is reserved, considering CMA memory is > > going to be used for special purpose. > > Looking at this again, I guess CMA alloc is not going to depend on > memory policy, but this is for other movable allocation ? This is for usual file cache or anonymous page allocation. IIUC, policy_zone is used to determine if mempolicy should be applied or not and setting policy_zone to ZONE_CMA makes mempolicy less useful. Thanks.
Re: [PATCH v5 2/6] mm/cma: introduce new zone, ZONE_CMA
"Aneesh Kumar K.V"writes: > > >> static inline void check_highest_zone(enum zone_type k) >> { >> -if (k > policy_zone && k != ZONE_MOVABLE) >> +if (k > policy_zone && k != ZONE_MOVABLE && !is_zone_cma_idx(k)) >> policy_zone = k; >> } >> > > > Should we apply policy to allocation from ZONE CMA ?. CMA reserve > happens early and may mostly come from one node. Do we want the > CMA allocation to fail if we use mbind(MPOL_BIND) with a node mask not > including that node on which CMA is reserved, considering CMA memory is > going to be used for special purpose. Looking at this again, I guess CMA alloc is not going to depend on memory policy, but this is for other movable allocation ? -aneesh
Re: [PATCH v5 2/6] mm/cma: introduce new zone, ZONE_CMA
"Aneesh Kumar K.V" writes: > > >> static inline void check_highest_zone(enum zone_type k) >> { >> -if (k > policy_zone && k != ZONE_MOVABLE) >> +if (k > policy_zone && k != ZONE_MOVABLE && !is_zone_cma_idx(k)) >> policy_zone = k; >> } >> > > > Should we apply policy to allocation from ZONE CMA ?. CMA reserve > happens early and may mostly come from one node. Do we want the > CMA allocation to fail if we use mbind(MPOL_BIND) with a node mask not > including that node on which CMA is reserved, considering CMA memory is > going to be used for special purpose. Looking at this again, I guess CMA alloc is not going to depend on memory policy, but this is for other movable allocation ? -aneesh
Re: [PATCH v5 2/6] mm/cma: introduce new zone, ZONE_CMA
> static inline void check_highest_zone(enum zone_type k) > { > - if (k > policy_zone && k != ZONE_MOVABLE) > + if (k > policy_zone && k != ZONE_MOVABLE && !is_zone_cma_idx(k)) > policy_zone = k; > } > Should we apply policy to allocation from ZONE CMA ?. CMA reserve happens early and may mostly come from one node. Do we want the CMA allocation to fail if we use mbind(MPOL_BIND) with a node mask not including that node on which CMA is reserved, considering CMA memory is going to be used for special purpose. -aneesh
Re: [PATCH v5 2/6] mm/cma: introduce new zone, ZONE_CMA
> static inline void check_highest_zone(enum zone_type k) > { > - if (k > policy_zone && k != ZONE_MOVABLE) > + if (k > policy_zone && k != ZONE_MOVABLE && !is_zone_cma_idx(k)) > policy_zone = k; > } > Should we apply policy to allocation from ZONE CMA ?. CMA reserve happens early and may mostly come from one node. Do we want the CMA allocation to fail if we use mbind(MPOL_BIND) with a node mask not including that node on which CMA is reserved, considering CMA memory is going to be used for special purpose. -aneesh