On 07/08/2019 15:56, Matthew Wilcox wrote:
> On Wed, Aug 07, 2019 at 03:30:38PM +0100, Steven Price wrote:
>> On 07/08/2019 15:15, Matthew Wilcox wrote:
>>> On Tue, Aug 06, 2019 at 11:40:00PM -0700, Christoph Hellwig wrote:
>>>> On Tue, Aug 06, 2019 at 12:09:38PM -0700, Matthew Wilcox wrote:
>>>>> Has anyone looked at turning the interface inside-out?  ie something like:
>>>>>
>>>>>   struct mm_walk_state state = { .mm = mm, .start = start, .end = end, };
>>>>>
>>>>>   for_each_page_range(&state, page) {
>>>>>           ... do something with page ...
>>>>>   }
>>>>>
>>>>> with appropriate macrology along the lines of:
>>>>>
>>>>> #define for_each_page_range(state, page)                          \
>>>>>   while ((page = page_range_walk_next(state)))
>>>>>
>>>>> Then you don't need to package anything up into structs that are shared
>>>>> between the caller and the iterated function.
>>>>
>>>> I'm not an all that huge fan of super magic macro loops.  But in this
>>>> case I don't see how it could even work, as we get special callbacks
>>>> for huge pages and holes, and people are trying to add a few more ops
>>>> as well.
>>>
>>> We could have bits in the mm_walk_state which indicate what things to return
>>> and what things to skip.  We could (and probably should) also use different
>>> iterator names if people actually want to iterate different things.  eg
>>> for_each_pte_range(&state, pte) as well as for_each_page_range().
>>>
>>
>> The iterator approach could be awkward for the likes of my generic
>> ptdump implementation[1]. It would require an iterator which returns all
>> levels and allows skipping levels when required (to prevent KASAN
>> slowing things down too much). So something like:
>>
>> start_walk_range(&state);
>> for_each_page_range(&state, page) {
>>      switch(page->level) {
>>      case PTE:
>>              ...
>>      case PMD:
>>              if (...)
>>                      skip_pmd(&state);
>>              ...
>>      case HOLE:
>>              ....
>>      ...
>>      }
>> }
>> end_walk_range(&state);
>>
>> It seems a little fragile - e.g. we wouldn't (easily) get type checking
>> that you are actually treating a PTE as a pte_t. The state mutators like
>> skip_pmd() also seem a bit clumsy.
> 
> Once you're on-board with using a state structure, you can use it in all
> kinds of fun ways.  For example:
> 
> struct mm_walk_state {
>       struct mm_struct *mm;
>       unsigned long start;
>       unsigned long end;
>       unsigned long curr;
>       p4d_t p4d;
>       pud_t pud;
>       pmd_t pmd;
>       pte_t pte;
>       enum page_entry_size size;
>       int flags;
> };
> 
> For this user, I'd expect something like ...
> 
>       DECLARE_MM_WALK_FLAGS(state, mm, start, end,
>                               MM_WALK_HOLES | MM_WALK_ALL_SIZES);
> 
>       walk_each_pte(state) {
>               switch (state->size) {
>               case PE_SIZE_PTE:
>                       ... 
>               case PE_SIZE_PMD:
>                       if (...(state->pmd))
>                               continue;

You need to be able to signal whether you want to descend into the PMD
or skip the entire part of the tree. This was my skip_pmd() function above.

>               ...
>               }
>       }
> 
> There's no need to have start / end walk function calls.
> 

You've got a start walk function (it's your DECLARE_MM_WALK_FLAGS
above). The end walk I agree I think you don't actually need it since
struct mm_walk_state contains all the state.

Steve

Reply via email to