On Tue, Aug 12, 2025 at 11:55:10AM -0400, Zi Yan wrote:
[...]
>+/*
>+ * gather_folio_orders - scan through [vaddr_start, len) and record folio 
>orders
>+ * @vaddr_start: start vaddr
>+ * @len: range length
>+ * @pagemap_fd: file descriptor to /proc/<pid>/pagemap
>+ * @kpageflags_fd: file descriptor to /proc/kpageflags
>+ * @orders: output folio order array
>+ * @nr_orders: folio order array size
>+ *
>+ * gather_folio_orders() scan through [vaddr_start, len) and check all folios
>+ * within the range and record their orders. All order-0 pages will be 
>recorded.

I feel a little confused about the description here. Especially on the
behavior when the range is not aligned on folio boundary. 

See following code at 1) and 2).

>+ * Non-present vaddr is skipped.
>+ *
>+ *
>+ * Return: 0 - no error, -1 - unhandled cases
>+ */
>+static int gather_folio_orders(char *vaddr_start, size_t len,
>+                             int pagemap_fd, int kpageflags_fd,
>+                             int orders[], int nr_orders)
>+{
>+      uint64_t page_flags = 0;
>+      int cur_order = -1;
>+      char *vaddr;
>+
>+      if (!pagemap_fd || !kpageflags_fd)
>+              return -1;

If my understanding is correct, we use open() to get a file descriptor.

On error it returns -1. And 0 is a possible valid value, but usually used by
stdin. The code may work in most cases, but seems not right.

>+      if (nr_orders <= 0)
>+              return -1;
>+

Maybe we want to check orders[] here too?

>+      for (vaddr = vaddr_start; vaddr < vaddr_start + len;) {
>+              char *next_folio_vaddr;
>+              int status;
>+
>+              status = get_page_flags(vaddr, pagemap_fd, kpageflags_fd,
>+                                      &page_flags);
>+              if (status < 0)
>+                      return -1;
>+
>+              /* skip non present vaddr */
>+              if (status == 1) {
>+                      vaddr += psize();
>+                      continue;
>+              }
>+
>+              /* all order-0 pages with possible false postive (non folio) */

Do we still false positive case? Non-present page returns 1, which is handled
above.

>+              if (!(page_flags & (KPF_COMPOUND_HEAD | KPF_COMPOUND_TAIL))) {
>+                      orders[0]++;
>+                      vaddr += psize();
>+                      continue;
>+              }
>+
>+              /* skip non thp compound pages */
>+              if (!(page_flags & KPF_THP)) {
>+                      vaddr += psize();
>+                      continue;
>+              }
>+
>+              /* vpn points to part of a THP at this point */
>+              if (page_flags & KPF_COMPOUND_HEAD)
>+                      cur_order = 1;
>+              else {
>+                      /* not a head nor a tail in a THP? */
>+                      if (!(page_flags & KPF_COMPOUND_TAIL))
>+                              return -1;

When reaches here, we know (page_flags & (KPF_COMPOUND_HEAD | 
KPF_COMPOUND_TAIL)).
So we have at least one of it set.

Looks not possible to hit it?

>+
>+                      vaddr += psize();
>+                      continue;

1)

In case vaddr points to the middle of a large folio, this will skip this folio
and count from next one.

>+              }
>+
>+              next_folio_vaddr = vaddr + (1UL << (cur_order + pshift()));
>+
>+              if (next_folio_vaddr >= vaddr_start + len)
>+                      break;
>+
>+              while ((status = get_page_flags(next_folio_vaddr, pagemap_fd,
>+                                               kpageflags_fd,
>+                                               &page_flags)) >= 0) {
>+                      /*
>+                       * non present vaddr, next compound head page, or
>+                       * order-0 page
>+                       */
>+                      if (status == 1 ||
>+                          (page_flags & KPF_COMPOUND_HEAD) ||
>+                          !(page_flags & (KPF_COMPOUND_HEAD | 
>KPF_COMPOUND_TAIL))) {
>+                              if (cur_order < nr_orders) {
>+                                      orders[cur_order]++;
>+                                      cur_order = -1;
>+                                      vaddr = next_folio_vaddr;
>+                              }
>+                              break;
>+                      }
>+
>+                      /* not a head nor a tail in a THP? */
>+                      if (!(page_flags & KPF_COMPOUND_TAIL))
>+                              return -1;
>+
>+                      cur_order++;
>+                      next_folio_vaddr = vaddr + (1UL << (cur_order + 
>pshift()));

2)

If (vaddr_start + len) points to the middle of a large folio and folio is more
than order 1 size, we may continue the loop and still count this last folio.
Because we don't check next_folio_vaddr and (vaddr_start + len).

A simple chart of these case.

          vaddr_start                   +     len
               |                               |
               v                               v
     +---------------------+              +-----------------+
     |folio 1              |              |folio 2          |
     +---------------------+              +-----------------+

folio 1 is not counted, but folio 2 is counted.

So at 1) and 2) handles the boundary differently. Not sure this is designed
behavior. If so I think it would be better to record in document, otherwise
the behavior is not obvious to user.

>+              }
>+
>+              if (status < 0)
>+                      return status;
>+      }
>+      if (cur_order > 0 && cur_order < nr_orders)
>+              orders[cur_order]++;

Another boundary case here.

If we come here because (next_folio_vaddr >= vaddr_start + len) in the for
loop instead of the while loop. This means we found the folio head at vaddr,
but the left range (vaddr_start + len - vaddr) is less than or equal to order
1 page size.

But we haven't detected the real end of this folio. If this folio is more than
order 1 size, we still count it an order 1 folio.

>+      return 0;
>+}
>+
>+int check_folio_orders(char *vaddr_start, size_t len, int pagemap_fd,
>+                      int kpageflags_fd, int orders[], int nr_orders)
>+{
>+      int *vaddr_orders;
>+      int status;
>+      int i;
>+
>+      vaddr_orders = (int *)malloc(sizeof(int) * nr_orders);
>+
>+      if (!vaddr_orders)
>+              ksft_exit_fail_msg("Cannot allocate memory for vaddr_orders");
>+
>+      memset(vaddr_orders, 0, sizeof(int) * nr_orders);
>+      status = gather_folio_orders(vaddr_start, len, pagemap_fd,
>+                                   kpageflags_fd, vaddr_orders, nr_orders);
>+      if (status)
>+              goto out;
>+
>+      status = 0;
>+      for (i = 0; i < nr_orders; i++)
>+              if (vaddr_orders[i] != orders[i]) {
>+                      ksft_print_msg("order %d: expected: %d got %d\n", i,
>+                                     orders[i], vaddr_orders[i]);
>+                      status = -1;
>+              }
>+
>+out:
>+      free(vaddr_orders);
>+      return status;
>+}

-- 
Wei Yang
Help you, Help me

Reply via email to