On Tue, Aug 12, 2025 at 11:55:10AM -0400, Zi Yan wrote: [...] >+/* >+ * gather_folio_orders - scan through [vaddr_start, len) and record folio >orders >+ * @vaddr_start: start vaddr >+ * @len: range length >+ * @pagemap_fd: file descriptor to /proc/<pid>/pagemap >+ * @kpageflags_fd: file descriptor to /proc/kpageflags >+ * @orders: output folio order array >+ * @nr_orders: folio order array size >+ * >+ * gather_folio_orders() scan through [vaddr_start, len) and check all folios >+ * within the range and record their orders. All order-0 pages will be >recorded.
I feel a little confused about the description here. Especially on the behavior when the range is not aligned on folio boundary. See following code at 1) and 2). >+ * Non-present vaddr is skipped. >+ * >+ * >+ * Return: 0 - no error, -1 - unhandled cases >+ */ >+static int gather_folio_orders(char *vaddr_start, size_t len, >+ int pagemap_fd, int kpageflags_fd, >+ int orders[], int nr_orders) >+{ >+ uint64_t page_flags = 0; >+ int cur_order = -1; >+ char *vaddr; >+ >+ if (!pagemap_fd || !kpageflags_fd) >+ return -1; If my understanding is correct, we use open() to get a file descriptor. On error it returns -1. And 0 is a possible valid value, but usually used by stdin. The code may work in most cases, but seems not right. >+ if (nr_orders <= 0) >+ return -1; >+ Maybe we want to check orders[] here too? >+ for (vaddr = vaddr_start; vaddr < vaddr_start + len;) { >+ char *next_folio_vaddr; >+ int status; >+ >+ status = get_page_flags(vaddr, pagemap_fd, kpageflags_fd, >+ &page_flags); >+ if (status < 0) >+ return -1; >+ >+ /* skip non present vaddr */ >+ if (status == 1) { >+ vaddr += psize(); >+ continue; >+ } >+ >+ /* all order-0 pages with possible false postive (non folio) */ Do we still false positive case? Non-present page returns 1, which is handled above. >+ if (!(page_flags & (KPF_COMPOUND_HEAD | KPF_COMPOUND_TAIL))) { >+ orders[0]++; >+ vaddr += psize(); >+ continue; >+ } >+ >+ /* skip non thp compound pages */ >+ if (!(page_flags & KPF_THP)) { >+ vaddr += psize(); >+ continue; >+ } >+ >+ /* vpn points to part of a THP at this point */ >+ if (page_flags & KPF_COMPOUND_HEAD) >+ cur_order = 1; >+ else { >+ /* not a head nor a tail in a THP? */ >+ if (!(page_flags & KPF_COMPOUND_TAIL)) >+ return -1; When reaches here, we know (page_flags & (KPF_COMPOUND_HEAD | KPF_COMPOUND_TAIL)). So we have at least one of it set. Looks not possible to hit it? >+ >+ vaddr += psize(); >+ continue; 1) In case vaddr points to the middle of a large folio, this will skip this folio and count from next one. >+ } >+ >+ next_folio_vaddr = vaddr + (1UL << (cur_order + pshift())); >+ >+ if (next_folio_vaddr >= vaddr_start + len) >+ break; >+ >+ while ((status = get_page_flags(next_folio_vaddr, pagemap_fd, >+ kpageflags_fd, >+ &page_flags)) >= 0) { >+ /* >+ * non present vaddr, next compound head page, or >+ * order-0 page >+ */ >+ if (status == 1 || >+ (page_flags & KPF_COMPOUND_HEAD) || >+ !(page_flags & (KPF_COMPOUND_HEAD | >KPF_COMPOUND_TAIL))) { >+ if (cur_order < nr_orders) { >+ orders[cur_order]++; >+ cur_order = -1; >+ vaddr = next_folio_vaddr; >+ } >+ break; >+ } >+ >+ /* not a head nor a tail in a THP? */ >+ if (!(page_flags & KPF_COMPOUND_TAIL)) >+ return -1; >+ >+ cur_order++; >+ next_folio_vaddr = vaddr + (1UL << (cur_order + >pshift())); 2) If (vaddr_start + len) points to the middle of a large folio and folio is more than order 1 size, we may continue the loop and still count this last folio. Because we don't check next_folio_vaddr and (vaddr_start + len). A simple chart of these case. vaddr_start + len | | v v +---------------------+ +-----------------+ |folio 1 | |folio 2 | +---------------------+ +-----------------+ folio 1 is not counted, but folio 2 is counted. So at 1) and 2) handles the boundary differently. Not sure this is designed behavior. If so I think it would be better to record in document, otherwise the behavior is not obvious to user. >+ } >+ >+ if (status < 0) >+ return status; >+ } >+ if (cur_order > 0 && cur_order < nr_orders) >+ orders[cur_order]++; Another boundary case here. If we come here because (next_folio_vaddr >= vaddr_start + len) in the for loop instead of the while loop. This means we found the folio head at vaddr, but the left range (vaddr_start + len - vaddr) is less than or equal to order 1 page size. But we haven't detected the real end of this folio. If this folio is more than order 1 size, we still count it an order 1 folio. >+ return 0; >+} >+ >+int check_folio_orders(char *vaddr_start, size_t len, int pagemap_fd, >+ int kpageflags_fd, int orders[], int nr_orders) >+{ >+ int *vaddr_orders; >+ int status; >+ int i; >+ >+ vaddr_orders = (int *)malloc(sizeof(int) * nr_orders); >+ >+ if (!vaddr_orders) >+ ksft_exit_fail_msg("Cannot allocate memory for vaddr_orders"); >+ >+ memset(vaddr_orders, 0, sizeof(int) * nr_orders); >+ status = gather_folio_orders(vaddr_start, len, pagemap_fd, >+ kpageflags_fd, vaddr_orders, nr_orders); >+ if (status) >+ goto out; >+ >+ status = 0; >+ for (i = 0; i < nr_orders; i++) >+ if (vaddr_orders[i] != orders[i]) { >+ ksft_print_msg("order %d: expected: %d got %d\n", i, >+ orders[i], vaddr_orders[i]); >+ status = -1; >+ } >+ >+out: >+ free(vaddr_orders); >+ return status; >+} -- Wei Yang Help you, Help me