On 28/05/2025 13:40, David Hildenbrand wrote: > On 28.05.25 12:53, Ryan Roberts wrote: >> On 28/05/2025 11:48, David Hildenbrand wrote: >>> On 28.05.25 12:44, David Hildenbrand wrote: >>>> On 28.05.25 12:34, Ryan Roberts wrote: >>>>> Hi David, >>>>> >>>>> >>>>> On 09/05/2025 16:30, David Hildenbrand wrote: >>>>>> Let's test some basic functionality using /dev/mem. These tests will >>>>>> implicitly cover some PAT (Page Attribute Handling) handling on x86. >>>>>> >>>>>> These tests will only run when /dev/mem access to the first two pages >>>>>> in physical address space is possible and allowed; otherwise, the tests >>>>>> are skipped. >>>>> >>>>> We are seeing really horrible RAS errors with this test when run on arm64 >>>>> tx2 >>>>> machine. Based solely on reviewing the code, I think the problem is that >>>>> tx2 >>>>> doesn't have anything at phys address 0, so test_read_access() is trying >>>>> to >>>>> put >>>>> trasactions out to a bad address on the bus. >>>>> >>>>> tx2 /proc/iomem: >>>>> >>>>> $ sudo cat /proc/iomem >>>>> 30000000-37ffffff : PCI ECAM >>>>> 38000000-3fffffff : PCI ECAM >>>>> 40000000-5fffffff : PCI Bus 0000:00 >>>>> ... >>>>> >>>>> Whereas my x86 box has some reserved memory: >>>>> >>>>> $ sudo cat /proc/iomem >>>>> 00000000-00000fff : Reserved >>>>> 00001000-0003dfff : System RAM >>>>> ... >>>>> >>>> >>>> A quick fix would be to make this test specific to x86 (the only one I >>>> tested on). We should always have the lower two pages IIRC (BIOS stuff >>>> etc). >> >> I'm not sure how far along this patch is? I'm guessing mm-stable? Perhaps you >> can do the quick fix, then I'd be happy to make this more robust for arm64 >> later? > > Can you give the following a quick test on that machine? Then, I can send it > as a > proper patch later.
The machine in question is part of our CI infra, so not easy for me to run an ad-hoc test. I've asked Aishwarya if it's possible to queue up a CI job with the patch, but that will involve running the whole test run I think, so probably will take a couple of days to turn around. FWIW, the change looks good to me: Reviewed-by: Ryan Roberts <ryan.robe...@arm.com> > > > From 40fea063f2fcf1474fb47cb9aebdb04fd825032b Mon Sep 17 00:00:00 2001 > From: David Hildenbrand <da...@redhat.com> > Date: Wed, 28 May 2025 14:35:23 +0200 > Subject: [PATCH] selftests/mm: two fixes for the pfnmap test > > When unregistering the signal handler, we have to pass SIG_DFL, and > blindly reading from PFN 0 and PFN 1 seems to be problematic on !x86 > systems. In particularly, on arm64 tx2 machines where noting resides > at these physical memory locations, we can generate RAS errors. > > Let's fix it by scanning /proc/iomem for actual "System RAM". > > Reported-by: Ryan Roberts <ryan.robe...@arm.com> > Closes: https://lore.kernel.org/all/232960c2-81db-47ca- > a337-38c4bce5f...@arm.com/T/#u > Fixes: 2616b370323a ("selftests/mm: add simple VM_PFNMAP tests based on > mmap'ing /dev/mem") > Signed-off-by: David Hildenbrand <da...@redhat.com> > --- > tools/testing/selftests/mm/pfnmap.c | 61 +++++++++++++++++++++++++++-- > 1 file changed, 57 insertions(+), 4 deletions(-) > > diff --git a/tools/testing/selftests/mm/pfnmap.c b/tools/testing/selftests/mm/ > pfnmap.c > index 8a9d19b6020c7..4943927a7d1ea 100644 > --- a/tools/testing/selftests/mm/pfnmap.c > +++ b/tools/testing/selftests/mm/pfnmap.c > @@ -12,6 +12,8 @@ > #include <stdint.h> > #include <unistd.h> > #include <errno.h> > +#include <stdio.h> > +#include <ctype.h> > #include <fcntl.h> > #include <signal.h> > #include <setjmp.h> > @@ -43,14 +45,62 @@ static int test_read_access(char *addr, size_t size, > size_t > pagesize) > /* Force a read that the compiler cannot optimize out. */ > *((volatile char *)(addr + offs)); > } > - if (signal(SIGSEGV, signal_handler) == SIG_ERR) > + if (signal(SIGSEGV, SIG_DFL) == SIG_ERR) > return -EINVAL; > > return ret; > } > > +static int find_ram_target(off_t *phys_addr, > + unsigned long pagesize) > +{ > + unsigned long long start, end; > + char line[80], *end_ptr; > + FILE *file; > + > + /* Search /proc/iomem for the first suitable "System RAM" range. */ > + file = fopen("/proc/iomem", "r"); > + if (!file) > + return -errno; > + > + while (fgets(line, sizeof(line), file)) { > + /* Ignore any child nodes. */ > + if (!isalnum(line[0])) > + continue; > + > + if (!strstr(line, "System RAM\n")) > + continue; > + > + start = strtoull(line, &end_ptr, 16); > + /* Skip over the "-" */ > + end_ptr++; > + /* Make end "exclusive". */ > + end = strtoull(end_ptr, NULL, 16) + 1; > + > + /* Actual addresses are not exported */ > + if (!start && !end) > + break; > + > + /* We need full pages. */ > + start = (start + pagesize - 1) & ~(pagesize - 1); > + end &= ~(pagesize - 1); > + > + if (start != (off_t)start) > + break; > + > + /* We need two pages. */ > + if (end > start + 2 * pagesize) { > + fclose(file); > + *phys_addr = start; > + return 0; > + } > + } > + return -ENOENT; > +} > + > FIXTURE(pfnmap) > { > + off_t phys_addr; > size_t pagesize; > int dev_mem_fd; > char *addr1; > @@ -63,14 +113,17 @@ FIXTURE_SETUP(pfnmap) > { > self->pagesize = getpagesize(); > > + /* We'll require two physical pages throughout our tests ... */ > + if (find_ram_target(&self->phys_addr, self->pagesize)) > + SKIP(return, "Cannot find ram target in '/dev/iomem'\n"); > + > self->dev_mem_fd = open("/dev/mem", O_RDONLY); > if (self->dev_mem_fd < 0) > SKIP(return, "Cannot open '/dev/mem'\n"); > > - /* We'll require the first two pages throughout our tests ... */ > self->size1 = self->pagesize * 2; > self->addr1 = mmap(NULL, self->size1, PROT_READ, MAP_SHARED, > - self->dev_mem_fd, 0); > + self->dev_mem_fd, self->phys_addr); > if (self->addr1 == MAP_FAILED) > SKIP(return, "Cannot mmap '/dev/mem'\n"); > > @@ -129,7 +182,7 @@ TEST_F(pfnmap, munmap_split) > */ > self->size2 = self->pagesize; > self->addr2 = mmap(NULL, self->pagesize, PROT_READ, MAP_SHARED, > - self->dev_mem_fd, 0); > + self->dev_mem_fd, self->phys_addr); > ASSERT_NE(self->addr2, MAP_FAILED); > } >