On Fri, Jan 16, 2026 at 10:29 PM Tomas Vondra <[email protected]> wrote:
>
> Hi,
>
> Here's WIP fix for the root cause, i.e. handling status -2 in the two
> views querying NUMA node for memory pages:
>
> * pg_shmem_allocations_numa
> * pg_buffercache_numa
>
> We can't prevent -2 from happening - the kernel can move arbitrary pages
> to swap, we have no control over it. So I think we need to handle -2 as
> "unknown" node, instead of failing. The patch simply returns NULL
> instead of the node, but in principle we might return some other value
> (but IMHO we should not return the raw status, the -2 makes no sense in
> our context, it's some internal kernel errno).
>
> The pg_buffercache_numa was not failing, it just returned the -2 status
> verbatim. But I modified it to return NULL, for consistency.
>
> AFAIK this will fix the regression tests too - they only check COUNT(*),
> not the actual values.
>
> I'm not sure if we need to mention this in the docs. It probably should
> mention the column can be NULL, which means "unknown node".

Right, OK, so I've reproduced this without patch (as You have stated, just cause
shared_buffers to swap out, in my case it was simple stress-ng -m 16 --vm-bytes
SOME_HIGH_VALUE).

It gets ERROR pretty fast: select numa_node, sum(size) from
pg_shmem_allocations_numa group by numa_node;
    numa_node |     sum
    -----------+-------------
            0 | 24062603264
    (1 row)
and then after pretty soon:
    ERROR:  invalid NUMA node id outside of allowed range [0, 0]: -2

but with patch it (which by the way looks good to me), it does not,
instead I get:

 numa_node |     sum
-----------+-------------
           | 10821046272
         0 | 13241556992

-J.


Reply via email to