Thank you so much Dave
On 10 July 2014 22:47, Dave Anderson <[email protected]> wrote: > > > ----- Original Message ----- > > Hello Everyone, > > > > I am analysing a kernel crash dump (vmcore) captured from RHEL-5 > > kernel version (2.6.18-371.4.1.el5) and found that the value of > > "NR_WRITEBACK" counter is negative (-126). > > > > $ rpm -q crash > > crash-7.0.6-2.el6.x86_64 > > > > crash> sys | grep -e RELEASE -e MACHINE -e MEMORY > > RELEASE: 2.6.18-371.4.1.el5 > > MACHINE: x86_64 (3000 Mhz) > > MEMORY: 31.5 GB > > > > crash> kmem -z | grep -e ZONE -e NR_WRITEBACK > > NODE: 0 ZONE: 0 ADDR: ffff810000032000 NAME: "DMA" > > NR_WRITEBACK: 0 > > NODE: 0 ZONE: 1 ADDR: ffff810000032b00 NAME: "DMA32" > > NR_WRITEBACK: 0 > > NODE: 0 ZONE: 2 ADDR: ffff810000033600 NAME: "Normal" > > NR_WRITEBACK: -126 <<<< > > NODE: 0 ZONE: 3 ADDR: ffff810000034100 NAME: "HighMem" > > > > crash> kmem -V | grep -e NR_WRITEBACK > > NR_WRITEBACK: -126 <<<< > > > > crash> vm_stat > > vm_stat = $1 = > > {{ > > counter = 1106459 > > }, { > > counter = 2940354 > > }, { > > counter = 6341366 > > }, { > > counter = 301750 > > }, { > > counter = 245858 > > }, { > > counter = 438 > > }, { > > counter = -126 // NR_WRITEBACK <<<< > > }, { > > counter = 0 > > }, { > > counter = 0 > > }, { > > counter = 19687071384 > > }, { > > counter = 0 > > }, { > > counter = 0 > > }, { > > counter = 29247123 > > }, { > > counter = 19687071384 > > }, { > > counter = 0 > > }} > > > > As we're running a 64 bit kernel and the counters are signed long, > > so this is very unlikely to be a counter overflow. I need pointers > > and suggestions to determine the *cause* of negative counter from > > vmcore. > > Since the vmcore is just a snapshot in time, I would think that it would > be difficult/unlikely that you can determine *the cause* just from the > system's state at the time of the dump. > > FWIW, I ran a quick test of ~200 sample vmcores, and find that negative > values in the vm_stat[] array and per-zone statistics are not all that > unusual, even on recent kernels. (about 5% of the vmcores had one or > more negative items). > > And if the counter value is required for some kind of VM-related decision > or whatever, it would use one of the following functions, which return 0 in > the case of negative values: > > static inline unsigned long global_page_state(enum zone_stat_item item) > { > long x = atomic_long_read(&vm_stat[item]); > #ifdef CONFIG_SMP > if (x < 0) > x = 0; > #endif > return x; > } > > static inline unsigned long zone_page_state(struct zone *zone, > enum zone_stat_item item) > { > long x = atomic_long_read(&zone->vm_stat[item]); > #ifdef CONFIG_SMP > if (x < 0) > x = 0; > #endif > return x; > } > > Dave > > > > > > > > Additional Information: > > > > $ git show ce866b34ae1b7f1ce60234cf65855886ac7e7d30 > > [..] > > diff --git a/drivers/base/node.c b/drivers/base/node.c > > index 6fed520..a7b3dcb 100644 > > --- a/drivers/base/node.c > > +++ b/drivers/base/node.c > > @@ -49,9 +49,6 @@ static ssize_t node_read_meminfo(struct sys_device * > dev, > > char * buf) > > get_page_state_node(&ps, nid); > > __get_zone_counts(&active, &inactive, &free, NODE_DATA(nid)); > > > > - /* Check for negative values in these approximate counters */ > > - if ((long)ps.nr_writeback < 0) > > - ps.nr_writeback = 0; > > > > n = sprintf(buf, "\n" > > "Node %d MemTotal: %8lu kB\n" > > [..] > > > > Thank you ! > > > > -- > > BKS > > > > -- > > Crash-utility mailing list > > [email protected] > > https://www.redhat.com/mailman/listinfo/crash-utility > > -- > Crash-utility mailing list > [email protected] > https://www.redhat.com/mailman/listinfo/crash-utility > -- BKS
-- Crash-utility mailing list [email protected] https://www.redhat.com/mailman/listinfo/crash-utility
