Hi, Did anybody face the below issue earlier? Appreciate your response.
Thanks, K M ---------- Forwarded message ---------- From: sri <[email protected]> Date: Tue, Jan 31, 2012 at 7:03 PM Subject: chatter: in arp querier: cannot make packet! -- resulting in kernel oops To: [email protected] Hi Experts, Am facing "oops, kernel could not allocate memory for skbuff" issue while generating ARP query by click router module. As a result, the machine is becoming unresponsive for some time and need to reboot for making it up again. It was working fine on the centos-5.3 kernel (2.6.18-128.el5) and recently upgraded the OS to centos 5.5. Machine is running 4 GB of RAM and am surprising how all the memory is consumed? Observed two things from the coredumps created at the time of crash: 1) failsafe_re_fo_ invoked oom-killer: gfp_mask=0xd0, order=0, oomkilladj=0 < THIS IS REPEATED FROM SOME OTHER PROCESSES> [<c044abd8>] out_of_memory+0x72/0x17a [<c044beab>] __alloc_pages+0x237/0x2b8 [<c045f145>] cache_alloc_refill+0x217/0x3e4 [<c045ef26>] kmem_cache_alloc+0x22/0x2a [<c046e9f2>] getname+0x1a/0xb0 [<c0460d62>] do_sys_open+0x12/0xae [<c0460e2b>] sys_open+0x16/0x18 [<c0403c9b>] syscall_call+0x7/0xb 2) oops, kernel could not allocate memory for skbuff chatter: in arp querier: cannot make packet! BUG: unable to handle kernel NULL pointer dereference at virtual address 00000060 Tracing the message showed that the control was flowing thru the following code: ------------------------------------ code snippet start ------------------------------------------------------- src/click/click-1.6.0/lib/packet.cc WritablePacket * Packet::make(uint32_t headroom, const unsigned char *data, uint32_t len, uint32_t tailroom) { int want = 1; if (struct sk_buff *skb = skbmgr_allocate_skbs(headroom, len + tailroom, &want)) { ---- src/click/click-1.6.0/linuxmodule/skbmgr.cc struct sk_buff *skbmgr_allocate_skbs(unsigned headroom, unsigned size, int *want) RecycledSkbPool::allocate(unsigned headroom, unsigned size, int want, int *store_got) 342 while (got < want) { 343 struct sk_buff *skb = alloc_skb(size, GFP_ATOMIC); 344 #if DEBUG_SKBMGR 345 _allocated++; 346 #endif 347 if (!skb) { 348 printk("<1>oops, kernel could not allocate memory for skbuff\n"); 349 break; 350 } ------------------------------------code snippet end ---------------------------------------------- At the time of crash, the following logs were thrown: -----------------------------------LOG start--------------------- Mem-info: DMA per-cpu: cpu 0 hot: high 0, batch 1 used:0 cpu 0 cold: high 0, batch 1 used:0 DMA32 per-cpu: empty Normal per-cpu: cpu 0 hot: high 186, batch 31 used:77 cpu 0 cold: high 62, batch 15 used:51 HighMem per-cpu: cpu 0 hot: high 186, batch 31 used:6 cpu 0 cold: high 62, batch 15 used:7 Free pages: 902160kB (895592kB HighMem) Active:61161 inactive:3887 dirty:64 writeback:0 unstable:0 free:225540 slab:183727 mapped-file:4522 mapped-anon:44493 pagetables:605 DMA free:3576kB min:144kB low:180kB high:216kB active:24kB inactive:0kB present:16384kB pages_scanned:312286092 all_unreclaimable? yes lowmem_reserve[]: 0 0 880 4080 DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 880 4080 Normal free:2992kB min:8044kB low:10052kB high:12064kB active:0kB inactive:116kB present:901120kB pages_scanned:1056265289 all_unreclaimable? yes lowmem_reserve[]: 0 0 0 25600 HighMem free:895592kB min:512kB low:7824kB high:15140kB active:244620kB inactive:15432kB present:3276800kB pages_scanned:0 all_unreclaimable? no lowmem_reserve[]: 0 0 0 0 DMA: 0*4kB 1*8kB 1*16kB 1*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3576kB DMA32: empty Normal: 0*4kB 0*8kB 1*16kB 1*32kB 0*64kB 1*128kB 1*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 2992kB HighMem: 7126*4kB 8504*8kB 7649*16kB 6080*32kB 3817*64kB 754*128kB 136*256kB 90*512kB 33*1024kB 11*2048kB 1*4096kB = 895592kB20555 pagecache pages Swap cache: add 27, delete 25, find 0/0, race 0+0 Free swap = 4192848kB Total swap = 4192956kB Free swap: 4192848kB 1048576 pages of RAM 819200 pages of HIGHMEM 569028 reserved pages 32658 pages shared 2 pages swap cached 64 pages dirty 0 pages writeback 4522 pages mapped 183727 pages slab 605 pages pagetables oops, kernel could not allocate memory for skbuff chatter: in arp querier: cannot make packet! BUG: unable to handle kernel NULL pointer dereference at virtual address 00000060 printing eip: fb6e3df9 *pde = 71b6d067 Oops: 0000 [#1] last sysfs file: /devices/pci0000:00/0000:00:00.0/class Modules linked in: xfrm4_mode_transport(U) krng(U) ansi_cprng(U) chainiv(U) rng(U) authenc(U) aes_generic(U) testmgr_cipher(U) aes_i586(U) cbc(U) hmac(U) crypto_hash(U) testmgr(U) crypto_blkcipher(U) cryptomgr(U) esp4(U) xfrm4_esp(U) aead(U) crypto_algapi(U) xfrm_nalgo(U) crypto_api(U) softdog(U) click(U) proclikefs(U) deflate(U) zlib_deflate(U) af_key(U) pkp_drv(PU) autofs4(U) kick(U) dm_mirror(U) dm_log(U) dm_multipath(U) scsi_dh(U) dm_mod(U) video(U) backlight(U) sbs(U) power_meter(U) hwmon(U) i2c_ec(U) dell_wmi(U) wmi(U) button(U) battery(U) asus_acpi(U) ac(U) parport_pc(U) lp(U) parport(U) joydev(U) sg(U) i2c_i801(U) i2c_core(U) ide_cd(U) cdrom(U) cdc_ether(U) usbnet(U) bnx2(U) pcspkr(U) mptctl(U) mptsas(U) mptscsih(U) mptbase(U) scsi_transport_sas(U) megaraid_sas(U) ata_piix(U) libata(U) sd_mod(U) scsi_mod(U) ext3(U) jbd(U) uhci_hcd(U) ohci_hcd(U) ehci_hcd(U) CPU: 0 EIP: 0060:[<fb6e3df9>] Tainted: P VLI EFLAGS: 00010292 (2.6.18-cisco.nac.3 #1) EIP is at _ZN11ARPQuerier14send_query_forERK9IPAddressitit+0x299/0x4c0 [click] eax: cb0fedc0 ebx: 00000001 ecx: f7fff0c0 edx: c957f300 esi: 00000000 edi: 00000060 ebp: 00000001 esp: f6bb0f14 ds: 007b es: 007b ss: 0068 Process kclick (pid: 2227, ti=f6bb0000 task=f745b000 task.ti=f6bb0000) Stack: fb76e178 1e31a61d ed3e10a8 0000e602 00000000 f40b5600 ed3e1000 ed3e14c4 f4aa2300 f4aa2300 ed3e1000 f40b5600 f38bf440 ed3e1000 000000f5 fb6e569c 00000000 00000001 0000e602 2e02c13c 00ed4f4c f746e470 f746e470 ed3e14d8 Call Trace: [<fb6e569c>] _ZN11ARPQuerier11expire_hookEP5TimerPv+0x14c/0x230 [click] [<fb6a540a>] _ZN6Master10run_timersEv+0xca/0x100 [click] [<fb69c150>] _ZN12RouterThread6driverEv+0x190/0x290 [click] [<fb7452d2>] _Z11click_schedPv+0x82/0x130 [click] [<fb745250>] _Z11click_schedPv+0x0/0x130 [click] [<c040496b>] kernel_thread_helper+0x7/0x10 ======================= Code: 44 85 ed 0f 8e fa 00 00 00 31 d2 b9 2e 00 00 00 b8 1c 00 00 00 c7 04 24 00 00 00 00 e8 31 dd f9 ff 85 c0 89 c6 0f 84 34 01 00 00 <8b> 56 60 31 c0 8b be a0 00 00 00 89 d1 c1 e9 02 f3 ab f6 c2 02 EIP: [<fb6e3df9>] _ZN11ARPQuerier14send_query_forERK9IPAddressitit+0x299/0x4c0 [click] SS:ESP 0068:f6bb0f14 ----------------------------------LOG end ---------------------- My machine environment includes Centos-5.5 kernel (2.6.18-194.el5) and click-1.6 loaded as a kernel module. As we customized the click module, upgrading would be difficult and it was working very well with centos5.3 kernel. Any suggestions/pointers to resolve this or to find root cause are appreciated. Thanks in advance. -- -- Krishna Mohan B -- -- Krishna Mohan B _______________________________________________ click mailing list [email protected] https://amsterdam.lcs.mit.edu/mailman/listinfo/click
