Re: [OmniOS-discuss] ILB memory leak?
> On Nov 10, 2015, at 2:50 AM, Al Slaterwrote: > > On 10/11/2015 07:40, Al Slater wrote: >> It seems to me that ilbd_run_probe just needs to call >> posix_spawn_file_actions_destroy appropriately. > > And probably posix_spawnattr_destroy as well? Wow! Great catch. I'll bet a small sum you nailed this to the wall. Want me to build you a replacement ilbd? Dan ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] ILB memory leak?
On 10/11/15 15:26, Dan McDonald wrote: > >> On Nov 10, 2015, at 2:50 AM, Al Slaterwrote: >> >> On 10/11/2015 07:40, Al Slater wrote: >>> It seems to me that ilbd_run_probe just needs to call >>> posix_spawn_file_actions_destroy appropriately. >> >> And probably posix_spawnattr_destroy as well? > > Wow! Great catch. I'll bet a small sum you nailed this to the wall. > > Want me to build you a replacement ilbd? Yes please :) Thanks for your, and Bob's, help with this. -- Al Slater ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] ILB memory leak?
Hi Dan, On 06/11/2015 18:31, Dan McDonald wrote: You said you had a test box, right? Yes. Can you: - Disable UMEM_DEBUG - RESTART the service. - IMMEDIATELY after restart do pmap, and do pmap once per (sec, 10 sec, something) to see how it grows? Attached is a compressed file with 5hrs or so of 10s pmaps. Hopefully not too big for the list. After that, maybe we can dtrace and see what's going on. -- Al Slater Technical Director SCL Phone : +44 (0)1273 07 Fax : +44 (0)1273 01 email : al.sla...@scluk.com Stanton Consultancy Ltd Park Gate, 161 Preston Road, Brighton, East Sussex, BN1 6AU Registered in England Company number: 1957652 VAT number: GB 760 2433 55 pmap.6589.gz Description: application/gzip ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] ILB memory leak?
On 09/11/15 15:43, Dan McDonald wrote: > >> On Nov 9, 2015, at 8:39 AM, Al Slaterwrote: >> >> Attached is a compressed file with 5hrs or so of 10s pmaps. >> Hopefully not too big for the list. > > It compressed nicely. I'm noticing a pattern: > > Mon Nov 9 08:21:45 UTC 2015 total Kb 134008 133504 131416 > - Mon Nov 9 08:50:21 UTC 2015 total Kb 265080 264576 262488 > - Mon Nov 9 09:37:42 UTC 2015 total Kb 265088 264580 262492 > - Mon Nov 9 09:47:40 UTC 2015 total Kb 527232 526724 524636 > - Mon Nov 9 11:42:19 UTC 2015 total Kb 1051520 1050960 1048872 > - Mon Nov 9 11:42:29 UTC 2015 total Kb 1051520 1051012 1048924 > - > > > It's mostly linear growth. Notice the time intervals also double > whenever the footprint essentially doubles? > > So I need to back up and ask some things, especially given libumem > doesn't appear to show leaks or even usage: > > 1.) Is the eating of memory affecting your system peformance? (If > you've only 8GB, yeah, I can see that.) Hmmm... I started investigating after the servers hung a couple of times. I have not conclusively proved that this was the cause, but the machines have been running for months with no issue after I added a cronjob to restart ilb twice a day. I can see a gradual increase in kernel memory use as well, but I have not investigated that. > 2.) Is ilb failing after it gets sufficiently large? Again, no link conclusively proved, but I did see log messages like the following when the memory use had grown to 4Gb... Nov 5 11:17:01 l1-lb2 ilbd[3041]: [ID 410242 daemon.error] ilbd_hc_probe_timer: cannot restart timer: rule ggp server _ggp.11, disabling it I looked at the source for ilbd and I think this could be caused by a memory allocation failure in iu_schedule_timer. After these messages was generated, it looks like the disabled servers were never re-enabled, so eventually this could end up with no enabled servers, and therefore no service, without manual intervention. -- Al Slater ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] ILB memory leak?
On 05/11/2015 14:57, Dan McDonald wrote: On Nov 5, 2015, at 6:38 AM, Al Slaterwrote: I have the 4Gb core file. Is there anything useful I can extract from it to try and spot where the problem is? Your one ::findleaks showed nothing. Did your 4GB corefile have ::findleaks show nothing as well? ::findleaks against the 4GB corefile showed nothing. -- Al Slater ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] ILB memory leak?
> On Nov 6, 2015, at 3:11 AM, Al Slaterwrote: > > On 05/11/2015 14:57, Dan McDonald wrote: >> >>> On Nov 5, 2015, at 6:38 AM, Al Slater wrote: >>> >>> I have the 4Gb core file. Is there anything useful I can extract from >>> it to try and spot where the problem is? >> >> Your one ::findleaks showed nothing. Did your 4GB corefile have ::findleaks >> show nothing as well? > > ::findleaks against the 4GB corefile showed nothing. None of the libumem stats show anything resembling leaks or even excessive allocation. pmap(1) of the corefile is semi-interesting: r151014(~/corefiles/Slater)[0]% pmap core.3041 core 'core.3041' of 3041: /usr/lib/inet/ilbd 0802E000 104K rw---[ stack ] 0805 76K r-x-- /usr/lib/inet/ilbd 08073000 4K rw--- /usr/lib/inet/ilbd 080740001252K rw---[ heap ] 081B 256K rwx--[ anon ] 08202048K rwx--[ anon ] 0841 256K rwx--[ anon ] 0846 512K rwx--[ anon ] 084F1024K rwx--[ anon ] 08608192K rwx--[ anon ] 08E1 256K rwx--[ anon ] 08E6 512K rwx--[ anon ] 08EF1024K rwx--[ anon ] 0900 65536K rwx--[ anon ] 0D01 256K rwx--[ anon ] 0D06 512K rwx--[ anon ] 0D0F1024K rwx--[ anon ] 0D20 262144K rwx--[ anon ] 1D21 256K rwx--[ anon ] 1D26 512K rwx--[ anon ] 1D2F1024K rwx--[ anon ] 1D40 524288K rwx--[ anon ] 3D41 256K rwx--[ anon ] 3D46 512K rwx--[ anon ] 3D4F1024K rwx--[ anon ] 3D60 1048576K rwx--[ anon ] 7D61 256K rwx--[ anon ] 7D66 512K rwx--[ anon ] 7D6F1024K rwx--[ anon ] 7D80 1048576K rwx--[ anon ] BD81 256K rwx--[ anon ] BD86 512K rwx--[ anon ] BD8F1024K rwx--[ anon ] BDA0 524288K rwx--[ anon ] DDA1 256K rwx--[ anon ] DDA6 512K rwx--[ anon ] DDAF1024K rwx--[ anon ] DDC0 262144K rwx--[ anon ] EDC1 256K rwx--[ anon ] EDC6 512K rwx--[ anon ] EDCF1024K rwx--[ anon ] EDE0 131072K rwx--[ anon ] F5E1 256K rwx--[ anon ] F5E6 512K rwx--[ anon ] F5EF1024K rwx--[ anon ] F600 65536K rwx--[ anon ] FA01 256K rwx--[ anon ] FA06 512K rwx--[ anon ] FA0F1024K rwx--[ anon ] FA20 32768K rwx--[ anon ] FC21 256K rwx--[ anon ] FC26 512K rwx--[ anon ] FC2F1024K rwx--[ anon ] FC40 16384K rwx--[ anon ] FD41 256K rwx--[ anon ] FD46 512K rwx--[ anon ] FD4F1024K rwx--[ anon ] FD608192K rwx--[ anon ] FDE1 256K rwx--[ anon ] FDE6 512K rwx--[ anon ] FDEF1024K rwx--[ anon ] FE004096K rwx--[ anon ] FE41 256K rwx--[ anon ] FE46 512K rwx--[ anon ] FE4F1024K rwx--[ anon ] FE602048K rwx--[ anon ] FE821024K rwx--[ anon ] FE931024K rwx--[ anon ] FEA4 512K rwx--[ anon ] FEAD 256K rwx--[ anon ] FEB2 128K rwx--[ anon ] FEB5 64K rwx--[ anon ] FEB7 64K rwx--[ anon ] FEB9 4K rwx--[ anon ] FEBA 20K r-x-- /usr/lib/libilb.so.1 FEBB5000 4K rw--- /usr/lib/libilb.so.1 FEBC 32K r-x-- /lib/libuutil.so.1 FEBD8000 4K rw--- /lib/libuutil.so.1 FEBE 4K rwx--[ anon ] FEBF 172K r-x-- /lib/libscf.so.1 FEC2B000 4K rw--- /lib/libscf.so.1 FEC3 20K r-x-- /lib/libinetutil.so.1 FEC45000 4K rw--- /lib/libinetutil.so.1 FEC5 4K rwx--[ anon ] FEC6 20K r-x-- /lib/libcmdutils.so.1 FEC75000 4K rw--- /lib/libcmdutils.so.1 FEC8 4K r* [ anon ] FEC9 64K rwx--[ anon ] FECB 64K rwx--[ anon ] FECD 416K r-x-- /lib/libnsl.so.1 FED48000 8K rw--- /lib/libnsl.so.1 FED4A000 20K rw--- /lib/libnsl.so.1 FED5 4K rwx--[ anon ] FED6 52K r-x-- /lib/libsocket.so.1 FED7D000 4K rw--- /lib/libsocket.so.1 FED8 24K rwx--[ anon ] FED91252K r-x-- /lib/libc.so.1 FEED9000 36K rwx-- /lib/libc.so.1 FEEE2000 8K rwx-- /lib/libc.so.1 FEEF 4K rwx--[ anon ] FEF0 196K r-x-- /lib/libumem.so.1 FEF4 8K rwx-- /lib/libumem.so.1 FEF52000 76K rw--- /lib/libumem.so.1 FEF65000 24K rw--- /lib/libumem.so.1 FEF7 4K r* [ anon ] FEF8 4K rwx--[ anon ] FEF9 4K rw---[ anon ] FEFA 4K rw---[ anon ] FEFB 4K rwx--[ anon ] FEFB5000 216K r-x-- /lib/ld.so.1 FEFFB000 8K rwx-- /lib/ld.so.1 FEFFD000 4K rwx-- /lib/ld.so.1 total 4040340K r151014(~/corefiles/Slater)[0]% Lots
Re: [OmniOS-discuss] ILB memory leak?
> On Nov 6, 2015, at 9:39 AM, Dan McDonaldwrote: > > Lots of LARGE anonymous mappings. I wonder why that happened? I'll dig into > that a bit more. pmap(1) works even better on running processes. Could you run, say "pmap -xa `pgrep ilbd`" on your running machine? Dan ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] ILB memory leak?
On 06/11/15 14:51, Dan McDonald wrote: > >> On Nov 6, 2015, at 9:39 AM, Dan McDonaldwrote: >> >> Lots of LARGE anonymous mappings. I wonder why that happened? I'll dig into >> that a bit more. > > pmap(1) works even better on running processes. Could you run, say "pmap -xa > `pgrep ilbd`" on your running machine? Here you go... root@loki:/export/home/BRIGHTON/aslate# pmap -xa `pgrep ilbd` 12346: /usr/lib/inet/ilbd Address Kbytes RSSAnon Locked Mode Mapped File 08027000 132 132 132 - rw---[ stack ] 0805 76 76 - - r-x-- ilbd 08073000 4 4 4 - rw--- ilbd 08074000 96 - - - rw--- ilbd 0808C000115611401112 - rw---[ heap ] 0D20 262144 262144 262144 - rwx--[ anon ] 1D40 524288 524288 524288 - rwx--[ anon ] 3D60 1048576 1048576 1048576 - rwx--[ anon ] 7D80 1048576 1048576 1048576 - rwx--[ anon ] BDA0 524288 524288 524288 - rwx--[ anon ] DDC0 262144 262144 262144 - rwx--[ anon ] EDE0 131072 131072 131072 - rwx--[ anon ] F600 65536 65536 65536 - rwx--[ anon ] FA20 32768 32768 32768 - rwx--[ anon ] FC40 16384 16384 16384 - rwx--[ anon ] FD60819281928192 - rwx--[ anon ] FE00409640964096 - rwx--[ anon ] FE60204820482048 - rwx--[ anon ] FE8A 36 16 - - r-x-- libtsol.so.2 FE8B9000 4 4 4 - rw--- libtsol.so.2 FE8C 4 4 4 - rwx--[ anon ] FE8D 140 112 - - r-x-- libbsm.so.1 FE903000 28 28 28 - rw--- libbsm.so.1 FE90A000 4 - - - rw--- libbsm.so.1 FE91 16 16 - - r-x-- libsecdb.so.1 FE924000 4 4 4 - rw--- libsecdb.so.1 FE93102410241024 - rwx--[ anon ] FEA4 512 512 512 - rwx--[ anon ] FEAD 256 256 256 - rwx--[ anon ] FEB2 128 128 128 - rwx--[ anon ] FEB5 64 64 64 - rwx--[ anon ] FEB7 64 16 16 - rwx--[ anon ] FEB9 4 4 4 - rwx--[ anon ] FEBA 20 20 - - r-x-- libilb.so.1 FEBB5000 4 4 4 - rw--- libilb.so.1 FEBC 32 32 - - r-x-- libuutil.so.1 FEBD8000 4 4 4 - rw--- libuutil.so.1 FEBE 4 4 4 - rwx--[ anon ] FEBF 172 148 - - r-x-- libscf.so.1 FEC2B000 4 4 4 - rw--- libscf.so.1 FEC3 20 20 - - r-x-- libinetutil.so.1 FEC45000 4 4 4 - rw--- libinetutil.so.1 FEC5 4 4 4 - rwx--[ anon ] FEC6 20 12 - - r-x-- libcmdutils.so.1 FEC75000 4 4 4 - rw--- libcmdutils.so.1 FEC8 4 4 - - r--s- dev:528,24 ino:2821218250 FEC9 64 64 4 - rwx--[ anon ] FECB 64 64 4 - rwx--[ anon ] FECD 416 368 - - r-x-- libnsl.so.1 FED48000 8 8 8 - rw--- libnsl.so.1 FED4A000 20 16 4 - rw--- libnsl.so.1 FED5 4 4 4 - rwx--[ anon ] FED6 52 48 - - r-x-- libsocket.so.1 FED7D000 4 4 4 - rw--- libsocket.so.1 FED8 24 12 12 - rwx--[ anon ] FED91252 936 - - r-x-- libc_hwcap1.so.1 FEED9000 36 36 32 - rwx-- libc_hwcap1.so.1 FEEE2000 8 8 8 - rwx-- libc_hwcap1.so.1 FEEF 4 4 4 - rwx--[ anon ] FEF0 196 112 - - r-x-- libumem.so.1 FEF4 8 4 4 - rwx-- libumem.so.1 FEF52000 76 72 16 - rw--- libumem.so.1 FEF65000 24 24 24 - rw--- libumem.so.1 FEF7 4 4 - - r--s- ld.config FEF8 4 4 4 - rwx--[ anon ] FEF9 4 4 4 - rw---[ anon ] FEFA 4 4 4 - rw---[ anon ] FEFB 4 4 4 - rwx--[ anon ] FEFB5000 216 216 - - r-x-- ld.so.1 FEFFB000 8 8 8 - rwx-- ld.so.1 FEFFD000 4 4 4 - rwx-- ld.so.1 --- --- --- --- total Kb 3936668 3935948 3933588 -- Al Slater ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com
Re: [OmniOS-discuss] ILB memory leak?
> On Nov 6, 2015, at 10:57 AM, Al Slaterwrote: > > > 7D80 1048576 1048576 1048576 - rwx--[ anon ] > BDA0 524288 524288 524288 - rwx--[ anon ] > DDC0 262144 262144 262144 - rwx--[ anon ] > EDE0 131072 131072 131072 - rwx--[ anon ] More huge anonymous mappings (1G, 512MB, 256MB, 128MB). I don't know pmap as well as I should. I don't see anything in the man page to give me further insight into why these chunks of memory are being eaten. Dan ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] ILB memory leak?
> On Nov 6, 2015, at 11:25 AM, Dan McDonaldwrote: > >> On Nov 6, 2015, at 10:57 AM, Al Slater wrote: >> >> >> 7D80 1048576 1048576 1048576 - rwx--[ anon ] >> BDA0 524288 524288 524288 - rwx--[ anon ] >> DDC0 262144 262144 262144 - rwx--[ anon ] >> EDE0 131072 131072 131072 - rwx--[ anon ] > > More huge anonymous mappings (1G, 512MB, 256MB, 128MB). > You said you had a test box, right? Can you: - Disable UMEM_DEBUG - RESTART the service. - IMMEDIATELY after restart do pmap, and do pmap once per (sec, 10 sec, something) to see how it grows? After that, maybe we can dtrace and see what's going on. Dan ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] ILB memory leak?
On Fri, 6 Nov 2015, Dan McDonald wrote: More huge anonymous mappings (1G, 512MB, 256MB, 128MB). I don't know pmap as well as I should. I don't see anything in the man page to give me further insight into why these chunks of memory are being eaten. It is pretty common for memory allocators to use anonymous mappings for large memory allocations. This allows releasing memory back to the system. Some applications use algorithms where they double the memory size request from the previous request when a little more memory is required in order to lessen the hit from many realloc() calls. This might explain the power-of two sizes. If this is being done, the smaller power of two allocations may be a bug. Tracing mmap() calls on the program while is is running might reveal something. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] ILB memory leak?
To the mailing list as well... On 22/10/2015 09:43, Al Slater wrote: > On 21/10/2015 17:35, Dan McDonald wrote: >> >>> On Oct 21, 2015, at 6:08 AM, Al Slater>>> wrote: >>> >>> Hi, >>> >>> I am running omnios r151014 on a couple of machines with a couple >>> of zones each. 1 zone runs apache as an SSL reverse proxy, the >>> other runs ILB for load balancing web to app tier connections. >>> >>> I noticed that in the ILB zone, the ilbd process memory grows to >>> about 2Gb. Restarting ILB releases the memory, and then the >>> memory usage gradually increases again, with each memory increase >>> approximately 2 * the size of the previous one. I run a cronjob >>> twice a day ( 8am and 8pm) which restarts the ilb service and >>> releases the memory. >>> >>> A graph of memory usage is available at >>> https://www.dropbox.com/s/zaz51apxslnivlq/ILB_Memory_2_days.png?dl=0 >>> > >> There are currently 62 rules in the load balancer, with a > >> total >>> of 664 server/port pairs. >>> >>> Is there anything I can provide that would help track this down? >> >> You can use svccfg(1M) to enable user-level memory debugging on ilb. >> It may cause the ilb daemon to dump core. (And you're just noticing >> this in the process, not kernel memory consumption, correct?) > > I am seeing kernel memory consumption increasing as well, but that may > be a different issue. The ilbd process memory is definitely growing. > >> As root: >> >> svcadm disable -t ilb svccfg -s ilb setenv LD_PRELOAD libumem.so >> svccfg -s ilb setenv UMEM_DEBUG default svccfg -s ilb refresh svcadm >> enable ilb >> >> That should enable user-level memory debugging. If you get a >> coredump, save it and share it. If you don't and the ilb daemon >> keeps running, eventually please: >> >> gcore `pgrep ilbd` >> >> and share THAT corefile. You can also do this by youself: >> >> mdb > ::findleaks >> >> and share ::findleaks. >> >> Once you're done generating corefiles, repeat the steps above, but >> use "unsetenv LD_PRELOAD" and "unsetenv UMEM_DEBUG" instead of the >> setenv lines. > > Thanks Dan. As we are talking about production boxes here, I will have > to try and reproduce on another box and then I will give the process > above a go and see what we come up with. I have reproduced the problem on a test box. prstat shows: 3041 daemon 3946M 3946M sleep 590 0:48:03 0.1% ilbd/1 memstat: root@loki:/export/home/BRIGHTON/aslate# echo ::memstat | mdb -k Page SummaryPagesMB %Tot Kernel 238420 931 12% ZFS File Data 630861 2464 31% Anon 1054835 4120 51% Exec and libs2204 80% Page cache 10624411% Free (cachelist) 9236360% Free (freelist)105626 4125% Total 2051806 8014 Physical 2051805 8014 mdb findleaks: root@loki:/export/home/BRIGHTON/aslate# mdb core.3041 Loading modules: [ libumem.so.1 libc.so.1 libcmdutils.so.1 libuutil.so.1 ld.so.1 ] > ::findleaks findleaks: no memory leaks detected > Now, I am seeing lots of log messages like the following in /var/adm/messages Nov 5 11:17:01 l1-lb2 ilbd[3041]: [ID 410242 daemon.error] ilbd_hc_probe_timer: cannot restart timer: rule ggp server _ggp.11, disabling it So, I was wrong about growing to 2Gb, the truth is nearer 4Gb. I am guessing that ilbd_hc_restart_timer is failing because no more memory can be allocated. I have the 4Gb core file. Is there anything useful I can extract from it to try and spot where the problem is? -- Al Slater ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] ILB memory leak?
Hi Dan, On 05/11/2015 14:57, Dan McDonald wrote: On Nov 5, 2015, at 6:38 AM, Al Slaterwrote: I have the 4Gb core file. Is there anything useful I can extract from it to try and spot where the problem is? Your one ::findleaks showed nothing. Did your 4GB corefile have ::findleaks show nothing as well? ::umausers may be helpful. root@loki:/export/home/BRIGHTON/aslate# mdb core.3041 Loading modules: [ libumem.so.1 libc.so.1 libcmdutils.so.1 libuutil.so.1 ld.so.1 ] ::umausers 71424 bytes for 62 allocations with data size 1152: libumem.so.1`umem_cache_alloc_debug+0x1fe libumem.so.1`umem_cache_alloc+0x18f libumem.so.1`umem_alloc+0x50 libumem.so.1`umem_malloc+0x36 libumem.so.1`calloc+0x50 i_ilbd_alloc_sg+0x13 ilbd_create_sg+0x9a ilbd_scf_instance_walk_pg+0x2a6 ilbd_walk_sg_pgs+0x37 i_ilbd_read_config+0x28 main_loop+0x7f main+0x1d3 _start+0x83 53120 bytes for 664 allocations with data size 80: libumem.so.1`umem_cache_alloc_debug+0x1fe libumem.so.1`umem_cache_alloc+0x18f libumem.so.1`umem_alloc+0x50 libumem.so.1`umem_malloc+0x36 libumem.so.1`calloc+0x50 ilbd_hc_srv_add+0x18 ilbd_hc_associate_rule+0xd8 ilbd_create_rule+0x1a3 ilbd_scf_instance_walk_pg+0x1c4 ilbd_walk_rule_pgs+0x37 i_ilbd_read_config+0x4e main_loop+0x7f main+0x1d3 _start+0x83 53120 bytes for 664 allocations with data size 80: libumem.so.1`umem_cache_alloc_debug+0x1fe libumem.so.1`umem_cache_alloc+0x18f libumem.so.1`umem_alloc+0x50 libumem.so.1`umem_malloc+0x36 libumem.so.1`calloc+0x50 i_add_srv2sg+0x15 ilbd_add_server_to_group+0x310 ilbd_scf_instance_walk_pg+0x2dd ilbd_walk_sg_pgs+0x37 i_ilbd_read_config+0x28 main_loop+0x7f main+0x1d3 _start+0x83 31584 bytes for 658 allocations with data size 48: libumem.so.1`umem_cache_alloc_debug+0x1fe libumem.so.1`umem_cache_alloc+0x99 libumem.so.1`umem_alloc+0x50 libumem.so.1`umem_malloc+0x36 libumem.so.1`calloc+0x50 libinetutil.so.1`iu_schedule_timer_ms+0x2d libinetutil.so.1`iu_schedule_timer+0x37 ilbd_hc_restart_timer+0xbc ilbd_hc_probe_timer+0x23 libinetutil.so.1`iu_expire_timers+0xbe ilbd_hc_timeout+0x11 main_loop+0xe6 main+0x1d3 _start+0x83 12288 bytes for 1 allocations with data size 12288: libumem.so.1`umem_cache_alloc_debug+0x1fe libumem.so.1`umem_cache_alloc+0x18f libumem.so.1`umem_alloc+0x50 libumem.so.1`umem_malloc+0x36 libc.so.1`ltzset_u+0xa2 libc.so.1`localtime_r+0x35 libc.so.1`ctime_r+0x2c libc.so.1`vsyslog+0x1e4 ilbd_log+0x48 main+0x15e _start+0x83 10368 bytes for 54 allocations with data size 192: libumem.so.1`umem_cache_alloc_debug+0x1fe libumem.so.1`umem_cache_alloc+0x99 libumem.so.1`umem_alloc+0x50 libumem.so.1`umem_malloc+0x36 libumem.so.1`calloc+0x50 i_alloc_ilbd_rule+0x17 ilbd_create_rule+0xfa ilbd_scf_instance_walk_pg+0x1c4 ilbd_walk_rule_pgs+0x37 i_ilbd_read_config+0x4e main_loop+0x7f main+0x1d3 _start+0x83 Sharing the corefile would also be helpful. I have put it on dropbox https://www.dropbox.com/s/y6cv78d1xk5j5u7/core.3041.gz?dl=0 I'm assuming, given you see problems at 4GB that ilbd is a 32-bit process, right? Yes, # file /usr/lib/inet/ilbd /usr/lib/inet/ilbd: ELF 32-bit LSB executable 80386 Version 1, dynamically linked, not stripped, no debugging information available cheers -- Al Slater Technical Director SCL Phone : +44 (0)1273 07 Fax : +44 (0)1273 01 email : al.sla...@scluk.com Stanton Consultancy Ltd Park Gate, 161 Preston Road, Brighton, East Sussex, BN1 6AU Registered in England Company number: 1957652 VAT number: GB 760 2433 55 ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] ILB memory leak?
> On Nov 5, 2015, at 6:38 AM, Al Slaterwrote: > > I have the 4Gb core file. Is there anything useful I can extract from > it to try and spot where the problem is? Your one ::findleaks showed nothing. Did your 4GB corefile have ::findleaks show nothing as well? ::umausers may be helpful. Sharing the corefile would also be helpful. I'm assuming, given you see problems at 4GB that ilbd is a 32-bit process, right? Thanks, Dan ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] ILB memory leak?
Le 22/10/15 10:43, Al Slater a écrit : > I am seeing kernel memory consumption increasing as well, but that may > be a different issue. The ilbd process memory is definitely growing. > this is indeed probably a different issue, but it would be useful to create a thread on illumos discuss as I'm seeing it as well (not using ILB).. for example, running a number of rather intensive builds I see kernel steadily going up to ~40%!!: > richard@omnis:/home/richard$ swap -hs ; echo ::memstat |pfexec mdb -k > total: 1,8G allocated + 311M reserved = 2,1G used, 40G available > Page SummaryPagesMB %Tot > > Kernel3231113 12621 39% > ZFS File Data 2944763 11502 35% > Anon 452803 17685% > Exec and libs5088190% > Page cache 65892 2571% > Free (cachelist)70820 2761% > Free (freelist) 1614595 6307 19% > > Total 8385074 32754 > Physical 8385072 32754 -- Richard PALO ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] ILB memory leak?
On 21/10/2015 17:35, Dan McDonald wrote: On Oct 21, 2015, at 6:08 AM, Al Slaterwrote: Hi, I am running omnios r151014 on a couple of machines with a couple of zones each. 1 zone runs apache as an SSL reverse proxy, the other runs ILB for load balancing web to app tier connections. I noticed that in the ILB zone, the ilbd process memory grows to about 2Gb. Restarting ILB releases the memory, and then the memory usage gradually increases again, with each memory increase approximately 2 * the size of the previous one. I run a cronjob twice a day ( 8am and 8pm) which restarts the ilb service and releases the memory. A graph of memory usage is available at https://www.dropbox.com/s/zaz51apxslnivlq/ILB_Memory_2_days.png?dl=0 >> There are currently 62 rules in the load balancer, with a >> total of 664 server/port pairs. Is there anything I can provide that would help track this down? You can use svccfg(1M) to enable user-level memory debugging on ilb. It may cause the ilb daemon to dump core. (And you're just noticing this in the process, not kernel memory consumption, correct?) I am seeing kernel memory consumption increasing as well, but that may be a different issue. The ilbd process memory is definitely growing. As root: svcadm disable -t ilb svccfg -s ilb setenv LD_PRELOAD libumem.so svccfg -s ilb setenv UMEM_DEBUG default svccfg -s ilb refresh svcadm enable ilb That should enable user-level memory debugging. If you get a coredump, save it and share it. If you don't and the ilb daemon keeps running, eventually please: gcore `pgrep ilbd` and share THAT corefile. You can also do this by youself: mdb > ::findleaks and share ::findleaks. Once you're done generating corefiles, repeat the steps above, but use "unsetenv LD_PRELOAD" and "unsetenv UMEM_DEBUG" instead of the setenv lines. Thanks Dan. As we are talking about production boxes here, I will have to try and reproduce on another box and then I will give the process above a go and see what we come up with. -- Al Slater Technical Director SCL Phone : +44 (0)1273 07 Fax : +44 (0)1273 01 email : al.sla...@scluk.com Stanton Consultancy Ltd Park Gate, 161 Preston Road, Brighton, East Sussex, BN1 6AU Registered in England Company number: 1957652 VAT number: GB 760 2433 55 ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] ILB memory leak?
Al Slater writes: > On 21/10/2015 17:35, Dan McDonald wrote: >> >> That should enable user-level memory debugging. If you get a >> coredump, save it and share it. If you don't and the ilb daemon >> keeps running, eventually please: >> >> gcore `pgrep ilbd` >> >> and share THAT corefile. You can also do this by youself: >> >> mdb > ::findleaks >> >> and share ::findleaks. >> >> Once you're done generating corefiles, repeat the steps above, but >> use "unsetenv LD_PRELOAD" and "unsetenv UMEM_DEBUG" instead of the >> setenv lines. > > Thanks Dan. As we are talking about production boxes here, I will have > to try and reproduce on another box and then I will give the process > above a go and see what we come up with. You can also use the DTrace pid provider to grab the user stack on every malloc(3C) call, and the syscall provider to track mmap(2) calls. That poses no harm to production and might make the cause of memory usage obvious. Something like: dtrace -qn 'pid$target::malloc:entry { @[ustack()] = count(); } syscall::mmap*:entry /pid == $target/ { @[ustack()] = count(); }' -p Let that run for a while as the memory grows, then Ctrl-C. -Z ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] ILB memory leak?
> On Oct 21, 2015, at 6:08 AM, Al Slaterwrote: > > Hi, > > I am running omnios r151014 on a couple of machines with a couple of zones > each. 1 zone runs apache as an SSL reverse proxy, the other runs ILB for > load balancing web to app tier connections. > > I noticed that in the ILB zone, the ilbd process memory grows to about 2Gb. > Restarting ILB releases the memory, and then the memory usage gradually > increases again, with each memory increase approximately 2 * the size of the > previous one. I run a cronjob twice a day ( 8am and 8pm) which restarts the > ilb service and releases the memory. > > A graph of memory usage is available at > https://www.dropbox.com/s/zaz51apxslnivlq/ILB_Memory_2_days.png?dl=0 > > There are currently 62 rules in the load balancer, with a total of 664 > server/port pairs. > > Is there anything I can provide that would help track this down? You can use svccfg(1M) to enable user-level memory debugging on ilb. It may cause the ilb daemon to dump core. (And you're just noticing this in the process, not kernel memory consumption, correct?) As root: svcadm disable -t ilb svccfg -s ilb setenv LD_PRELOAD libumem.so svccfg -s ilb setenv UMEM_DEBUG default svccfg -s ilb refresh svcadm enable ilb That should enable user-level memory debugging. If you get a coredump, save it and share it. If you don't and the ilb daemon keeps running, eventually please: gcore `pgrep ilbd` and share THAT corefile. You can also do this by youself: mdb > ::findleaks and share ::findleaks. Once you're done generating corefiles, repeat the steps above, but use "unsetenv LD_PRELOAD" and "unsetenv UMEM_DEBUG" instead of the setenv lines. Hope this helps, Dan ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
Re: [OmniOS-discuss] ILB memory leak?
On Wed, 21 Oct 2015, Dan McDonald wrote: You can use svccfg(1M) to enable user-level memory debugging on ilb. It may cause the ilb daemon to dump core. (And you're just noticing this in the process, not kernel memory consumption, correct?) As root: svcadm disable -t ilb svccfg -s ilb setenv LD_PRELOAD libumem.so svccfg -s ilb setenv UMEM_DEBUG default svccfg -s ilb refresh svcadm enable ilb Is there a way to use ulimit to limit the data segment size (ulimit -d)? If this is possible, then a dumped core (due to hitting the limit) may point directly to the guilty party. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss
[OmniOS-discuss] ILB memory leak?
Hi, I am running omnios r151014 on a couple of machines with a couple of zones each. 1 zone runs apache as an SSL reverse proxy, the other runs ILB for load balancing web to app tier connections. I noticed that in the ILB zone, the ilbd process memory grows to about 2Gb. Restarting ILB releases the memory, and then the memory usage gradually increases again, with each memory increase approximately 2 * the size of the previous one. I run a cronjob twice a day ( 8am and 8pm) which restarts the ilb service and releases the memory. A graph of memory usage is available at https://www.dropbox.com/s/zaz51apxslnivlq/ILB_Memory_2_days.png?dl=0 There are currently 62 rules in the load balancer, with a total of 664 server/port pairs. Is there anything I can provide that would help track this down? -- Al Slater ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com http://lists.omniti.com/mailman/listinfo/omnios-discuss