Hola Luis, revisa bien la configuración del apache. Sobre todo asignaciones
de memoria RAM o algo así.

Espero te ayude mi indicio.

Saludos

El 2 abr. 2017 13:12, "luis gil" <[email protected]> escribió:

> Buenas tardes. Espero que me puedan ayudar. El caso es que tengo un
> servidor con Debian 6. Está corriendo mysql 5.1 y apache 2 con una
> aplicación en php. La máquina tiene 31 G de RAM y 24 procesadores con 6
> cores cada uno. El SO es de 32 bits.
>
> De vez en cuando, varias veces al día, normalmente con consultas pesadas a
> la base de datos, se reinicia el mysql con consecuencias desde
> indetectables a desastrosas. Mirando el syslog me sale lo siguiente:
>
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058476] apache2 invoked
> oom-killer: gfp_mask=0x44d0, order=2, oom_adj=0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058481] apache2 cpuset=/
> mems_allowed=0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058487] Pid: 10826, comm:
> apache2 Tainted: G        W  2.6.32-5-686-bigmem #1
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058491] Call Trace:
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058501]  [<c1090380>] ?
> oom_kill_process+0x60/0x201
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058508]  [<c10908fd>] ?
> __out_of_memory+0xf4/0x107
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058514]  [<c109096a>] ?
> out_of_memory+0x5a/0x7c
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058519]  [<c109322c>] ?
> __alloc_pages_nodemask+0x3ef/0x4d9
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058526]  [<c1093322>] ?
> __get_free_pages+0xc/0x17
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058532]  [<c10b6fb9>] ?
> __kmalloc_track_caller+0x34/0x124
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058540]  [<c11e1289>] ?
> sock_alloc_send_pskb+0xaa/0x25f
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058544]  [<c11e5052>] ?
> __alloc_skb+0x4a/0x115
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058550]  [<c11e1289>] ?
> sock_alloc_send_pskb+0xaa/0x25f
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058558]  [<c102ac40>] ?
> __wake_up_sync_key+0x33/0x49
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058565]  [<c11449e3>] ?
> copy_from_user+0x27/0x10e
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058571]  [<c11e144a>] ?
> sock_alloc_send_skb+0xc/0xf
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058578]  [<c12421f7>] ?
> unix_stream_sendmsg+0x134/0x2c4
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058585]  [<c11de589>] ?
> __sock_sendmsg+0x43/0x4a
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058591]  [<c11de633>] ?
> sock_aio_write+0xa3/0xb0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058598]  [<c10bb522>] ?
> do_sync_write+0xc0/0x107
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058605]  [<c104c062>] ?
> run_posix_cpu_timers+0x65f/0x67a
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058612]  [<c104abbe>] ?
> autoremove_wake_function+0x0/0x2d
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058618]  [<c10bb20d>] ?
> fsnotify_access+0x5a/0x61
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058625]  [<c110b1d0>] ?
> security_file_permission+0xc/0xd
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058630]  [<c10bbe3d>] ?
> vfs_write+0x8f/0xd6
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058635]  [<c10bbf1c>] ?
> sys_write+0x3c/0x63
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058641]  [<c100813b>] ?
> sysenter_do_call+0x12/0x28
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058644] Mem-Info:
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058646] DMA per-cpu:
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058649] CPU    0: hi:    0,
> btch:   1 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058652] CPU    1: hi:    0,
> btch:   1 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058656] CPU    2: hi:    0,
> btch:   1 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058659] CPU    3: hi:    0,
> btch:   1 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058663] CPU    4: hi:    0,
> btch:   1 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058666] CPU    5: hi:    0,
> btch:   1 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058668] CPU    6: hi:    0,
> btch:   1 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058671] CPU    7: hi:    0,
> btch:   1 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058674] CPU    8: hi:    0,
> btch:   1 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058677] CPU    9: hi:    0,
> btch:   1 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058679] CPU   10: hi:    0,
> btch:   1 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058682] CPU   11: hi:    0,
> btch:   1 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058685] CPU   12: hi:    0,
> btch:   1 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058689] CPU   13: hi:    0,
> btch:   1 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058692] CPU   14: hi:    0,
> btch:   1 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058695] CPU   15: hi:    0,
> btch:   1 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058698] CPU   16: hi:    0,
> btch:   1 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058701] CPU   17: hi:    0,
> btch:   1 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058704] CPU   18: hi:    0,
> btch:   1 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058708] CPU   19: hi:    0,
> btch:   1 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058711] CPU   20: hi:    0,
> btch:   1 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058714] CPU   21: hi:    0,
> btch:   1 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058717] CPU   22: hi:    0,
> btch:   1 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058721] CPU   23: hi:    0,
> btch:   1 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058723] Normal per-cpu:
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058726] CPU    0: hi:  186,
> btch:  31 usd: 178
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058730] CPU    1: hi:  186,
> btch:  31 usd: 118
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058733] CPU    2: hi:  186,
> btch:  31 usd:  54
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058736] CPU    3: hi:  186,
> btch:  31 usd: 107
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058739] CPU    4: hi:  186,
> btch:  31 usd:  81
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058743] CPU    5: hi:  186,
> btch:  31 usd: 113
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058746] CPU    6: hi:  186,
> btch:  31 usd:  73
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058750] CPU    7: hi:  186,
> btch:  31 usd:   2
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058753] CPU    8: hi:  186,
> btch:  31 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058757] CPU    9: hi:  186,
> btch:  31 usd:  59
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058760] CPU   10: hi:  186,
> btch:  31 usd:  30
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058764] CPU   11: hi:  186,
> btch:  31 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058767] CPU   12: hi:  186,
> btch:  31 usd: 174
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058771] CPU   13: hi:  186,
> btch:  31 usd: 168
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058774] CPU   14: hi:  186,
> btch:  31 usd: 132
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058778] CPU   15: hi:  186,
> btch:  31 usd: 106
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058781] CPU   16: hi:  186,
> btch:  31 usd:  99
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058785] CPU   17: hi:  186,
> btch:  31 usd:  91
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058788] CPU   18: hi:  186,
> btch:  31 usd:  50
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058792] CPU   19: hi:  186,
> btch:  31 usd:  38
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058795] CPU   20: hi:  186,
> btch:  31 usd:  95
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058799] CPU   21: hi:  186,
> btch:  31 usd:  70
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058802] CPU   22: hi:  186,
> btch:  31 usd:  61
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058806] CPU   23: hi:  186,
> btch:  31 usd:  61
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058809] HighMem per-cpu:
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058812] CPU    0: hi:  186,
> btch:  31 usd:  69
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058815] CPU    1: hi:  186,
> btch:  31 usd:  45
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058819] CPU    2: hi:  186,
> btch:  31 usd:  26
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058823] CPU    3: hi:  186,
> btch:  31 usd: 160
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058827] CPU    4: hi:  186,
> btch:  31 usd: 108
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058830] CPU    5: hi:  186,
> btch:  31 usd: 162
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058834] CPU    6: hi:  186,
> btch:  31 usd: 128
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058837] CPU    7: hi:  186,
> btch:  31 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058841] CPU    8: hi:  186,
> btch:  31 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058844] CPU    9: hi:  186,
> btch:  31 usd:   4
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058848] CPU   10: hi:  186,
> btch:  31 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058851] CPU   11: hi:  186,
> btch:  31 usd:   0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058855] CPU   12: hi:  186,
> btch:  31 usd:  69
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058858] CPU   13: hi:  186,
> btch:  31 usd: 154
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058861] CPU   14: hi:  186,
> btch:  31 usd:  23
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058864] CPU   15: hi:  186,
> btch:  31 usd:  36
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058867] CPU   16: hi:  186,
> btch:  31 usd:  96
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058870] CPU   17: hi:  186,
> btch:  31 usd:  27
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058873] CPU   18: hi:  186,
> btch:  31 usd:  61
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058877] CPU   19: hi:  186,
> btch:  31 usd:  88
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058880] CPU   20: hi:  186,
> btch:  31 usd: 105
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058883] CPU   21: hi:  186,
> btch:  31 usd:  41
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058886] CPU   22: hi:  186,
> btch:  31 usd:  53
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058890] CPU   23: hi:  186,
> btch:  31 usd:  73
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058898] active_anon:136720
> inactive_anon:8095 isolated_anon:0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058899]  active_file:2176234
> inactive_file:488398 isolated_file:33
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058901]  unevictable:0
> dirty:674 writeback:4993 unstable:0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058903]  free:5410488
> slab_reclaimable:71137 slab_unreclaimable:12584
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058905]  mapped:5868 shmem:140
> pagetables:2280 bounce:0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058914] DMA free:3524kB
> min:64kB low:80kB high:96kB active_anon:0kB inactive_anon:0kB
> active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB
> isolated(file):0kB present:15868kB mlocked:0kB dirty:0kB writeback:0kB
> mapped:0kB shmem:0kB slab_reclaimable:5308kB slab_unreclaimable:6880kB
> kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB
> pages_scanned:0 all_unreclaimable? yes
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058921] lowmem_reserve[]: 0 865
> 32482 32482
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058930] Normal free:264444kB
> min:3728kB low:4660kB high:5592kB active_anon:0kB inactive_anon:0kB
> active_file:400kB inactive_file:872kB unevictable:0kB isolated(anon):0kB
> isolated(file):132kB present:885944kB mlocked:0kB dirty:0kB writeback:0kB
> mapped:2060kB shmem:0kB slab_reclaimable:279240kB
> slab_unreclaimable:43456kB kernel_stack:4824kB pagetables:0kB unstable:0kB
> bounce:0kB writeback_tmp:0kB pages_scanned:2494 all_unreclaimable? yes
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058938] lowmem_reserve[]: 0 0
> 252937 252937
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058947] HighMem free:21373984kB
> min:512kB low:34580kB high:68652kB active_anon:546880kB
> inactive_anon:32380kB active_file:8704536kB inactive_file:1952720kB
> unevictable:0kB isolated(anon):0kB isolated(file):0kB present:32375936kB
> mlocked:0kB dirty:2696kB writeback:19972kB mapped:21412kB shmem:560kB
> slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB
> pagetables:9120kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0
> all_unreclaimable? no
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058955] lowmem_reserve[]: 0 0 0
> 0
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058959] DMA: 43*4kB 33*8kB
> 37*16kB 56*32kB 12*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB
> = 3588kB
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058969] Normal: 39027*4kB
> 13466*8kB 14*16kB 7*32kB 2*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB
> 0*4096kB = 264412kB
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058979] HighMem: 63*4kB
> 105428*8kB 250783*16kB 162467*32kB 59014*64kB 32018*128kB 9763*256kB
> 1585*512kB 117*1024kB 6*2048kB 0*4096kB = 21373292kB
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058990] 2664895 total pagecache
> pages
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058992] 967 pages in swap cache
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058994] Swap cache stats: add
> 26965, delete 25998, find 586333133/586335409
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058997] Free swap  = 7070072kB
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.058999] Total swap = 7079928kB
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.195357] 8519679 pages RAM
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.195358] 8292353 pages HighMem
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.195359] 202153 pages reserved
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.195360] 2686894 pages shared
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.195360] 367159 pages non-shared
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.195362] Out of memory: kill
> process 22973 (mysqld_safe) score 110038 or a child
> Mar 31 13:34:05 LaCuqui3 kernel: [30300729.195408] Killed process 10926
> (mysqld)
>
> No soy capaz de descifrar todo el contenido del log pero por lo que soy
> capaz de entender, el problema empieza con que apache llama a oom-killer
> (en otros casos lo llama php). En principio esto se produce cuando hay
> problemas de memoria, pero he comprobado (o eso creo) que esto no es así.
> Durante una de estas caídas, estaba lanzando el comando "vmstat 5",
> comprobando el estado de la memoria cada 5 sg, y la memoria libre no ha
> bajado de 22G. Además he instalado hace unas semanas una herramienta de
> monitorización (pandora) que tampoco ha detectado caidas de memoria aunque
> la comprobación la hace cada 5 mn.
>
> He estado investigando cómo funciona el oom-killer y esto no debería de
> estar sucediendo. El oom-killer se llama cuando un proceso necesita memoria
> y la máquina no dispone de ella. Entonces ejecuta un algoritmo más o menos
> complejo y selecciona un proceso para tirar. Supongo que siempre selecciona
> mysql porque será el que más memoria consume. He averiguado cómo hacer que
> no seleccione nunca mysql pero eso no solucionaría el problema original, ya
> que seleccionaría otro proceso y a lo mejor es peor el remedio que la
> enfermedad.
>
> Esto viene sucediendo desde hace más o menos 3 meses. En este tiempo no he
> modificado nada en la máquina salvo algo de código php (lo justo e
> imprescindible). Lo que también cambia diariamente es la cantidad de datos
> que contiene el mysql, que va subiendo diariamente. Tiene muchos datos. El
> backup comprimido ocupa unos 8G. Esto provoca que hay consultas realmente
> pesadas, pero por lo que he visto no llegan a comerse la memoria en
> absoluto. No creo que esto tenga nada que ver pero lo indico por si acaso.
>
> En resumen, ¿cual puede ser la razón de que se llame al oom-killer cuando
> hay memoria de sobras? ¿la memoria libre que me muestra vmstat es real o
> puede haber alguna memoria que no me esté mostrando y que hay que restar a
> la libre? ¿algún comando mejor para monitorizar el uso de memoria?
>
> Un saludo y gracias de antemano.
>

Responder a