We have ~1000 clients, in the evening spacewalk runs a lot of commands (checks files revisions for example)
I'm receiving ~1000 tracebacks, clients can't connect to spacewalk. 1. Usually sp as ~300 processes, during those task ~ 1000 2. I didn't change any tomcat/httpd settings 3. Only changed postgres setttings to be optimized for 64Gb or ram 4. No any errors on backends, but top: op - 06:48:54 up 1 day, 6 min, 2 users, load average: 155.93, 133.08, 117.28 Tasks: 965 total, 119 running, 846 sleeping, 0 stopped, 0 zombie %Cpu(s): 95.3 us, 1.2 sy, 0.0 ni, 3.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem : 65767568 total, 50071348 free, 7842848 used, 7853372 buff/cache KiB Swap: 33008636 total, 33002300 free, 6336 used. 56303848 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 44415 postgres 20 0 15.577g 188576 185440 R 11.6 0.3 0:22.19 postgres 45154 postgres 20 0 15.586g 21764 15564 S 10.9 0.0 0:15.56 postgres 45271 postgres 20 0 15.586g 19588 13680 S 10.9 0.0 0:14.95 postgres 45136 postgres 20 0 15.586g 19944 14064 R 10.6 0.0 0:16.09 postgres 45161 postgres 20 0 15.586g 22348 16044 R 10.6 0.0 0:16.13 postgres 45172 postgres 20 0 15.586g 19512 13680 S 10.6 0.0 0:15.86 postgres 44792 postgres 20 0 15.586g 22292 16044 R 10.3 0.0 0:17.78 postgres 44885 postgres 20 0 15.584g 18824 13932 R 10.3 0.0 0:16.73 postgres 44998 postgres 20 0 15.586g 21296 15100 R 10.3 0.0 0:16.36 postgres 45011 postgres 20 0 15.586g 21200 15048 R 10.3 0.0 0:16.45 postgres 45034 postgres 20 0 15.586g 19348 13540 S 10.3 0.0 0:16.59 postgres 45120 postgres 20 0 15.586g 22060 15608 S 10.3 0.0 0:15.85 postgres 45131 postgres 20 0 15.586g 19352 13560 R 10.3 0.0 0:15.76 postgres 45167 postgres 20 0 15.586g 19416 13580 S 10.3 0.0 0:15.88 postgres 45254 postgres 20 0 15.586g 21096 15020 S 10.3 0.0 0:11.00 postgres 45261 postgres 20 0 15.586g 19328 13516 R 10.3 0.0 0:15.47 postgres 45267 postgres 20 0 15.586g 19372 13560 R 10.3 0.0 0:15.14 postgres 44492 postgres 20 0 15.586g 24872 18508 R 10.0 0.0 0:21.62 postgres 44791 postgres 20 0 15.586g 24396 17840 S 10.0 0.0 0:17.04 postgres 44944 postgres 20 0 15.586g 19324 13512 S 10.0 0.0 0:17.23 postgres 44946 postgres 20 0 15.586g 19388 13556 S 10.0 0.0 0:16.82 postgres 44957 postgres 20 0 15.586g 19356 13520 R 10.0 0.0 0:16.76 postgres 45045 postgres 20 0 15.586g 19372 13564 S 10.0 0.0 0:16.89 postgres 45099 postgres 20 0 15.586g 19448 13624 R 10.0 0.0 0:16.24 postgres 45116 postgres 20 0 15.586g 19444 13628 S 10.0 0.0 0:15.95 postgres 45142 postgres 20 0 15.586g 19412 13612 R 10.0 0.0 0:15.75 postgres 45153 postgres 20 0 15.586g 20932 14924 S 10.0 0.0 0:15.63 postgres 45169 postgres 20 0 15.586g 19900 14064 S 10.0 0.0 0:15.76 postgres 45197 postgres 20 0 15.586g 19368 13532 R 10.0 0.0 0:15.79 postgres 45218 postgres 20 0 15.586g 19824 13964 R 10.0 0.0 0:15.04 postgres 45259 postgres 20 0 15.586g 19364 13548 S 10.0 0.0 0:15.56 postgres 44447 postgres 20 0 15.586g 26928 20336 R 9.6 0.0 0:21.75 postgres 44763 postgres 20 0 15.586g 22256 16024 R 9.6 0.0 0:16.38 postgres 44799 postgres 20 0 15.586g 24700 18116 S 9.6 0.0 0:17.20 postgres 44836 postgres 20 0 15.586g 21084 14928 S 9.6 0.0 0:16.58 postgres 44895 postgres 20 0 15.586g 20784 14464 R 9.6 0.0 0:17.45 postgres 44950 postgres 20 0 15.586g 19272 13464 S 9.6 0.0 0:16.52 postgres 44954 postgres 20 0 15.586g 18128 12736 R 9.6 0.0 0:16.56 postgres 44955 postgres 20 0 15.586g 19412 13584 R 9.6 0.0 0:16.68 postgres #------------------------------------------------------------------------------ # pgtune run on 2017-03-22 # Based on 65767568 KB RAM, platform Linux #------------------------------------------------------------------------------ maintenance_work_mem = 2GB checkpoint_completion_target = 0.9 effective_cache_size = 44GB work_mem = 52MB wal_buffers = 16MB shared_buffers = 15GB max_connections = 600 Any thoughts how to optimize get back sp to life? Thanks
_______________________________________________ Spacewalk-list mailing list [email protected] https://www.redhat.com/mailman/listinfo/spacewalk-list
