we use TH-1A slurm cluster
when i use sbatch jobs zhe jobs always pd 1 hour zhe jobs run What reason be ? 李林吉 BGI | IT运维(NSCC-TJ) 电话 +86 02259803611, 手机 +86 18665379896, 传真 +86 02259803603 天津华大基因科技有限公司,天津华大基因科技有限公司临床检验所. 天津东丽区空港经济区环河北路80号 商务园东区E3楼,300308 邮箱: [email protected] 网站: http://www.genomics.cn 打印前请考虑环保 linji, Li BGI | IT manager(NSCC-TJ) Office +86 02259803611, Mobile +86 18665379896, Fax +86 02259803604 BGI Tianjin Corporation,BG Tianjin Medical Examination Laboratory Add:E3,Tianjin,Airpot industrial Zone. Tianjin,china 300308 mail: [email protected] web: http://www.genomics.cn Please consider the environment before printing this email Confidentiality Notice | This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential or proprietary information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, immediately contact the sender by reply e-mail and destroy all copies of the original message. From: Daniel Letai Date: 2015-03-15 21:46 To: slurm-dev Subject: [slurm-dev] submitting 100k job array causes slurmctld to socket timeout Hi, Testing a new slurm cluster (14.11.4) on a 1k nodes cluster. Several things we've tried: Increase slurmctld threads (8 ports range) Increase munge threads (threads=10) Increase messageTimeout to 30 We are using accounting (db on different server) Thanks for any help
