Well, that’s not what I want. Running sge_execd via systemd is relatively easy, but it does not solve my problem as this way all jobs end up in the same cgroup as sge_execd. My aim is that sge_execd spawns the job via systemd, creating a separate job slice for every job – i.e. much like when you log in to your Debian box, user slice is created for you via pam_systemd (now, we can’t use PAM modules for obvious reasons, so that’s why suggest using “systemd-run”). This way we can do much cleaner job integration with cgroups as systemd would be managing them.
Also, it would be much easier to track processes belonging to a single job and hardware resources consumed by the job (i.e. via systemd-cgtop) – as systemd would track them for us Ondrej From: Laurent TOMAS <laurent.to...@idiap.ch> Sent: Thursday, August 8, 2019 10:03 AM To: Ondrej Valousek <ondrej.valou...@adestotech.com> Subject: Re: [SGE-discuss] SGE & systemd integration On 07.08.19 17:59, Ondrej Valousek wrote: Hi all, I am thinking of making SGE (or sge_execd) more systemd friendly. Right now, there is some (as per 8.1.9) support for cgroups as per: USE_CGROUPS=y/n My proposal is to make it: USE_CGROUPS=y/n/systemd when set to systemd, we would not to detect and any cgroups (and setting cpuset controller) manually. Instead, shepherd daemon would run the job via "systemd-run" binary. https://www.freedesktop.org/software/systemd/man/systemd-run.html<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.freedesktop.org%2Fsoftware%2Fsystemd%2Fman%2Fsystemd-run.html&data=02%7C01%7Condrej.valousek%40adestotech.com%7C44ddcc0473e04d7d1d6a08d71bd6d52f%7C2ccd8edaa14a4b4f825ce6ad71d71b81%7C0%7C1%7C637008481839622177&sdata=4Dsp846iEBmBBTE5%2Fp9jofdheoPJ6ynTV78XHGf3xj4%3D&reserved=0> systemd-run can set various cgroup controllers via it's "--property" flag, achieving the same we do now manually. Initially, I was thinking about implementing the same via "starter_method" flag, but systemd-run needs to be run as root, so it has to be hardcoded into shepherd.c and sge_execd daemon needs to also be running under root privileges, not sure if capabilities would help here. Does this initiative make any sense? I can try to implement it myself, but I am not familiar with sge internals. I can try... Ondrej _______________________________________________ SGE-discuss mailing list SGE-discuss@liv.ac.uk<mailto:SGE-discuss@liv.ac.uk> https://arc.liv.ac.uk/mailman/listinfo/sge-discuss<https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Farc.liv.ac.uk%2Fmailman%2Flistinfo%2Fsge-discuss&data=02%7C01%7Condrej.valousek%40adestotech.com%7C44ddcc0473e04d7d1d6a08d71bd6d52f%7C2ccd8edaa14a4b4f825ce6ad71d71b81%7C0%7C1%7C637008481839632174&sdata=fy23fKw0Ay5NJZAAyDROib7nZ4tCYo%2FE3OYYRX%2FVAuU%3D&reserved=0> Hi execd work well with systemd. SGE_ND=true will start sge_daemon as foreground. sytemd will manage daemon if it die or properly stopped (kill -15 ?) manage log as stdout if you made s script, why not via tee to have your own log and insert in systemd(stdout) note: a kill -15 of sge daemon should stop peoprerly sge daemons. example for execd: /lib/systemd/system/sge-execd-idiap.service -------------------------------------------------------------------------- [Unit] Description=SGE Execution Daemon (sge_execd) After=sge-mount-idiap.service [Service] Type=simple # SGE path Environment=The_path_of_SGE_ROOT SGE_CELL=My_cell_name # TCP ports (chanche for your ports Environment=SGE_QMASTER_PORT=536 SGE_EXECD_PORT=537 # Stay in foreground Environment=SGE_ND=true # Let's Rock-'n-Roll! ExecStartPre=/idiap/resource/software/sge/scripts/sge_execd-prestart # $SGE_ROOT will not work, add full path ExecStart=THE_SGE_ROOT_PATH/bin/lx-amd64/sge_execd [Install] WantedBy=multi-user.target -------------------------------------------------------------------------- note: if sge binaries are in NFS woth automount, it could not start, ExecStartPre may help you or use a pre start in another systemd file like this ------------------------------------- [Unit] Description=SGE Mount Dependencies After=remote-fs.target network.target nfs-client.target autofs.service # Restart limit (also see 'Restart...' in [Service] configuration below) StartLimitBurst=60 StartLimitIntervalSec=305 [Service] Type=simple RemainAfterExit=true ExecStart=/usr/bin/test -e whereis_SGE_ROOT_mount_point # Restart on failure (also see 'StartLimit...' in [Unit] configuration above) Restart=on-failure RestartSec=5 [Install] WantedBy=multi-user.target ------------------------------------------------------------------ if you have only master like us, it's easy like execd if shadow+master, the best is to find a way to manage logs at stdout and foreground. Note: We work on Debian Stretch and i am testing on Buster (multiple CPU architectures amd64, Arm, Power, Octeon) Best Regards -TOMAS Laurent- C.E.H. - system engineer Idiap Research Institute Centre du Parc Rue Marconi 19 CH-1920 Martigny Tel: +41 27 721 77 11 Fax: +41 27 721 77 12 Web: http://www.idiap.ch/<https://nam04.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.idiap.ch%2F&data=02%7C01%7Condrej.valousek%40adestotech.com%7C44ddcc0473e04d7d1d6a08d71bd6d52f%7C2ccd8edaa14a4b4f825ce6ad71d71b81%7C0%7C1%7C637008481839632174&sdata=htM3P8ivkKVCGMsqxFz7nKhpNxiW%2BeC%2B33CVi33vcC4%3D&reserved=0> _______________________________________________ SGE-discuss mailing list SGE-discuss@liv.ac.uk https://arc.liv.ac.uk/mailman/listinfo/sge-discuss