Hi, Just a guess but could it be the same issue with this?
https://serverfault.com/questions/1105733/virsh-command-hangs-when-script-runs-in-the-background 2023年1月12日(木) 15:36 Madison Kelly <mke...@alteeve.com>: > > On 2023-01-12 01:26, Reid Wahl wrote: > > On Wed, Jan 11, 2023 at 10:21 PM Madison Kelly <mke...@alteeve.com> wrote: > >> > >> On 2023-01-12 01:12, Reid Wahl wrote: > >>> On Wed, Jan 11, 2023 at 8:11 PM Madison Kelly <mke...@alteeve.com> wrote: > >>>> > >>>> Hi all, > >>>> > >>>> There was a lot of sub-threads, so I figured it's helpful to start a > >>>> new thread with a summary so far. For context; I have a super simple > >>>> perl script that pretends to be an RA for the sake of debugging. > >>>> > >>>> https://pastebin.com/9z314TaB > >>>> > >>>> I've had variations log environment variables and confirmed that all > >>>> the variables in the direct call that work are in the crm_resource > >>>> triggered call. There are no selinux issues logged in audit.log and > >>>> selinux is permissive. The script logs the real and effective UID and > >>>> GID and it's the same in both instances. Calling other shell programs > >>>> (tested with 'hostname') run fine, this is specifically crm_resource -> > >>>> test RA -> virsh call. > >>>> > >>>> I ran strace on the virsh call from inside my test script (changing > >>>> 'virsh.good' to 'virsh.bad' between running directly and via > >>>> crm_resource. The strace runs made six files each time. Below are > >>>> pastebin links with the outputs of the six runs in one paste, but each > >>>> file's output is in it's own block (search for file: to see the > >>>> different file outputs) > >>>> > >>>> Good/direct run of the test RA: > >>>> - https://pastebin.com/xtqe9NSG > >>>> > >>>> Bad/crm_resource triggered run of the test RA: > >>>> - https://pastebin.com/vBiLVejW > >>>> > >>>> Still absolutely stumped. > >>> > >>> The strace outputs show that your bad runs are all getting stopped > >>> with SIGTTOU. If you've never heard of that, me either. > >> > >> The hell?! This is new to me also. > >> > >>> https://www.gnu.org/software/libc/manual/html_node/Job-Control-Signals.html > >>> > >>> Macro: int SIGTTOU > >>> > >>> This is similar to SIGTTIN, but is generated when a process in a > >>> background job attempts to write to the terminal or set its modes. > >>> Again, the default action is to stop the process. SIGTTOU is only > >>> generated for an attempt to write to the terminal if the TOSTOP output > >>> mode is set; see Output Modes. > >>> > >>> > >>> Maybe this has something to do with the buffer settings in the perl > >>> script(?). It might be worth trying a version that doesn't fiddle with > >>> the outputs and buffer settings. > >> > >> I tried removing the $|, and then I changed the script to be entirely a > >> bash script, still hanging. I tried 'virsh --connect <method> list > >> --all' where method was qemu:///system, qemu:///session, and > >> ssh+qemu:///root@localhost/system, all hang. In bash or perl. > >> > >>> I don't know which difference between your environment and mine is > >>> relevant here, such that I can't reproduce the issue using your test > >>> script. It works perfectly fine for me. > >>> > >>> Can you run `stty -a | grep tostop`? If there's a minus sign > >>> ("-tostop"), it's disabled; if it's present without a minus sign > >>> ("tostop"), it's enabled, as best I can tell. > >> > >> -tostop is there > >> > >> ==== > >> [root@mk-a07n02 ~]# stty -a | grep tostop > >> isig icanon iexten echo echoe echok -echonl -noflsh -xcase -tostop -echoprt > >> [root@mk-a07n02 ~]# > >> ==== > >> > >>> I'm just spitballing here. It's disabled by default on my machine... > >>> but even when I enable it, crm_resource --validate works fine. It may > >>> be set differently when running under crm_resource. > >> > >> How do you enable it? > > > > With `stty tostop` > > > > It's 100% possible that this whole thing is a red herring by the way. > > I'm looking for anything that might explain the discrepancy. SIGTTOU > > may not be directly tied to the root cause. > > Appreciate the stab, didn't stop the hang though :( > > -- > Madison Kelly > Alteeve's Niche! > Chief Technical Officer > c: +1-647-471-0951 > https://alteeve.com/ > > _______________________________________________ > Manage your subscription: > https://lists.clusterlabs.org/mailman/listinfo/users > > ClusterLabs home: https://www.clusterlabs.org/ -- Keisuke MORI _______________________________________________ Manage your subscription: https://lists.clusterlabs.org/mailman/listinfo/users ClusterLabs home: https://www.clusterlabs.org/