[ansible-project] File operations failing on shared file system #67410

Stephen Gevers Sat, 29 Feb 2020 13:13:41 -0800

I opened an issue <https://github.com/ansible/ansible/issues/67410> that 
was closed because the developer believes the problem is a race condition 
that can't be dealt with in code.  I created a playbook where two hosts 
that both mount a shared file system test for the existence of a file.  The 
playbook starts with the file present in the shared file system.  The 
playbook then executes the following steps:

1. stat the file on both hosts (output shows the file is there)
2. remove the file from host1 using a when option to limit the action to
the desired host (output shows "skipping & changed")
3. stat the file on both hosts (output shows the file does not exist)
4. create the file on host2 using a when option to limit the action to
the desired host (output shows "skipping & changed" again, but on opposite
hosts as step 2)
5. stat the file on both hosts (output shows the file exists on host2
but not on host1)

I don't understand how this is a race condition. This isn't a case where
something outside ansible is creating the file. The task that creates the
file clearly completes before the stat task that checks for the file's
existence is started. Further, the check for the file's existence is run
concurrently on both machines and the task run on the host that created the
file sees the file whereas the other does not. A race condition would
imply that the machine that doesn't see that the file exists would have had
to have checked before the task that creates the file finished.

While it's possible that I'm being fooled by the order of output of the
"failing" stat output in step 5, past experience tells me that the tasks in
step 5 won't be executed by any host without all hosts in step4 being
completed. I had a set of WebSphere patches that I needed to execute
against both linux and windows hosts. Though the patches were installed in
exactly the same manner, the tasks were different between the two types of
hosts. The Linux based task had a when option for the Linux OS type and it
was followed by the Windows task with a when option for the Windows OS
type. Though all of the Linux machines processed the task in parallel, the
Windows machines didn't start until the Linux machines had completed. In
order to get both to operate in parallel, I had to add an async option to
both tasks and then add more tasks to wait for the results.

If I'm confused, please set me straight so I understand how I'm creating a
race condition. Otherwise, I'd like to reopen the issue.

--
You received this message because you are subscribed to the Google Groups
"Ansible Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/ansible-project/855c5b98-ffd8-496b-89ea-df1ac4cc2ae1%40googlegroups.com.

[ansible-project] File operations failing on shared file system #67410

Reply via email to