On Wed, 2024-09-04 at 12:21 +0000, Konrad Weihmann via lists.openembedded.org 
wrote:
> in case of running two or more runqemu instances in parallel
> with no previously setup tap devices, the following happens:
> 
> instance A probes for tap devices, but doesn't find
> any, proceeds to generating tap devices with the sudo call,
> resulting in tap0.
> instance B starts to probes, finds tap0.
> Both will lock tap0.
> 
> tap0 will be then forwarded to qemu.
> Instance A reporting
> 
> "Using preconfigured tap device tap0"
> 
> but then failing with
> 
> qemu-system... could not configure /dev/net/tun (tap0): Device or resource 
> busy
> 
> To fix that, lock the entire tap creation process with
> an exclusive file (blocking) lock, so only a single instance can
> perform the non-atomic changes.
> 
> Signed-off-by: Konrad Weihmann <[email protected]>
> ---
>  scripts/runqemu | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/scripts/runqemu b/scripts/runqemu
> index 2817acb19f..e5db06be37 100755
> --- a/scripts/runqemu
> +++ b/scripts/runqemu
> @@ -1163,6 +1163,9 @@ to your build configuration.
>  
>          self.make_lock_dir(lockdir)
>  
> +        tap_setup_lock = open(os.path.join(lockdir, '_tap_creation_lock'), 
> 'w')
> +        fcntl.flock(tap_setup_lock, fcntl.LOCK_EX)
> +
>          cmd = (ip, 'link')
>          logger.debug('Running %s...' % str(cmd))
>          ip_link = subprocess.check_output(cmd).decode('utf-8')
> @@ -1187,6 +1190,8 @@ to your build configuration.
>  
>          if not tap:
>              if os.path.exists(nosudo_flag):
> +                fcntl.flock(tap_setup_lock, fcntl.LOCK_UN)
> +                tap_setup_lock.close()
>                  logger.error("Error: There are no available tap devices to 
> use for networking,")
>                  logger.error("and I see %s exists, so I am not going to try 
> creating" % nosudo_flag)
>                  raise RunQemuError("a new one with sudo.")

This doesn't make sense to me. Surely you should be taking this lock
*within* the "if not tap" code block since that is where the ifup is
called out to with sudo? You appear to be dropping the lock there
instead?

It appears you're putting the locking around the search for pre-
configured tap devices which is already atomic and working well from a
race perspective?

> @@ -1198,6 +1203,8 @@ to your build configuration.
>              try:
>                  tap = subprocess.check_output(cmd).decode('utf-8').strip()
>              except subprocess.CalledProcessError as e:
> +                fcntl.flock(tap_setup_lock, fcntl.LOCK_UN)
> +                tap_setup_lock.close()
>                  logger.error('Setting up tap device failed:\n%s\nRun 
> runqemu-gen-tapdevs to manually create one.' % str(e))
>                  sys.exit(1)
>              lockfile = os.path.join(lockdir, tap)
> @@ -1206,6 +1213,9 @@ to your build configuration.
>              self.cleantap = True
>              logger.debug('Created tap: %s' % tap)
>  
> +        fcntl.flock(tap_setup_lock, fcntl.LOCK_UN)
> +        tap_setup_lock.close()

For locks like this, they should really be in a try/finally or with clause...

Cheers,

Richard
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#204229): 
https://lists.openembedded.org/g/openembedded-core/message/204229
Mute This Topic: https://lists.openembedded.org/mt/108263503/21656
Group Owner: [email protected]
Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub 
[[email protected]]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to