[Sts-sponsors] [Bug 1810372] Re: Infinite busy-loop trying to cull when cache space is short

Dan Streetman Fri, 11 Jan 2019 07:16:08 -0800

** Description changed:

  [Impact]
  
  A user reports that cachefilesd will spin at 100% of a cpu when started
  on a filesystem where the free space is less than the bcull threshold
  and culling the cache is insufficient to free up space.
  
  Investigation shows that this is because cachefilesd detects that
  culling is required, tries to cull, and does not realise that culling
  cannot free up enough space, so just keeps retrying.
  
  [Test Case]
  
- I created a VM with a spare disk, mounted it to /raid (or whatever).
- I changed the /etc/default/cachefilesd to start at boot, filled the /raid 
filesystem to over the bcull threshold, and started cachefilesd. When running 
top, the cachefilesd process is at the top using approx 100% of 1 CPU. It 
appears to be trying to free up space in the cache (that is not even used) 
because the filesystem is over the threshold for culling.
+ Create a trusty or xenial VM, and install cachefilesd.  Using either a real 
disk or loopback image, create a ext4 filesystem, and edit fstab to mount it at 
/var/cache/fscache, e.g.:
+ $ sudo dd if=/dev/zero of=/cache.img bs=1024m count=1024
+ $ sudo losetup -f /cache.img
+ $ sudo losetup -a
+ $ sudo mkfs.ext4 /dev/loop0   (note, adjust loop0 if needed)
+ 
+ edit fstab e.g.:
+ $ grep fscache /etc/fstab 
+ /cache.img /var/cache/fscache ext4 defaults,loop,user_xattr 0 0
+ 
+ It's important to include the 'user_xattr' option as cachefilesd
+ requires that.
+ 
+ stop the cachefilesd service and move the fscache contents:
+ $ sudo service cachefilesd stop
+ $ cd /var/cache
+ $ sudo mkdir fscache2
+ $ sudo mv -vf fscache/* fscache2/
+ $ sudo mount fscache
+ $ sudo mv -vf fscache2/* fscache/
+ $ sudo rmdir fscache2
+ 
+ create a file to fill up the fscache space, e.g.:
+ $ sudo dd if=/dev/zero of=/var/cache/fscache/largefile.txt bs=1024k count=850
+ $ df /var/cache/fscache
+ Filesystem     1K-blocks   Used Available Use% Mounted on
+ /dev/loop0        999320 922896      7612 100% /var/cache/fscache
+ 
+ edit /etc/default/cachefilesd to uncomment 'RUN=yes', e.g.:
+ $ grep RUN /etc/default/cachefilesd 
+ RUN=yes
+ 
+ reboot, or just restart cachefilesd service
+ $ sudo service cachefilesd start
+ 
+ check top
+ $ top
+ 
+ cachefilesd should be spinning, using 100% (or as much as it can) cpu
+ time.
  
  [Regression Potential]
  
  The patch makes changes to how cachefilesd detects if it should sleep
  or cull, so regressions would be in the area of cachefilesd spinning
  instead of sleeping (which is what it does now) or sleeping instead
  of culling.
  
  However the patch is small and easily understood and backports with
  minimal effort.
  
  [Other Info]
  
  This is fixed upstream in 0.10.6:
  
  * Wed Feb 3 2016 David Howells <dhowe...@redhat.com> 0.10.6-1
  ...
  - Suspend culling when cache space is short and cache objects are pinned.
  
  The particular patch is ce353f5b6b5b ("cachefilesd can spin when disk
  space is short.")
  
  Since bionic has version 0.10.10-0.1, this fix is needed only for xenial
  and trusty.


-- 
You received this bug notification because you are a member of STS
Sponsors, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1810372

Title:
  Infinite busy-loop trying to cull when cache space is short

Status in cachefilesd package in Ubuntu:
  Fix Released
Status in cachefilesd source package in Trusty:
  In Progress
Status in cachefilesd source package in Xenial:
  In Progress

Bug description:
  [Impact]

  A user reports that cachefilesd will spin at 100% of a cpu when
  started on a filesystem where the free space is less than the bcull
  threshold and culling the cache is insufficient to free up space.

  Investigation shows that this is because cachefilesd detects that
  culling is required, tries to cull, and does not realise that culling
  cannot free up enough space, so just keeps retrying.

  [Test Case]

  Create a trusty or xenial VM, and install cachefilesd.  Using either a real 
disk or loopback image, create a ext4 filesystem, and edit fstab to mount it at 
/var/cache/fscache, e.g.:
  $ sudo dd if=/dev/zero of=/cache.img bs=1024m count=1024
  $ sudo losetup -f /cache.img
  $ sudo losetup -a
  $ sudo mkfs.ext4 /dev/loop0   (note, adjust loop0 if needed)

  edit fstab e.g.:
  $ grep fscache /etc/fstab 
  /cache.img /var/cache/fscache ext4 defaults,loop,user_xattr 0 0

  It's important to include the 'user_xattr' option as cachefilesd
  requires that.

  stop the cachefilesd service and move the fscache contents:
  $ sudo service cachefilesd stop
  $ cd /var/cache
  $ sudo mkdir fscache2
  $ sudo mv -vf fscache/* fscache2/
  $ sudo mount fscache
  $ sudo mv -vf fscache2/* fscache/
  $ sudo rmdir fscache2

  create a file to fill up the fscache space, e.g.:
  $ sudo dd if=/dev/zero of=/var/cache/fscache/largefile.txt bs=1024k count=850
  $ df /var/cache/fscache
  Filesystem     1K-blocks   Used Available Use% Mounted on
  /dev/loop0        999320 922896      7612 100% /var/cache/fscache

  edit /etc/default/cachefilesd to uncomment 'RUN=yes', e.g.:
  $ grep RUN /etc/default/cachefilesd 
  RUN=yes

  reboot, or just restart cachefilesd service
  $ sudo service cachefilesd start

  check top
  $ top

  cachefilesd should be spinning, using 100% (or as much as it can) cpu
  time.

  [Regression Potential]

  The patch makes changes to how cachefilesd detects if it should sleep
  or cull, so regressions would be in the area of cachefilesd spinning
  instead of sleeping (which is what it does now) or sleeping instead
  of culling.

  However the patch is small and easily understood and backports with
  minimal effort.

  [Other Info]

  This is fixed upstream in 0.10.6:

  * Wed Feb 3 2016 David Howells <dhowe...@redhat.com> 0.10.6-1
  ...
  - Suspend culling when cache space is short and cache objects are pinned.

  The particular patch is ce353f5b6b5b ("cachefilesd can spin when disk
  space is short.")

  Since bionic has version 0.10.10-0.1, this fix is needed only for
  xenial and trusty.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/cachefilesd/+bug/1810372/+subscriptions

-- 
Mailing list: https://launchpad.net/~sts-sponsors
Post to     : sts-sponsors@lists.launchpad.net
Unsubscribe : https://launchpad.net/~sts-sponsors
More help   : https://help.launchpad.net/ListHelp

[Sts-sponsors] [Bug 1810372] Re: Infinite busy-loop trying to cull when cache space is short

Reply via email to