Hi Mauricio,

The tests in [5][6] and [7] have been done with a 44GB memory VM.
This VM usually has 64GB of allocated memory.
The goal was to verify that the load of the whole btree in RAM was not 
prevented (like in mca-reap -> down_write_trylock).

All this kernel stuff is rather new to me, so I may be following a wrong
lead.

** Description changed:

  Hello,
  
  In short, we faced an issue with a huge IO wait on a bionic Ubuntu 
4.15.0-118.119-generic kernel.
  This is the full list of process and the kernel function they were stuck in 
[0].
  
  The main issue can probably be summarized by this perf reports
  * first identify that the cpu are stuck in idle because of something[1]
  * second, see what kernel function seems to stuck the process kswapd0 and 
kswapd1 [2].
  
  We could see that this seems to be the mutex_lock in the bch_mca_scan
  function [3].
  
  After running the command:
  
   | sudo bash -c "echo 1 > /sys/fs/bcache/f1a1e8cb-3e6b-40ea-852e-
  583c48d0c2b8/internal/btree_shrinker_disabled"
  
  The server started to respond normally and the IO wait dropped
  significantly
  
  Here is a trace of the bcache event related lock in the kernel obtained
  with some bpfcc-tools [4].
  
  klockstat-bpfcc -c bch_ -i 5 -s 3
  
  The trace has been run in parallel with the following command line
  
  echo "Shrinker disabled: $(date)"; sleep 60; echo "Enabling shrinker:
  $(date)"; echo 0 | sudo tee
  /sys/block/bcache0/bcache/cache/internal/btree_shrinker_disabled ; sleep
  60; echo "Disabling shrinker: $(date)"; echo 1 | sudo tee
  /sys/block/bcache0/bcache/cache/internal/btree_shrinker_disabled; sleep
  60; echo "End of test: $(date)"
  
  Trying to dig more, we reduced by 20 GB the memory allocated to a VM on the 
server.
  * The bcache btree size fluctuation seems "normal" [5]
- * I noticed that, when the shrinker was enabled,a lot of time was spent in 
the locks during "bch_btree_insert_node".
+ * I noticed that, when the shrinker was enabled,a lot of time was spent in 
the locks during "bch_btree_insert_node". [6]
  
  I decided to check if one of the function called during
  bch_btree_insert_node was taking longer than usual when the shrinker was
  enabled.
  
  I finally found the "funclatency" tool and tried do have the same approach I 
had with the klockstat [7]. However, that was inconclusive. I could see there 
that the bch_btree_insert_node was barely called during the whole duration of 
the test.
  Which made me think it's amount of time spent in lock is more due to another 
process acquiring the lock.
  
  I'm going to try to have another go with some perf/klockstat/funclatency
  focused on bch_mca_scan and the function called there.
  
  Also, here are some memory related metrics [8].
  
- 
  Now another perf stacktrace with the command used [9].
  Strangely this one doesn't show any bch_mca_scan at all.
  
  I enabled the shrinker again, hoping to get more traces, but apparently the 
timeframe was not right. Not enough load to trigger the cliff resulting in a 
1sec IOwait plateau.
  Which is interesting because that means that without the maximal workload, 
the kernel can cope with the shrinker.
- 
  
  [0]: https://pastebin.ubuntu.com/p/QYXPdsMCWC/
  [1]: https://pastebin.ubuntu.com/p/BFsvF7H54r/
  [2]: https://pastebin.ubuntu.com/p/35qdsHYHf5/
  [3]: 
https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/bionic/tree/drivers/md/bcache/btree.c?h=Ubuntu-4.15.0-118.119#n674
  [4]: https://pastebin.ubuntu.com/p/qhyqP35fCw/
  [5]: https://pastebin.ubuntu.com/p/McjxxqTVjn/
  [6]: https://pastebin.ubuntu.com/p/KmrnW4Ng8F/
  [7]: https://pastebin.ubuntu.com/p/fSX4c7tTFV/
  [8]: https://pastebin.ubuntu.com/p/CZgXkgKhmJ/
  [9]: https://pastebin.ubuntu.com/p/DzKCP8NGdf/
  
  ====================
  $ cat /proc/version_signature
  Ubuntu 4.15.0-118.119-generic 4.15.18
  
  ProblemType: Bug
  DistroRelease: Ubuntu 18.04
  Package: linux-image-4.15.0-118-generic 4.15.0-118.119
  ProcVersionSignature: User Name 4.15.0-118.119-generic 4.15.18
  Uname: Linux 4.15.0-118-generic x86_64
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Sep 29 10:04 seq
   crw-rw---- 1 root audio 116, 33 Sep 29 10:04 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.9-0ubuntu7.16
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  Date: Tue Oct  6 20:36:18 2020
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  MachineType: HP ProLiant DL380 G7
  PciMultimedia:
  
  ProcFB: 0 radeondrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-118-generic 
root=UUID=c6ad1629-a506-4043-a339-6d57f0708d12 ro console=ttyS1,115200 nosplash
  RelatedPackageVersions:
   linux-restricted-modules-4.15.0-118-generic N/A
   linux-backports-modules-4.15.0-118-generic  N/A
   linux-firmware                              1.173.18
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  SourcePackage: linux
  UpgradeStatus: Upgraded to bionic on 2019-09-27 (375 days ago)
  dmi.bios.date: 05/05/2011
  dmi.bios.vendor: HP
  dmi.bios.version: P67
  dmi.chassis.type: 23
  dmi.chassis.vendor: HP
  dmi.modalias: 
dmi:bvnHP:bvrP67:bd05/05/2011:svnHP:pnProLiantDL380G7:pvr:cvnHP:ct23:cvr:
  dmi.product.family: ProLiant
  dmi.product.name: ProLiant DL380 G7
  dmi.sys.vendor: HP

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1898786

Title:
  Issue with bcache bch_mca_scan causing huge IO wait

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1898786/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to