Bug#983331: zfs-linux: zfs_zrele_async can cause txg sync deadlocks

2021-02-22 Thread Heitor Alves de Siqueira
Source: zfs-linux
Severity: important

Dear Maintainer,

For certain ZFS workloads, we can see hung task timeouts in the kernel logs due
to a transaction group deadlock. Userspace process will hang and display stack
traces similar to the one below:
[49181.619711] clnt_server D0 21699  28868 0x0320
[49181.619715] Call Trace:
[49181.619725]  __schedule+0x24e/0x880
[49181.619730]  schedule+0x2c/0x80
[49181.619750]  cv_wait_common+0x11e/0x140 [spl]
[49181.619763]  ? wait_woken+0x80/0x80
[49181.619775]  __cv_wait+0x15/0x20 [spl]
[49181.619872]  zil_commit.part.14+0x80/0x8c0 [zfs]
[49181.619884]  ? _cond_resched+0x19/0x40
[49181.619887]  ? mutex_lock+0x12/0x40
[49181.619959]  zil_commit+0x17/0x20 [zfs]
[49181.620026]  zfs_fsync+0x77/0xe0 [zfs]
[49181.620093]  zpl_fsync+0x68/0xa0 [zfs]
[49181.620100]  vfs_fsync_range+0x51/0xb0
[49181.620105]  do_fsync+0x3d/0x70
[49181.620109]  SyS_fsync+0x10/0x20
[49181.620114]  do_syscall_64+0x73/0x130
[49181.620119]  entry_SYSCALL_64_after_hwframe+0x41/0xa6

We also might see a kworker thread blocking in the zfs writeback/evict path:
[49181.881570] kworker/u17:3   D0  4915  2 0x8000
[49181.881576] Workqueue: writeback wb_workfn (flush-zfs-10)
[49181.881577] Call Trace:
[49181.881580]  __schedule+0x24e/0x880
[49181.881582]  ? atomic_t_wait+0x60/0x60
[49181.881584]  schedule+0x2c/0x80
[49181.881588]  bit_wait+0x11/0x60
[49181.881592]  __wait_on_bit+0x4c/0x90
[49181.881596]  ? atomic_t_wait+0x60/0x60
[49181.881599]  __inode_wait_for_writeback+0xb9/0xf0
[49181.881601]  ? bit_waitqueue+0x40/0x40
[49181.881605]  inode_wait_for_writeback+0x26/0x40
[49181.881609]  evict+0xb5/0x1a0
[49181.881611]  iput+0x19c/0x230
[49181.881648]  zfs_iput_async+0x1d/0x80 [zfs]
[49181.881682]  zfs_get_data+0x1d4/0x2a0 [zfs]
[49181.881718]  zil_commit.part.14+0x640/0x8c0 [zfs]
[49181.881752]  zil_commit+0x17/0x20 [zfs]
[49181.881784]  zpl_writepages+0xd5/0x160 [zfs]
[49181.881787]  do_writepages+0x4b/0xe0
[49181.881790]  __writeback_single_inode+0x45/0x350
[49181.881792]  ? __writeback_single_inode+0x45/0x350
[49181.881794]  writeback_sb_inodes+0x1d7/0x530
[49181.881796]  wb_writeback+0xfb/0x300
[49181.881799]  wb_workfn+0xad/0x400
[49181.881800]  ? wb_workfn+0xad/0x400
[49181.881803]  ? __switch_to_asm+0x35/0x70
[49181.881809]  process_one_work+0x1de/0x420
[49181.881811]  worker_thread+0x32/0x410
[49181.881813]  kthread+0x121/0x140
[49181.881815]  ? process_one_work+0x420/0x420
[49181.881817]  ? kthread_create_worker_on_cpu+0x70/0x70
[49181.881819]  ret_from_fork+0x35/0x40

This is caused by a race between ZFS writeback and evict threads, usually during
a transaction group sync operation. It's possible to have two iput() threads
racing for the same inode: one of them scheduled async and the other executed
synchronously as part of the writeback path. If the writeback thread tries to
evict the inode while the async thread is running, it might re-enter the block
layer for the same inode due to ZFS counters being in an inconsistent
state. This then causes the kworker thread to stall the writeback, which in turn
prevents the transaction group sync to complete and locks other ZFS threads.

This is fixed by the upstream commit:
- Fix zrele race in zrele_async that can cause hang (2921ad6cba54) [0]

[0] https://github.com/openzfs/zfs/pull/11530



Bug#946610: zfs-dkms: livelock between ZFS evict and writeback threads

2019-12-11 Thread Heitor Alves de Siqueira
Source: zfs-linux
Severity: normal

Dear Maintainer,

For certain ZFS workloads, we start seeing hung task timeouts in the kernel 
logs due to zil_commit() stalling. This is due to zfs_zget() not detecting 
whether a znode has been marked for deletion before attempting to access it, 
causing a constant "retry loop" in zfs_get_data() if that znode has been 
unlinked already. An example of the stack traces follows:

[72742.051703] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this 
message.
[72742.070429] mysqld  D0  5713   2881 0x0320
[72742.073220] Call Trace:
[72742.075305]  __schedule+0x24e/0x880
[72742.090436]  schedule+0x2c/0x80
[72742.090438]  schedule_preempt_disabled+0xe/0x10
[72742.090441]  __mutex_lock.isra.5+0x276/0x4e0
[72742.090547]  ? dmu_tx_destroy+0x105/0x130 [zfs]
[72742.090555]  __mutex_lock_slowpath+0x13/0x20
[72742.115374]  ? __mutex_lock_slowpath+0x13/0x20
[72742.132266]  mutex_lock+0x2f/0x40
[72742.134207]  zil_commit_impl+0x1b0/0x1b30 [zfs]
[72742.150428]  ? spl_kmem_alloc+0x115/0x180 [spl]
[72742.152622]  ? mutex_lock+0x12/0x40
[72742.154819]  ? zfs_refcount_add_many+0x9a/0x100 [zfs]
[72742.171450]  zil_commit+0xde/0x150 [zfs]
[72742.173687]  zfs_fsync+0x77/0xe0 [zfs]
[72742.175044]  zpl_fsync+0x80/0x110 [zfs]
[72742.191690]  vfs_fsync_range+0x51/0xb0
[72742.193876]  do_fsync+0x3d/0x70
[72742.195126]  SyS_fsync+0x10/0x20
[72742.211059]  do_syscall_64+0x73/0x130
[72742.214078]  entry_SYSCALL_64_after_hwframe+0x3d/0xa2

It's possible to hit this issue due to a race between the ZFS evict and 
writeback threads. If the z_iput task is trying to evict a znode that's 
currently sitting in the writeback thread, both will "livelock" each other and 
stall the ZIO pipeline, causing other ZFS operations (such as zil_commit) to 
hang indefinitely.

This has been documented and fixed upstream in PR#9583 [0]. We need to pull two 
fixes from upstream: the first one fixes the zfs_zget() issue in the writeback 
thread, while the second fixes a regression on O_TMPFILE descriptors caused by 
the first one.

Upstream patches:
 - Break out of zfs_zget early if unlinked znode (41e1aa2a06f8) [1]
 - Check for unlinked znodes after igrab() (0c46813805f4) [2]

[0] https://github.com/zfsonlinux/zfs/pull/9583
[1] https://github.com/zfsonlinux/zfs/commit/41e1aa2a06f8
[2] https://github.com/zfsonlinux/zfs/commit/0c46813805f4



Bug#926657: [Pkg-openldap-devel] Bug#926657: openldap: slapd process failure is not detected by systemd

2019-05-06 Thread Heitor Alves de Siqueira
Hi Ryan,

I'm attaching a patch for this which includes the override file.
Do you think this patch can be adopted in Debian until we have the
native service file? Please let me know if you think it needs any
changes!

Thanks,
Heitor
From 84316f458fb14592085f7e67d6aab349aaa312ff Mon Sep 17 00:00:00 2001
From: Heitor Alves de Siqueira 
Date: Mon, 6 May 2019 10:17:31 -0300
Subject: [PATCH] slapd: Fix systemd not detecting service failures

If the slapd daemon process exits due to some failure, the systemd
service is still reported as active even though the child (daemon)
process has exited with a signal. This is due to the sysv-generator unit
file for slapd telling the service to remain active after its process
exits.

This patch adds a systemd override file with 'RemainAfterExit=no', to
make the service behave in the expected way.

Closes: #926657

Signed-off-by: Heitor Alves de Siqueira 
---
 debian/slapd-remain-after-exit.conf | 3 +++
 debian/slapd.install| 1 +
 2 files changed, 4 insertions(+)
 create mode 100644 debian/slapd-remain-after-exit.conf

diff --git a/debian/slapd-remain-after-exit.conf b/debian/slapd-remain-after-exit.conf
new file mode 100644
index ..b031203eaa9f
--- /dev/null
+++ b/debian/slapd-remain-after-exit.conf
@@ -0,0 +1,3 @@
+[Service]
+Type=forking
+RemainAfterExit=no
diff --git a/debian/slapd.install b/debian/slapd.install
index 2e7c9990a53d..ea197a99bf14 100644
--- a/debian/slapd.install
+++ b/debian/slapd.install
@@ -5,6 +5,7 @@ debian/ldiftopasswd usr/share/slapd
 debian/DB_CONFIG usr/share/slapd
 debian/slapd.conf usr/share/slapd
 debian/slapd.init.ldif usr/share/slapd
+debian/slapd-remain-after-exit.conf lib/systemd/system/slapd.service.d
 
 usr/lib/ldap/back_*.so*
 usr/lib/ldap/back_*.la
-- 
2.21.0



Bug#928182: debconf: readline frontend does not show options before prompting user

2019-05-03 Thread Heitor Alves de Siqueira
Some minor changes to the patch, to fix indentation and avoid calling
Term::Readline->new() twice.
From ce8eb9118a35992f1fd7f1cc21be3142b72821a4 Mon Sep 17 00:00:00 2001
From: Heitor Alves de Siqueira 
Date: Fri, 3 May 2019 10:34:09 -0300
Subject: [PATCH] debconf: fix readline prompt for run-parts

When run-parts is executed with the --report flag, it buffers stdout and
stderr from debconf into pipes for formatting. This causes the readline
prompt to be displayed before any of the available options show up, as
readline goes directly to /dev/tty by default and completely sidesteps
the run-parts buffers. The prompt will then block for input, preventing
the options from showing up until the user makes a selection.

This patch forces readline to print to stdout instead of /dev/tty if
both stdout and stderr are being buffered through pipes.

Closes: #928182

Signed-off-by: Heitor Alves de Siqueira 
---
 Debconf/FrontEnd/Readline.pm | 4 
 1 file changed, 4 insertions(+)

diff --git a/Debconf/FrontEnd/Readline.pm b/Debconf/FrontEnd/Readline.pm
index 546a0be32512..ef982491cbea 100644
--- a/Debconf/FrontEnd/Readline.pm
+++ b/Debconf/FrontEnd/Readline.pm
@@ -51,6 +51,10 @@ sub init {
 	$this->readline(Term::ReadLine->new('debconf'));
 	$this->readline->ornaments(1);
 
+	if (-p STDOUT && -p STDERR) { # make readline play nice with buffered stdout
+		$this->readline->newTTY(*STDIN, *STDOUT);
+	}
+
 	if (Term::ReadLine->ReadLine =~ /::Gnu$/) {
 		# Well, emacs shell buffer has some annoying interactions
 		# with Term::ReadLine::GNU. It's not worth the pain.
-- 
2.21.0



Bug#928182: debconf: readline frontend does not show options before prompting user

2019-04-29 Thread Heitor Alves de Siqueira
Hi Colin,

Thank you for taking a look at this one. It has been reported in
Launchpad as well [0].
A summary of how we can trigger this bug would be like this:

- dpkg/apt triggers a package upgrade
- some maintainer scripts are executed with the 'run-parts' tool using
the '--report' flag (e.g. the /etc/kernel/postinst.d scripts at the
end of [1])
- run-parts sets up pipes and captures stdout/stderr from debconf
- readline frontend prints prompt before run-parts can process the
options from stdout pipe

I belive we could fix this by forcing readline to always use stdout if
both stdout and stderr are pipes (since that most likely means debconf
is running under run-parts).
>From what I've tested, the attached patch works correctly for this
case. Do you think that patch would be an appropriate solution?

Thanks,
Heitor

[0] https://bugs.launchpad.net/bugs/1822270
[1] https://termbin.com/c92p
From bbfe0bb54ccf8e6f6cacfd642687e550a2ec5493 Mon Sep 17 00:00:00 2001
From: Heitor Alves de Siqueira 
Date: Mon, 29 Apr 2019 12:32:01 -0300
Subject: [PATCH] [RFC] debconf: fix readline prompt for run-parts

Signed-off-by: Heitor Alves de Siqueira 
---
 Debconf/FrontEnd/Readline.pm | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/Debconf/FrontEnd/Readline.pm b/Debconf/FrontEnd/Readline.pm
index 546a0be32512..e4868d9be278 100644
--- a/Debconf/FrontEnd/Readline.pm
+++ b/Debconf/FrontEnd/Readline.pm
@@ -48,7 +48,12 @@ sub init {
 	close TESTTY;
 
 	$Term::ReadLine::termcap_nowarn = 1; # Turn off stupid termcap warning.
-	$this->readline(Term::ReadLine->new('debconf'));
+if (-p STDOUT && -p STDERR) { # make readline play nice with run-parts --report
+$this->readline(Term::ReadLine->new('debconf', *STDIN, *STDOUT));
+}
+else {
+$this->readline(Term::ReadLine->new('debconf'));
+}
 	$this->readline->ornaments(1);
 
 	if (Term::ReadLine->ReadLine =~ /::Gnu$/) {
-- 
2.21.0



Bug#928182: debconf: readline frontend does not show options before prompting user

2019-04-29 Thread Heitor Alves de Siqueira
Package: debconf
Version: 1.5.71
Severity: normal

Dear Maintainer,

When upgrading packages with apt or dpkg, debconf scripts are ran
through 'run-parts' with the '--report' flag. This causes script output
to be handled through pipes set up by run-parts, and buffers output from
maintainer scripts nicely for formatting.

If debconf makes use of the readline frontend, any prompts will bypass
the run-parts buffers and be displayed directly to /dev/tty. This
generally causes the prompt to be displayed before the user gets any of
the available options for it, and printing will block until the user
inputs a valid option.



Bug#927311: resource-agents: ethmonitor does not list interfaces without assigned IP address

2019-04-17 Thread Heitor Alves de Siqueira
Package: resource-agents
Severity: normal

Dear Maintainer,

The is_interface() function in heartbeat/ethmonitor tries to match an
interface to a list obtained from the 'ip' tool. It lists interfaces
using the 'inet' family, which omits interfaces that don't have an IP
address assigned.

If the interface that we're looking for is e.g. a VLAN bridge that does
not have an IP address, it won't show up in the listing and
is_interface() will return false. ethmonitor will miss that interface,
and it won't be available for monitoring.

Upstream commits:
- https://github.com/ClusterLabs/resource-agents/commit/40d05029ce0b
- https://github.com/ClusterLabs/resource-agents/commit/c0ac191c73f1