Bug#971895: lintian: hangs indefinitely on stable using lintian 2.97.0~bpo10+1
On 10/9/20 5:40 PM, Felix Lechner wrote: > Perhaps it was an error to mix sysread [1] with print (as noted in the > documentation, even though the handles are different). Can you try > syswrite [2] in the three lines here [3] instead of print, please? Already tried it in one of my tests, doesn't solve the issue. I've also tried to flush with fh->flush() without success. -- Baptiste BEAUPLAT - lyknode signature.asc Description: OpenPGP digital signature
Bug#971895: lintian: hangs indefinitely on stable using lintian 2.97.0~bpo10+1
Hi Baptiste, On Fri, Oct 9, 2020 at 6:41 AM Baptiste Beauplat wrote: > > The problem seems to be an interaction between the pipes, the process, > perl and the kernel. Perhaps it was an error to mix sysread [1] with print (as noted in the documentation, even though the handles are different). Can you try syswrite [2] in the three lines here [3] instead of print, please? Kind regards Felix Lechner [1] https://perldoc.perl.org/functions/sysread [2] https://perldoc.perl.org/functions/syswrite [3] https://salsa.debian.org/lintian/lintian/-/blob/master/lib/Lintian/IO/Select.pm#L192-194
Bug#971895: lintian: hangs indefinitely on stable using lintian 2.97.0~bpo10+1
Hi Felix, I did a couple more testing and here are the results: - Using the bpo kernel solves the issue - Using a smaller write on the read buffer solves the issue (tested with 4k) What does not solve the issue, that I've tried: - Writing only when the pipes are available for writing (using select on stdin). See the attached patch for the modification/tests I've made. The problem seems to be an interaction between the pipes, the process, perl and the kernel. Since bumping the kernel isn't an option for a lot users, I would suggest decreasing the read buffer (or writing smaller chunk?). -- Baptiste Beauplat - lyknode --- /usr/share/perl5/Lintian/IO/Select.pm 2020-10-09 15:36:34.817261016 +0200 +++ /usr/share/lintian/lib/Lintian/IO/Select.pm 2020-10-09 15:38:11.944481974 +0200 @@ -77,7 +77,9 @@ my @pids; -my $select = IO::Select->new; +my $select_r = IO::Select->new; +my $select_w = IO::Select->new; +my $select_e = IO::Select->new; my $produce_stdin; my $produce_stdout; @@ -98,7 +100,8 @@ push(@pids, $produce_pid); -$select->add($produce_stdout, $produce_stderr); +$select_r->add($produce_stdout, $produce_stderr); +$select_e->add($produce_stdin, $produce_stdout, $produce_stderr); my $extract_stdin; my $extract_stdout; @@ -120,7 +123,9 @@ push(@pids, $extract_pid); -$select->add($extract_stdout, $extract_stderr); +$select_r->add($extract_stdout, $extract_stderr); +$select_w->add($extract_stdin); +$select_e->add($extract_stdin, $extract_stdout, $extract_stderr); my @index_options = qw(--list --verbose --utc --full-time --quoting-style=c --file -); @@ -140,7 +145,9 @@ push(@pids, $named_pid); -$select->add($named_stdout, $named_stderr); +$select_r->add($named_stdout, $named_stderr); +$select_w->add($named_stdin); +$select_e->add($named_stdin, $named_stdout, $named_stderr); my $numeric_stdin; my $numeric_stdout; @@ -159,7 +166,9 @@ push(@pids, $numeric_pid); -$select->add($numeric_stdout, $numeric_stderr); +$select_r->add($numeric_stdout, $numeric_stderr); +$select_w->add($numeric_stdin); +$select_e->add($numeric_stdin, $numeric_stdout, $numeric_stderr); my $named = EMPTY; my $numeric = EMPTY; @@ -168,12 +177,27 @@ my $extract_errors = EMPTY; my $named_errors = EMPTY; -while (my @ready = $select->can_read) { +use Data::Dumper; +while (my @state = IO::Select->select($select_r, $select_w, $select_e)) { +(my $read, my $write, my $error) = @state; + +for my $handle (@{$error}) { +die "PROCESS ERROR" +} + +my $count = scalar @{$write}; +if ($count != 3) { +STDERR->printflush("Not ready to write: $count: \n"); +next; +} else { +STDERR->printflush("OK to write\n"); +} -for my $handle (@ready) { +for my $handle (@{$read}) { my $buffer; -my $length = sysread($handle, $buffer, 4096 * TAR_RECORD_SIZE); +# my $length = sysread($handle, $buffer, 4096 * TAR_RECORD_SIZE); +my $length = sysread($handle, $buffer, 4096); die "Error from child: $!\n" unless defined $length; @@ -184,7 +208,7 @@ close $named_stdin; close $numeric_stdin; } -$select->remove($handle); +$select_r->remove($handle); next; } signature.asc Description: OpenPGP digital signature
Bug#971895: lintian: hangs indefinitely on stable using lintian 2.97.0~bpo10+1
Hi Baptiste, On Fri, Oct 9, 2020 at 2:03 AM Felix Lechner wrote: > > Untarring is an expensive operation, and the two indices would > otherwise require two such operations in addition to the actual > unpacking. Upon re-reading, my wording was perhaps a bit unclear. Here, 'index' refers to 'tar t'. or the index output by tar. We collect both named and numerical owners, which require two separate runs. > In your case, the index of 'installed' files seems to be the issue. This index is the full in-memory replication of the file list (including checksums and, sometimes, cached content). It is based on Lintian::Index, which is how checks examine files installed in *.deb packages. Lintian provides four indices total: installed, control, patched and orig. Their features are identical. Kind regards Felix Lechner
Bug#971895: lintian: hangs indefinitely on stable using lintian 2.97.0~bpo10+1
On 10/9/20 11:03 AM, Felix Lechner wrote: > Hi Baptiste, > > On Fri, Oct 9, 2020 at 12:30 AM Baptiste BEAUPLAT wrote: >> >> the issue is intermittent > > In which percentage of cases does this issue occur, please? With xargs I'd say 90% reproducibility. Single run is closer to 1 out of 3. > I am unable to reproduce it in twenty runs locally on stable-bpo, in > which I develop ("bare metal"), without the 'time' command which is, > for some reason, not available in that position in my version of bash. > I made twenty runs: > > $ seq 1 2 | xargs -I {} -P 0 bin/lintian > ../bugs/lyknode/gnome-user-docs-3.38.1-1/gnome-user-docs_3.38.1-1_amd64.changes > E: gnome-user-docs: description-is-pkg-name GNOME user docs > W: gnome-user-docs source: incomplete-creative-commons-license > cc-by-3.0 (line 7) > I: gnome-user-docs: unusual-documentation-package-name > P: gnome-user-docs source: silent-on-rules-requiring-root > E: gnome-user-docs: description-is-pkg-name GNOME user docs > W: gnome-user-docs source: incomplete-creative-commons-license > cc-by-3.0 (line 7) > I: gnome-user-docs: unusual-documentation-package-name > P: gnome-user-docs source: silent-on-rules-requiring-root > > Right now, my best guess is a race condition or other problem in this routine: > > https://salsa.debian.org/lintian/lintian/-/blob/master/lib/Lintian/IO/Select.pm#L75 > > The issue should be easy to track down, if it is caused by Lintian, > because Lintian no longer does anything in parallel. That routine is > the sole exception. > > Untarring is an expensive operation, and the two indices would > otherwise require two such operations in addition to the actual > unpacking. > > In your case, the index of 'installed' files seems to be the issue. > Indices are now unpacked on demand. Do you see the issue with just a > single check that accesses Processable::installed, for example with > '-C files/special'? Yes. The issue does occur with the following command: lintian -d -C files/special gnome-user-docs_3.38.1-1_all.deb I did 10 runs, the stuck rate was 50%. List of zombies processes varies over the runs but it's always among the same processes (the 3 tars and 2 dpkg-deb). > > Ideally, you would run it only on the offending *.deb. Thanks! -- Baptiste Beauplat - lyknode signature.asc Description: OpenPGP digital signature
Bug#971895: lintian: hangs indefinitely on stable using lintian 2.97.0~bpo10+1
Hi Baptiste, On Fri, Oct 9, 2020 at 12:30 AM Baptiste BEAUPLAT wrote: > > the issue is intermittent In which percentage of cases does this issue occur, please? I am unable to reproduce it in twenty runs locally on stable-bpo, in which I develop ("bare metal"), without the 'time' command which is, for some reason, not available in that position in my version of bash. I made twenty runs: $ seq 1 2 | xargs -I {} -P 0 bin/lintian ../bugs/lyknode/gnome-user-docs-3.38.1-1/gnome-user-docs_3.38.1-1_amd64.changes E: gnome-user-docs: description-is-pkg-name GNOME user docs W: gnome-user-docs source: incomplete-creative-commons-license cc-by-3.0 (line 7) I: gnome-user-docs: unusual-documentation-package-name P: gnome-user-docs source: silent-on-rules-requiring-root E: gnome-user-docs: description-is-pkg-name GNOME user docs W: gnome-user-docs source: incomplete-creative-commons-license cc-by-3.0 (line 7) I: gnome-user-docs: unusual-documentation-package-name P: gnome-user-docs source: silent-on-rules-requiring-root Right now, my best guess is a race condition or other problem in this routine: https://salsa.debian.org/lintian/lintian/-/blob/master/lib/Lintian/IO/Select.pm#L75 The issue should be easy to track down, if it is caused by Lintian, because Lintian no longer does anything in parallel. That routine is the sole exception. Untarring is an expensive operation, and the two indices would otherwise require two such operations in addition to the actual unpacking. In your case, the index of 'installed' files seems to be the issue. Indices are now unpacked on demand. Do you see the issue with just a single check that accesses Processable::installed, for example with '-C files/special'? Ideally, you would run it only on the offending *.deb. Thanks! Kind regards Felix Lechner
Bug#971895: lintian: hangs indefinitely on stable using lintian 2.97.0~bpo10+1
On 10/9/20 9:50 AM, Felix Lechner wrote: > Hi Baptiste, > > On Fri, Oct 9, 2020 at 12:30 AM Baptiste BEAUPLAT wrote: >> >> seq 1 2 | time xargs -I {} -P 0 lintian -d >> gnome-user-docs_3.38.1-1_amd64.changes > > What is the purpose of the 'seq' in this command, please? To produce **two** arbitrary lines that will be used by xargs to run **two** instances of the command in parallel. >> The issue is only reproducible when the following criteria are meet: >> >> - Running buster >> - Using lintian bpo >> - Using on bare metal or VM > > That seems to cover most scenarios. Are you saying it reproduces > everywhere *except* in a chroot? And except on unstable and unstable or stable lintian, yes. > >> >> https://framadrop.org/lufi/r/T_Vd4FcPB6#iTe5/s293qeEKTSWA1f4KR/iwptzGcCyx9FS9f6I+yE= > > Did you mean to upload a tarball called > 'gnome-user-docs-3.38.1-1.tar.gz' ? Thanks! Yes, that's an archive containing the upload. $ sha256sum gnome-user-docs-3.38.1-1.tar.gz e4a829a2e29c4778ef8c1a11b30cf6d61e8f2c72e726ce2f92e7a2623f4c49fa gnome-user-docs-3.38.1-1.tar.gz Content is: $ sha256sum gnome-user-docs-3.38.1-1/* 7d9fcb25cd82bf9855e2d4821dc60e08ed58255560a96511a50c22c8f7c01b8a gnome-user-docs-3.38.1-1/gnome-user-docs_3.38.1-1_all.deb bb47c5e4f834504e4faffc118027db30471a3356bc5b0aeef66ea33a6e50464a gnome-user-docs-3.38.1-1/gnome-user-docs_3.38.1-1_amd64.buildinfo d78182dea442d7710a9c4122ae6bfd3e7ef9c21c8b790d981571e7c1080f1192 gnome-user-docs-3.38.1-1/gnome-user-docs_3.38.1-1_amd64.changes db17908b10ffe4c7540d521ca5bdbc21677721d5a2fc67af8f171cc34eb850cf gnome-user-docs-3.38.1-1/gnome-user-docs_3.38.1-1.debian.tar.xz 8d231141654f8aab2e151b7887ffd9658daad2fc0684295b9e5212d2db9f1915 gnome-user-docs-3.38.1-1/gnome-user-docs_3.38.1-1.dsc 4e8af49f5d23571abafc787c37e8ed81e267aeb20b083fd5d8bc85b1cc769e48 gnome-user-docs-3.38.1-1/gnome-user-docs_3.38.1.orig.tar.xz af9da43a85f7d5809cfd6957dafc49fef73418dceaabc53b2ade07cd6785c11c gnome-user-docs-3.38.1-1/gnome-user-guide_3.38.1-1_all.deb -- Baptiste Beauplat - lyknode signature.asc Description: OpenPGP digital signature
Bug#971895: lintian: hangs indefinitely on stable using lintian 2.97.0~bpo10+1
Hi Baptiste, On Fri, Oct 9, 2020 at 12:30 AM Baptiste BEAUPLAT wrote: > > seq 1 2 | time xargs -I {} -P 0 lintian -d > gnome-user-docs_3.38.1-1_amd64.changes What is the purpose of the 'seq' in this command, please? > The issue is only reproducible when the following criteria are meet: > > - Running buster > - Using lintian bpo > - Using on bare metal or VM That seems to cover most scenarios. Are you saying it reproduces everywhere *except* in a chroot? > > https://framadrop.org/lufi/r/T_Vd4FcPB6#iTe5/s293qeEKTSWA1f4KR/iwptzGcCyx9FS9f6I+yE= Did you mean to upload a tarball called 'gnome-user-docs-3.38.1-1.tar.gz' ? Thanks! Kind regards Felix Lechner
Bug#971895: lintian: hangs indefinitely on stable using lintian 2.97.0~bpo10+1
Package: lintian Version: 2.97.0~bpo10+1 Severity: important Dear maintainer, On mentors.debian.net, our worker got stuck twice while running lintian on two separate packages. While I haven't been able to reproduce the issue with the first package, the second did it. # The issue Lintian hangs indefinitely on extracting the source. When run with -d, lintian stops on: N: Running check: debian/control on source:gnome-user-docs/3.38.1-1 ... The process state is: `-/usr/share/lint |-dpkg-deb --fsys-tarfile /home/lyknode/tmp/gnome-user-docs_3.38.1-1_all.deb | |-(dpkg-deb) | `-dpkg-deb --fsys-tarfile /home/lyknode/tmp/gnome-user-docs_3.38.1-1_all.deb |-tar --no-same-owner --no-same-permissions --touch --extract --file - -C /tmp/lintian-pool-tvEmMa5WKi/gnome-user-docs/gnome-user-docs_3.38.1-1_all_binary/unpacked |-tar --list --verbose --utc --full-time --quoting-style=c --file - `-tar --numeric-owner --list --verbose --utc --full-time --quoting-style=c --file - The process `(dpkg-deb)` is in Zombie state, everything else is in Sleep state. # How to reproduce First of all, the issue is intermittent. I found out it will be triggered best if multiple lintian are run at once (it will occur on a single run but less often). I use the following command to reproduce the issue: seq 1 2 | time xargs -I {} -P 0 lintian -d gnome-user-docs_3.38.1-1_amd64.changes The issue is only reproducible when the following criteria are meet: - Running buster - Using lintian bpo - Using on bare metal or VM I don't have any information regarding if a specific package triggers it. I'm uploading the one I've been using to test it. It will be available here for 60 days: https://framadrop.org/lufi/r/T_Vd4FcPB6#iTe5/s293qeEKTSWA1f4KR/iwptzGcCyx9FS9f6I+yE= I'm keeping the archive, don't hesitate to ping me for re-upload. -- Baptiste BEAUPLAT - lyknode signature.asc Description: OpenPGP digital signature