Bug#971895: lintian: hangs indefinitely on stable using lintian 2.97.0~bpo10+1

2020-10-09 Thread Baptiste Beauplat
On 10/9/20 5:40 PM, Felix Lechner wrote:
> Perhaps it was an error to mix sysread [1] with print (as noted in the
> documentation, even though the handles are different). Can you try
> syswrite [2] in the three lines here [3] instead of print, please?

Already tried it in one of my tests, doesn't solve the issue. I've also
tried to flush with fh->flush() without success.

-- 
Baptiste BEAUPLAT - lyknode



signature.asc
Description: OpenPGP digital signature


Bug#971895: lintian: hangs indefinitely on stable using lintian 2.97.0~bpo10+1

2020-10-09 Thread Felix Lechner
Hi Baptiste,

On Fri, Oct 9, 2020 at 6:41 AM Baptiste Beauplat  wrote:
>
> The problem seems to be an interaction between the pipes, the process,
> perl and the kernel.

Perhaps it was an error to mix sysread [1] with print (as noted in the
documentation, even though the handles are different). Can you try
syswrite [2] in the three lines here [3] instead of print, please?

Kind regards
Felix Lechner

[1] https://perldoc.perl.org/functions/sysread
[2] https://perldoc.perl.org/functions/syswrite
[3] 
https://salsa.debian.org/lintian/lintian/-/blob/master/lib/Lintian/IO/Select.pm#L192-194



Bug#971895: lintian: hangs indefinitely on stable using lintian 2.97.0~bpo10+1

2020-10-09 Thread Baptiste Beauplat
Hi Felix,

I did a couple more testing and here are the results:

- Using the bpo kernel solves the issue
- Using a smaller write on the read buffer solves the issue (tested with 4k)

What does not solve the issue, that I've tried:

- Writing only when the pipes are available for writing (using select on
stdin). See the attached patch for the modification/tests I've made.

The problem seems to be an interaction between the pipes, the process,
perl and the kernel. Since bumping the kernel isn't an option for a lot
users, I would suggest decreasing the read buffer (or writing smaller
chunk?).

-- 
Baptiste Beauplat - lyknode
--- /usr/share/perl5/Lintian/IO/Select.pm	2020-10-09 15:36:34.817261016 +0200
+++ /usr/share/lintian/lib/Lintian/IO/Select.pm	2020-10-09 15:38:11.944481974 +0200
@@ -77,7 +77,9 @@
 
 my @pids;
 
-my $select = IO::Select->new;
+my $select_r = IO::Select->new;
+my $select_w = IO::Select->new;
+my $select_e = IO::Select->new;
 
 my $produce_stdin;
 my $produce_stdout;
@@ -98,7 +100,8 @@
 
 push(@pids, $produce_pid);
 
-$select->add($produce_stdout, $produce_stderr);
+$select_r->add($produce_stdout, $produce_stderr);
+$select_e->add($produce_stdin, $produce_stdout, $produce_stderr);
 
 my $extract_stdin;
 my $extract_stdout;
@@ -120,7 +123,9 @@
 
 push(@pids, $extract_pid);
 
-$select->add($extract_stdout, $extract_stderr);
+$select_r->add($extract_stdout, $extract_stderr);
+$select_w->add($extract_stdin);
+$select_e->add($extract_stdin, $extract_stdout, $extract_stderr);
 
 my @index_options
   = qw(--list --verbose --utc --full-time --quoting-style=c --file -);
@@ -140,7 +145,9 @@
 
 push(@pids, $named_pid);
 
-$select->add($named_stdout, $named_stderr);
+$select_r->add($named_stdout, $named_stderr);
+$select_w->add($named_stdin);
+$select_e->add($named_stdin, $named_stdout, $named_stderr);
 
 my $numeric_stdin;
 my $numeric_stdout;
@@ -159,7 +166,9 @@
 
 push(@pids, $numeric_pid);
 
-$select->add($numeric_stdout, $numeric_stderr);
+$select_r->add($numeric_stdout, $numeric_stderr);
+$select_w->add($numeric_stdin);
+$select_e->add($numeric_stdin, $numeric_stdout, $numeric_stderr);
 
 my $named = EMPTY;
 my $numeric = EMPTY;
@@ -168,12 +177,27 @@
 my $extract_errors = EMPTY;
 my $named_errors = EMPTY;
 
-while (my @ready = $select->can_read) {
+use Data::Dumper;
+while (my @state = IO::Select->select($select_r, $select_w, $select_e)) {
+(my $read, my $write, my $error) = @state;
+
+for my $handle (@{$error}) {
+die "PROCESS ERROR"
+}
+
+my $count = scalar @{$write};
+if ($count != 3) {
+STDERR->printflush("Not ready to write: $count: \n");
+next;
+} else {
+STDERR->printflush("OK to write\n");
+}
 
-for my $handle (@ready) {
+for my $handle (@{$read}) {
 
 my $buffer;
-my $length = sysread($handle, $buffer, 4096 * TAR_RECORD_SIZE);
+# my $length = sysread($handle, $buffer, 4096 * TAR_RECORD_SIZE);
+my $length = sysread($handle, $buffer, 4096);
 
 die "Error from child: $!\n"
   unless defined $length;
@@ -184,7 +208,7 @@
 close $named_stdin;
 close $numeric_stdin;
 }
-$select->remove($handle);
+$select_r->remove($handle);
 next;
 }
 


signature.asc
Description: OpenPGP digital signature


Bug#971895: lintian: hangs indefinitely on stable using lintian 2.97.0~bpo10+1

2020-10-09 Thread Felix Lechner
Hi Baptiste,

On Fri, Oct 9, 2020 at 2:03 AM Felix Lechner  wrote:
>
> Untarring is an expensive operation, and the two indices would
> otherwise require two such operations in addition to the actual
> unpacking.

Upon re-reading, my wording was perhaps a bit unclear. Here, 'index'
refers to 'tar t'. or the index output by tar. We collect both named
and numerical owners, which require two separate runs.

> In your case, the index of 'installed' files seems to be the issue.

This index is the full in-memory replication of the file list
(including checksums and, sometimes, cached content). It is based on
Lintian::Index, which is how checks examine files installed in *.deb
packages.

Lintian provides four indices total: installed, control, patched and
orig. Their features are identical.

Kind regards
Felix Lechner



Bug#971895: lintian: hangs indefinitely on stable using lintian 2.97.0~bpo10+1

2020-10-09 Thread Baptiste Beauplat
On 10/9/20 11:03 AM, Felix Lechner wrote:
> Hi Baptiste,
> 
> On Fri, Oct 9, 2020 at 12:30 AM Baptiste BEAUPLAT  wrote:
>>
>> the issue is intermittent
> 
> In which percentage of cases does this issue occur, please?

With xargs I'd say 90% reproducibility.
Single run is closer to 1 out of 3.

> I am unable to reproduce it in twenty runs locally on stable-bpo, in
> which I develop ("bare metal"), without the 'time' command which is,
> for some reason, not available in that position in my version of bash.
> I made twenty runs:
> 
> $ seq 1 2 | xargs -I {} -P 0 bin/lintian
> ../bugs/lyknode/gnome-user-docs-3.38.1-1/gnome-user-docs_3.38.1-1_amd64.changes
> E: gnome-user-docs: description-is-pkg-name GNOME user docs
> W: gnome-user-docs source: incomplete-creative-commons-license
> cc-by-3.0 (line 7)
> I: gnome-user-docs: unusual-documentation-package-name
> P: gnome-user-docs source: silent-on-rules-requiring-root
> E: gnome-user-docs: description-is-pkg-name GNOME user docs
> W: gnome-user-docs source: incomplete-creative-commons-license
> cc-by-3.0 (line 7)
> I: gnome-user-docs: unusual-documentation-package-name
> P: gnome-user-docs source: silent-on-rules-requiring-root
> 
> Right now, my best guess is a race condition or other problem in this routine:
> 
> https://salsa.debian.org/lintian/lintian/-/blob/master/lib/Lintian/IO/Select.pm#L75
> 
> The issue should be easy to track down, if it is caused by Lintian,
> because Lintian no longer does anything in parallel. That routine is
> the sole exception.
> 
> Untarring is an expensive operation, and the two indices would
> otherwise require two such operations in addition to the actual
> unpacking.
> 
> In your case, the index of 'installed' files seems to be the issue.
> Indices are now unpacked on demand. Do you see the issue with just a
> single check that accesses Processable::installed, for example with
> '-C files/special'?

Yes. The issue does occur with the following command:

lintian -d -C files/special gnome-user-docs_3.38.1-1_all.deb

I did 10 runs, the stuck rate was 50%.

List of zombies processes varies over the runs but it's always among the
same processes (the 3 tars and 2 dpkg-deb).

> 
> Ideally, you would run it only on the offending *.deb. Thanks!


-- 
Baptiste Beauplat - lyknode



signature.asc
Description: OpenPGP digital signature


Bug#971895: lintian: hangs indefinitely on stable using lintian 2.97.0~bpo10+1

2020-10-09 Thread Felix Lechner
Hi Baptiste,

On Fri, Oct 9, 2020 at 12:30 AM Baptiste BEAUPLAT  wrote:
>
> the issue is intermittent

In which percentage of cases does this issue occur, please?

I am unable to reproduce it in twenty runs locally on stable-bpo, in
which I develop ("bare metal"), without the 'time' command which is,
for some reason, not available in that position in my version of bash.
I made twenty runs:

$ seq 1 2 | xargs -I {} -P 0 bin/lintian
../bugs/lyknode/gnome-user-docs-3.38.1-1/gnome-user-docs_3.38.1-1_amd64.changes
E: gnome-user-docs: description-is-pkg-name GNOME user docs
W: gnome-user-docs source: incomplete-creative-commons-license
cc-by-3.0 (line 7)
I: gnome-user-docs: unusual-documentation-package-name
P: gnome-user-docs source: silent-on-rules-requiring-root
E: gnome-user-docs: description-is-pkg-name GNOME user docs
W: gnome-user-docs source: incomplete-creative-commons-license
cc-by-3.0 (line 7)
I: gnome-user-docs: unusual-documentation-package-name
P: gnome-user-docs source: silent-on-rules-requiring-root

Right now, my best guess is a race condition or other problem in this routine:

https://salsa.debian.org/lintian/lintian/-/blob/master/lib/Lintian/IO/Select.pm#L75

The issue should be easy to track down, if it is caused by Lintian,
because Lintian no longer does anything in parallel. That routine is
the sole exception.

Untarring is an expensive operation, and the two indices would
otherwise require two such operations in addition to the actual
unpacking.

In your case, the index of 'installed' files seems to be the issue.
Indices are now unpacked on demand. Do you see the issue with just a
single check that accesses Processable::installed, for example with
'-C files/special'?

Ideally, you would run it only on the offending *.deb. Thanks!

Kind regards
Felix Lechner



Bug#971895: lintian: hangs indefinitely on stable using lintian 2.97.0~bpo10+1

2020-10-09 Thread Baptiste Beauplat
On 10/9/20 9:50 AM, Felix Lechner wrote:
> Hi Baptiste,
> 
> On Fri, Oct 9, 2020 at 12:30 AM Baptiste BEAUPLAT  wrote:
>>
>> seq 1 2 | time xargs -I {} -P 0 lintian -d 
>> gnome-user-docs_3.38.1-1_amd64.changes
> 
> What is the purpose of the 'seq' in this command, please?

To produce **two** arbitrary lines that will be used by xargs to run
**two** instances of the command in parallel.

>> The issue is only reproducible when the following criteria are meet:
>>
>> - Running buster
>> - Using lintian bpo
>> - Using on bare metal or VM
> 
> That seems to cover most scenarios. Are you saying it reproduces
> everywhere *except* in a chroot?

And except on unstable and unstable or stable lintian, yes.

> 
>> 
>> https://framadrop.org/lufi/r/T_Vd4FcPB6#iTe5/s293qeEKTSWA1f4KR/iwptzGcCyx9FS9f6I+yE=
> 
> Did you mean to upload a tarball called
> 'gnome-user-docs-3.38.1-1.tar.gz' ? Thanks!

Yes, that's an archive containing the upload.

$ sha256sum gnome-user-docs-3.38.1-1.tar.gz 
e4a829a2e29c4778ef8c1a11b30cf6d61e8f2c72e726ce2f92e7a2623f4c49fa  
gnome-user-docs-3.38.1-1.tar.gz

Content is:

$ sha256sum gnome-user-docs-3.38.1-1/* 
7d9fcb25cd82bf9855e2d4821dc60e08ed58255560a96511a50c22c8f7c01b8a  
gnome-user-docs-3.38.1-1/gnome-user-docs_3.38.1-1_all.deb
bb47c5e4f834504e4faffc118027db30471a3356bc5b0aeef66ea33a6e50464a  
gnome-user-docs-3.38.1-1/gnome-user-docs_3.38.1-1_amd64.buildinfo
d78182dea442d7710a9c4122ae6bfd3e7ef9c21c8b790d981571e7c1080f1192  
gnome-user-docs-3.38.1-1/gnome-user-docs_3.38.1-1_amd64.changes
db17908b10ffe4c7540d521ca5bdbc21677721d5a2fc67af8f171cc34eb850cf  
gnome-user-docs-3.38.1-1/gnome-user-docs_3.38.1-1.debian.tar.xz
8d231141654f8aab2e151b7887ffd9658daad2fc0684295b9e5212d2db9f1915  
gnome-user-docs-3.38.1-1/gnome-user-docs_3.38.1-1.dsc
4e8af49f5d23571abafc787c37e8ed81e267aeb20b083fd5d8bc85b1cc769e48  
gnome-user-docs-3.38.1-1/gnome-user-docs_3.38.1.orig.tar.xz
af9da43a85f7d5809cfd6957dafc49fef73418dceaabc53b2ade07cd6785c11c  
gnome-user-docs-3.38.1-1/gnome-user-guide_3.38.1-1_all.deb

-- 
Baptiste Beauplat - lyknode



signature.asc
Description: OpenPGP digital signature


Bug#971895: lintian: hangs indefinitely on stable using lintian 2.97.0~bpo10+1

2020-10-09 Thread Felix Lechner
Hi Baptiste,

On Fri, Oct 9, 2020 at 12:30 AM Baptiste BEAUPLAT  wrote:
>
> seq 1 2 | time xargs -I {} -P 0 lintian -d 
> gnome-user-docs_3.38.1-1_amd64.changes

What is the purpose of the 'seq' in this command, please?

> The issue is only reproducible when the following criteria are meet:
>
> - Running buster
> - Using lintian bpo
> - Using on bare metal or VM

That seems to cover most scenarios. Are you saying it reproduces
everywhere *except* in a chroot?

> 
> https://framadrop.org/lufi/r/T_Vd4FcPB6#iTe5/s293qeEKTSWA1f4KR/iwptzGcCyx9FS9f6I+yE=

Did you mean to upload a tarball called
'gnome-user-docs-3.38.1-1.tar.gz' ? Thanks!

Kind regards
Felix Lechner



Bug#971895: lintian: hangs indefinitely on stable using lintian 2.97.0~bpo10+1

2020-10-09 Thread Baptiste BEAUPLAT
Package: lintian
Version: 2.97.0~bpo10+1
Severity: important

Dear maintainer,

On mentors.debian.net, our worker got stuck twice while running lintian
on two separate packages. While I haven't been able to reproduce the
issue with the first package, the second did it.

# The issue

Lintian hangs indefinitely on extracting the source.

When run with -d, lintian stops on:

N: Running check: debian/control on source:gnome-user-docs/3.38.1-1  ...

The process state is:

`-/usr/share/lint
|-dpkg-deb --fsys-tarfile 
/home/lyknode/tmp/gnome-user-docs_3.38.1-1_all.deb
|   |-(dpkg-deb)
|   `-dpkg-deb --fsys-tarfile 
/home/lyknode/tmp/gnome-user-docs_3.38.1-1_all.deb
|-tar --no-same-owner --no-same-permissions --touch --extract --file - 
-C 
/tmp/lintian-pool-tvEmMa5WKi/gnome-user-docs/gnome-user-docs_3.38.1-1_all_binary/unpacked
|-tar --list --verbose --utc --full-time --quoting-style=c --file -
`-tar --numeric-owner --list --verbose --utc --full-time 
--quoting-style=c --file -

The process `(dpkg-deb)` is in Zombie state, everything else is in Sleep state.

# How to reproduce

First of all, the issue is intermittent. I found out it will be
triggered best if multiple lintian are run at once (it will occur on a
single run but less often). I use the following command to reproduce the
issue:

seq 1 2 | time xargs -I {} -P 0 lintian -d 
gnome-user-docs_3.38.1-1_amd64.changes

The issue is only reproducible when the following criteria are meet:

- Running buster
- Using lintian bpo
- Using on bare metal or VM

I don't have any information regarding if a specific package triggers
it. I'm uploading the one I've been using to test it.

It will be available here for 60 days:


https://framadrop.org/lufi/r/T_Vd4FcPB6#iTe5/s293qeEKTSWA1f4KR/iwptzGcCyx9FS9f6I+yE=

I'm keeping the archive, don't hesitate to ping me for re-upload.

-- 
Baptiste BEAUPLAT - lyknode



signature.asc
Description: OpenPGP digital signature