Bug#1039710: debian-installer: Grub installation fails and /var/log/syslog is empty

2023-08-08 Thread Michael Tokarev

Hi everyone!

Somehow I missed this whole issue, - I didn't see it until now.
Will adjust my mail filters.

08.08.2023 00:49, Philip Hands wrote:

Steve McIntyre  writes:


On Wed, Jul 12, 2023 at 11:15:57AM +0200, Cyril Brulebois wrote:

Hi Michael,

Cyril Brulebois  (2023-06-28):

Control: reassign -1 busybox-udeb 1:1.36.1-3


[…]


With a local build, confirmed -3 is buggy, and that reverting only
busybox-udeb to -1 is sufficient to restore syslog support in the
installer.

Reassigning there; the GRUB thing can be filed separately once we have
actual information via syslog.


A fix would be appreciated, we've got reports piling up about things we
have no logs for.


After weeks with this breakage, I've just uploaded a minimal NMU to
fix it, reverting the syslog changes since -1. I've buit and tested
successfully locally.


It turned out whole syslog thing in busybox is quite broken, - this is
obvious when you see my initial patch which started whole this issue.
Later on upstream did it in a different way which broke whole thing
entirely, which I tried to fix and it seemed to be working locally, I
notified upstream about the breakage and moved on, thinking it's all
set. But obviously it is not.


Thanks -- and I agree, it works :-)

   https://openqa.debian.net/tests/178534/logfile?filename=DI_syslog.txt

As it happens, over the weekend it occurred to me that I might be able
to pave the way to a fix for this bug by coming up with a test for the
failure.

Awkwardly, syslogd wants to open /dev/log and bails out if it's already
in use, so I resorted to (the somewhat disgusting hack of) using podman:


https://salsa.debian.org/philh/busybox/-/commit/2697f7cce81d1a70de202a30e7062dc9f64a94b1

At least it allows syslogd to run well enough to succeed or fail
similarly to the behaviour seen in the bug.


Gosh..


Here it is going wrong with the -3 code:

   https://salsa.debian.org/philh/busybox/-/jobs/4523822#L3963
   (lines 3969-3975, with the last line showing the entire syslog)

and here is an example of it going right:

   https://salsa.debian.org/philh/busybox/-/jobs/4523808#L3668

   Line 3668 here, saying "PASS: syslogd-works" indicates that we
   succeeded in grepping the test string in /var/log/syslog

The difference between these two is simply disabling
CONFIG_FEATURE_REMOTE_LOG, as seen here:

   
https://salsa.debian.org/philh/busybox/-/commit/89c253f75690dd41487b6fd6f9356a1e890a6ac2

I'm not proposing that as a fix, but it does seem to indicate where the
problem may be located. I'm afraid I've failed to work out what's
actually going wrong here (my C's pretty rusty).

BTW At one point I thought I'd narrowed it down to the while loop here:

   
https://salsa.debian.org/philh/busybox/-/commit/328fdfbe43cd8d9e4425c3ee1c68aadfa44ee434

but if that did work, it does no longer. Either I was mistaken about it
having worked earlier (I'm at least 80% sure that's not the case) or
something non-deterministic is going on ... which makes me wonder if the
underlying cause might be something to do with using uninitialised data
somewhere.

Hopefully this will be of some use to those more familiar with the code.


Oh well. So much work for so minor thing.. :(  I'm sorry for missing whole
thing, I'd act right away and fix whole thing in a minutes.

The whole thing is.. well, quite bad.  We identified a few issues, upstream
syslogd is entirely broken now, remote logging isn't that important and it
has just been enabled, - the fix for now is to just disable remote logging
and to revert to the previous-to-breakage situation as is done in the NMU
(remote logging is a niche thing in this context, while it might be useful
for sure - provided it actually works.)  And ping upstream.

The thing is that upstream will most likely fix it in a different way anyway,
as Denys likes to keep it small even if the code becomes barely readable,
and he has a few common practices which he uses when changing anything.

Thank you all for all this huge work.  Adding podman to the tests is.. oh 
well...

/mjt



Bug#1039710: debian-installer: Grub installation fails and /var/log/syslog is empty

2023-08-07 Thread Philip Hands
Steve McIntyre  writes:

> On Wed, Jul 12, 2023 at 11:15:57AM +0200, Cyril Brulebois wrote:
>>Hi Michael,
>>
>>Cyril Brulebois  (2023-06-28):
>>> Control: reassign -1 busybox-udeb 1:1.36.1-3
>>
>>[…]
>>
>>> With a local build, confirmed -3 is buggy, and that reverting only
>>> busybox-udeb to -1 is sufficient to restore syslog support in the
>>> installer.
>>> 
>>> Reassigning there; the GRUB thing can be filed separately once we have
>>> actual information via syslog.
>>
>>A fix would be appreciated, we've got reports piling up about things we
>>have no logs for.
>
> After weeks with this breakage, I've just uploaded a minimal NMU to
> fix it, reverting the syslog changes since -1. I've buit and tested
> successfully locally.

Thanks -- and I agree, it works :-)

  https://openqa.debian.net/tests/178534/logfile?filename=DI_syslog.txt

As it happens, over the weekend it occurred to me that I might be able
to pave the way to a fix for this bug by coming up with a test for the
failure.

Awkwardly, syslogd wants to open /dev/log and bails out if it's already
in use, so I resorted to (the somewhat disgusting hack of) using podman:

   
https://salsa.debian.org/philh/busybox/-/commit/2697f7cce81d1a70de202a30e7062dc9f64a94b1

At least it allows syslogd to run well enough to succeed or fail
similarly to the behaviour seen in the bug.

Here it is going wrong with the -3 code:

  https://salsa.debian.org/philh/busybox/-/jobs/4523822#L3963
  (lines 3969-3975, with the last line showing the entire syslog)

and here is an example of it going right:

  https://salsa.debian.org/philh/busybox/-/jobs/4523808#L3668

  Line 3668 here, saying "PASS: syslogd-works" indicates that we
  succeeded in grepping the test string in /var/log/syslog

The difference between these two is simply disabling
CONFIG_FEATURE_REMOTE_LOG, as seen here:

  
https://salsa.debian.org/philh/busybox/-/commit/89c253f75690dd41487b6fd6f9356a1e890a6ac2

I'm not proposing that as a fix, but it does seem to indicate where the
problem may be located. I'm afraid I've failed to work out what's
actually going wrong here (my C's pretty rusty).

BTW At one point I thought I'd narrowed it down to the while loop here:

  
https://salsa.debian.org/philh/busybox/-/commit/328fdfbe43cd8d9e4425c3ee1c68aadfa44ee434

but if that did work, it does no longer. Either I was mistaken about it
having worked earlier (I'm at least 80% sure that's not the case) or
something non-deterministic is going on ... which makes me wonder if the
underlying cause might be something to do with using uninitialised data
somewhere.

Hopefully this will be of some use to those more familiar with the code.

Cheers, Phil.
-- 
Philip Hands -- https://hands.com/~phil


signature.asc
Description: PGP signature


Bug#1039710: debian-installer: Grub installation fails and /var/log/syslog is empty

2023-08-03 Thread Steve McIntyre
On Wed, Jul 12, 2023 at 11:15:57AM +0200, Cyril Brulebois wrote:
>Hi Michael,
>
>Cyril Brulebois  (2023-06-28):
>> Control: reassign -1 busybox-udeb 1:1.36.1-3
>
>[…]
>
>> With a local build, confirmed -3 is buggy, and that reverting only
>> busybox-udeb to -1 is sufficient to restore syslog support in the
>> installer.
>> 
>> Reassigning there; the GRUB thing can be filed separately once we have
>> actual information via syslog.
>
>A fix would be appreciated, we've got reports piling up about things we
>have no logs for.

After weeks with this breakage, I've just uploaded a minimal NMU to
fix it, reverting the syslog changes since -1. I've buit and tested
successfully locally.

Here's the NMU diff.

-- 
Steve McIntyre, Cambridge, UK.st...@einval.com
You raise the blade, you make the change... You re-arrange me 'til I'm sane...
diff -Nru busybox-1.36.1/debian/changelog busybox-1.36.1/debian/changelog
--- busybox-1.36.1/debian/changelog 2023-06-14 22:01:54.0 +0100
+++ busybox-1.36.1/debian/changelog 2023-08-03 21:22:44.0 +0100
@@ -1,3 +1,11 @@
+busybox (1:1.36.1-3.1) unstable; urgency=medium
+
+  * NMU
+  * Revert recent changes that have broken syslogd in d-i.
+Closes: #1039710
+
+ -- Steve McIntyre <93...@debian.org>  Thu, 03 Aug 2023 21:22:44 +0100
+
 busybox (1:1.36.1-3) unstable; urgency=medium
 
   * syslogd-avoid-nulling-devlog.patch - fix overriding dev/log
diff -Nru busybox-1.36.1/debian/patches/series 
busybox-1.36.1/debian/patches/series
--- busybox-1.36.1/debian/patches/series2023-06-14 21:55:08.0 
+0100
+++ busybox-1.36.1/debian/patches/series2023-08-03 21:22:44.0 
+0100
@@ -14,6 +14,7 @@
 platform-linux.diff
 fix-non-linux-build.patch
 #
-syslogd-decrease-stack-usage-50-bytes.patch
-syslogd-daemonize-after-init-make-errs-visible.patch
-syslogd-avoid-nulling-devlog.patch
+syslogd-fork-after-init-not-before.patch
+#syslogd-decrease-stack-usage-50-bytes.patch
+#syslogd-daemonize-after-init-make-errs-visible.patch
+#syslogd-avoid-nulling-devlog.patch
diff -Nru 
busybox-1.36.1/debian/patches/syslogd-fork-after-init-not-before.patch 
busybox-1.36.1/debian/patches/syslogd-fork-after-init-not-before.patch
--- busybox-1.36.1/debian/patches/syslogd-fork-after-init-not-before.patch  
1970-01-01 01:00:00.0 +0100
+++ busybox-1.36.1/debian/patches/syslogd-fork-after-init-not-before.patch  
2023-08-03 21:22:44.0 +0100
@@ -0,0 +1,58 @@
+From: Michael Tokarev 
+Date: Tue, 6 Jun 2023 17:08:18 +0300
+Subject: [PATCH] syslogd: fork after init on a regular system, not before
+
+Current syslogd performs all init after daemonizing, meanwhile
+main process exits successfully. This means any errors during init
+will not be even shown up because at this time the process has its
+stderr redirected to /dev/null already.
+
+On a MMU system, delay daemonizing to after init.
+On non-MMU system, keep current code.
+
+Signed-off-by: Michael Tokarev 
+---
+ sysklogd/syslogd.c | 13 -
+ 1 file changed, 12 insertions(+), 1 deletion(-)
+
+diff --git a/sysklogd/syslogd.c b/sysklogd/syslogd.c
+index 6ddfd771a..2f9a727cd 100644
+--- a/sysklogd/syslogd.c
 b/sysklogd/syslogd.c
+@@ -1025,7 +1025,6 @@ static void do_syslogd(void)
+   signal(SIGALRM, do_mark);
+   alarm(G.markInterval);
+ #endif
+-  xmove_fd(create_socket(), STDIN_FILENO);
+ 
+   if (option_mask32 & OPT_circularlog)
+   ipcsyslog_init();
+@@ -1033,6 +1032,16 @@ static void do_syslogd(void)
+   if (option_mask32 & OPT_kmsg)
+   kmsg_init();
+ 
++  {
++  int fd = create_socket();
++#if BB_MMU
++  if (!(option_mask32 & OPT_nofork)) {
++  bb_daemonize(DAEMON_CHDIR_ROOT);
++  }
++#endif
++  xmove_fd(fd, STDIN_FILENO);
++  }
++
+   timestamp_and_log_internal("syslogd started: BusyBox v" BB_VER);
+   write_pidfile_std_path_and_ext("syslogd");
+ 
+@@ -1179,9 +1188,11 @@ int syslogd_main(int argc UNUSED_PARAM, char **argv)
+   G.hostname = safe_gethostname();
+   *strchrnul(G.hostname, '.') = '\0';
+ 
++#if !BB_MMU
+   if (!(opts & OPT_nofork)) {
+   bb_daemonize_or_rexec(DAEMON_CHDIR_ROOT, argv);
+   }
++#endif
+ 
+   do_syslogd();
+   /* return EXIT_SUCCESS; */


Bug#1039710: debian-installer: Grub installation fails and /var/log/syslog is empty

2023-07-12 Thread Cyril Brulebois
Hi Michael,

Cyril Brulebois  (2023-06-28):
> Control: reassign -1 busybox-udeb 1:1.36.1-3

[…]

> With a local build, confirmed -3 is buggy, and that reverting only
> busybox-udeb to -1 is sufficient to restore syslog support in the
> installer.
> 
> Reassigning there; the GRUB thing can be filed separately once we have
> actual information via syslog.

A fix would be appreciated, we've got reports piling up about things we
have no logs for.


Cheers,
-- 
Cyril Brulebois (k...@debian.org)
D-I release manager -- Release team member -- Freelance Consultant


signature.asc
Description: PGP signature


Bug#1039710: debian-installer: Grub installation fails and /var/log/syslog is empty

2023-06-28 Thread Cyril Brulebois
Control: reassign -1 busybox-udeb 1:1.36.1-3

Cyril Brulebois  (2023-06-28):
> busybox seems to me like the most likely suspect here. deb-reversion'ing
> bookworm's version as a version that's newer than the one in unstable,
> stashing its binaries under build/localudebs and building say a
> netboot(-gtk) image should be a quick way to test that hypothesis.

With a local build, confirmed -3 is buggy, and that reverting only
busybox-udeb to -1 is sufficient to restore syslog support in the
installer.

Reassigning there; the GRUB thing can be filed separately once we have
actual information via syslog.


Cheers,
-- 
Cyril Brulebois (k...@debian.org)
D-I release manager -- Release team member -- Freelance Consultant


signature.asc
Description: PGP signature


Processed: Re: Bug#1039710: debian-installer: Grub installation fails and /var/log/syslog is empty

2023-06-28 Thread Debian Bug Tracking System
Processing control commands:

> reassign -1 busybox-udeb 1:1.36.1-3
Bug #1039710 [debian-installer] debian-installer: Grub installation fails and 
/var/log/syslog is empty
Bug reassigned from package 'debian-installer' to 'busybox-udeb'.
Ignoring request to alter found versions of bug #1039710 to the same values 
previously set
Ignoring request to alter fixed versions of bug #1039710 to the same values 
previously set
Bug #1039710 [busybox-udeb] debian-installer: Grub installation fails and 
/var/log/syslog is empty
Marked as found in versions busybox/1:1.36.1-3.

-- 
1039710: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1039710
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#1039710: debian-installer: Grub installation fails and /var/log/syslog is empty

2023-06-28 Thread Cyril Brulebois
Roland Clobus  (2023-06-28):
> My findings so far:
> * The command line arguments of syslogd and klogd (both from Busybox) have not
> changed between Bookworm and Trixie.
> * At the moment of the failure, the /var/log folder contains only 3 files [3]:
> syslog (a single line, stating that syslog was started from Busybox [4]),
> partman and Xorg.0.log
> * When running `logger`, an entry should have been created in /var/log/syslog,
> but that doesn't happen. The netinst image from Bookworm works correctly.
> * Possibly relevant packages that have been updated: busybox, libc, linux-
> image, bsdutils

busybox seems to me like the most likely suspect here. deb-reversion'ing
bookworm's version as a version that's newer than the one in unstable,
stashing its binaries under build/localudebs and building say a
netboot(-gtk) image should be a quick way to test that hypothesis.


Cheers,
-- 
Cyril Brulebois (k...@debian.org)
D-I release manager -- Release team member -- Freelance Consultant


signature.asc
Description: PGP signature


Bug#1039710: debian-installer: Grub installation fails and /var/log/syslog is empty

2023-06-28 Thread Roland Clobus
Package: debian-installer
Version: daily build 2023-06-28T05:19Z
Severity: grave
Tags: d-i
Justification: renders package unusable
X-Debbugs-Cc: p...@hands.com

Hello Debian-installer maintainers,

On openQA [1] the installation tests with the latest netinst image [2] fail,
because GRUB cannot install.
I've tried to look a bit deeper into the issue, but I cannot proceed, because
/var/log/syslog is empty. So effectively there are possibly two issues in this
report:
1) Failure in grub
2) No logging to /var/log/syslog

My findings so far:
* The command line arguments of syslogd and klogd (both from Busybox) have not
changed between Bookworm and Trixie.
* At the moment of the failure, the /var/log folder contains only 3 files [3]:
syslog (a single line, stating that syslog was started from Busybox [4]),
partman and Xorg.0.log
* When running `logger`, an entry should have been created in /var/log/syslog,
but that doesn't happen. The netinst image from Bookworm works correctly.
* Possibly relevant packages that have been updated: busybox, libc, linux-
image, bsdutils

With kind regards,
Roland Clobus

[1] https://openqa.debian.net/tests/167456
[2] https://get.debian.org/images/daily-builds/daily/arch-latest/amd64/iso-
cd/debian-testing-amd64-netinst.iso
[3] https://openqa.debian.net/tests/167456/file/grub-var_log.tar
[4] https://openqa.debian.net/tests/167456/logfile?filename=DI_syslog.txt

PS: Attached system information is from my personal computer, not the installed
system


-- System Information:
Debian Release: 12.0
  APT prefers testing
  APT policy: (990, 'testing'), (500, 'testing-debug')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 6.1.0-9-amd64 (SMP w/8 CPU threads; PREEMPT)
Locale: LANG=en_GB.UTF-8, LC_CTYPE=en_GB.UTF-8 (charmap=UTF-8), 
LANGUAGE=en_GB:en
Shell: /bin/sh linked to /usr/bin/dash
Init: systemd (via /run/systemd/system)
LSM: AppArmor: enabled