Bug#970678: Network preseeding using http is broken

2020-09-21 Thread Ben Hutchings
On Mon, 2020-09-21 at 20:42 +0100, Ben Hutchings wrote:
> On Mon, 2020-09-21 at 17:43 +0200, Martin Samuelsson wrote:
> > Philip Hands @ 2020-09-21 (Monday), 15:30 (+0200)
> > > Martin Samuelsson  writes:
> > > 
> > > Just to be clear on this point, are you saying [...]
> > 
> > I'm saying there is no /dev/fd/ at all on current daily debian-installer 
> > images and hasn't been since at least 20200818 (which was the oldest one I 
> > could try with current kernel modules).
> [...]
> 
> Then we should investigate and fix that instead of removing code that
> uses it.

It's a udev regression, bug #967546.

Ben.

-- 
Ben Hutchings
Anthony's Law of Force: Don't force it, get a larger hammer.



signature.asc
Description: This is a digitally signed message part


Bug#970678: Network preseeding using http is broken

2020-09-21 Thread Martin Samuelsson

Philip Hands @ 2020-09-21 (Monday), 22:38 (+0200)


BTW The example I pasted was just busybox running on my laptop running
full Debian, so was not supposed to be demonstrating it working under
d-i.


I could likely have been more precise from the beginning about the exact 
cause. Sorry for making you required to ask for clarification.



This mixes the stdout of wget with the %OK% %EOF% stuff, and then puts
it all into the sed, which seems flawed.


And that is why you're a Debian developer and I'm merely a user. I agree.


I'd have thought something like this would do the trick (not tested yet):

local RETVAL=$(
{
  {
wget "$@" 2>&1 >/dev/null && echo 0 >&3
  } | {
grep -q "$file_not_found_pattern" && echo 4 >&3
  } || echo 1 >&3
} 3>&1 | head -1
)


Looks good to me. As in, this code I can actually follow and understand.  
Which was a bit of a stretch for me to claim about for the sed construct.


I don't really understand why the tail would be needed, but am not objecting 
to it.


Replacing the RETVAL assignment expression as above works for me. Both from 
debian-installer and when running this simple test script in a shell:


 #!/bin/sh
 fetch-url http://pxeserver./preseed/base.txt base.txt >/dev/null
 [ $? = 0 ] || exit
 echo "Existing file downloaded"

 fetch-url http://pxeserver./non-existant.txt non-existant.txt >/dev/null
 [ $? = 4 ] || exit
 echo "Non-existing file returned 404"

 fetch-url http://invalid-host./whatever.txt invalid-host.txt >/dev/null
 [ $? = 1 ] || exit
 echo "Unresolvable host return general error"


One could of course use stderr for that (>&2 ... 2>&1) for getting the
echo-ed return codes out, rather than fd #3, but I think this is clearer
(since one really isn't trying to produce stderr output) and AFAIK it
should work fine even if /dev/fd/ is missing.


I agree. Nicely done!


I suspect there may be a way of getting this to work without the need
for a local variable and the echoing for the result codes, so I will
ponder on that...


Maybe it is, and if so it might include early return statements and thus 
save spawning grep on successful fetches. I reckon you could be happy and 
proud with the code you've posted already though.



BTW I just saw Ben's comment that we should just fix the missing /dev/fd
which strikes me as entirely sensible.


It's entirely sensible to also attempt fixing the root cause, but my opinion 
is that it wouldn't hurt to treat missing fd-nodes and an overcomplex 
wget404() as two separate issues. Please feel free to fork the bug report.


One could also envision adding a test case for detecting missing fd:s within 
some test suite for d-i. Are there any tests run on nightly builds, which 
could help catching regressions like these?

--
/Martin



Bug#970678: Network preseeding using http is broken

2020-09-21 Thread Philip Hands
Martin Samuelsson  writes:

> I'm saying there is no /dev/fd/ at all

Oh, fair enough.  That's odd.

BTW The example I pasted was just busybox running on my laptop running
full Debian, so was not supposed to be demonstrating it working under
d-i.

...
> --- http.orig 2020-09-21 17:21:24.159480072 +0200
> +++ http.simplified   2020-09-21 17:21:03.951356698 +0200
> @@ -14,10 +14,10 @@
>  
>   local RETVAL=$( {
>   echo 1
> - wget "$@" 2>&1 >&3 && echo %OK%
> + wget "$@" 2>&1 && echo %OK%
>   echo %EOF%
> - } | ( sed -ne 
> '1{h;d};/'"$file_not_found_pattern"'/{p;s/.*/4/;h;d};/^%OK%$/{s/.*/0/;h;d};$!p;$x;$w
>  /dev/fd/4' >&2 ) 4>&1
> - ) 3>&1
> + } | ( sed -ne 
> '1{h;d};/'"$file_not_found_pattern"'/{p;s/.*/4/;h;d};/^%OK%$/{s/.*/0/;h;d};$!p;$x;$p
>  '>&2 )
> + ) 
>   return $RETVAL
>   }

This mixes the stdout of wget with the %OK% %EOF% stuff, and then puts
it all into the sed, which seems flawed.

One could replace the >&3 with >/dev/null, and keep the sed, but if one
isn't trying to preserve the wget output I don't see the point of
keeping the sed at all.

I'd have thought something like this would do the trick (not tested yet):

local RETVAL=$(
 {
   {
 wget "$@" 2>&1 >/dev/null && echo 0 >&3
   } | {
 grep -q "$file_not_found_pattern" && echo 4 >&3
   } || echo 1 >&3
 } 3>&1 | head -1
)

(the patterns will need to be tweaked to take account of sed vs. grep)

One could of course use stderr for that (>&2 ... 2>&1) for getting the
echo-ed return codes out, rather than fd #3, but I think this is clearer
(since one really isn't trying to produce stderr output) and AFAIK it
should work fine even if /dev/fd/ is missing.

I suspect there may be a way of getting this to work without the need
for a local variable and the echoing for the result codes, so I will
ponder on that...

BTW I just saw Ben's comment that we should just fix the missing /dev/fd
which strikes me as entirely sensible. Even if nothing uses /dev/fd/ in
d-i (other than here) it seems wrong to simply ignore the fact that it's
missing, since people might well try to use it in preseed scripts etc.

Cheers, Phil.
-- 
|)|  Philip Hands  [+44 (0)20 8530 9560]  HANDS.COM Ltd.
|-|  http://www.hands.com/http://ftp.uk.debian.org/
|(|  Hugo-Klemm-Strasse 34,   21075 Hamburg,GERMANY


signature.asc
Description: PGP signature


Bug#970678: Network preseeding using http is broken

2020-09-21 Thread Ben Hutchings
On Mon, 2020-09-21 at 17:43 +0200, Martin Samuelsson wrote:
> Philip Hands @ 2020-09-21 (Monday), 15:30 (+0200)
> > Martin Samuelsson  writes:
> > 
> > Just to be clear on this point, are you saying [...]
> 
> I'm saying there is no /dev/fd/ at all on current daily debian-installer 
> images and hasn't been since at least 20200818 (which was the oldest one I 
> could try with current kernel modules).
[...]

Then we should investigate and fix that instead of removing code that
uses it.

Ben.

-- 
Ben Hutchings
The Peter principle: In a hierarchy, every employee tends to rise to
their level of incompetence.




signature.asc
Description: This is a digitally signed message part


Bug#970678: Network preseeding using http is broken

2020-09-21 Thread Martin Samuelsson

Geert Stappers @ 2020-09-21 (Monday), 17:18 (+0200)


Under which circumstance does the bug shows itself?


As far as I understand /dev/fd seems to be completely missing. Haven't dug 
into it. For what its worth, it seems /proc/self/fd is still available. I 
did experiment with redirecting sed there instead before realizing it wasn't 
really necessary to retain wget's output.

--
/Martin



Bug#970678: Network preseeding using http is broken

2020-09-21 Thread Martin Samuelsson

Philip Hands @ 2020-09-21 (Monday), 15:30 (+0200)

Martin Samuelsson  writes:

Just to be clear on this point, are you saying [...]


I'm saying there is no /dev/fd/ at all on current daily debian-installer 
images and hasn't been since at least 20200818 (which was the oldest one I 
could try with current kernel modules).


Missing fd:s is the root cause, and a different issue which I hope someone 
else is addressing in case it is a bug. (Nothing else in d-i failed for me 
due to it.)



To illustrate this, here's some output from busybox shell, running on my
laptop:


Yet those examples are not from the context of a recent debian-installer 
nightly image, right? 


If you noticed that it's not there when e.g. simply listing the contents
of /dev/fd then that's normal AFAIK.


The issue I reported is that wget404() is unnecessarily complex and fragile, 
for no reasonable reason. (You've stated so yourself in the README.wget404)


A suggested fix for /usr/lib/fetch-url/http looks as below:

--- http.orig   2020-09-21 17:21:24.159480072 +0200
+++ http.simplified 2020-09-21 17:21:03.951356698 +0200
@@ -14,10 +14,10 @@

local RETVAL=$( {
echo 1
-   wget "$@" 2>&1 >&3 && echo %OK%
+   wget "$@" 2>&1 && echo %OK%
echo %EOF%
-   } | ( sed -ne 
'1{h;d};/'"$file_not_found_pattern"'/{p;s/.*/4/;h;d};/^%OK%$/{s/.*/0/;h;d};$!p;$x;$w 
/dev/fd/4' >&2 ) 4>&1
-   ) 3>&1
+   } | ( sed -ne 
'1{h;d};/'"$file_not_found_pattern"'/{p;s/.*/4/;h;d};/^%OK%$/{s/.*/0/;h;d};$!p;$x;$p 
'>&2 )
+		) 
		return $RETVAL

}

I just tried patching as above at the first of the interactive debug prompts 
when setting the DEBUG bootflag high. Installation succeeded, and fetch-url 
still returns 0 on success, 1 on general failure and 4 on http not found.  
The only difference is that any output from wget is discarded already inside 
wget404() rather than by the call to wget404().


It seems me mentioning the issue on irc made you fix typos in the README.  
The preservation of wget's stdout and stderr looks fenomenal, but it fills 
no practical purpose. Please consider killing your darling. Simple code is 
beautiful and robust code.

--
/Martin



Bug#970678: Network preseeding using http is broken

2020-09-21 Thread Geert Stappers
On Mon, Sep 21, 2020 at 03:30:27PM +0200, Philip Hands wrote:
> Martin Samuelsson  writes:
> 
> > Booting the installer with DEBCONF_DEBUG=5 and debuging /bin/preseed_fetch, 
> > /bin/fetch-url and /usr/lib/fetch_url/http shows that wget404() in the 
> > latter is what's failing. It seems the pipeline fails since /dev/fd/4 does 
> > not exist.
> 
> Just to be clear on this point, are you saying that /dev/fd/4 does not
> exist when you look for it in a shell, or rather specifically when doing
> so in a context where it is in use?
> 
> If you noticed that it's not there when e.g. simply listing the contents
> of /dev/fd then that's normal AFAIK.
> 
> To illustrate this, here's some output from busybox shell, running on my
> laptop:
> 
> =-=-=-=-
> ~ $ echo /dev/fd/*
> /dev/fd/0 /dev/fd/1 /dev/fd/10 /dev/fd/2 /dev/fd/3
> ~ $ ( echo /dev/fd/* ) 4>&1
> /dev/fd/0 /dev/fd/1 /dev/fd/10 /dev/fd/2 /dev/fd/3 /dev/fd/4
> ~ $ ( echo /dev/fd/* ) 4>&1 5>&1
> /dev/fd/0 /dev/fd/1 /dev/fd/10 /dev/fd/2 /dev/fd/3 /dev/fd/4 /dev/fd/5
> ~ $  ( echo /dev/fd/* ) 6>&1
> /dev/fd/0 /dev/fd/1 /dev/fd/10 /dev/fd/2 /dev/fd/3 /dev/fd/6
> ~ $
> =-=-=-=-
> 


On Mon, Sep 21, 2020 at 02:58:36PM +0200, Philip Hands wrote:
 ...
> 
> The fact that /dev/fd/4 is missing does seem to be a separate bug, but a
> quick grep -lr suggests that this is the only place it's used in d-i, so
> perhaps a bug we need not worry about too much.
> 


Under which circumstance does the bug shows itself?



Regards
Geert Stappers
-- 
Silence is hard to parse



Bug#970678: Network preseeding using http is broken

2020-09-21 Thread Philip Hands
Martin Samuelsson  writes:

> Booting the installer with DEBCONF_DEBUG=5 and debuging /bin/preseed_fetch, 
> /bin/fetch-url and /usr/lib/fetch_url/http shows that wget404() in the 
> latter is what's failing. It seems the pipeline fails since /dev/fd/4 does 
> not exist.

Just to be clear on this point, are you saying that /dev/fd/4 does not
exist when you look for it in a shell, or rather specifically when doing
so in a context where it is in use?

If you noticed that it's not there when e.g. simply listing the contents
of /dev/fd then that's normal AFAIK.

To illustrate this, here's some output from busybox shell, running on my
laptop:

=-=-=-=-
~ $ echo /dev/fd/*
/dev/fd/0 /dev/fd/1 /dev/fd/10 /dev/fd/2 /dev/fd/3
~ $ ( echo /dev/fd/* ) 4>&1
/dev/fd/0 /dev/fd/1 /dev/fd/10 /dev/fd/2 /dev/fd/3 /dev/fd/4
~ $ ( echo /dev/fd/* ) 4>&1 5>&1
/dev/fd/0 /dev/fd/1 /dev/fd/10 /dev/fd/2 /dev/fd/3 /dev/fd/4 /dev/fd/5
~ $  ( echo /dev/fd/* ) 6>&1
/dev/fd/0 /dev/fd/1 /dev/fd/10 /dev/fd/2 /dev/fd/3 /dev/fd/6
~ $
=-=-=-=-

Cheers, Phil.
-- 
|)|  Philip Hands  [+44 (0)20 8530 9560]  HANDS.COM Ltd.
|-|  http://www.hands.com/http://ftp.uk.debian.org/
|(|  Hugo-Klemm-Strasse 34,   21075 Hamburg,GERMANY


signature.asc
Description: PGP signature


Bug#970678: Network preseeding using http is broken

2020-09-21 Thread Philip Hands
Martin Samuelsson  writes:

...
> Reading README.wget404[1] clearly states this output redirection dance is 
> never actually used, and that this convoluted expression merely exists 
> because it could possibly-maybe be useful some day. As far as I can see the 
> callers of wget404() does indeed not use its output.
>
> Given that this construct does cause real world problems, without adding any 
> actual value, how would you consider having wget404() simplified to simply 
> consume the error message and thus skip all usage of &3 and &4?

Seems fair enough to me -- if the wget output is actually needed at some
point in the future, one can always get hold of the sed incantation from
git.

That being the case, a warning could be added to the modified wget404 to
say that it now discards output, and that if that's a problem one should
look at  for a version that manages to preserve it, as
long as /dev/fd/4 is available.

The fact that /dev/fd/4 is missing does seem to be a separate bug, but a
quick grep -lr suggests that this is the only place it's used in d-i, so
perhaps a bug we need not worry about too much.

Cheers, Phil.
-- 
|)|  Philip Hands  [+44 (0)20 8530 9560]  HANDS.COM Ltd.
|-|  http://www.hands.com/http://ftp.uk.debian.org/
|(|  Hugo-Klemm-Strasse 34,   21075 Hamburg,GERMANY


signature.asc
Description: PGP signature


Bug#970678: Network preseeding using http is broken

2020-09-21 Thread Martin Samuelsson

Package: debian-installer
Version: 20200920

Debian installer fails to fetch preseed files over http.

How to reproduce:

Boot the installer with url=http://pxeserver./example.txt

Where example.txt contains:
d-i preseed/include string something.txt \
   other.txt \
   more.txt

The installer will fail to fetch at least one of the required files, and 
catching pcaps will show no fetch attempts are made.


Booting the installer with DEBCONF_DEBUG=5 and debuging /bin/preseed_fetch, 
/bin/fetch-url and /usr/lib/fetch_url/http shows that wget404() in the 
latter is what's failing. It seems the pipeline fails since /dev/fd/4 does 
not exist.


Changing /dev/fd/4 to /dev/null is a workaround making the problem go away.

Reading README.wget404[1] clearly states this output redirection dance is 
never actually used, and that this convoluted expression merely exists 
because it could possibly-maybe be useful some day. As far as I can see the 
callers of wget404() does indeed not use its output.


Given that this construct does cause real world problems, without adding any 
actual value, how would you consider having wget404() simplified to simply 
consume the error message and thus skip all usage of &3 and &4?


[1]: 
https://sources.debian.org/src/debian-installer-utils/1.133/README.wget404/#L119