from:"Elliott Mitchell"

Bug#1071480: libldap: sends some IPv6 addresses as server name

2024-05-20 Thread Elliott Mitchell

On Mon, May 20, 2024 at 04:25:57PM -0700, Quanah Gibson-Mount wrote:
> 
> --On Monday, May 20, 2024 3:45 PM -0700 Elliott Mitchell 
>  wrote:
> 
> Side note - I did raise this issue with the rest of the OpenLDAP project, 
> and Howard noted:
> 
> "DNS names are required to begin with a letter. RFC 1035, sec 2.3.1. The 
> fact that gnutls allows names that are all numeric is certainly their bug".
> 
> So I guess two bugs here.

According to what I found, that requirement was removed.  This doesn't
invalidate the fact that no top-level domain consists exclusively of
numbers (in fact I'm pretty sure none have any numbers).

I'm proposing checking only for nul-characters and passing everything
else through.  Principle being anything handling SNI must handle the
case of a string which fails to match a known entry.  If a server program
chose to honor strings which violate RFC 6066, GnuTLS doesn't need to get
in the way of that.  Simply terminating the connection really isn't to
helpful (it could simply be a bug).

-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1071480: libldap: sends some IPv6 addresses as server name

2024-05-20 Thread Elliott Mitchell

On Mon, May 20, 2024 at 12:46:34PM -0700, Ryan Tandy wrote:
> However, I tested your patch, and I'm not sure it's correct.
> 
> If the IPv6 address contains a letter a-f before the first colon, I 
> think the code you changed is never reached. On seeing the first 
> non-digit, we break the loop with numeric=0, and never reach the colon.
> 
> Have I missed something?
> 
> I would appreciate if you would pursue this issue upstream. If the fix 
> needs further review or discussion with the upstream developers, I'd 
> really rather not be a middleman in that conversation.

No, you haven't missed something.  %-)  Turns out I goofed when reading
the loop.  Indeed the `if(!isdigit(*c)) {` needs to have the `break;`
removed too, then it will work.

The person writing the loop was thinking of the most commonly used block
of IPv6 addresses which start with "2001:".  Yet IPv6 is hexadecimal and
"fd00:/8" is part of a validly used block.



On Mon, May 20, 2024 at 01:13:11PM -0700, Quanah Gibson-Mount wrote:
> 
> --On Monday, May 20, 2024 1:46 PM -0700 Ryan Tandy  wrote:
> 
> > Control: tag -1 upstream moreinfo
> >
> > Hi Elliott, thank you for investigating this issue and contributing a
> > patch.
> 
> [snip]
> 
> > I would appreciate if you would pursue this issue upstream. If the fix
> > needs further review or discussion with the upstream developers, I'd
> > really rather not be a middleman in that conversation.
> 
> Upstream generally does not accept 3rd party patch contributions, so asking 
> debian to contribute it wil likely result in it not being accepted.  So 
> it's better to work directly with the OpenLDAP project.  I'd start by 
> filing an issue in the issue tracker if one doesn't already exist:
> 
> https://bugs.openldap.org
> 
> and then apply for a gitlab account with the OpenLDAP project:
> 
> https://git.openldap.org
> 
> After the account is approved, you can open a PR to have your patch 
> evaluated.

Debian policy for maintainers is they're required to take care of pushing
issues upstream.  I didn't want to deal with the OpenLDAP bug tracker and
those steps, so pushing to the Debian project seemed handiest.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1071480: libldap: sends some IPv6 addresses as server name

2024-05-19 Thread Elliott Mitchell

Seems there were two bugs in #1070033.  The part for OpenLDAP is pretty
simple.  When detecting an IPv6 address (via ':' in the string),
the function `ldap_int_tls_connect()` triggers a `break;`, but this
requires `numeric=1` to still be in effect.  Since IPv6 addresses are
hexadecimal, this isn't always true.

Patch attached.  Given how small it is, any license acceptable to the
Debian project is acceptable to me.  I'll let the maintainer forward it
to the OpenLDAP project.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445


From: Elliott Mitchell 
Date: Sun, 19 May 2024 09:49:36 -0700
Subject: [PATCH] tls: fix handling of numeric IPv6 addresses for SNI

A colon in the SNI is a strong indicator of an IPv6 address.  Since IPv6
addresses are hexadecimal, `numeric` may already be false and falling
through to the test doesn't work.  Address this by preemptively setting
`sni` to invalid (NULL).

Fixes: b8f34888 ("ITS#9176 check for numeric addrs before passing SNI")
---
 libraries/libldap/tls2.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/libraries/libldap/tls2.c b/libraries/libldap/tls2.c
index f9dcbfc8d..d433e6508 100644
--- a/libraries/libldap/tls2.c
+++ b/libraries/libldap/tls2.c
@@ -399,8 +399,10 @@ ldap_int_tls_connect( LDAP *ld, LDAPConn *conn, const char *host )
 		int numeric = 1;
 		unsigned char *c;
 		for ( c = (unsigned char *)sni; *c; c++ ) {
-			if ( *c == ':' )	/* IPv6 address */
+			if ( *c == ':' ) {	/* IPv6 address */
+sni = NULL;
 break;
+			}
 			if ( *c == '.' )
 continue;
 			if ( !isdigit( *c )) {
-- 
2.39.2

Bug#1070033: libgnutls30: rejects numeric IPv6 addresses during connection

2024-05-18 Thread Elliott Mitchell

On Sat, May 18, 2024 at 10:47:55AM +0200, Andreas Metzler wrote:
> On 2024-05-18 Elliott Mitchell  wrote:
> > On Sat, May 18, 2024 at 08:16:25AM +0200, Andreas Metzler wrote:
> [...]
> 
> >> You seem to argue that it is major problem for a gnutls client to *send*
> >> e.g. "127.0.0.1" as SNI. My point is that this is not a problem but at
> >> most uncomely since client-side certificate verification will fail.
> >> Even for a trusted certificate name checking is done (if gnutls is
> >> correctly used). And this will not succeed if the CN or SAN is an IP
> >> address. (I have tried with test certificates and gnutls-cli/-serv. My
> >> testing might be flawed of course.)
> 
> > This is purely hypothetical since this case isn't being observed.
> 
> > What #1070033 is about is, a program was configured to directly connect
> > to a server via IPv6.  This address was provided to libgnutls.  libgnutls
> > sent the provided address to the server as SNI without verifying it was
> > valid for SNI.
> 
> > The usual approach is be conservative in what you send, but liberal in
> > what you accept.  This means libgnutls needs to check whether what is
> > provided is acceptable before sending it, but the server side could
> > allow an IP address which violates RFC 6066.
> 
> > `gnutls-cli` is a very poor simulcrum for this case.  `gnutls-cli` does
> > lots of checking which specialized clients may skip.  `gnutls-cli` also
> > assumes name service is fully available.  Whereas `nslcd` cannot rely on
> > name service being operational as it may provide name service.

> Let's assume
> a) _gnutls_server_name_send_params() was changed to reject
>e.g. "127.0.0.1"[1] and
> b) this stopped libgnutls from sending "127.0.0.1" to the server as SNI.
> 
> How would this help you, or how is this related to this bug report? In
> this bug report perhaps an IPv6 address was used which is already
> rejected by _gnutls_server_name_send_params().

This is not something I proposed and indeed this wouldn't help me.

_gnutls_server_name_recv_params() does some rough filtering which catches
IPv6 addresses, but not IPv4 addresses.

_gnutls_server_name_send_params() does NO filtering and thus sends both
IPv4 and IPv6 addresses.

libgnutls is being conservative in what it accepts, but liberal in what
it sends.  This breaks interoperability.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1070033: libgnutls30: rejects numeric IPv6 addresses during connection

2024-05-18 Thread Elliott Mitchell

On Sat, May 18, 2024 at 08:16:25AM +0200, Andreas Metzler wrote:
> On 2024-05-18 Elliott Mitchell  wrote:
> > On Sat, May 18, 2024 at 07:40:13AM +0200, Andreas Metzler wrote:
> >> On 2024-05-18 Elliott Mitchell  wrote:
> >>> On Sat, May 18, 2024 at 06:55:06AM +0200, Andreas Metzler wrote:
> [...]
> >>>> Afaict it is a short-cut to save more expensive processing for obvious
> >>>> errors. gnutls_session_get_verify_cert_status() (with
> >>>> gnutls_session_set_verify_cert() set correctly) or
> >>>> gnutls_x509_crt_check_hostname()/gnutls_certificate_verify_peers3()
> >>>> does more elaborate stuff on the data,
> >>>> gnutls_certificate_verify_peers2() requires a separate
> >>>> gnutls_x509_crt_check_hostname().
> 
> >>> Which seems to argue the more urgent issue is
> >>> _gnutls_server_name_send_params() needs to do checking of the provided
> >>> server hostname before sending it as SNI.
> 
> >> Why is this urgent or even relevant? Certificate checking (client-side)
> >> will not accept IP adresses as SNI field.
> 
> > Not relevant.  If the certificate comes from a local file, it is assumed
> > trusted.  If the certificate comes from the server, then it is only
> > available *after* connection and the SNI has already been sent.
> [...]

> You seem to argue that it is major problem for a gnutls client to *send*
> e.g. "127.0.0.1" as SNI. My point is that this is not a problem but at
> most uncomely since client-side certificate verification will fail.
> Even for a trusted certificate name checking is done (if gnutls is
> correctly used). And this will not succeed if the CN or SAN is an IP
> address. (I have tried with test certificates and gnutls-cli/-serv. My
> testing might be flawed of course.)

This is purely hypothetical since this case isn't being observed.

What #1070033 is about is, a program was configured to directly connect
to a server via IPv6.  This address was provided to libgnutls.  libgnutls
sent the provided address to the server as SNI without verifying it was
valid for SNI.

The usual approach is be conservative in what you send, but liberal in
what you accept.  This means libgnutls needs to check whether what is
provided is acceptable before sending it, but the server side could
allow an IP address which violates RFC 6066.

`gnutls-cli` is a very poor simulcrum for this case.  `gnutls-cli` does
lots of checking which specialized clients may skip.  `gnutls-cli` also
assumes name service is fully available.  Whereas `nslcd` cannot rely on
name service being operational as it may provide name service.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1070033: libgnutls30: rejects numeric IPv6 addresses during connection

2024-05-18 Thread Elliott Mitchell

On Sat, May 18, 2024 at 07:40:13AM +0200, Andreas Metzler wrote:
> On 2024-05-18 Elliott Mitchell  wrote:
> > On Sat, May 18, 2024 at 06:55:06AM +0200, Andreas Metzler wrote:
> [...]
> > > > > I notice the `_gnutls_dnsname_is_valid()` function in
> > > > > gnutls28-3.8.5/lib/str.h accepts IPv4 addresses (which are NOT valid 
> > > > > in
> > > > > DNS), but rejects IPv6 addresses.
> 
> > > At a very bare level an IPv4 address is a valid DNS name (alnum, dashes,
> > > and dots), an IPv6 adress is not. That is what gnutls is checking here.
> 
> > No, there isn't any IPv4 address which is a valid DNS name.  No top-level
> > domain consists purely of decimal digits, whereas IPv4 addresses consist
> > of purely decimal digits.  In fact I don't believe there are any
> > top-level domains which have even a single decimal digit in them.

> which is totally irrelevant if my reading (quoted below) that this is
> not a policy check but a performance optimization is correct.

Most recent change to the line, commit 71d921edc4:

  Add GNUTLS_E_RECEIVED_DISALLOWED_NAME for illegal SNI names

  An illegal/disallowed SNI server name previously generated
  the misleading message "An illegal parameter has been received.".

  This commit changes it to
"A disallowed SNI server name has been received.".

That commit message clearly indicates the author was thinking of it as a
policy check.

> > > Afaict it is a short-cut to save more expensive processing for obvious
> > > errors. gnutls_session_get_verify_cert_status() (with
> > > gnutls_session_set_verify_cert() set correctly) or
> > > gnutls_x509_crt_check_hostname()/gnutls_certificate_verify_peers3()
> > > does more elaborate stuff on the data,
> > > gnutls_certificate_verify_peers2() requires a separate
> > > gnutls_x509_crt_check_hostname().
> 
> > Which seems to argue the more urgent issue is
> > _gnutls_server_name_send_params() needs to do checking of the provided
> > server hostname before sending it as SNI.
> 
> Why is this urgent or even relevant? Certificate checking (client-side)
> will not accept IP adresses as SNI field.

Not relevant.  If the certificate comes from a local file, it is assumed
trusted.  If the certificate comes from the server, then it is only
available *after* connection and the SNI has already been sent.

The issue is libgnutls's API requires providing the library with the
server being connected to.  libgnutls then assumes the provided server
can be used for SNI, which is untrue (in this case IP addresses violate
RFC 6066).


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1070033: libgnutls30: rejects numeric IPv6 addresses during connection

2024-05-17 Thread Elliott Mitchell

On Sat, May 18, 2024 at 06:55:06AM +0200, Andreas Metzler wrote:
> On 2024-05-17 Elliott Mitchell  wrote:
> > On Thu, May 16, 2024 at 07:06:49PM -0700, Elliott Mitchell wrote:
> > > On Tue, May 14, 2024 at 06:22:09PM +0200, Andreas Metzler wrote:
> [...]
> > > > Could you please post the requested output, although there are no
> > > > obvious clues there to your eyes?
> > > 
> > > Problem is that provides rather a lot of data about this network setup.
> > > The quantity of information is enough for me to be rather uncomfortable
> > > with providing it via public channel.
> [...]
> 
> > > I notice the `_gnutls_dnsname_is_valid()` function in
> > > gnutls28-3.8.5/lib/str.h accepts IPv4 addresses (which are NOT valid in
> > > DNS), but rejects IPv6 addresses.

> At a very bare level an IPv4 address is a valid DNS name (alnum, dashes,
> and dots), an IPv6 adress is not. That is what gnutls is checking here.

No, there isn't any IPv4 address which is a valid DNS name.  No top-level
domain consists purely of decimal digits, whereas IPv4 addresses consist
of purely decimal digits.  In fact I don't believe there are any
top-level domains which have even a single decimal digit in them.

> Afaict it is a short-cut to save more expensive processing for obvious
> errors. gnutls_session_get_verify_cert_status() (with
> gnutls_session_set_verify_cert() set correctly) or
> gnutls_x509_crt_check_hostname()/gnutls_certificate_verify_peers3()
> does more elaborate stuff on the data,
> gnutls_certificate_verify_peers2() requires a separate
> gnutls_x509_crt_check_hostname().

Which seems to argue the more urgent issue is
_gnutls_server_name_send_params() needs to do checking of the provided
server hostname before sending it as SNI.

I've got an initial implementation of this here, but I'm left wondering
how far verification should go.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1070033: libgnutls30: rejects numeric IPv6 addresses during connection

2024-05-17 Thread Elliott Mitchell

On Thu, May 16, 2024 at 07:06:49PM -0700, Elliott Mitchell wrote:
> On Tue, May 14, 2024 at 06:22:09PM +0200, Andreas Metzler wrote:
> > On 2024-05-14 Elliott Mitchell  wrote:
> > > On Wed, May 01, 2024 at 01:45:00PM +0200, Andreas Metzler wrote:
> > [...]
> > >> well you could post the complete output of
> > >> gnutls-cli --port 636 fd12:3456:7890:abcd::3
> > >> perhaps even with -d10? I would reassign to openldap then if there are
> > >> no obvious clues.
> > 
> > > `gnutls-cli` doesn't yield anything obvious.
> > [...]
> 
> > Could you please post the requested output, although there are no
> > obvious clues there to your eyes?
> 
> Problem is that provides rather a lot of data about this network setup.
> The quantity of information is enough for me to be rather uncomfortable
> with providing it via public channel.
> 
> 
> I did get the connection to proceed further than before though.  If I add
> the IPv6 address of the LDAP server to /etc/hosts, and then use the
> hostname instead of IPv6 address for the uri line of /etc/nslcd.conf
> things get further (I believe over IPv6, but I haven't satisfactorily
> verified this).
> 
> This suggests #1070033 is either in libgnutls30 or slapd.  The issue
> could be slapd is passing an IPv6 address to a portion of libgnutls30's
> API which requires a hostname.  The issue could be libgnutls30 rejects
> IPv6 addresses in some place(s) where they should be valid by the API.
> 
> I notice the `_gnutls_dnsname_is_valid()` function in
> gnutls28-3.8.5/lib/str.h accepts IPv4 addresses (which are NOT valid in
> DNS), but rejects IPv6 addresses.

Then I look deeper and find RFC 6066
(https://www.rfc-editor.org/rfc/rfc6066), page 7:

Literal IPv4 and IPv6 addresses are not permitted in "HostName".

This suggests there are at least 2, possibly 3 or more bugs.

#1  RFC 6066 says neither are legal, yet _gnutls_dnsname_is_valid()
accepts IPv4 addresses (including the 32-bit integer version), but
rejects IPv6 addresses.  This sort of inconsistency leads to security
breaches.

#2  The gnutls library uses the SNI extension without checking
whether it was passed a literal addresses.

#3  nslcd always passes the host string provided to its "uri"
configuration setting to the gnutls API without checking whether it is a
literal address.

#1 is definitely a bug present in the libgnutls30 package.  At least one
of #2 and #3 is definitely a bug, but both may very well be bugs.  Seems
better to check in the library as it could effect multiple programs using
the library.

-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1070033: libgnutls30: rejects numeric IPv6 addresses during connection

2024-05-16 Thread Elliott Mitchell

On Tue, May 14, 2024 at 06:22:09PM +0200, Andreas Metzler wrote:
> On 2024-05-14 Elliott Mitchell  wrote:
> > On Wed, May 01, 2024 at 01:45:00PM +0200, Andreas Metzler wrote:
> [...]
> >> well you could post the complete output of
> >> gnutls-cli --port 636 fd12:3456:7890:abcd::3
> >> perhaps even with -d10? I would reassign to openldap then if there are
> >> no obvious clues.
> 
> > `gnutls-cli` doesn't yield anything obvious.
> [...]

> Could you please post the requested output, although there are no
> obvious clues there to your eyes?

Problem is that provides rather a lot of data about this network setup.
The quantity of information is enough for me to be rather uncomfortable
with providing it via public channel.

I did get the connection to proceed further than before though.  If I add
the IPv6 address of the LDAP server to /etc/hosts, and then use the
hostname instead of IPv6 address for the uri line of /etc/nslcd.conf
things get further (I believe over IPv6, but I haven't satisfactorily
verified this).

This suggests #1070033 is either in libgnutls30 or slapd.  The issue
could be slapd is passing an IPv6 address to a portion of libgnutls30's
API which requires a hostname.  The issue could be libgnutls30 rejects
IPv6 addresses in some place(s) where they should be valid by the API.

I notice the `_gnutls_dnsname_is_valid()` function in
gnutls28-3.8.5/lib/str.h accepts IPv4 addresses (which are NOT valid in
DNS), but rejects IPv6 addresses.

-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1070033: libgnutls30: rejects numeric IPv6 addresses during connection

2024-05-13 Thread Elliott Mitchell

affects 1070033 nslcd
quit

On Wed, May 01, 2024 at 01:45:00PM +0200, Andreas Metzler wrote:
> On 2024-04-30 Elliott Mitchell  wrote:
> > On Tue, Apr 30, 2024 at 05:55:15AM +0200, Andreas Metzler wrote:
> > > On 2024-04-29 Elliott Mitchell  wrote:
> [...] 
> > > > From `nslcd` on clients I was getting the message:
> > > > nslcd[12345]: [1a2b3c]  failed to bind to LDAP 
> > > > server ldaps://[fd12:3456:7890:abcd::3]/: Can't contact LDAP server: 
> > > > The TLS connection was non-properly terminated.: Resource temporarily 
> > > > unavailable
> [...] 
> > > > Once I finally figured out `slapd`'s debug mode ('-h ldaps:/// 
> > > > ldapi:///'
> > > > is two arguments, the ldaps and ldapi are a single argument).  I got
> > > > traces from `slapd`: (serial numbers filed off)
> > > 
> > > > tls_read: want=5, got=5
> > > >   :  16 03 01 01 8f
> > > 
> > > > tls_read: want=399, got=399
> > > >   0160:fd12  
> > > >   0170::3456:7890:abcd:  
> > > >   0180::3.-.@.   
> > > > TLS: can't accept: A disallowed SNI server name has been received..
> > > > connection_read(13): TLS accept failure error=-1 id=1005, closing
> [...]
> > > I guess you used the IPv6 address as either CN or Subject Alternative
> > > Name. Both take names, not IP addresses. There is a different field for
> > > IP addresses.
> > > 
> > > gnutls-cli --port 636 fd12:3456:7890:abcd::3 
> > > 
> > > will probably give more info.
> > > 
> > > FWIW I have just generated a local test certificate with "IPAddress:"
> > > set to '::1' and things work for me as expected.
> 
> > Hmm, `gnutls-cli --port ldaps` gave a different result.  The connection
> > successfully established and I was left being able to type to `slapd`.
> [...]
> > Anything further is purely guesswork.

> well you could post the complete output of
> gnutls-cli --port 636 fd12:3456:7890:abcd::3
> perhaps even with -d10? I would reassign to openldap then if there are
> no obvious clues.

`gnutls-cli` doesn't yield anything obvious.

Problem is there are at least 3 packages where the bug could lurk:

libgnutls30's API could indicate numeric addresses are legal somewhere,
but not accept IPv6 addresses (something gets fed to
_gnutls_dnsname_is_valid() which shouldn't be).

I notice the libgnutls30 function _gnutls_dnsname_is_valid() will return
true for "127.0.0.1".  This function is almost certainly wrong as it
accepts IPv4 addresses (which are not valid in DNS), but rejects IPv6
addresses.

nslcd could be passing something which could be an IP address to the
wrong part of the libgnutls30 API.  nslcd might also be sending an IP
address in LDAP somewhere it is required to send a hostname.

slapd could be passing something which could be an IP address to the
wrong part of the libgnutls30 API.  slapd might also be assuming
something in LDAP is a hostname when it is valid to be an IP address.

Right now _gnutls_dnsname_is_valid() seems highly suspect.

-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1070033: libgnutls30: rejects numeric IPv6 addresses during connection

2024-04-29 Thread Elliott Mitchell

On Tue, Apr 30, 2024 at 05:55:15AM +0200, Andreas Metzler wrote:
> On 2024-04-29 Elliott Mitchell  wrote:
> > Package: libgnutls30
> > Version: 3.7.9-2+deb12u2
> > Severity: important
> 
> > Long story to finding this one.  Trying to get LDAP setup on this
> > network.  As a recent deployment it seemed appropriate to use IPv6.
> 
> > From `nslcd` on clients I was getting the message:
> > nslcd[12345]: [1a2b3c]  failed to bind to LDAP server 
> > ldaps://[fd12:3456:7890:abcd::3]/: Can't contact LDAP server: The TLS 
> > connection was non-properly terminated.: Resource temporarily unavailable
> 
> > Running `nslcd` in debug mode failed to yield any additional useful
> > information.
> 
> > Once I finally figured out `slapd`'s debug mode ('-h ldaps:/// ldapi:///'
> > is two arguments, the ldaps and ldapi are a single argument).  I got
> > traces from `slapd`: (serial numbers filed off)
> 
> > tls_read: want=5, got=5
> >   :  16 03 01 01 8f
> 
> > tls_read: want=399, got=399
> >   0160:fd12  
> >   0170::3456:7890:abcd:  
> >   0180::3.-.@.   
> > TLS: can't accept: A disallowed SNI server name has been received..
> > connection_read(13): TLS accept failure error=-1 id=1005, closing
> 
> > Further tracing of the error message appears to point to the function
> > `_gnutls_dnsname_is_valid()` in gnutls/lib/str.h.  Seems libgnutls30 is
> > incompatible with numeric IPv6 addresses.
> 
> > While IPv6-only hosts are presently uncommon, there is now quite a bit of
> > IPv6 traffic in many places.  I think this is worthy of having a severity
> > of "critical" as "bookworm" may remain as "stable" past when there is
> > more IPv6 traffic than IPv4 traffic.  For "trixie" this seems very
> > likely.
> [...]
> 
> Good morning,
> 
> I guess you used the IPv6 address as either CN or Subject Alternative
> Name. Both take names, not IP addresses. There is a different field for
> IP addresses.
> 
> gnutls-cli --port 636 fd12:3456:7890:abcd::3 
> 
> will probably give more info.
> 
> FWIW I have just generated a local test certificate with "IPAddress:"
> set to '::1' and things work for me as expected.

Hmm, `gnutls-cli --port ldaps` gave a different result.  The connection
successfully established and I was left being able to type to `slapd`.

Unfortunately that causes there to be 3 packages which could be the one
responsible for the problem.  Could be libgnutls30 as I originally
suspected.  Yet `slapd` and `nslcd` could also be responsible for the
problem.

The string "A disallowed SNI server name has been received." is found in
`libgnutls.so.30`.

The string "connection_read(%d): input error=%d id=%lu, closing." is
found in `/usr/sbin/slapd`.

Anything further is purely guesswork.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1070033: libgnutls30: rejects numeric IPv6 addresses during connection

2024-04-28 Thread Elliott Mitchell

Package: libgnutls30
Version: 3.7.9-2+deb12u2
Severity: important

Long story to finding this one.  Trying to get LDAP setup on this
network.  As a recent deployment it seemed appropriate to use IPv6.

>From `nslcd` on clients I was getting the message:
nslcd[12345]: [1a2b3c]  failed to bind to LDAP server 
ldaps://[fd12:3456:7890:abcd::3]/: Can't contact LDAP server: The TLS 
connection was non-properly terminated.: Resource temporarily unavailable

Running `nslcd` in debug mode failed to yield any additional useful
information.

Once I finally figured out `slapd`'s debug mode ('-h ldaps:/// ldapi:///'
is two arguments, the ldaps and ldapi are a single argument).  I got
traces from `slapd`: (serial numbers filed off)

tls_read: want=5, got=5
  :  16 03 01 01 8f

tls_read: want=399, got=399
  0160:fd12  
  0170::3456:7890:abcd:  
  0180::3.-.@.   
TLS: can't accept: A disallowed SNI server name has been received..
connection_read(13): TLS accept failure error=-1 id=1005, closing

Further tracing of the error message appears to point to the function
`_gnutls_dnsname_is_valid()` in gnutls/lib/str.h.  Seems libgnutls30 is
incompatible with numeric IPv6 addresses.

While IPv6-only hosts are presently uncommon, there is now quite a bit of
IPv6 traffic in many places.  I think this is worthy of having a severity
of "critical" as "bookworm" may remain as "stable" past when there is
more IPv6 traffic than IPv4 traffic.  For "trixie" this seems very
likely.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1069264: grub: chooses stale RAID1 mirror over fresh mirror

2024-04-18 Thread Elliott Mitchell

Package: grub
Version: 2.06-13+deb12u1

>From `dmesg`:

md: kicking non-fresh  from array!

This is using MD-RAID1.  Appears GRUB is opting to load grub.cfg, kernel
and initial ramdisk off of this device, rather than the still operational
mirror.  The result is without manual intervention an older kernel
potentially gets loaded and causes problems.

I must argue this qualifies as "critical" since an older kernel might
have security holes or other problems.  For now I'll leave this as
"important" since I'm unsure how many are effected by this.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#988477: Also observing #988477

2024-01-18 Thread Elliott Mitchell

tags 988477 - moreinfo
found 988477 4.17.2+76-ge1f9cb16e2-1~deb12u1
affects 988477 src:linux
severity 988477 critical
quit

I am also observing #988477 occur.  This machine has a AMD Zen 4
processor.  The first observation was when motherboard/processor was
swapped out, the older motherboard/processor was several generations old.

The pattern which is emerging is Linux MD RAID1 plus recent AMD processor
which has full IOMMU functionality.  The older machine was believed to
have an IOMMU, but the BIOS wasn't creating appropriate ACPI tables
(IVRS) and thus Xen was unable to utilize it.

This seems to be occuring with a small percentage of write operations.
Subsequent read operations appear to be fine.

I am not convinced this is a Xen bug.  I suspect this is instead a bug
in the Linux MD subsystem.  In particular if the DMA interface was
designed assuming only a single device would ever access any page, but
the MD RAID1 driver is reusing the same page for both devices.

IOMMU page release could be handled by marking the page unused in a
device data structure and later removed by sweeping a table.  In such
case if the MD-RAID1 driver was to redirect the page to another device
between these two steps, the entry for a subsequent device could be wiped
out when trying to invalidate an entry for a prior device.


Anyway, I'm also observing bug #988477.  This could also be a kernel bug.
So far no crashes/confirmed data loss have occured, but sweeping the
mirror does turn up small numbers of inconsistencies.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#810964: #810964 is more kernel driver than Xen

2023-10-02 Thread Elliott Mitchell



reassign 810964 src:linux
tags 810964 -moreinfo
affects 810964 src:xen
found 810964 5.10.191-1
found 810964 6.1.52-1
found 810964 6.5.3-1
found 810964 5.10.127-2~bpo10+1
found 810964 6.1.38-4~bpo11+1
found 810964 6.4.4-3~bpo12+1
quit

Upon further investigation, while some part of #810964 may be in Xen,
the biggest issue is in the Linux kernel.

Appears MCE/EDAC support for Xen was implemented around 2008-2012.  Since
that time the maintainer has changed and the new maintainer was unaware
the driver was supposed to function on Xen.

As such the current maintainer has been adding in constructs which are
incompatible with operation on Xen, and at 767f4b620eda overtly broke Xen
support.

Part of the fix may require adjustments to Xen, but right now the
immediate source of breakage is the Linux kernel.

As such I'm reassigning this to src:linux.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1050030: Similar reproduction

2023-08-23 Thread Elliott Mitchell

On Fri, Aug 18, 2023 at 02:05:31PM -0700, Elliott Mitchell wrote:
> >From reading the available information I suspect Tianocore/EDK2 may have
> tried to move some functionality to a distinct build and neither setup
> quite works.  Notably there is now a "OvmfPkg/OvmfXen.dsc" build
> configuration.  The OVMF.fd for Qemu for Xen functionality may have been
> moved /here/.  There might also be an attempt at functionality similar to
> "ArmVirtPkg/ArmVirtXen.dsc" (Debian 978595) for x86.

Now confirmed reverting to 2020.11-2+deb11u1 takes care of the issues I'm
running into.  I've been able to build OvmfPkg/OvmfXen.dsc, but haven't
gotten it to do anything.  I'm suspecting the support for running
headless didn't get into OvmfXen.  I'm interacting with someone
knowledgeable, but nothing yet.

I suspect the "ovmf" package needs to be split.  I've gotten the
impression the build needed for normal `qemu` isn't going to be the same
as the build needed for xen-qemu.

I think what is really needed is for xen-utils-X.YY to Recommend a
virtual package "xen-domu-bootloader" which is then provided by tools
which can load VMs.  The current other in-service tool is grub-xen-host,
but it appears OvmfXen may also be able to provide the service.

I'm attaching two patches which should help organize the source package.
These leave all the "./edksetup.sh" lines identical.  Perhaps make use
of this to make the build cleaner?


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445


>From 910f6592733dbef2166ceb469320b8e21c4fa977 Mon Sep 17 00:00:00 2001
Message-Id: <910f6592733dbef2166ceb469320b8e21c4fa977.1692832840.git.ehem+deb...@m5p.com>
From: Elliott Mitchell 
Date: Wed, 20 Jan 2021 17:40:15 -0800
Subject: [PATCH 1/2] debian/rules: Rework edksetup.sh invocations

Instead of using "set -e", instead overtly test return codes using
a conditional.  Move commonly used build flags to front of command as a
precursor to merging into a macro.  This also makes the varying flags
more overt by being on the end.
---
 debian/rules | 45 +++--
 1 file changed, 19 insertions(+), 26 deletions(-)

diff --git a/debian/rules b/debian/rules
index 116c9c74b7..36b1ffc045 100755
--- a/debian/rules
+++ b/debian/rules
@@ -59,8 +59,7 @@ undefine CONF_PATH
 override_dh_auto_build: build-qemu-efi-aarch64 build-qemu-efi-arm build-ovmf build-ovmf32
 
 debian/setup-build-stamp:
-	set -e; . ./edksetup.sh; \
-	make -C BaseTools ARCH=$(EDK2_BUILD_ARCH)
+	. ./edksetup.sh && make -C BaseTools ARCH=$(EDK2_BUILD_ARCH)
 	touch $@
 
 OVMF_INSTALL_DIR = debian/ovmf-install
@@ -95,11 +94,10 @@ build-ovmf32: $(OVMF32_BINARIES) $(OVMF32_IMAGES)
 $(OVMF32_BINARIES) $(OVMF32_IMAGES): debian/setup-build-stamp
 	rm -rf $(OVMF32_INSTALL_DIR)
 	mkdir $(OVMF32_INSTALL_DIR)
-	set -e; . ./edksetup.sh; \
-		build -a IA32 \
-			-t $(EDK2_TOOLCHAIN) \
+	. ./edksetup.sh && build -b $(BUILD_TYPE) -t $(EDK2_TOOLCHAIN) \
+			-a IA32 \
 			-p OvmfPkg/OvmfPkgIa32.dsc \
-			$(OVMF32_4M_SMM_FLAGS) -b $(BUILD_TYPE)
+			$(OVMF32_4M_SMM_FLAGS)
 	cp $(OVMF32_BUILD_DIR)/FV/OVMF_CODE.fd \
 		$(OVMF32_INSTALL_DIR)/OVMF32_CODE_4M.secboot.fd
 	cp $(OVMF32_BUILD_DIR)/FV/OVMF_VARS.fd \
@@ -109,38 +107,34 @@ build-ovmf: $(OVMF_BINARIES) $(OVMF_IMAGES) $(OVMF_PREENROLLED_VARS)
 $(OVMF_BINARIES) $(OVMF_IMAGES): debian/setup-build-stamp
 	rm -rf $(OVMF_INSTALL_DIR)
 	mkdir $(OVMF_INSTALL_DIR)
-	set -e; . ./edksetup.sh; \
-		build -a X64 \
-			-t $(EDK2_TOOLCHAIN) \
+	. ./edksetup.sh && build -b $(BUILD_TYPE) -t $(EDK2_TOOLCHAIN) \
+			-a X64 \
 			-p OvmfPkg/OvmfPkgX64.dsc \
-			$(OVMF_2M_FLAGS) -b $(BUILD_TYPE)
+			$(OVMF_2M_FLAGS)
 	cp $(OVMF_BUILD_DIR)/FV/OVMF_CODE.fd \
 		$(OVMF_BUILD_DIR)/FV/OVMF.fd $(OVMF_INSTALL_DIR)/
 	cp $(OVMF_BUILD_DIR)/FV/OVMF_VARS.fd $(OVMF_INSTALL_DIR)/
 	rm -rf Build/OvmfX64
-	set -e; . ./edksetup.sh; \
-		build -a IA32 -a X64 \
-			-t $(EDK2_TOOLCHAIN) \
+	. ./edksetup.sh && build -b $(BUILD_TYPE) -t $(EDK2_TOOLCHAIN) \
+			-a IA32 -a X64 \
 			-p OvmfPkg/OvmfPkgIa32X64.dsc \
-			$(OVMF_4M_FLAGS) -b $(BUILD_TYPE)
+			$(OVMF_4M_FLAGS)
 	cp $(OVMF3264_BUILD_DIR)/FV/OVMF_CODE.fd \
 		$(OVMF_INSTALL_DIR)/OVMF_CODE_4M.fd
 	cp $(OVMF3264_BUILD_DIR)/FV/OVMF_VARS.fd \
 		$(OVMF_INSTALL_DIR)/OVMF_VARS_4M.fd
 	rm -rf Build/OvmfX64
-	set -e; . ./edksetup.sh; \
-		build -a X64 \
-			-t $(EDK2_TOOLCHAIN) \
+	. ./edksetup.sh && build -b $(BUILD_TYPE) -t $(EDK2_TOOLCHAIN) \
+			-a X64 \
 			-p OvmfPkg/OvmfPkgX64.dsc \
-			$(OVMF_2M_SMM_FLAGS) -b $(BUILD_TYPE)
+			$(OVMF_2M_SMM_FLAGS)
 	cp $(OVMF_BUILD_DIR)/FV/OVMF_CODE.fd \
 		$(OVMF_INSTALL_DIR)/OVMF_CODE.secboot.fd
 	rm -rf

Bug#1050030: Similar reproduction

2023-08-18 Thread Elliott Mitchell

affects 1050030 src:xen
quit

I'm seeing a similar situation, though instead using FreeBSD/x86 in the
VM.

For FreeBSD the bootloader appears to operate normally, but something
fails quickly after loading the kernel:

Loading kernel...
/boot/kernel/kernel text=0x18aa98 text=0xdfd150 text=0x675154 data=0x140 
data=0x1c38e8+0x43b718 0x8+0x18fe70+0x8+0x1ae449/
Loading configured modules...
/boot/entropy size=0x1000
/etc/hostid size=0x25
staging 0xe3e0 (not copying) tramp 0xe351b000 PT4 0xe3512000
Start @ 0x8038b000 ...
EFI framebuffer information:
addr, size 0xf000, 0x1d5000
dimensions 800 x 600
stride 800
masks  0x00ff, 0xff00, 0x00ff, 0xff00

I believe all these messages are from FreeBSD's bootloader.  The first
message from the kernel should be "---<>---", yet that message
never shows.  Xen shows the domain spinning on a single processor which
makes me believe the FreeBSD kernel has loaded, panic()ed and the
debugger is loaded (but there is no VGA console).


>From reading the available information I suspect Tianocore/EDK2 may have
tried to move some functionality to a distinct build and neither setup
quite works.  Notably there is now a "OvmfPkg/OvmfXen.dsc" build
configuration.  The OVMF.fd for Qemu for Xen functionality may have been
moved /here/.  There might also be an attempt at functionality similar to
"ArmVirtPkg/ArmVirtXen.dsc" (Debian 978595) for x86.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#978595: #978595 is looking higher priority

2023-08-17 Thread Elliott Mitchell

On Tue, Jul 04, 2023 at 11:56:39PM +0300, Michael Tokarev wrote:
> Out of curiocity, what value is it to boot a xen domU (or qemu) guest in uefi 
> mode?
> I mean, bios mode is still recommended for at least commercial virt solutions 
> such
> as vmware, and it works significantly faster in qemu and xen too.  It is 
> more, qemu
> ships minimal bios (qboot) to eliminate all boot-time cruft which is not 
> needed in
> a vm most of the time.

First, the known high value portion of #978595 is getting
ArmVirtPkg/ArmVirtXen.dsc built and packaged.  This results in a
XEN_EFI.fd file.  As such the presently verified value only applies to
ARM.

What you do with XEN_EFI.fd is you configure an ARM domain with
'kernel = "${edk2_install_dir}/XEN_EFI.fd"'

The resultant domain has no extra daemons emulating hardware.  Inside the
domain, Tianocore/EDK2 will search via its normal means for a boot.efi
file and load that if it can.

This is similar to PyGRUB versus PvGRUB.  If the OS being loaded has
native Xen drivers, you've gotten rid of the Qemu process hanging around
in domain 0 providing security holes.

So far this is reliably booting the WIP FreeBSD/arm64.  I imagine this
could also load GRUB.

I believe OvmfPkg/OvmfXen.dsc aims to be something similar for x86, but
I've yet to achieve results from that.  My hope is this could load
FreeBSD/x86 in a PVH domain.

On Tue, Jul 04, 2023 at 10:30:34PM +0200, Paul Leiber wrote:
> As the Windows systems are not usable anymore, Xen is significantly
> reduced in functionality after the upgrade. Is this existing bug report
> the right place to file this, or should I open a new bug report? If this
> bug report is the right place, its priority should indeed be raised, at
> least to important (linux PVH DomUs are still working fine). If I should
> open a new bug report, for which package?

New report.  The topic for #978595 is I was hoping for some other build
types of EDK2/Tianocore to be built and packaged.  What you're describing
is a regression and certainly not merely a wishlist packaging request.

I'm unsure, but at first thought this would be src:xen.  On that note a
FreeBSD VM I've got has been having difficulty since the 4.14 -> 4.17
upgrade.  I'm still fighting other upgrade issues right now.

Some portions of the EDK2/Tianocore packaging look suspicious, so I
wouldn't be surprised if the failure was there.

On a very different note, I'm concerned about commit 5e68feec5b2.  If you
would care to examine patch #3 attached to message #22 on bug 978595, you
will notice it bears a strong resemblance.

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=978595#22
https://bugs.debian.org/cgi-bin/bugreport.cgi?att=3;bug=978595;filename=0003-debian-rules-Switch-to-truncate-from-dd.patch;msg=22

I believe 5e68feec5b2 is simply parallel development (that was an ugly
use of `dd` and `truncate` is known).  Yet I find it discouraging I
pointed to the issue more than a full 2 years earlier it was ignored.

-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#452721: irt: irt: Bug#452721 notes from explorations

2023-08-17 Thread Elliott Mitchell

Synthesizing things since I hadn't been copied on previous message...

On Mon Jul 31 18:10:34 BST 2023, zithro wrote:
> 
> On 31 Jul 2023 03:39, Elliott Mitchell wrote:
> 
> > Presently I hope to convince the Xen core to allow full Python in domain
> > configuration files, but no news on that front so far.  This would mean
> > /etc/default/xendomains would need to change to match Python syntax.
> 
> There was an answer today on xen-devel: the ability to use scripts in
> domU cfg files has been explicitely removed for various reasons.
> This does not prevent you from "source"-ing teh cfg files in your
> script(s) if they are proper Python syntax. Or you could simply
> parse/regex the values you want.

Though the reasons given seem orthogonal to my thinking.  I'm thinking
use libpython as the parser since that allows dictionaries and guarantees
the syntax remains a subset of Python.  Whereas the responses read like
they think I'm asking for full Python scripts as domain configurations
(which is a very large superset of what I'm proposing).

> And as Marek suggested in his answer, you can also put any arbitrary
> settings in the comments.

I had already thought of that as it is a common strategy for such things.
This though has substantial limitations and since Python has all the
capabilities needed, strategies based on Python seem very attractive.

I was thinking Perl for a bit, but Python provides a simple strategy for
extracting required information out of configurations.  Crucially the
UUID which lets you match running domains to their configuration.

> Although ...
> 
> > My thinking for adding to domain configuration files would be something
> > along these lines:
> >
> > init = {
> >   'tool': 'xendomains-ng',
> >   'version': 0,
> >   'order': 9,
> >   'startwait': 60,
> >   'stopaction': 'save',
> > }
> 
> The problem with adding this to a domU config file is that it could
> cause problems for (live) migrations. The start/stop order is "per
> dom0", and may be different on another one.
> Imagine two dom0s, one storing the domain files "locally", while the
> other uses NFS. Only in the second case the domU should wait for the NFS
> server/domain to be available.
> 
> To me, the start/stop logic should be in a dom0 config file.

I'm not understanding the situation you're thinking of.

The closest I can come is you're thinking of a situation which would be
handled by having host defaults, but also overrides in domain.cfg files.
Generic VMs would act according to the host settings, only domains which
had overridden values would act differently.

You could have a network of VM hosts where normal hosts specify 'migrate'
in /etc/default/xendomains.  Then you have the magic host which specifies
'save' or 'shutdown'.  You would also specify something other than
'migrate' for domains handling services local to a particular host.

> > 'startwait' would tell the script to wait that long before starting
> > subsquent domains.
> 
> A time-based wait may be useful for when everything goes well, but what
> about when there are problems ?
> If you want to be sure a domain is up (ie. ready to serve), you would
> need to peek at the related "service".
> For example, to be sure a DNS domU is up, you would have to try a DNS
> request, as a ping or "xl list" would not be enough.
> Also, domains in xen/auto are started with a mix of serialization AND
> parallelization, as "xl create" returns once the domain has started (ie.
> in the Xen point of view, not the user's).

Indeed.  I'm well aware what I'm suggesting has major limitations.  I'm
proposing what I consider feasible given available time.  What you're
suggesting could be a feature for v2, which might be written based on
what I manage.

> > 'stopaction' would allow different actions if the machine was to stop.
> > The 3 options which come to mind are 'stop' (shutdown), 'save' (save to
> > specified storage location), and 'migrate'.
> 
> Then, each time you do NOT want to follow the usual action, you'd have
> to edit -each- domU cfg file ?

Usually if you didn't want to follow the usual action, you would invoke
`xl` manually.

What has come to mind though is perhaps the action should be uploaded to
the xenstore.  Then when an unusual action was desired, the xenstore
information could be changed and the action would follow the domain.
This though seems a feature for a future version.

> > If full Python doesn't become available, this might take the format:
> > init = 'tool=xendomains-ng,version=0,order=9,startwait=60,stopaction=save'
> > Not needing to parse the string though does make one's life simpler.
> 
> Well, it makes -your- life easier, not

Bug#1049450: New rpc.mountd rejects -N 2 option

2023-08-16 Thread Elliott Mitchell

On Wed, Aug 16, 2023 at 08:57:16AM +0200, Salvatore Bonaccorso wrote:
> 
> On Tue, Aug 15, 2023 at 04:13:59PM -0700, Elliott Mitchell wrote:
> > Package: nfs-kernel-server
> > Version: 1:2.6.2-4
> > 
> > Hopefully SSIA.
> > 
> > `rpc.mountd` has a -N option to disable versions of NFS.
> > 
> > I had been previously using "-N 2", but that is now broken.  The error
> > message was quite non-helpful ("nfsd2" if I recall correctly).  Upon
> > removing "-N 2", luckily NFSv2 didn't get enabled, but this was still
> > annoying to deal with.  At worst using a deprecated setting should merely
> > generate a warning.
> 
> Removal of NFSv2 support was documented with a Debian NEWS entry for
> 1:2.6.1-1~exp1, cf. #1006650.
> 
> nfs-utils (1:2.6.1-1~exp1) unstable; urgency=medium
> 
>   Support for NFSv2 has been removed from nfs-kernel-server.  It was
>   previously disabled by default, but still available.
> 
>  -- Ben Hutchings   Sun, 13 Mar 2022 19:05:02 +0100

Removing NFSv2 support shouldn't invalidate "-N 2".  "-N 2" is supposed
to disable NFSv2 at runtime, as such removing all NFSv2 support should
merely render "-N 2" 100% redundant and at worst produce a warning.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1049450: New rpc.mountd rejects -N 2 option

2023-08-15 Thread Elliott Mitchell

Package: nfs-kernel-server
Version: 1:2.6.2-4

Hopefully SSIA.

`rpc.mountd` has a -N option to disable versions of NFS.

I had been previously using "-N 2", but that is now broken.  The error
message was quite non-helpful ("nfsd2" if I recall correctly).  Upon
removing "-N 2", luckily NFSv2 didn't get enabled, but this was still
annoying to deal with.  At worst using a deprecated setting should merely
generate a warning.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#452721: irt: Bug#452721 notes from explorations

2023-07-30 Thread Elliott Mitchell

Even though there hasn't been any discussion recently, bug #452721 is
very much still of major concern to me.

First issue is how to parse domain configuration files.  Reason being a
foo.cfg file might have the configuration 'name = "bar"'.  This would
also let the script retrieve the UUID if that has been set.

Turns out while Python in domain configuration files isn't supportted,
the syntax is still a proper subset of the Python language.  This makes
Python the ideal programming language for a replacement script.  Only
weakness is being able to have full Python syntax in configuration files
might make the task simpler.

Presently I hope to convince the Xen core to allow full Python in domain
configuration files, but no news on that front so far.  This would mean
/etc/default/xendomains would need to change to match Python syntax.


My thinking for adding to domain configuration files would be something
along these lines:

init = {
'tool': 'xendomains-ng',
'version': 0,
'order': 9,
'startwait': 60,
'stopaction': 'save',
}

Mainly a Python dictionary holding key values.  Thought being the 'tool'
and 'version' values, is to hope for some form of compatibility if such
scripts were to become common.

My thinking is 'order' would indicate sequence.  Domains with higher
order get started first (same order would nominally allow parallel
start).  If a domain.cfg file didn't define order then its order is 0.

'startwait' would tell the script to wait that long before starting
subsquent domains.

'stopaction' would allow different actions if the machine was to stop.
The 3 options which come to mind are 'stop' (shutdown), 'save' (save to
specified storage location), and 'migrate'.


If full Python doesn't become available, this might take the format:

init = 'tool=xendomains-ng,version=0,order=9,startwait=60,stopaction=save'

Not needing to parse the string though does make one's life simpler.


Other concerns include:
Sometimes you may want to take a distinct action during stop.  Ie if
you're doing restarts for kernel updates, you'll want to override and
have domains reboot.

It may be handier to have distinct options for 'restart'.  Full restarts
can follow proper order, or could simply involve bouncing domains based
on order.  Notably with HVM domains and Qemu updates, you could do:

order 0 down, order 1 down, order 9 down, order 9 up, order 2 up, order 0 up

Or you could do:

order 9 down, order 9 up, order 1 down, order 1 up, order 0 down, order 0 up


I'm basically certain writing a new xendomains script in Python is the
way to go.  Now to get an answer as to whether full Python in domain
configuration files could be reenabled.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1036364: zfsutils-linux: please remove GPT creation bug

2023-05-19 Thread Elliott Mitchell

Package: zfsutils-linux
Version: 2.0.3-9+deb11u1

Would the Debian ZFS maintainers be so kind as to remove the GPT creation
bug from zfsutils-linux?

Full details are at: https://github.com/openzfs/zfs/issues/94

The issue is simply zpool's create/replace and other subcommands try to
unconditionally create GPT labels on anything resembling a whole device.
Unfortunately whatever algorithm is used for detecting whole devices is
poor and generates almost as many false positives and false negatives as
true positives and true negatives.

Attempts to get the OpenZFS project to address this seem to have met with
deaf ears.  Notably this was reported as a bug a decade ago, but nothing
has happened (notice the dates on the upstream bug report).

At this point I hope the Debian maintainers are willing to patch out this
bug.  Hopefully maintainers of other projects will follow and upstream
might notice they're approaching the losing end of a fork.

This has been a problem for so long that the workarounds are headed
towards being well-established techniques.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1034811: linux: consider CONFIG_HW_RANDOM_VIRTIO=n

2023-04-24 Thread Elliott Mitchell

Package: src:linux
Version: 6.0.3-1~bpo11+1
Severity: wishlist

Looks like someone had the idea of a virtualized HW RNG.  Yet looking at
the kernel source, there isn't a single actual implementation.  Unless
I'm missing something, having CONFIG_HW_RANDOM_VIRTIO simply wastes
processor time during build and enlarges the package for no gain.
Perhaps time for Debian to quit packaging this used idea?

Looks like on-processor HW RNGs are what are taking over.  Possibly also
the HW RNG from the vTPM implementation.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1034463: closing 1034463

2023-04-16 Thread Elliott Mitchell

On Sun, Apr 16, 2023 at 07:08:03AM +0200, Salvatore Bonaccorso wrote:
> CONFIG_AGP is built-in in Debian, in particular for:
> 
> debian/config/alpha/config:CONFIG_AGP=y
> debian/config/amd64/config:CONFIG_AGP=y
> debian/config/hppa/config.parisc64:CONFIG_AGP=y
> debian/config/ia64/config:CONFIG_AGP=y
> debian/config/kernelarch-powerpc/config:CONFIG_AGP=y
> debian/config/kernelarch-x86/config:CONFIG_AGP=y

I hadn't checked all architectures, but was well-aware it is built-in
for amd64.  I was suggesting it should change from being built-in to
being a module.

The reason being AGP is very rare on amd64 motherboards.  According to
the handy reference, AGP was starting to disappear just as amd64 hardware
started hitting the market.

I'm unsure where other architectures stand on the issue.  Yet amd64 it
shouldn't be built-in.

-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1034463: linux: consider CONFIG_AGP=m

2023-04-15 Thread Elliott Mitchell

Package: src:linux
Version: 5.10.158+2
Severity: wishlist

Could AGP support be turned into a module for Debian kernels?

I'm tempted to suggest it shouldn't even be built for amd64, but does
seem reasonable for i686 kernels.  Given this, module seems to make
sense.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1032480: xen: Important cherry-picks for bookworm/updates

2023-03-18 Thread Elliott Mitchell

On Tue, Mar 07, 2023 at 01:13:56PM -0800, Elliott Mitchell wrote:
> 
> ad15a0a8ca2515d8ac58edfc0bc1d3719219cb77
> x86/time: prevent overflow with high frequency TSCs

Okay, looks like this one had already been grabbed.  Sorry for the way
too late alert.  Thanks for staying on top of what was happening with
upstream Xen.

> I haven't found a patch for the other one yet.  There is some issue with
> the latest generation which needs "x2apic=false" on Xen's command-line
> in order to get interrupts to domain 0.  I'm guessing the latest from AMD
> broke the PIC emulation.
> 
> If this isn't actually patched yet, I suspect it soon will be.  I haven't
> observed anything on xen-devel, so perhaps the workaround was found too
> quickly to get noticed as urgent.

This one though looks potentially more and less serious.  The workaround
is simpler than the above ("x2apic=false" on Xen's command-line, instead
of "tsc_mode = 1" for *every* VM).  Yet the underlying problem could be
more severe.

-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1032480: xen: Important cherry-picks for bookworm/updates

2023-03-07 Thread Elliott Mitchell

Package: src:xen
Version: 4.17.0+46-gaaf74a532c-1
Severity: important

Two major bugs have shown with the release of new hardware from AMD.
Since the new hardware is likely to become common during the life of
Debian/bookworm, you may wish to grab them early:

ad15a0a8ca2515d8ac58edfc0bc1d3719219cb77
x86/time: prevent overflow with high frequency TSCs

Turns out the latest generation is fast enough to cause overflows.



I haven't found a patch for the other one yet.  There is some issue with
the latest generation which needs "x2apic=false" on Xen's command-line
in order to get interrupts to domain 0.  I'm guessing the latest from AMD
broke the PIC emulation.

If this isn't actually patched yet, I suspect it soon will be.  I haven't
observed anything on xen-devel, so perhaps the workaround was found too
quickly to get noticed as urgent.


Now to continuing the work on figuring out the consequences from
upgrading hardware a bit too early...


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#921187: IRT: backports for Xen

2023-01-18 Thread Elliott Mitchell

>From looking, it doesn't appear necessary to remove the dependency of
QEMU on libxenmiscX.YY to make backports possible.  According to DPKG,
multiple versions of libxenmisc can be installed at the same time, so
the issue is simply whether multiple versions of QEMU can be installed
at the same time.

Last time I tried, it was /almost/ possible to install the testing
version of Xen on an otherwise stable system.  The only dependency issue
was the testing version of Xen needed an incompatible version of libc.

Backports already look 99% possible.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1026914: arcanist client improperly uploading files

2022-12-23 Thread Elliott Mitchell

Package: arcanist
Version: 0~git20200925-1
Severity: grave

If one has one or more commits in /some/repo one can create a
Phabricator diff by running `arc diff $oldver`.  If there are are
untracked files in the directory the arcanist client gives the message:

8<-8<
You have untracked files in this working copy.

  Working copy: /some/repo

  Untracked changes in working copy:
  (To ignore these 1 change(s), add them to ".git/info/exclude".)
file0
file1
file2

Ignore these 3 untracked file(s) and continue? [y/N]
8<-8<

Suspicious resemblance to what `git status` might give.  If one then goes
to an appropriate version of Phabricator, on the right column between
"Tags" and "Subscribers" will be "Referenced Files".

I have noticed "Referenced Files" appears when untracked files are
present.  Diffs done from repository directories with no untracked files
do not have the "Referenced Files".

As such I reasonably believe arcanist is NOT ignoring these files.  At a
minimum it is uploading metadata about them to Phabricator, at worst it
is uploading them to the server without notification.

Privacy and security violation.  This is visible enough I suspect many
people have already noticed.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1006418: #1006418: Linux stubdomains?

2022-09-21 Thread Elliott Mitchell

Not a proper In-Reply-To since that message ended up /somewhere/ and I'm
thus going back to the bug DB for this reply.


I guess I'm neutral-ish on Linux versus Mini-OS for doing stub domains
for Debian on Xen.  I suspect development on Xen's Mini-OS isn't all that
active.  On the flip side due to its limited requirements, Mini-OS might
not need much updating.

My major concern is can current Linux kernels be made small enough for
this to be worthwhile?  I've done some small Debian domains and they
really want a minimum of 192MB of memory, which seems a bit large.

Really the big issue seems to be someone simply needs to play with this
a *lot* in order to figure the thing out.  The information which is out
there isn't easy to understand.

I suspect the simplest may be to examine Qubes OS as they have figured
the thing out.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1017944: Another reproduction of #1017944

2022-09-11 Thread Elliott Mitchell

X-Debbugs-Cc: pkg-xen-de...@lists.alioth.debian.org

Guess we're finding out where everyone's update windows are.  Some though
may report before resolving the issue or somewhat after.

Yet another reproducer of the issue here.  I also observed the failure in
Xen's dmesg and confirm the issue occurs with PVH VMs.

I haven't tried rebuilding with Valentin Kleibel's patch, but another
potential workaround is to add:

deb https://snapshot.debian.org/archive/debian/20220801T032804Z/ bullseye main

To /etc/apt/sources.list, then *hold* the GRUB packages at 2.04-20.

I'm wondering whether we should subscribe
pkg-xen-de...@lists.alioth.debian.org to this bug as it has a kind of
major impact.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#737564: #737564 is becoming more urgent

2022-04-18 Thread Elliott Mitchell

For some time the Linux kernel hasn't guaranteed the order of block
devices.  #737564 is a good solution to this issue.

(yeah, suddenly running into devices getting different designations due
to restart)


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1009793: linux-source 5.10.106-1 changes block device order

2022-04-17 Thread Elliott Mitchell

Package: src:linux
Version: 5.10.106-1

Between 5.10.103-1 and 5.10.106-1 (image -13) something changed which
reliably causes what used to show as /dev/sda to show as /dev/sdb.  Other
block devices plugged into the SCSI subsystem may have swapped around,
but I've yet to untangle the others.

A few utilities are still sensitive to block device order and this
causes issues for those.  Nothing on the hardware explains this.  The
controller thinks the device has a lower number, the device should
respond much faster.

The lowest level is the cciss driver.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1008911: initscripts: /run often mounted nodev, "/run/rootdev" likely to fail

2022-04-03 Thread Elliott Mitchell

Package: initscripts
Version: 3.02-1

Often /run is mounted with the "nodev" option, at which point doing a
`mknod` "/run/rootdev", then trying to `fsck` that doesn't work as a
fallback.  Perhaps "/dev/fsckfallbackdev"?


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1008910: mount-functions: Only allows for LABEL/UUID

2022-04-03 Thread Elliott Mitchell

found 1008910 3.02-1
found 1008910 2.96-7+deb11u1
found 1008910 2.93-8
quit

On Mon, Apr 04, 2022 at 12:48:07AM +0200, Thorsten Glaser wrote:
> On Sun, 3 Apr 2022, Elliott Mitchell wrote:
> 
> > Perhaps the test should be: "[A-Z][A-Z]*[A-Z][A-Z]=*"?
> 
> No, that???s a shellglob, no BRE.

Indeed.  That was the strictest pattern I could come up with likely to
match new additions and exclude other things.  All other matches start
with "/" so nominally "[A-Z]*" would be enough by itself.

> I think it???s best here to update the list with whatever findfs(8)
> comes up when it does come up; anything else would require either
> ksh extglobs or really excessive parsing attempts few would want
> to maintain.
> 
> -  LABEL=*|UUID=*)
> +  LABEL=*|UUID=*|PARTUUID=*|PARTLABEL=*)

Not my decision to make, I simply narrowed down the issue and stated it
was a problem.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1008910: mount-functions: Only allows for LABEL/UUID

2022-04-03 Thread Elliott Mitchell

Package: initscripts
Version: 3.01-1

This is *almost* #677420, but not quite.

The test in /lib/init/mount-functions.sh, _read_fstab() tests for
"LABEL=*|UUID=*" before resorting to `findfs`.  Thing is `findfs` has
two other cases it can handle and that test misses those two.

Perhaps the test should be: "[A-Z][A-Z]*[A-Z][A-Z]=*"?

That matches the two currently supported extra cases and would hopefully
catch further additions to what `findfs` can handle.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1008857: irt: fsck: Automatic filesystem check skipped

2022-04-03 Thread Elliott Mitchell

Come to think of it, my initial message may have pointed to the root
cause.  May very well be `fsck` skips checks on filesystems on USB
devices.

Problem is this behavior is taking precedence over checking filesystems
listed in /etc/fstab.  If the root filesystem is located on USB it is
irrelevant the device can easily be removed; if the device is removed
the system will be in a very problematic state, so acting as if it is
non-removable is the correct behavior.

Similar situation for any device in /etc/fstab which isn't marked
"noauto".


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1008857: fsck: Automatic filesystem check skipped

2022-04-02 Thread Elliott Mitchell

Package: util-linux
Version: 2.36.1-8+devuan2
Severity: important

For some reason on this aarch64 device, the automated filesystem checks
which should be done via `fsck -T -M -A -a -t ext4` are getting skipped.
When trying to run this manually, no error messages of any sort was
observed.

During boot I am observing the message "warning: maximal mount count
reached, running e2fsck is recommended".

I'm aware of two system quirks which might cause `fsck` to fail.

This is an aarch64 (ARM64) system.

The main storage device is UAS and features a hybrid MBR.  Most utilities
and the Linux kernel find the valid GPT and ignore the hybrid GPT.

I recall running into an issue like this on a mipsel system many years
ago.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1008308: radicale: TLS broken with several clients

2022-03-26 Thread Elliott Mitchell

On Sat, Mar 26, 2022 at 05:38:21PM +0100, Jonas Smedegaard wrote:
> Quoting Elliott Mitchell (2022-03-26 16:35:53)
> > Has been reported upstream:
> > https://github.com/Kozea/Radicale/issues/1183
> > 
> > Upstream has been completely unresponsive.  No fix is available.
> 
> Thanks for reporting this upstream where it belongs.
> 
> For the Debian packaging of Radicale the recommended use is to *not* 
> handle TLS directly but let another frontend web service handle that. 
> Upstream calls this approach "Reverse Proxy": 
> https://radicale.org/v3.html#reverse-proxy

"Recommended" means other configurations should function.  Notably
the documentation suggests running as a daemon is in theory
supported: https://radicale.org/v3.html#running-as-a-service

Reverse-proxy is also a specialized configuration not appropriate for
all situations.  For the setup I've got adding Apache or ngnix would
more than double the size of the installation.  This would also add
Apache or ngnix's security vulnerabilities to this setup (they've been
pretty good, but that is not perfect).

> Lowering severity accordingly.

important:
"a bug which has a major effect on the usability of a package, without
rendering it completely unusable to everyone."

Broken seems the definition on major effect on usability.  In fact I
believe "grave" is appropriate for this issue.

I don't know the frequencies of the various types of configuration.  I
have a suspicion standalone daemon is a very common configuration and
most users are unaware they're relying on the security of WPA2.

> > With no fix available this renders the Radicale package useless unless 
> > one wishes to run in with an insecure configuration (disable TLS/SSL).
> 
> No.  Radicale is certainly not useless.

Okay, that is true.  It is simply broken for the type of setup I've got
and no assistance has been forthcoming from upstream.

My hope was your channels as a package maintainer might be able to place
more pressure on upstream to address a grave bug.

-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1008308: radicale: TLS broken with several clients

2022-03-26 Thread Elliott Mitchell

Package: radicale
Version: 3.0.6-3
Severity: important

Has been reported upstream:
https://github.com/Kozea/Radicale/issues/1183

Upstream has been completely unresponsive.  No fix is available.  Their
changelog fails to mentions any fix for this.  Reputedly upstream plans
to force upgrades and doing so would violate Debian policy.  With no fix
available this renders the Radicale package useless unless one wishes to
run in with an insecure configuration (disable TLS/SSL).

Sorry to say this, but perhaps the Radicale package needs to be removed
from Debian if this is the support level.


Clients known effected include iPhone and DAVx5 (Android).  I suspect
this only manifests if Radicale is in the standalone configuration
(likely not when setup as an Apache module).

Presently the only visible solution is to remain with the old stable
version of Radicale.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1006595: libexec move patch update

2022-02-27 Thread Elliott Mitchell

This still seems a Good Idea(tm) to start the process of moving to
/usr/libexec, but I do update patches if I discover issues.

Note, for the shorter term it makes sense to leave things in /usr/lib.
Until a few revisions pass with both /usr/lib and /usr/libexec copies,
xen-utils-common must keep using /usr/lib.  Issue is once
xen-utils-common uses /usr/libexec, older installations break.  Best to
keep compatibility with old builds for a while.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445


>From b7477e7fab01b48b663d3e89e4f4c7bd352c8b7e Mon Sep 17 00:00:00 2001
From: Elliott Mitchell 
Date: Sat, 26 Feb 2022 17:15:46 -0800
Subject: [PATCH] debian: Initial phase of moving xen-utils-* to libexec,
 future compat

At some future point the executables will be moved to /usr/libexec.
Ensure current versions of the package will be compatible with future
xen-utils-common packages which expect the files in /usr/libexec.

Signed-off-by: Elliott Mitchell 
---
 debian/rules  | 4 
 debian/xen-utils-V.install.vsn-in | 3 +++
 debian/xen-utils-common.install   | 3 +++
 3 files changed, 10 insertions(+)

diff --git a/debian/rules b/debian/rules
index ba2567b4de..095ad07c51 100755
--- a/debian/rules
+++ b/debian/rules
@@ -298,6 +298,10 @@ xenstore_rm = $(addprefix debian/xen-utils-common/,		\
 override_dh_install:
 	debian/shuffle-binaries $(upstream_version)
 	:
+	mkdir $(t)/usr/libexec
+	ln -s /usr/lib/xen-$(upstream_version)/bin $(t)/usr/libexec/xen-$(upstream_version)
+	ln -s /usr/lib/xen-common/bin $(t)/usr/libexec/xen
+	:
 	debian/shuffle-boot-files $(upstream_version) $(flavour)
 	:
 	dh_install $(dh_install_excludes)
diff --git a/debian/xen-utils-V.install.vsn-in b/debian/xen-utils-V.install.vsn-in
index da04b59d42..66dc5cd190 100644
--- a/debian/xen-utils-V.install.vsn-in
+++ b/debian/xen-utils-V.install.vsn-in
@@ -1,3 +1,6 @@
+# initial phase of moving to libexec, future compatibility
+usr/libexec/xen-@version@
+
 usr/lib/xen-@version@/bin
 usr/lib/xen-@version@/lib/python
 
diff --git a/debian/xen-utils-common.install b/debian/xen-utils-common.install
index 620825ad18..121d45d8a0 100755
--- a/debian/xen-utils-common.install
+++ b/debian/xen-utils-common.install
@@ -29,3 +29,6 @@ usr/share/man
 
 ../scripts/xen-toolstack-wrapper	usr/lib/xen-common/bin
 ../scripts/xen-toolstack		usr/lib/xen-common/bin
+
+# initial phase of moving to libexec, future compatibility
+usr/libexec/xen
-- 
2.30.2

Bug#1005176: xen-utils-4 library dependencies need update

2022-02-25 Thread Elliott Mitchell

On Fri, Feb 25, 2022 at 06:40:23PM +0100, Hans van Kranenburg wrote:
> 
> However, I hope you understand that there's no way we can help when you 
> use something else than the actual packages in Debian, do not provide 
> any error messages seen, and describe what you see instead as "it felt 
> like everything wanted to explode".

I'm aware I've got things in a state which is outside the support
envelope.  I was hoping observations might also apply inside the support
envelope.

> For me, Xen 4.16 does run OK on my test servers, FWIW.

That doesn't surprise me, it didn't take long to get things into a
working state for me.  Just I was able to get things into a problematic
state which the packaging is supposed to prevent.

xen-utils-4.16 depends on: libxencall1, libxenevtchn1,
libxenforeignmemory1, libxengnttab1, and libxentoollog1.

On a system being upgraded there will be 3 versions of each of these
libraries available.

4.14.3+32-g9de3671772-1
4.14.3+32-g9de3671772-1~deb11u1
4.16.0-1~exp1

Issue is the rebuilt xen-hypervisor-4.16 and xen-utils-4.16 could be
installed without updating libxencall1, libxenevtchn1,
libxenforeignmemory1, libxengnttab1, and libxentoollog1.

With the 4.14.3+32-g9de3671772-1~deb11u1 versions of libraries things
were broken.  I'm unsure which one(s) was the problem, though the
problem disappeared once all 5 were updated.

That enough for you?

-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#466064: xserver-xorg-core: -novtswitch is still broken

2022-02-13 Thread Elliott Mitchell

found 466064 2:1.20.11-1+deb11u1
quit

I almost wonder whether I'm seeing a distinct bug since #466064 is so
old.  -novtswitch continues(?) to be problematic.  Current version the
option doesn't work.

Not switching VTs is rather valuable for having multiple X-servers
started by init and running on distinct VTs.  Combined with VMs all sorts
of interesting things become possible.

Having a broken -novtswitch option causes all sorts of trouble for such a
setup.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1002670: grub2-common: Unable to force MBR/embedding installation

2022-02-12 Thread Elliott Mitchell

On Mon, Jan 03, 2022 at 05:17:19PM +, Steve McIntyre wrote:
> 
> On Mon, Jan 03, 2022 at 08:52:48AM -0800, Elliott Mitchell wrote:
> >

> arm64 machines categorically do *not* have any capability to run this
> way. It has never been a thing. Instead, systems running GRUB will
> load GRUB as a UEFI binary from an EFI System Partition (ESP) and go
> from there. Depending on the exact installation on your system, you
> may have an ESP on removable storage (SD?), on hard drive / SSD, or
> maybe on internal eMMC or similar. It's possible you could be loading
> from the network too, but I assume you'd know if that was happening.

Finally figured out what was occurring.  I believe your statement "a UEFI
binary from an EFI System Partition (ESP)" is wrong.  Appears Tianocore
will load UEFI binaries from diskslices which simply contain filesystems
it understands.

> You have not identified the exact platform you're using, so I've no
> idea exactly which of the above options is most likely.

A cheap and popular ARM64 device which got a Tianocore implementation
fairly quickly due to its level of popularity.  Trick is it doesn't have
NVRAM available for storage of EFI variables, so `grub-install` cannot
make itself the default boot method.

I would suggest the "EFI variables are not supported on this system."
warning/error message needs more information.

In Tianocore's configuration (boot menu) I needed to go in and select
"Add boot method" and navigate to where `grub-install` put grubaa64.efi.
Then I simply needed to tell Tianocore to have that as the highest
priority boot method.

> >Or at least that is a simple explanation for why traces of
> >2.02+dfsg1-20+deb10u4 continue to persist, while 2.04-20 appears
> >reluctant.
> 
> My first guess would be that either:
> 
>  * You have more than one ESP on your system somewhere, and the system
>is finding an old grub binary that way; or
> 
>  * The old grub is in the removable media path and you haven't managed
>to replace it yet (see my first mail for details on how to do
>that).

I finally found the 2.02+dfsg1-20+deb10u4 "grubaa64.efi".

It was on /dev/sda128 which was marked with a type UUID of
bc13c2ff-59e6-4262-a352-b275fd6f7172 ("Linux extended boot" according to
`fdisk`).

/dev/sda128 contained an ISO9660 filesystem which was the previous
Debian netinst ISO image.

The message "EFI variables are not supported on this system." was less
than wonderful for figuring out what to do next.

-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1005176: xen-utils-4 library dependencies need update

2022-02-08 Thread Elliott Mitchell

Package: src:xen
Version: 4.16.0-1~exp1

I'm guilty of pulling in later Xen source and building it based on the
experimental 4.16 packaging.  As such this may actually only be an issue
for a package version beyond 4.16.0.

I'm uncertain which it is, but xen-utils-4.16 appears to need an update
to one or more of libxencall1, libxenevtchn1, libxenforeignmemory1,
libxengnttab1 and/or libxentoollog1 in order to function.

During my initial update I merely updated libxenmisc4.16 and
libxenstore4.  In this condition something (I suspect xenstored) was
rather broken and things were unusable.

Notably `xl list` was hanging.  I was unable to get VMs started and it
felt like everything wanted to explode.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#989560: Bug #989560 solved by update?

2022-02-04 Thread Elliott Mitchell

Nothing further has been heard.  Was bug #989560 resolved by updating to
the GRUB 2.04 packages?  Possibly as part of upgrading to bullseye?

The provided information looks like what one might expect from trying to
load Xen on ARM via GRUB 2.02.  As such I'm left suspecting this was
resolved by updating.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1002670: grub2-common: Unable to force MBR/embedding installation

2022-01-03 Thread Elliott Mitchell

On Mon, Jan 03, 2022 at 05:17:19PM +, Steve McIntyre wrote:
> 
> On Mon, Jan 03, 2022 at 08:52:48AM -0800, Elliott Mitchell wrote:
> >On Mon, Jan 03, 2022 at 02:35:48PM +, Steve McIntyre wrote:
> >> 
> >> What you're asking for here won't work; arm64 devices don't/can't use
> >> the embedding MBR/gap style of GRUB installation - that's x86 only. 
> >> Instead,
> >> what you need is to do an EFI installation but with a couple of extra
> >> options chosen. Run "dpkg-reconfigure -plow grub-efi-arm64" and say:
> >> 
> >>  * "yes" to "Force extra installation to the EFI removable media path?"
> >>  * "no" to "Update NVRAM variables to automatically boot into Debian?"
> >> 
> >> and and you should be fine from now on.
> >
> >Justify the statement "arm64 devices don't/can't use the embedding
> >MBR/gap style of GRUB installation".  I concur that is not the normal way
> >of doing EFI installation on ARM64 devices, but in this case I've got a
> >device which is unable to store persistent variables (if you sacrifice a
> >SD Card it can store them, but otherwise it loses them on restart).
> 
> The MBR/gap style is *totally specific* to the old-school x86 BIOS way
> of doing things:
> 
>  * A tiny 16-bit x86 asm loader is added into the boot sector; it uses
>BIOS routines to load the GRUB core image from the raw space after
>the partition table and execute it.
> 
>  * The core image contains enough functionality (display, filesystems,
>storage drivers, etc.) to be able to find load further modules from
>the /boot filesystem. It loads those, runs the menu, etc.
> 
> arm64 machines categorically do *not* have any capability to run this
> way. It has never been a thing. Instead, systems running GRUB will
> load GRUB as a UEFI binary from an EFI System Partition (ESP) and go
> from there. Depending on the exact installation on your system, you
> may have an ESP on removable storage (SD?), on hard drive / SSD, or
> maybe on internal eMMC or similar. It's possible you could be loading
> from the network too, but I assume you'd know if that was happening.
> 
> You have not identified the exact platform you're using, so I've no
> idea exactly which of the above options is most likely.

The platform is Tianocore built for the particular hardware.

While the hardware does have a SD controller, since there is no card
present it couldn't be loaded from there (no eMMC).  As such it is
definitely coming from what Linux sees as /dev/sda (via USB3 since that
hardware is available).

Now, since everything on the VFAT filesystem used by both the initial
stage loader and Tianocore was moved to a subdirectory, the older
version isn't coming from there (unless Tianocore does the equivalent of
a `find` during load, which seems unlikely).

As such I'm pretty sure Tianocore is finding the older GRUB by looking in
the gap between the GPT entries and data start.


> >Yet somehow despite restarting from a mostly clean slate an older
> >installation of 2.02+dfsg1-20+deb10u4 keeps managing to manifest, while
> >2.04-20 is unable to be loaded.
> >
> >My conclusion is 2.02+dfsg1-20+deb10u4 was able to successfully install
> >in the embedding area despite not being supposed to work.  Meanwhile I'm
> >guessing Tianocore/ARM64 inherited the ability to boot from the embedding
> >area, despite using that being strongly discouraged.
> 
> Nope, sorry.
> 
> >Or at least that is a simple explanation for why traces of
> >2.02+dfsg1-20+deb10u4 continue to persist, while 2.04-20 appears
> >reluctant.
> 
> My first guess would be that either:
> 
>  * You have more than one ESP on your system somewhere, and the system
>is finding an old grub binary that way; or
> 
>  * The old grub is in the removable media path and you haven't managed
>to replace it yet (see my first mail for details on how to do
>that).

The former isn't possible.  I am though wondering if use of the
"--removable" option will resolve the situation.  When it comes down to
it, all storage media is removable just an issue of how difficult it is
to remove and replace.  Since this one is via USB3 it is pretty simple to
swap out, but I could see internal storage being attached via USB3...


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1002670: grub2-common: Unable to force MBR/embedding installation

2022-01-03 Thread Elliott Mitchell

On Mon, Jan 03, 2022 at 02:35:48PM +, Steve McIntyre wrote:
> 
> On Sun, Dec 26, 2021 at 05:12:38PM -0800, Elliott Mitchell wrote:
> >
> >Hopefully the subject tells the tale.  Due to some odd hardware, I need
> >to force `grub-install` to install the EFI version of GRUB into the
> >MBR/boot area gap.  Unfortunately the documentation suggest none of
> >`grub-install`'s options can get this result.  As a result I've got a
> >problem.
> >
> >The background:  I'm trying to get GRUB installed on a very popular ARM64
> >device which has a full Tianocore/UEFI image available.  Unfortunately
> >while it is full Tianocore, the device lacks any private NVRAM and thus
> >is unable to store EFI variables.
> >
> >`grub-install` tries to do a "normal" UEFI installation, which fails due
> >to lack of EFI variables.  As a result I need GRUB to install in the
> >MBR/GPT gap, but none of `grub-install`'s options appear to cause this.
> 
> What you're asking for here won't work; arm64 devices don't/can't use
> the embedding MBR/gap style of GRUB installation - that's x86 only. Instead,
> what you need is to do an EFI installation but with a couple of extra
> options chosen. Run "dpkg-reconfigure -plow grub-efi-arm64" and say:
> 
>  * "yes" to "Force extra installation to the EFI removable media path?"
>  * "no" to "Update NVRAM variables to automatically boot into Debian?"
> 
> and and you should be fine from now on.

Justify the statement "arm64 devices don't/can't use the embedding
MBR/gap style of GRUB installation".  I concur that is not the normal way
of doing EFI installation on ARM64 devices, but in this case I've got a
device which is unable to store persistent variables (if you sacrifice a
SD Card it can store them, but otherwise it loses them on restart).

Yet somehow despite restarting from a mostly clean slate an older
installation of 2.02+dfsg1-20+deb10u4 keeps managing to manifest, while
2.04-20 is unable to be loaded.

My conclusion is 2.02+dfsg1-20+deb10u4 was able to successfully install
in the embedding area despite not being supposed to work.  Meanwhile I'm
guessing Tianocore/ARM64 inherited the ability to boot from the embedding
area, despite using that being strongly discouraged.

Or at least that is a simple explanation for why traces of
2.02+dfsg1-20+deb10u4 continue to persist, while 2.04-20 appears
reluctant.

(now back to pondering whether grub-uboot may still be a more
maintainable for this installation)

-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1002670: grub2-common: Unable to force MBR/embedding installation

2021-12-26 Thread Elliott Mitchell

Package: grub2-common
Version: 2.04-20
Severity: important

Hopefully the subject tells the tale.  Due to some odd hardware, I need
to force `grub-install` to install the EFI version of GRUB into the
MBR/boot area gap.  Unfortunately the documentation suggest none of
`grub-install`'s options can get this result.  As a result I've got a
problem.

The background:  I'm trying to get GRUB installed on a very popular ARM64
device which has a full Tianocore/UEFI image available.  Unfortunately
while it is full Tianocore, the device lacks any private NVRAM and thus
is unable to store EFI variables.

`grub-install` tries to do a "normal" UEFI installation, which fails due
to lack of EFI variables.  As a result I need GRUB to install in the
MBR/GPT gap, but none of `grub-install`'s options appear to cause this.

Plan B might be to remove the EFI System UUID from the boot area, but
this solution seems wrong.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#991967: (Presently) Not in 5.10 source

2021-12-06 Thread Elliott Mitchell

Having finally gotten to test this, the issue does NOT effect 5.10.70-1.
So far I've only gotten to try reboot, but that went fine.

Might have been an ACPI or Xen mismerge into 4.19.  Alas this may simply
disappear into history.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1000147: radicale: Non-working init script

2021-11-18 Thread Elliott Mitchell

On Thu, Nov 18, 2021 at 07:26:50PM +0100, Jonas Smedegaard wrote:
> 
> Quoting Elliott Mitchell (2021-11-18 16:45:58)
> > Appears the documentation for `start-stop-daemon` is misleading or 
> > wrong, and the "--exec" option is needed if "--startas" is given a 
> > pathname.
> 
> This sounds like a bug in start-stop-daemon: please report against the 
> package dpkg which seems to provide start-stop-daemon, and provide more 
> details on how it fails to work.
> 
> 
> > Might be this is an issue for me, but not others since the "radicale" 
> > user's shell had been set to `/bin/false`.  As this is strongly 
> > recommended security hardening, the radicale package should work with 
> > a system setup this way.
> 
> Not sure what you are saying here, but seems a separate issue (even if 
> affecting the other one).
> 
> If you mean to say that using shell /usr/sbin/nologin for radicale 
> account is strongly discouraged, then please file a separate bugreport 
> about that - preferably with more details, as that is not obvious to me.
> 
> Also, please file a separate bugreport if you believe radicale should 
> work with custom shell setting and fails to do so (but works without 
> such change).  Because I agree that should work, and am surprised if it 
> doesn't (but I don't use sysV init system myself so cannot easily test).

My guess is this could be a documentation problem for
`start-stop-daemon`.

Based upon observed behavior, I suspect "--exec" changes to the
appropriate user and then does an execve() of the specified executeable.
Whereas "--startas" is instead executing the shell of the specified
user with arguments as specified.

The latter requires the shell be valid.  Unless there is an
overwhelmingly important reason for the radicale user's shell to be
valid, it should instead be `/bin/false`.  This though requires use of
"--exec".

Since Radicale appears to function properly when started with "--exec"
that seems a vastly superior approach (doesn't result in security
concerns).


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#1000147: radicale: Non-working init script

2021-11-18 Thread Elliott Mitchell

Package: radicale
Version: 3.0.6-3
Severity: important

The init script `/etc/init.d/radicale` which is included with the 3.0.6-3
package failed to start Radicale for me.

Radicale's "--daemon" option was apparently removed with 3.0.6-3.
Attempting to use the "--daemon" option resulted in an error.
`start-stop-daemon`'s "-b" option was able to work around this.


Appears the documentation for `start-stop-daemon` is misleading or wrong,
and the "--exec" option is needed if "--startas" is given a pathname.

Might be this is an issue for me, but not others since the "radicale"
user's shell had been set to `/bin/false`.  As this is strongly
recommended security hardening, the radicale package should work with a
system setup this way.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#972950: ncal: cal fails to highlight current date (and rejects -h flag)

2021-10-24 Thread Elliott Mitchell

Yet another person who has noticed this.  Highlighting the current date
is rather handy for interactive use.

The basis of #904839 is incorrect.  Without that change `cal` uses
isatty() to determine whether output is a terminal.  If not a terminal,
highlighting is disabled (compare `ncal -b` and `ncal -b | cat`).  As
such, if the reporter for #904839 really was attempting to parse the
output, that wouldn't have been effected by highlighting.

There was only a single reporter for #904839 (after the feature had been
in place for 5 years), there are now 5 complaints after it has been
absent for less than a year.  That seems like rather overwhelming support
for keeping highlighting.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#996988: Should Provide: flash-kernel on ARM(64)

2021-10-21 Thread Elliott Mitchell

Package: pv-grub-menu
Version: 1.3

SSIA.  On ARM(64) systems typical Linux kernel packages Recommends
"flash-kernel", but for VMs this is quite undesireable.  As such I would
suggest pv-grub-menu should be marked as providing flash-kernel on
ARM(64).

(I suspect this is harmless on other architectures, but is only really
needed on ARM(64))


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#996666: Xen PVH domains lack console

2021-10-16 Thread Elliott Mitchell

Package: grub-xen-host
Version: 2.04-20

I'm unsure which versions from stable were tried, but at a minimum
2.02+dfsg1-20+deb10u4 was and also had this issue.  I'm also unsure
whether this is actually a GRUB bug versus a Linux kernel bug.

When booting in x86 PVH mode the Linux kernel fails to load the Xen
virtual console as console.  As a result the kernel's dmesg is
unavailable for debugging during boot.

When using grub-x86_64-xen.bin (x86 PV mode) as Xen kernel:
$ cat /proc/consoles
tty0 -WU (EC p  )4:1
hvc0 -W- (E  p  )  229:0
$ 

When using grub-i386-xen_pvh.bin (x86 PVH mode) as Xen kernel:
$ cat /proc/consoles
tty0 -WU (EC p  )4:1
$ 


When using Tianocore's XEN_EFI.fd (arm64 PVH mode) as Xen kernel:
$ cat /proc/consoles
hvc0 -W- (EC p  )  229:0
$ 

Presently I think GRUB is more likely the culprit, but this is far from
certain.  Notably there are a few messages from the ACPI code, so I'm
wondering if GRUB sets up an ACPI table which isn't quite right.

I'm surprised at "tty0" being listed given the complete lack of any
sort of potential console device.  Yet this is x86, so I can understand.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#996608: linux-source-5.10: Mising dependency: dwarves

2021-10-15 Thread Elliott Mitchell

Package: linux-source-5.10
Version: 5.10.70-1

SSIA.  Debian's 5.10 configuration will NOT build without the "dwarves"
package (`pahole`).  In light of this some package, likely
linux-source-5.10 should recommend "dwarves".


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#452721: [Pkg-xen-devel] Bug#452721: "xendomains" does not restore domains in same order as it would start them

2021-09-28 Thread Elliott Mitchell

On Tue, Sep 28, 2021 at 11:39:49PM +0200, Diederik de Haas wrote:
> On Tuesday, 28 September 2021 13:41:57 CEST Andy Smith wrote:
> >
> > > Could the domain ID be used for that?
> >
> > I don't like it because it only says how recent a domain was
> > started relative to others, not any intention about start/stop
> > order. Shut one down manually (or crash) and start it again and it
> > gets a new domid higher than all existing.
>
> It is a (really) simple heuristic and likely too simple.
> But at first glance it seemed (to me) to actually do the right thing.

It is *definitely* too simple to do a good job; however, this has the
advantages of being a significant improvement and simple enough to be in 
service quickly.


On Wed, Sep 29, 2021 at 01:24:58AM +0200, Diederik de Haas wrote:
> On Wednesday, 29 September 2021 00:02:46 CEST Andy Smith wrote:
> > On Tue, Sep 28, 2021 at 11:39:49PM +0200, Diederik de Haas wrote:
> > The idea of the domid controlling/influencing order of shutdown
> 
> It was just an idea that popped in my head. All in all I've likely spend less 
> then a minute thinking about the domid idea.
> Don't spend more on it then you already have ;)

The record shows I suggested it first:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=452721#35

This isn't an adaquate solution, but is a distinct improvement.


> > > I really agree with the 'upstream' tag as not only should it be
> > > fixed/adjusted there, but it also engages a (much) larger audience who
> > > think of scenarios we likely didn't think about.
> > 
> > Should we move discussion to xen-us...@lists.xen.org then?
> 
> I can make a case for both xen-users and xen-devel.
> xen-users:
> It could be that a solution already exists. I know that in Qubes (which uses 
> Xen) has some dependency mechanism in that if you start vmA which depends on 
> vmB, then it first starts vmB and then vmA. I don't know if that is a Qubes 
> 'extension' or that they simply use available functionality of Xen.

Could be interesting to learn of what solutions are already out there and
what features are must have.  Most existing solutions likely have
problems.  Some may be GPL-incompatible.  Most are likely very limited.


> xen-devel:
> If needed functionality doesn't yet exist and needs to be built anew, then 
> xen-devel is the right place to discuss that.
> 
> It could be that the best place to start is xen-users which then may/could 
> 'transition' to xen-devel.
> 
> Let's hear others first what they think is the best approach.

Perhaps.  Question is how much person-time is available for this?

If a great deal of xen-devel person-time can be devoted to this a very
ambitious solution might be viable.  If only a little bit of xen-devel
person-time is available, the approach would need to be very limited.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#452721: [Pkg-xen-devel] Bug#452721: #452721 moreinfo?

2021-09-27 Thread Elliott Mitchell

On Mon, Sep 27, 2021 at 05:13:04PM +, Andy Smith wrote:
> On Sun, Sep 26, 2021 at 08:07:58PM -0700, Elliott Mitchell wrote:
> > During a full downtime when all VMs were fully shut down, this effect
> > can be achieved by including numbers in the filename.  Say
> > /etc/xen/auto/0_ldap.cfg, /etc/xen/auto/1_fileserver.cfg,
> > /etc/xen/auto/9_everything_else.cfg.
> 
> I also do this to control start up order, though I use a prefix of
> NNN-.
> 
> The main missing functionality from my point of view is not being
> able to control the order of save/shutdown. As you say the script
> for saving everything or shutting everything down just does a read
> of all existing domids and does the action on them one by one in
> increasing order.

Seems we're running into the same problems, coming up with the same
first-tier workaround and now we all need a common complete solution.


> I think the "auto" directory is a pretty good and simple interface,
> so how about using it for save/shutdown as well? So, instead of just
> enumerating all running domids, enumerate all files in
> /etc/xen/auto/ in REVERSE order, parsing the name of the domain out
> of each one and doing the action on that name. When all files have
> been exhausted, THEN do the action on any remaining running domains.
> 
> This has the advantages of:
> 
> - still working even if administrator does not use ordering in
>   /etc/xen/auto. Filename format there does not change from what it
>   is now, where ordering is already possible but is optional.
> 
> - being quite obvious behaviour - save/shutdown order is reverse of
>   start order.

This though requires something which understands the format of those
files, can retrieve name or uuid, and then resolve that to something
suitable for `xl {save|shutdown}`.  Alternatively this requires
`xl {save|shutdown}` to be able to select the target domain based on the
configuration file (documentation reads like this might be halfway
implemented).

Additionally this needs a tool to identify domains which are NOT listed
in /etc/xen/auto/ then do save/shutdown on them first.


> That seems like a good minimal improvement, but if one wanted to
> explicitly control save/shutdown order then perhaps the next
> enhancement could be an /etc/xen/shutdown/ directory with similar
> purpose to the "auto" one? i.e.:
> 
> 1. Enumerate files in "shutdown" directory in reverse order, getting
>name from each and doing shutdown action on it
> 
> 2. If there were no files there, instead use "auto" directory for
>this purpose
> 
> 3. Then do shutdown action on every remaining running domain as
>usual
> 
> Again this still results in everything getting a shutdown action if
> administrator does not want to do any of this.
> 
> It's an open question for me whether step 2 (falling back to
> enumerating "auto" directory) only happens when "shutdown" directory
> is empty or if it should happen all of the time.

This strikes me (note, I am NOT a Debian maintainer) as likely to involve
too much work for too little gain.  For complex setups this won't be
enough, for simple setups this will be overkill.


> > If the hypervisor is rebooted and VMs are saved to /var/lib/xen/save;
> > they will be paused in identifier order, but saved by domain name.  When
> > scanning /var/lib/xen/save, `xendomains` goes by filename which means VMs
> > are restored in a distinct (and often problematic) order.
> > 
> > A minimal solution would be for `xendomains` to save VMs in
> > /var/lib/xen/save - and then use `sort -n` during restore.
> 
> If by this you mean it would be good if the "save all" action picked
> the filename from the filename in the "auto" directory, to replicate
> that directory's ordering, then I agree.
> 
> If however you mean the actual Xen domid of the running domain then
> I'm not sure what that would buy us. If I had a domain with a
> filename of 010-ldap0.cfg it might get strted first and have domid
> 1, but then I reboot it and it has domid 99, I wouldn't want it
> saved as /var/lib/xen/save/99-ladp0, I'd still want it saved as
> /var/lib/xen/save/010-ladp0,

Minimal meaning very simple to implement, but very limited.

The idea is domains which start later get higher domain Ids.  As long as
crucial domains rarely get restarted, they will tend to keep low domain
Ids.  This fails when a crucial domain gets restarted late due to some
reason, but this might capture enough low-hanging fruit to be worthwhile.


> > A better approach would be to have a LSB style header specifying
> > dependencies to flag VMs which should be saved or shutdown late,
> > and VMs which should be saved or shutdown early.

Bug#452721: #452721 moreinfo?

2021-09-26 Thread Elliott Mitchell

I'm surprised #452721 is tagged moreinfo since it seems simple, but that
may depend on installation capability.

Note, I am not the original reporter, so I might actually be observing
something distinct.  I doubt this, but I cannot be certain.


Issue is this, a hypervisor machine could have tens or even hundreds of
VMs.  There could be ordering dependencies during startup and shutdown.

Notably there are core services, such as LDAP, DHCP, fileserver and DNS.
Often these need to be up before anything else and they may need to come
up in a particular order.  Most often the LDAP server (which can be a
distinct VM) needs to be up first.  Meanwhile for downtimes, a fileserver
(which can also be a VM) needs to go down last.

During a full downtime when all VMs were fully shut down, this effect
can be achieved by including numbers in the filename.  Say
/etc/xen/auto/0_ldap.cfg, /etc/xen/auto/1_fileserver.cfg,
/etc/xen/auto/9_everything_else.cfg.

If the hypervisor is rebooted and VMs are saved to /var/lib/xen/save;
they will be paused in identifier order, but saved by domain name.  When
scanning /var/lib/xen/save, `xendomains` goes by filename which means VMs
are restored in a distinct (and often problematic) order.


A minimal solution would be for `xendomains` to save VMs in
/var/lib/xen/save - and then use `sort -n` during restore.

A better approach would be to have a LSB style header specifying
dependencies to flag VMs which should be saved or shutdown late, and VMs
which should be saved or shutdown early.

A ridiculous overkill solution might be to turn the /etc/xen/*.cfg files
into full init scripts.  This could be done by having a script which
understood domain configuration files well enough to identify the
name/UUID and then start/stop the domain as specified by $1.  Use that
script as the interpreter (#! line), then it could find the configuration
via $0.  Then normal init script handling tools could take care of
ordering.

(geeze, that really does actually seem kind of like a semi-workable
solution despite seeming rather crazy at first)


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#939186: irt: Bug #939186 and 4.11/4.14

2021-09-26 Thread Elliott Mitchell

found 939186 4.14.2+25-gb6a8c4f72d-2
found 939186 4.14.3-1
tags 939186 upstream
quit

Upon a bit more experimentation, seems my minimal example had become too
minimal.  Bring in the less minimal example and things explode again.

Finally setup an appropriate downtime window and it reproduced.  After
some fighting to get Xen's console operational, detail on the panic was
recorded.

My wild guess from the output is some combination of enabling "nestedhvm"
and this being an AMD processor machine.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#939186: Bug #939186 and 4.11/4.14

2021-09-26 Thread Elliott Mitchell

Control: found 939186 4.11.4+107-gef32c7afa2-1

Certainly reproduced with 4.11.  Most recently I tried 4.14 and it
*didn't* occur.

Problem is I've got two guesses:

First, system had most VMs shutdown for some experimentation.  Could be
having >50% of memory allocated is needed for this to occur.

Second, could be the issue got fixed sometime before 4.14.


I *really* hope it is the second, but #939186 should remain open for a
while so that experimentation can occur.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#991967: Simply ACPI powerdown/reset issue?

2021-09-25 Thread Elliott Mitchell

On Tue, Sep 21, 2021 at 06:33:20AM -0400, Chuck Zmudzinski wrote:
> I presume you are suggesting I try booting 4.19.181-1 on the
> current version of Xen-4.14 for bullseye as a dom0. I am not
> inclined to try it until an official Debian developer endorses
> your opinion that the bug I am seeing is distinct
> from #991967, at which point I will report the bug I am
> seeing as a new bug.

Chuck Zmudzinski you are getting rather close to my threshold for calling
harrassment.  You're not /quite/ there, but I'm concerned.

Since the purpose of the bug reports is to find and diagnose bugs, I did
a bit of experimentation and made some observations.

I checked out the Debian Xen source via git.  I got the current
"master" branch which is presently the candidate 4.14.3-1 version,
which includes urgent fixes.  The hash is:
e7a17db0305c8de891b366ad3528e5a43015

On top of this I cherry-picked 3 commits from Xen's main branch:
5a4087004d1adbbb223925f3306db0e5824a2bdc
0f089bbf43ecce6f27576cb548ba4341d0ec46a8
bc141e8ca56200bdd0a12e04a6ebff3c19d6c27b

(these can be retrieved via Xen's gitweb at
https://xenbits.xen.org/gitweb/?p=xen.git;a=patch;h=<$hash> which is
suitable for the `git am` command)

With these I built 4.14.3-1 and then tried kernels 4.19.181-1 and
4.19.194-3 (this system is presently mostly on oldstable).  The results
were:

Xen 4.14.3-1 with Linux 4.19.181-1: system reboots were successful

Xen 4.14.3-1 with Linux 4.19.194-3: system reboots hung

Unfortunately I was too quick at installing the rebuilt 4.14.3-1 and I
missed trying the vanilla Debian 4.14.2+25-gb6a8c4f72d-2 with
Linux 4.19.181-1.  I believe this combination would have hung during
reboot.

As such, I believe there are in fact two distinct bugs being observed.
The presence of EITHER of these is sufficient to cause hangs during
powerdown or reboot.

First, some patch originally from Linux's main branch breaks Xen reboots
was backported somewhere between 4.19.181-1 and 4.19.194-3.  This may
either have been introduced before 5.10 diverged from main, or may also
have been backported to 5.10.  THIS is Debian bug #991967.

Second, the Xen patch 3c428e9ecb1f290689080c11e0c37b793425bef1 which is
valuable to ARM devices breaks reboots and powerdowns on x86.  This is
correctly fixed by 0f089bbf43ecce6f27576cb548ba4341d0ec46a8.  Presently
this has no Debian bug report.

The first is presently unidentified, someone enthusiastic either needs to
read git logs/source code, or bisect and build to find where it got
broken.

The second we seem to have a fix.  The only question is how many patches
to cherry pick?  bc141e8ca562 is non-urgent as it is merely superficial
and not needed for functionality.
5a4087004d1a is a workaround for Linux kernel breakage, but how likely
are we to see that fixed in the Linux kernel packages?  The fix is
well-contained and needed for some highly popular ARM devices.

-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#991967: #991967: Simply ACPI powerdown/reset issue?

2021-09-20 Thread Elliott Mitchell

On Mon, Sep 20, 2021 at 10:23:39PM -0400, Chuck Zmudzinski wrote:
> 
> On 9/20/21 7:39 PM, Diederik de Haas wrote:
> > On dinsdag 21 september 2021 01:15:15 CEST Elliott Mitchell wrote:
> >> Merely having the path is a sufficiently strong indicator for me to
> >> simply wave it past.  I though would suggest Debian should instead
> >> cherry-pick commit 0f089bbf43ecce6f27576cb548ba4341d0ec46a8.
> >>
> >> This is available as a patch at:
> >>
> >> https://xenbits.xen.org/gitweb/?p=xen.git;a=patch;h=0f089bbf43ecce6f27576cb548ba4341d0ec46a8
> > You probably then also want the following commit, which is a fix on that 
> > patch:
> > https://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=bc141e8ca56200bdd0a12e04a6ebff3c19d6c27b
> >
> > Found that via the following url/query:
> > https://xenbits.xen.org/gitweb/?p=xen.git=search=HEAD=commit=x86%2FACPI
> >
> > I don't know whether others should be used from that as well.
> 
> I tried these two commits (adapted for the xen-4.14 branch) but this
> approach did not fix the bug - with these patches applied the dom0
> did not power down.
> 
> My advice for the Debian Xen Team is to consult with upstream and
> get their advice on whether or not it is advisable for Debian to
> retain the patches from the Xen-4.16 branch that have been
> added to the Debian 4.14 package in an attempt to support
> some arm devices that panic during on an unpatched Xen-4.14.
> If upstream cannot help Debian backport fixes for arm panics
> from Xen-4.16/unstable to Xen-4.14 stable, I think the Debian
> Xen team should remove aggressive patches that really have now
> turned the Debian Xen-4.14 package into a Frankenstein version
> that is a mixture of Xen-4.14 and Xen-4.16, and decide that support
> for those arm devices must wait until Debian gets Xen 4.16 up
> and running on the unstable and hopefully soon, testing distribution.

It is still not established you're running into #991967.  Unless the one
you're pointing towards was backported to the Xen 4.11 packages (which I
doubt) it cannot explain #991967, since at the time 4.11 was in use.

Could be this is a second bug with symptoms similar to #991967.  Now
that a fix for the second bug has been identified, you might try a
4.19.181-1 kernel and see whether that fixes things.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#991967: #991967: Simply ACPI powerdown/reset issue?

2021-09-20 Thread Elliott Mitchell

On Mon, Sep 20, 2021 at 06:29:49PM -0400, Chuck Zmudzinski wrote:
> On 9/20/21 1:43 PM, Chuck Zmudzinski wrote:
> >
> > On 9/20/21 12:27 AM, Elliott Mitchell wrote:
> >> On Sun, Sep 19, 2021 at 01:05:56AM -0400, Chuck Zmudzinski wrote:
> >>
> >>> I suspect the following patch is the culprit for problems
> >>> shutting down on the amd64 architecture:
> >>>
> >>> 0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch
> >>> This patch does affect amd64 acpi code, and is probably causing
> >>> the problem on my amd64 system, so my build of the xen-4.14
> >>> hypervisor without this patch fixed the problem.
> >> Of the ones listed that is the only one which has any overlap with x86
> >> code.?? The next reproduction step is `apt-get source xen &&
> >> patch -p1 -R < 
> >> 0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch
> >> && dpkg-buildpackage -b`.?? Then try with this to confirm that patch
> >> is what does it.
> >>
> >> Thing is that delta is rather small.?? I don't have a simulator, but that
> >> is rather small to be the culprit.
> >
> > I just tested the build with
> > patch -p1 -R < 
> > 0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch
> > applied before building the package and I can confirm that this is the 
> > patch
> > causing the trouble for dom0 poweroff on x86/amd64. Reverting this patch
> > fixes it on my amd64 system. But this would probably break the arm build.
> >
> > I think one possible fix would require modifying
> > 0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch
> > so it only applies at runtime to the arm architecture. I will try some
> > modifications to the patch instead of removing it, and if I get something
> > that works on amd64 and also might work on arm, I will post it
> > for Elliott to try.
> 
> I have an encouraging result. I found a very simple patch
> to xen/arch/x86/acpi/lib.c that fixes the dom0 poweroff
> bug on my system and it should not affect the arm patches
> at all:
> --
> This patch partially reverts previous patch
> 0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch
> 
> This hopefully fixes #911976
> 
> --- a/xen/arch/x86/acpi/lib.c?? 2021-09-20 16:49:08.0 -0400
> +++ b/xen/arch/x86/acpi/lib.c?? 2021-09-20 16:25:05.572038000 -0400
> @@ -46,10 +46,6 @@
>   if ((phys + size) <= (1 * 1024 * 1024))
>   ?? return __va(phys);
> 
> -?? /* No further arch specific implementation after early boot */
> -?? if (system_state >= SYS_STATE_boot)
> -?? ?? return NULL;
> -
>   offset = phys & (PAGE_SIZE - 1);
>   mapped_size = PAGE_SIZE - offset;
>   set_fixmap(FIX_ACPI_END, phys);
> --
> 
> Can you try this patch to src:xen and see if your
> arm devices are OK with it?

Merely having the path is a sufficiently strong indicator for me to
simply wave it past.  I though would suggest Debian should instead
cherry-pick commit 0f089bbf43ecce6f27576cb548ba4341d0ec46a8.

This is available as a patch at:

https://xenbits.xen.org/gitweb/?p=xen.git;a=patch;h=0f089bbf43ecce6f27576cb548ba4341d0ec46a8


The other commit I would suggest being picked by src:xen is
5a4087004d1adbbb223925f3306db0e5824a2bdc

This is for device-tree funkiness which got added between linux-5.10.0
and linux-5.10.y (if the Debian kernel team wants to maintain a fix in
Debian's kernel source, that works too).

BTW have I mentioned I've become rather skeptical of device-trees being
a usable way of representing hardware information?


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#991967: #991967: Simply ACPI powerdown/reset issue?

2021-09-19 Thread Elliott Mitchell

On Sun, Sep 19, 2021 at 01:05:56AM -0400, Chuck Zmudzinski wrote:
> xen hypervisor version: 4.14.2+25-gb6a8c4f72d-2, amd64
> 
> linux kernel version: 5.10.46-4 (the current amd64 kernel
> for bullseye)
> 
> Boot system: EFI, not using secure boot, booting xen
> hypervisor and dom0 bullseye with grub-efi package for
> bullseye, and it boots the xen-4.14-amd64.gz file, not
> the xen-4.14-amd64.efi file.

> I also tested a buster dom0 with the 4.19 series kernel
> on the xen-4.14 hypervisor from bullseye and saw the
> problem, but I did not see the problem with either
> a buster (linux 4.19) or bullseye (linux 5.10) dom0 on
> the xen-4.11 hypervisor, so I think the problem is
> with the Debian version of the xen-4.14 hypervisor,
> not with src:linux.

You're referencing several software versions which are mismatches for
#991967.  #991967 was observed with Xen 4.11 and Linux kernel 4.19.194-3,
but not Linux kernel 4.19.181.

The fact it correlates with a Linux kernel update rather strongly points
to the Linux kernel.  I could believe the situation is partially the
fault of both though.

> I suspect the following patch is the culprit for problems
> shutting down on the amd64 architecture:
> 
> 0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch

> This patch does affect amd64 acpi code, and is probably causing
> the problem on my amd64 system, so my build of the xen-4.14
> hypervisor without this patch fixed the problem.

Of the ones listed that is the only one which has any overlap with x86
code.  The next reproduction step is `apt-get source xen &&
patch -p1 -R < 0030-xen-acpi-Rework-acpi_os_map_memory-and-acpi_os_unmap.patch
&& dpkg-buildpackage -b`.  Then try with this to confirm that patch
is what does it.

Thing is that delta is rather small.  I don't have a simulator, but that
is rather small to be the culprit.

> I think this bug should be re-classified as a bug in src:xen.

There could be a separate bug in src:xen, but that is not #991967.

> I also would inquire with the Debian Xen Team about why they
> are backporting patches from the upstream xen unstable
> branch into Debian's 4.14 package that is currently shipping
> on Debian stable (bullseye). IMHO, the aforementioned
> patches that are not in the stable 4.14 branch upstream
> should not be included in the xen package for Debian stable.

It was requested since someone trying to have Xen operational on a device
needed those for operation.  Rather a lot of bugfix or very small
standalone feature patches get cherry-picked.

Presently I haven't been convinced this is a Xen bug (though it does
effect Xen installations).

Any chance you've got the tools to build and try a 5.5.0 or 5.10.0 Linux
kernel?  I'm suspecting got incorrectly backported on the Linux side
(alternatively the Xen project seems a bit poor at keeping needed patches
in Linux).

-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#991967: #991967: Simply ACPI powerdown/reset issue?

2021-09-19 Thread Elliott Mitchell

On Sun, Sep 19, 2021 at 01:05:56AM -0400, Chuck Zmudzinski wrote:
> On Sat, 11 Sep 2021 13:29:12 +0200 Salvatore Bonaccorso 
>  wrote:
>  >
>  > On Fri, Sep 10, 2021 at 06:47:12PM -0700, Elliott Mitchell wrote:
>  > > An experiment lead to a potential alternative explanation for #991967.
>  > > The issue may be ACPI (non-UEFI) powerdown/reset was broken at
>  > > 4.19.194-3. Presence of Xen on the system may be unrelated.
>  > >
>  > > Failing that, it could be Xen and non-UEFI systems are effected. (Xen
>  > > was tried on a UEFI system and the issue wasn't observed)
>  >
>  > Following up on https://bugs.debian.org/991967#12
>  >
>  > Did you succeeded in bisecting the issue as you seem to have it
>  > reproducible?
> 
> I noticed this bug on bullseye ever since I have been
> running bullseye as a dom0, but my testing indicates
> there is no problem with src:linux but the problem
> appeared in src:xen with the 4.14 version of xen on
> bullseye.
> 
> I ask Elliott if you are only seeing the problem on Debian's
> xen-4.14 hypervisor? Also, which architecture, arm or
> amd64? I only see the problem on the Debian xen-4.14
> hypervisor, and I have only tested on amd64, and I
> have found a fix for my amd64 system which is as
> follows:
> 
> Motherboard: ASRock B85M Pro4, BIOS P2.50 12/11/2015,
> with a Haswell CPU (core i5-4590S)
> 
> xen hypervisor version: 4.14.2+25-gb6a8c4f72d-2, amd64
> 
> linux kernel version: 5.10.46-4 (the current amd64 kernel
> for bullseye)

Nope.  As per the report the problem appeared with kernel 4.19.194-3 and
at the time using Xen 4.11.

The kernel you're listing is rather more recent, which might suggest a
patch which had been backported from 5.x to 4.19.

I could believe a Xen security update being the trigger though (I don't
recall there being one at the right time, but I wouldn't rule it out).

> Boot system: EFI, not using secure boot, booting xen
> hypervisor and dom0 bullseye with grub-efi package for
> bullseye, and it boots the xen-4.14-amd64.gz file, not
> the xen-4.14-amd64.efi file.
> 
> I also tested a buster dom0 with the 4.19 series kernel
> on the xen-4.14 hypervisor from bullseye and saw the
> problem, but I did not see the problem with either
> a buster (linux 4.19) or bullseye (linux 5.10) dom0 on
> the xen-4.11 hypervisor, so I think the problem is
> with the Debian version of the xen-4.14 hypervisor,
> not with src:linux.

Just to make sure, the kernel you were testing was 4.19.194-3?  The
issue didn't manifest with kernels earlier than that.

Could be we're seeing distinct bugs.

> This patch does affect amd64 acpi code, and is probably causing
> the problem on my amd64 system, so my build of the xen-4.14
> hypervisor without this patch fixed the problem.

While that commit modifies the code path the processor takes, the
modified path appears identical.

> I also would inquire with the Debian Xen Team about why they
> are backporting patches from the upstream xen unstable
> branch into Debian's 4.14 package that is currently shipping
> on Debian stable (bullseye). IMHO, the aforementioned
> patches that are not in the stable 4.14 branch upstream
> should not be included in the xen package for Debian stable.

Some people are asking for those.  Those are bugfixes for an extremely
popular device which panics on boot without the patches.

Meanwhile turned out between 5.10.0 and 5.10.30 the ARM64 device-trees
were modified in a way which broke Xen 4.14 on ARM64.  The change
violated Linux's own standards for device-trees, yet still appeared in a
stable branch.

In other news, if you see device-trees compared to ACPI tables, they're
not very comparable.  99% of ACPI tables work for all versions of all
OSes.  Any given device-tree is only likely to work for a single version
of a single OS.  While a useful abstraction for portions of kernel code,
device-trees are utter garbage compared to ACPI tables.

-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#991967: #991967: Simply ACPI powerdown/reset issue?

2021-09-12 Thread Elliott Mitchell

On Sat, Sep 11, 2021 at 01:29:12PM +0200, Salvatore Bonaccorso wrote:
> On Fri, Sep 10, 2021 at 06:47:12PM -0700, Elliott Mitchell wrote:
> > An experiment lead to a potential alternative explanation for #991967.
> > The issue may be ACPI (non-UEFI) powerdown/reset was broken at
> > 4.19.194-3.  Presence of Xen on the system may be unrelated.
> > 
> > Failing that, it could be Xen and non-UEFI systems are effected.  (Xen
> > was tried on a UEFI system and the issue wasn't observed)
> 
> Following up on https://bugs.debian.org/991967#12
> 
> Did you succeeded in bisecting the issue as you seem to have it
> reproducible?

Problem is that is rather a lot of kernel builds, which also means a lot
of downtime...   Right now distribution update seems worthy of greater
attention.

The one notable bit is the one I sent in the last message.  The system
does NOT have UEFI, and a test system with UEFI seemed to have no
problem.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#991967: #991967: Simply ACPI powerdown/reset issue?

2021-09-10 Thread Elliott Mitchell

An experiment lead to a potential alternative explanation for #991967.
The issue may be ACPI (non-UEFI) powerdown/reset was broken at
4.19.194-3.  Presence of Xen on the system may be unrelated.

Failing that, it could be Xen and non-UEFI systems are effected.  (Xen
was tried on a UEFI system and the issue wasn't observed)


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#991967: linux-src 4.19.194-3 breaks Xen Dom0 powerdown and reboot

2021-08-06 Thread Elliott Mitchell

Package: src:linux
Version: 4.19.194-3
Control: affects -1 src:xen

SSIA.  Previous versions of 4.19 had no issues (4.19.181-1 according to
notes), but this cropped up with 4.19.194-3 (-1 and -2 weren't tested).

When a Xen domain 0 tries to reboot or powerdown the computer, it hangs
with the display off, but the power supply is active.

I'm rebuilding from source, so I imagine this also effects
linux-image-4.19.0-17-amd64.

Seems .194 caused multiple problems for Xen given 990642.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#989560: Bug #989560 is grub-common, not xen-hypervisor-common

2021-08-03 Thread Elliott Mitchell

I rate #989560 as a grub-common bug, *not* a xen-hypervisor-common bug.
As you've noticed, the problem is with the file /etc/grub.d/20_linux_xen,
which is part of grub-common, not xen-hypervisor-common.

A working grub.cfg will be generated by the version of the file from
GRUB 2.04.  If you can deal with installing *only* GRUB from testing,
that should work.

The bug should be reassigned to grub-common, but marked as effecting
Xen so duplicate reports don't show up (actually I'm pretty sure reports
against grub-common or src:grub2 already exist).


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#979548: u-boot: Package Xen build

2021-01-14 Thread Elliott Mitchell

On Thu, Jan 07, 2021 at 11:34:44PM -0800, Vagrant Cascadian wrote:
> On 2021-01-07, Elliott Mitchell wrote:
> > Might it be possible to get a u-boot-xen-arm64 package built?  While
> > "PyGRUB" is great for Linux, it isn't so good for booting other OSes.
> 
> Do you mean:
> 
>   
> https://gitlab.denx.de/u-boot/u-boot/-/blob/master/doc/board/xen/xenguest_arm64.rst
> 
> This doesn't describe how to use it or, importantly, what files we would
> need to ship in the package. If you could help clarify that (possibly
> provide a patch), and ideally get it clarified in the upstream
> documentation, then I would think we would be able to ship such a
> package.

Appears the build issue wasn't libfdt-dev, but instead `dwz` and
`debhelper`.  I suspect libfdt-dev:any or libfdt-dev may now be
sufficient for building (I'm not 100% sure since I have a workaround in
place).

Anyway I now have something which looks like a first pass at having
U-Boot/ARM boot a Xen VM.  Some progress has been made, but it I haven't
confirmed full operation yet.

The build was achieved by copying configs/xenguest_arm64_defconfig over
qemu_arm64_defconfig and then cross-building for arm64.  This suggests
extra steps for "qemu" are also appropriate for "xenguest".

Once complete, the file ./usr/lib/u-boot/qemu_arm64/u-boot.bin was
copied to the host machine.  A configuration file was created for xl,
the value for "kernel" was pointed at the u-boot.bin file, both
"bootloader" and "ramdisk" options were left unset.

Upon attempt to boot this VM (`xl create -c u-boot.cfg`) I ended up at a
prompt "xenguest# ".  The command-line appeared to act how I would expect
U-Boot to act, so I conclude U-Boot had successfully loaded.

The next task is to get the OS I wish to run in the VM loaded by U-Boot.

As of 2020.10+dfsg-2, appears the "xenguest" defconfig disables all
EFI/GPT support.  I must recommend the U-Boot maintainers advise upstream
to set CONFIG_EFI_LOADER=y and CONFIG_CMD_PART=y in the Xen defconfig.

While some smaller VMs may not need EFI support, it appears to be gaining
traction everywhere with ARM64.  I note SuSE uses it as an intermediate
stage between U-Boot and GRUB.  FreeBSD's ARM64 VM images appear to
*assume* EFI is in use.

I haven't gotten U-Boot/Xen to successfully load FreeBSD's bootloader
yet, but progress is being made.

-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#979548: u-boot: Package Xen build

2021-01-11 Thread Elliott Mitchell

On Thu, Jan 07, 2021 at 11:34:44PM -0800, Vagrant Cascadian wrote:
> This doesn't describe how to use it or, importantly, what files we would
> need to ship in the package. If you could help clarify that (possibly
> provide a patch), and ideally get it clarified in the upstream
> documentation, then I would think we would be able to ship such a
> package.

I think 2 or 3 files would be useful to ship in such a package.  First
would be "u-boot.bin" or whatever the output filename is.  Second might
be a README mentioning the 3 values needing to be set in a domu.cfg file.
Third might be a /etc/xen/xlexample.u-boot file.

The 3 values which need to be set in the domain configuration file are:

kernel = "/usr/lib/u-boot/xen/u-boot.bin"
# ramdisk =
# extra =

Mainly the "ramdisk" and "extra" settings should be left unset, while
"kernel" points at the U-Boot image.  A /etc/xen/xlexample.u-boot would
be a copy of Xen's /etc/xen/xlexample.pvlinux with the 3 values set
appropriately.

Then https://wiki.debian.org/Xen should be adjusted to mention U-Boot
being available to boot user domains for Xen.  In fact I'm trying to
find out whether Xen/U-Boot can load OSes besides Linux.

Note, this is presently theory as src:u-boot has a problematic set of
package requirements.

Presently libfdt-dev doesn't allow installation of multiple architecture
versions.  My build VM is setup to target Xen which needs the host
package, while the u-boot build needs the build package.  Grr!

-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#979548: u-boot: Package Xen build

2021-01-08 Thread Elliott Mitchell

On Thu, Jan 07, 2021 at 11:34:44PM -0800, Vagrant Cascadian wrote:
> On 2021-01-07, Elliott Mitchell wrote:
> > Might it be possible to get a u-boot-xen-arm64 package built?  While
> > "PyGRUB" is great for Linux, it isn't so good for booting other OSes.
> 
> Do you mean:
> 
>   
> https://gitlab.denx.de/u-boot/u-boot/-/blob/master/doc/board/xen/xenguest_arm64.rst
> 
> This doesn't describe how to use it or, importantly, what files we would
> need to ship in the package. If you could help clarify that (possibly
> provide a patch), and ideally get it clarified in the upstream
> documentation, then I would think we would be able to ship such a
> package.

I'm less than 100% sure myself.  :-)

Most likely you simply configure xenguest_arm64_defconfig, build the
configuration and the package would be the copyright plus one output
file.

In order to use this you setup a VM/domain configuration file where the
single output file is specified as the "kernel" parameter.  This would
cause the U-Boot image to be loaded as if it was an OS kernel and be
loaded into the resultant VM and started.  Then in theory U-Boot loads
configuration parameters from VM disk devices as it normally would.

-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#974755: smartd: Problematic memory activity

2020-12-12 Thread Elliott Mitchell

Hmm, don't see a copy of the follow-up message anywhere.  Sent to the bug
and not me?

6 devices are being monitored, they're behind a HP controller (cciss
driver).

I don't know for certain that triggering self-tests is the cause, this is
merely obvious speculation.  My most recent observations seem to suggest
this is incorrect as the oom-killer was triggered at a time when
self-tests shouldn't be run.

I'm open to other programs on the system being the actual cause,
allocating memory quickly and pushing the limits.  Thing is I've never
observed any program besides `smartd` triggering the oom-killer.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#976123: u-boot-rpi: Unreliable USB with storage+keyboard

2020-11-29 Thread Elliott Mitchell

Package: u-boot-rpi
Version: 2020.10+dfsg-1+b1
Severity: important

Hopefully SSIA.

U-Boot's USB support is highly unreliable.  Trying to interact with an
advanced bootloader (GRUB) via USB-keyboard is highly troublesome if the
Raspberry PI is also booting from a USB storage device.

There is some level of magic timing needed to get U-Boot to detect both
and provide access for long enough for the bootloader (GRUB here) to
finish its job.

UEFI is very new to U-Boot, but USB is highly unreliable.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#976122: u-boot-rpi: Fails with mini-UART

2020-11-29 Thread Elliott Mitchell

Package: u-boot-rpi
Version: 2020.10+dfsg-1+b1
Severity: important

Appears "standard" device trees for the Raspberry PI 4B connect the
serial pins to the mini-UART.  This is troublesome due to the mini-UART's
baud rate changing when the processor clock changes.

Often Raspberry PI devices have an initial boot phase where the
processor clock is locked at maximum for a period, and then decreased.
If/when that decrease occurs, the baud rate changes and suddenly serial
communication becomes corrupt.

3 strategies come to mind for U-Boot:

1>  Dynamically modify the baud rate register as the processor clock
changes.  If the processor clock is increased, decrease the baud rate
register.  If the processor clock is decreased, increase the baud rate
register.
2>  Peg the processor clock at maximum until EFI boot mode is exited.
3>  Peg the processor clock at minimum until EFI boot mode is exited.

The first is ideal, but requires U-Boot to monitor the processor clock
as it changes dynamically.  The next two are suboptimal, but not too
likely to cause problems.

The likely cause for a bootloader (GRUB) to remain active is user
interaction via the serial port.  In this case a stable baud rate is
crucial.  GRUB is likely to issue halt instructions while waiting and
this should keep processor temperature down.  As such I feel pegging the
processor clock to max is better than pegging to minimum.  Once an OS
takes over, EFI boot services should be exited and then it is no longer
a U-Boot issue.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#939633: More severe #939633 for RP4 on 5.8?

2020-11-27 Thread Elliott Mitchell

found 935456 5.9.6-1~bpo10+1
quit

After having spent several hours on kernel compiles and experimenting
with the situation, I'm fairly sure this also applies to
linux-source-5.9.

Odd thing is, when I booted the device using the Tianocore implementation
it came right up with no problems.  I'm getting this odd suspicion
someone deliberately broke the device-trees in Debian's kernel source.
The goal being to force everyone onto the Tianocore/ACPI implementation
and try to kill device-trees.

Right now I think this is conspiracy theory territory, but I'm left
wondering how such a serious bug could hang around so long...


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#824954: IRT: [bug #52939] [PATCH] 10_linux: support loading device trees

2020-11-26 Thread Elliott Mitchell

The patch to have GRUB load a device-tree is interesting.  This is
certainly worthy of discussion.

Three issues come up when looking though:

First, your patch modifies /etc/grub.d/10_linux, but misses
/etc/grub.d/10_linux_xen.  /etc/grub.d/10_linux_xen needs a fairly
similar treatment.

Second, rather than having this get buried inside Debian bug #824954, you
should instead file a new bug against grub-common.

Third, there may be a need for extra guarding to ensure these sections
*only* get invoked on ARM devices (I'm fairly sure the *exact* *same*
file is shipped for all architectures).


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#963962: /etc/grub.d/20_linux_xen generates non-functional menu entries

2020-11-26 Thread Elliott Mitchell

found 963962 2.02+dfsg1-20+deb10u2 2.04-10
quit

I was going to report I'd never observed this bug, but then I examined
the grub.cfg files and I discover they're present.  I would tend to rate
this as minor, but the original submitter didn't adjust severity.

With 2.04-10 the xen-4.*.config file entries are absent, but entries
for both the .efi file and the other are produced.  On an aarch64 system
the .efi file can be booted by GRUB 2.04.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#824954: flash-kernel: GRUB? via U-Boot?

2020-11-26 Thread Elliott Mitchell

For a Raspberry PI, I've got the initial workings of a script to
accomplish this goal.

First, install u-boot-rpi, raspi-firmware, and grub-efi-arm64.

Next, create a filesystem on a device the Raspberry PI will boot from.
For anything pre-RP4, this will have to VFAT and show up in a MBR.  A
system I've done has a GPT with entry #3, which matches with entry #1 in
MBR.  The Raspberry PI will find this and boot from it, Linux will see it
as /dev/sda3.  Mount this filesystem on /boot/efi.


Do the following:

cp /usr/lib/raspi-firmware/* /boot/efi
# cp /usr/share/doc/raspi-firmware/copyright /boot/efi/LICENSE.broadcom

cp /usr/lib/u-boot/rpi_arm64/u-boot.bin /boot/efi/u-boot64.bin

cp /usr/lib/u-boot/rpi_3/u-boot.bin /boot/efi/u-boot3.bin
cp /usr/lib/u-boot/rpi_4/u-boot.bin /boot/efi/u-boot4.bin

cp /boot/dtbs/`uname -r`/broadcom/bcm2*-rpi*.dtb /boot/efi

grub-install --bootloader-id=BOOT
cp /boot/efi/EFI/BOOT/grubaa64.efi /boot/efi/EFI/BOOT/bootaa64.efi

echo bootaa64 > /boot/efi/startup.nsh


Now, I'm using SuSE as a starting point.  They copy a series of
device-tree overlays into /boot/efi/overlays.  These may come from the
Raspberry PI Foundation for optional hardware/configuration the RPF
provides.

Next would be to to create /boot/efi/config.txt.  I'm unsure of which
directives would be appropriate for Debian.  Debian would certainly need
to configure distinct "kernel=" lines depending upon which variant was
being booted.

This is rather badly damaged by bug #939633.  Until the device-trees are
fixed, this is completely broken.

Not ready for most people, but almost there...


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#940628: Working in 2.04-8 and 2.04-10

2020-11-26 Thread Elliott Mitchell

As of 2.04-8 it was possible to boot Xen on ARM.  The funky mechanism by
which GRUB loads its modules does a good job of obscuring which modules
to confirm presence of.

Seeing 'xen_loader="xen_hypervisor"' makes one expect to find
"/usr/lib/grub/arm64-efi/xen_hypervisor.mod", not for it to be taken care
of by "/usr/lib/grub/arm64-efi/xen_boot.mod".


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#939633: More severe #939633 for RP4 on 5.8?

2020-11-25 Thread Elliott Mitchell

found 939633 5.8.10-1~bpo10+1
severity 939633 important
merge 935456 939633
quit

I'm left suspecting bugs #935456 and #939633, are in reality a single
bug: Raspberry Pi device trees were garbled during Debian's 5.2 kernel
development.

They appear to remain very garbled, to the point of being pretty well
useless.  I've built a kernel from Debian's 5.8 kernel source and the
device tree binary produced doesn't appear to allow a Raspberry PI 4B
to complete its boot.  Might be USB functionality is operational, but
neither ethernet interface nor display function.

Ironically, the additional ACPI/EFI support DOES function.  This means
the Tianocore image for Raspberry PI 4B works better with the current
source.

I'm unsure whether badly breaking all Raspberry PI variants quite
justifies critical or grave (popular machine, but kernel issues by nature
cause 10x the damage so severities should be somewhat damped).

I certainly hope to see the 5.9 release since that has additional
high-value improvements...


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#939186: [Pkg-xen-devel] Bug#939186: HVM + Balloon crashes Xen hypervisor

2020-11-25 Thread Elliott Mitchell

On Wed, Nov 25, 2020 at 01:32:10PM +0100, Hans van Kranenburg wrote:
> Can you still reproduce this with Xen 4.11 or 4.14?
> If not, can you mail 939186-cl...@bugs.debian.org to close it?
> 
> I just tried a few things with maxmem and memory with a PVH guest on Xen
> 4.14, and it just seems to work like it should.

I /think/ I tried it with 4.11 and it continued to reproduce.  That
though was sufficiently non-recently that I need to recheck to be
certain.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#921547: u-boot: Please consider making u-boot* arch:all

2020-11-24 Thread Elliott Mitchell

My thinking mirrors one of Jonathan McDowell's:  One should be able to
build an installation image for $device/$architecture on
$random_device/$random_architecture.

This is very useful for exactly the same situations where using
`debootstrap --foreign` is.  Say if one has a desktop already running
proper Debian and a target device which needs to get U-Boot.

As such let me suggest this should also be considered for all of the
u-boot-* packages.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#975685: grub-install fails with U-Boot EFI

2020-11-24 Thread Elliott Mitchell

Package: grub2-common
Version: 2.04-10

`grub-install` fails to install properly when run on a system using
U-Boot's implementation of the EFI protocol (potentially also effects
package grub-efi-arm64, perhaps this should be against src:grub2).

Since a Tianocore-based implementation of the EFI protocol is also
available, I can provide more imformation.  A useful distinction is
U-Boot's EFI implementation does NOT implement EFI variables.  This seems
a plausible method to distinguish U-Boot's partial EFI implementation
from Tianocore's complete EFI implementation.

On the U-Boot implementation grubaa64.efi needs to be installed as
/boot/efi/EFI/BOOT/bootaa64.efi instead.  Roughly akin to
--bootloader-id=BOOT, plus an extra rename.  I suspect I may be filing
other bugs soon.

(the platform is a Raspberry Pi 4B, the Tianocore implementation is
quite workable except too many pieces of software assume device-tree
on ARM and won't work with ACPI)


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#824954: flash-kernel: GRUB? via U-Boot?

2020-11-23 Thread Elliott Mitchell

There may be several distinct bugs involved with #824954.  For one, I
suspect `grub-install`'s behavior needs to change if EFI variables aren't
supported.  I use this as a flag which could distinguish installation on
top of a full EFI implementation (perhaps Tianocore-derived), versus
U-Boot's rather primative EFI implementation.

Notably right now `grub-install` tries to install to
/boot/efi/EFI/debian by default.  This is appropriate for a full EFI
implementation where boot entries can be added by adding variables.  Yet
with U-Boot's limited implementation, the files must go in EFI/BOOT
(--bootloader-id=BOOT).

Right now I'm simply trying to figure out what others have done to reuse
it for my own purposes.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#948712: Returned mail: see transcript for details

2020-11-23 Thread Elliott Mitchell

reopen 948712
quit

There should be a rather obvious use case where absent /boot/firmware is
quite appropriate.  For someone needing a copy of the firmware, but using
other tools to build the boot area.

Notably one might use raspi-firmware to retrieve start*.elf/fixup*.dat.
Then add u-boot-rpi for second stage bootloader.  Next grub-efi-arm* for
third stage.  Lastly flash-kernel to glue all the pieces together.

Not one of these requires the existance of /boot/firmware.  In fact, not
one of these needs the installation of dosfstools.

Perhaps the raspi-firmware package should be split into pieces so as to
allow merely installing the actually required portions?

(raspi-firmware-bin which depends upon: raspi1-firmware-bin,
raspi2-firmware-bin, raspi3-firmware-bin, and raspi4-firmware-bin?)


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#563204: Recommends is really too strong for os-prober

2020-11-22 Thread Elliott Mitchell

Commenting since the report still exists in the bug DB...

I've found `os-prober` often produces many false positive OS installation
detections.  As such I really find recommends too strong, simply
including during installation and then merely suggests would be better.

If someone removes it, likely that is due to not needing it and the
choice should be honored.

Worse, on VM systems searching for additional OS installations is a major
security risk due to potential for finding VMs instead of host.  If one
of those manages to boot on bare hardware, everything is compromised.

Producing messages during updates though is reasonable.  Just don't be
too pushy, even keeping recommends packages from being reinstalled
requires a careful eye.

Mostly documenting some people *really* don't want os-prober, even though
we are likely the minority.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#968965: xen: FTBFS woes in sid

2020-11-20 Thread Elliott Mitchell

On Fri, Nov 20, 2020 at 08:02:26PM +0100, Hans van Kranenburg wrote:
> So,
> 
> On 9/21/20 4:16 PM, Hans van Kranenburg wrote:
> > [...]
> > 
> > gcc-Wl,-z,relro -Wl,-z,now -pthread -Wl,-soname
> > -Wl,libxentoolcore.so.1 -shared -Wl,--version-script=libxentoolcore.map
> > -o libxentoolcore.so.1.0 handlereg.opic
> > /usr/bin/ld: i386:x86-64 architecture of input file `handlereg.opic' is
> > incompatible with i386 output
> > /usr/bin/ld: handlereg.opic: file class ELFCLASS64 incompatible with
> > ELFCLASS32
> > /usr/bin/ld: final link failed: file in wrong format
> > collect2: error: ld returned 1 exit status
> 
> This one is caused by "debian/rules: Combine shared Make args". I
> reverted that change for now.
> 
> When retrying the i386 build, I run into yet another failure, sigh:
> 
>  >8 
> 
> dh_install: warning: Cannot find (any matches for)
> "usr/lib/debug/usr/lib/xen-*/boot/*" (tried in ., debian/tmp)
> 
> dh_install: warning: xen-utils-4.14 missing files:
> usr/lib/debug/usr/lib/xen-*/boot/*
> dh_install: error: missing files, aborting
> 
>  >8 
> 
> I can only find CONFIG_PV_SHIM=n in the build log. What is going on
> here? Attached is the build log.
> 
> My WIP branch is here (including the make-patches commit, it's ready to
> build). I also forwarded the thing to latest stable-4.14.
> 
> https://salsa.debian.org/xen-team/debian-xen/-/commits/knorrie/4.14/

I was going to type, "That can't be true!  Both sections are identical,
so that commit *couldn't* have done it!"

Being the careful sort, look closer.  Look closer.  Then realize if one
reads fast they look identical, but they're getting *slightly* different
values for ${XEN_TARGET_ARCH}.  Mainly for $(make_args_xen),
${XEN_TARGET_ARCH} gets $(xen_arch_$(flavour)), but for
$(make_args_tools), ${XEN_TARGET_ARCH} gets $(xen_arch_$(DEB_HOST_ARCH)).

Three of us and we didn't spot that difference.  Should still combine
${XEN_COMPILE_ARCH} which remains identical for both values.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#546392: Isn't bug #546392 complete?

2020-11-20 Thread Elliott Mitchell

I'm pretty sure bug #546392 was completed several /years/ ago, yet the
bug was never marked complete.  I don't recall when, but perhaps near
version 2.01 or earlier?


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#975062: Python 3 (pygrub) in 4.14 packages

2020-11-18 Thread Elliott Mitchell

On Wed, Nov 18, 2020 at 04:32:00PM +0100, Hans van Kranenburg wrote:
> I also have a little snippet from IRC, which is about this, where Ian
> reports that he's seen it working.
> 
> https://salsa.debian.org/xen-team/debian-xen/-/snippets/500
> 
> So, apparently there are cases in which pygrub 'works' and in which it
> does not, and apparently using pygrub with "amd64 kernel and Xen tools
> but i386 userland" is problematic, and I remember some remarks which I
> can't find back about that that use case was probably already broken
> always, in the past.
> 
> I wanted to find out about this and set up some test cases to reproduce
> things (I've never used pygrub yet), but that obviously did not happen
> yet. I have some stuff going on in my personal life that is taking up a
> lot of time currently. What is rather easy for *me* is to help
> organizing the work and managing todo lists etc, but not learning new
> stuff ATM.
> 
> So, my current questions are:
> 
> 1. Is pygrub a blocker for having Xen 4.14 in unstable? Because that
> should be our first team-goal now.
> 2. What exactly is going on, can we make a list/table/whatever about in
> which cases pygrub 'does not work' (in more detail, how does it fail).
> 3. pygrub keeps being the thing that always causes problems. What would
> be your (asking anyone who wants to think along) ideas about which
> well-defined situations/test-cases we should have to execute instead of
> having the users report problems after big package changes?
> 
> Hans
> 
> P.S. Next message after the commercials will be on #968965 which is the
> other biggest issue for Xen 4.14 in unstable now.

Due to working with Pry Mar, I can state the cross-compilation of the
Python shared objects may not be 100% functional yet.  Looks very much
like Python's "distutils" took 2 steps forward and then 2 steps backward
during the Python 2 -> Python 3 transition.

(Great!  Linking and compilation got separated.  Ewww!  CFLAGS gets
appended to LDFLAGS.  Great!  We'll add support for architecture-specific
compilation directories.  Ewww!  There isn't a good way to pass in the
architecture triplet.)

I've got an initial patch for working around an issue here, but the
quality doesn't look great to me.  Something along those lines should be
submitted to Xen, but I'm unsure of all the issues.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445


>From 1bb407482fa82ad5034a4e4bdfa34dfa3a828f9a Mon Sep 17 00:00:00 2001
From: Elliott Mitchell 
Date: Thu, 1 Oct 2020 15:19:33 -0700
Subject: [PATCH] tools/python: Correct extension filenames for Python 3

Appears Python became *more* difficult to properly cross-compile between
Python 2 and Python 3.  This takes care of the naming of the xc.so/xs.so
extension shared objects for Python.

Signed-off-by: Elliott Mitchell 
---
 tools/pygrub/setup.py | 16 +++-
 tools/python/setup.py | 16 +++-
 2 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/tools/pygrub/setup.py b/tools/pygrub/setup.py
index 91019e97e7..dfe01e6220 100644
--- a/tools/pygrub/setup.py
+++ b/tools/pygrub/setup.py
@@ -11,6 +11,19 @@ except KeyError: pass
 
 XEN_ROOT = "../.."
 
+from distutils import command
+import distutils.command.build_ext
+class BuildExtArch(distutils.command.build_ext.build_ext):
+	arch_map = {
+		'x86_64':	'amd64',
+		'x86_32':	'i386',
+		'arm64':	'aarch64',
+		'arm32':	'armel',
+	}
+	def get_ext_filename(self, ext_name):
+		name = super().get_ext_filename(ext_name)
+		return name.replace(os.getenv("XEN_COMPILE_ARCH"), self.arch_map[os.getenv("XEN_TARGET_ARCH")])
+
 xenfsimage = Extension("xenfsimage",
 extra_compile_args = extra_compile_args,
 extra_link_args = extra_link_args,
@@ -30,5 +43,6 @@ setup(name='pygrub',
   package_dir={'grub': 'src', 'fsimage': 'src'},
   scripts = ["src/pygrub"],
   packages=pkgs,
-  ext_modules = [ xenfsimage ]
+  ext_modules = [ xenfsimage ],
+  cmdclass = {'build_ext': BuildExtArch},
   )
diff --git a/tools/python/setup.py b/tools/python/setup.py
index 8faf1c0ddc..6b95d30b89 100644
--- a/tools/python/setup.py
+++ b/tools/python/setup.py
@@ -13,6 +13,19 @@ PATH_LIBXC= XEN_ROOT + "/tools/libxc"
 PATH_LIBXL= XEN_ROOT + "/tools/libxl"
 PATH_XENSTORE = XEN_ROOT + "/tools/xenstore"
 
+from distutils import command
+import distutils.command.build_ext
+class BuildExtArch(distutils.command.build_ext.build_ext):
+	arch_map = {
+		'x86_64':	'amd64',
+		'x86_32':	'i386',
+		'arm64':	'aarch64',
+		'arm32':	'

Bug#974756: idle3-tools: Needs support for drives behind controllers

2020-11-14 Thread Elliott Mitchell

Package: idle3-tools
Version: 0.9.1-2

`idle3ctl` needs an implementation of `smartctl`'s -d option in order to
talk to disks behind hardware RAID controllers.

This is nearly a bug in smartmontools of the code for the -d option
needing to turn into a library so other low-level tools can utilize it.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#974755: smartd: Problematic memory activity (triggers oom-killer)

2020-11-14 Thread Elliott Mitchell

Package: smartmontools
Version: 6.6-1

`smartd` is doing some sort of activity which tends to trigger the kernel
oom-killer.  I suspect this may relate to triggering self-tests.

System in question has plenty of swap available, and presently reports
more than 50MB of available memory.

Presently adding "choom -p $$ -n +500 >/dev/null" to
/etc/default/smartmontools works around the worst of the damage as this
causes it to tend to kill itself instead of rather more critical daemons.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#774129: dpkg-buildpackage: Should set the cross build profile automatically

2020-10-25 Thread Elliott Mitchell

(sending a second copy to the body of the message since
<774...@bugs.debian.orgg> didn't quite work)
retitle 774129 dpkg-buildpackage: Should set the cross build profile 
automatically
severity 774129 normal
quit

Setting the "cross" build profile could be the difference between a
successful cross package build and a build failure.  As such I believe
this rates a normal bug as the -a/-t options are effectively broken right
now.

Stating in the man page "-Pcross" must be used if -a or -t is used might
turn this minor, though I really think the full solution should be aimed
for.



As stated in my last message, I also think setting the cross profile
should cause "noocaml" and potentially a few other no profiles to be set.
This would both simplify supporting cross-building (since packages
wouldn't need to detect the cross profile at every point the noocaml
profile is detected).  This would also make a future transition when
OCAML became cross build-friendly simpler since some packages wouldn't
need further adjustment to work, and those which did need adjustment
would cause bug reports mentioning the situation.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#971397: dpkg-dev: dpkg-buildpackage -P option behavior change in update

2020-09-29 Thread Elliott Mitchell

Package: dpkg-dev
Version: 1.19.7
Severity: important

Between versions 1.19.6 and 1.19.7 the behavior of the -P option for
dpkg-buildpackage changed.  At 1.19.6 if there was no string directly on
the -P option, the following argument would be interpreted as the
profiles to set.  At 1.19.7 the string MUST be part of the same argument.

ie at 1.19.6, `dpkg-buildpackage -a arm64 -P cross` worked, while
1.19.7 *requires* `dpkg-buildpackage -a arm64 -Pcross` (the latter may
have worked with 1.19.6, but the former worked with 1.19.6)

I see good arguments both for and against allowing or not the profile
list being a separate argument.  Overtly user-visible behavior though
should NOT change with patch-level changes (should be minor-version).


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#961511: [Pkg-xen-devel] Bug#961511: [PATCH] d/xen-utils-common.xen.init: disable oom killer for xenstored

2020-09-22 Thread Elliott Mitchell

On Tue, Sep 22, 2020 at 02:39:09PM +0200, Hans van Kranenburg wrote:
> How did you test it and how did you get a working process without the --?

By reading the man page, noticing there was no mention of "--" and then
trying `choom -n +5 sleep 5` and found that worked.  When you sent this
message I checked and GNU `sleep` does have "--version", thus I tried
`choom -n +5 sleep 5 --version` and found *that* failed.

A "--" seemed natural, but documentation omitting crucial details is a
problem.  Never mind.

Nice find, I did at one point have the oom-killer get the wrong process
and saw *problems*.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#961511: [PATCH] d/xen-utils-common.xen.init: disable oom killer for xenstored

2020-09-20 Thread Elliott Mitchell

This is fun.  Actually isn't too difficult to trigger, simply slowly
reduce the memory Xen allocates to Dom0 and eventually the oom-killer is
likely to trigger (having tried to shrink Dom0 as far as possible,
believe me, I know).  I had been wondering which of the Xen daemons could
be safely restarted since it is handy to restart daemons instead of whole
machine for security updates...

Interestingly running `xenstored --help` mentions:
  -I, --internal-db   store database in memory, not on disk

There is a run/xenstored/tdb file so I end up wondering if newer versions
are in fact storing everything in a file and restarting isn't so bad.

The patch switches the arguments from:
--exec "$try_xenstored" -- ...
to:
--exec /usr/bin/choom -- -n -1000 "$try_xenstored" -- ...

I'm pretty sure start-stop-daemon is consuming the "--" and the second
"--" shouldn't be there.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#774129: dpkg-*: Doesn't set cross-build profile with -a or -t

2020-09-13 Thread Elliott Mitchell

found 774129 1.19.7
quit

You might consider -a/--target-arch or -t/--target-type to merely be
conveniences, but /not/ enabling the cross profile when the build arch
differs from the host arch is stopping a decimeter shy of the goal line.

Is it even possible someone /wouldn't/ want the cross profile if
cross-building?  (a --no-default-profiles option could be worthwhile)

While the original bug report is about `dpkg-buildpackage`, this also
applies to `dpkg-checkbuilddeps`.

I'm inclined to rate this as more than "wishlist".  At a minimum, the
man page for `dpkg-buildpackage` would need to suggest using "-P cross"
when using -a or -t.


What is worthy of consideration is whether enabling the cross profile
should cause the noocaml profile (possibly others) to be enabled.

Cross-compiling Python bindings is pretty simple, depend on libpython-dev
and python-dev:any.  OCAML is presently very broken for cross-compilation
(pretty well impossible).  I'm unsure of the other profiles.

My concern is if enabling cross doesn't enable noocaml, any package setup
for cross-compilation, but includes some OCAML will have to disable OCAML
if either is enabled.  In the future once OCAML becomes cross-friendly,
every single package will then need to reenable OCAML.  Whereas if
dpkg-buildpackage is handling the situation merely removing
cross => noocaml could take care of those which don't need adjustment to
OCAML invocation (and would then generate new reports for packages which
do need adjustment).


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

Bug#965245: [Pkg-xen-devel] Bug#965245: Cross-build issues

2020-07-18 Thread Elliott Mitchell

On Sat, Jul 18, 2020 at 04:08:50PM +0200, Hans van Kranenburg wrote:
> On 7/18/20 5:53 AM, Elliott Mitchell wrote:
> > Package: src:xen
> > Version: 4.13
> > Tags: patch
> > 
> > I've been playing try to get Xen 4.13 to cross-build for ARM.  In the
> > process I've been running into bunches of problems, so here are fixes.
> 
> Can you:
> * add a 'why' line to the commit message of the first patch
> * add Signed-off-by lines
> * and then mailbomb (git send-email) it to
> pkg-xen-de...@lists.alioth.debian.org with Cc to Ian Jackson
> ? Just all of it in 1 mail thread? (So,
> with 0/10 cover letter which does not have to contain anything else than
> something like 'Hi! See #965245, kthxbye'.)
> 
> Then we can collect some Reviewed-by etc.

Will do, may end up collecting an extra patch or two in the process (one
of these has been sent upstream, Debian builds are unfinished for me).


> > OCAML/xenstored is being problematic, that looks like outright bugs on
> > ocaml-nox making it unusable for cross-building.
> 
> The cxenstored is also still there. The init scripts look if oxenstored
> is installed, and if not, it falls back to using normal xenstored. So, I
> suspect if you patch it out of the build for this arch, then no other
> changes are necessary. (Normally both are built now, so that if a user
> wants, in case of problems or whatever, they can switch back).

The problem is OCAML is basically utterly broken for cross-building.
There is the "-cc" argument for `ocamlc` which looks like someone started
work on making it work cross-architecture, but never finished.

In light of this, that is pretty much what I've done.  In order to get
dh_install to cooperate and ensure xen-utils-wrapper functions with
distinct builds, I need substitues for oxenstored.conf and oxenstored.


> > I'm including copies of 3 patches from Julien Grall.  Upstream source for
> > this is: git://xenbits.xen.org/people/julieng/xen-unstable.git  The
> > branch "arm-dma/v2".
> 
> Ok, these patches are in Xen 4.14 I see. First thing I want to do going
> forward  is forwarding the packaging to that. I hope this will also only
> make your life easier.

Hmm, thought they were against 4.13.  Might be these revised ones are
targeting 4.14, but the code is the same on 4.13.


> But, keep the 3 upstream patches in the set for now, so that it's
> explicit that you need them for this.
> 
> > Why yes, I am trying to get Xen operational on a Raspberry PI.  Why do
> > you ask?  :-)
> 
> Haha. Exciting. I like it. Looking forward to see it working and help
> testing it here. I didn't do cross-building yet, so time to learn
> something new.

There appear to be a *bunch* of people trying to get Xen operational on
Raspberry PI 4b devices.  I'm aiming for what I consider to be a
straightforward approach, which is to use existing packaging tools.


-- 
(\___(\___(\__  --=> 8-) EHM <=--  __/)___/)___/)
 \BS (| ehem+sig...@m5p.com  PGP 87145445 |)   /
  \_CS\   |  _  -O #include  O-   _  |   /  _/
8A19\___\_|_/58D2 7E3D DDF4 7BA6 <-PGP-> 41D1 B375 37D0 8714\_|_/___/5445

1 2 3 4 >

1 - 100 of 392 matches

Mail list logo