from:"Hans van Kranenburg"

Bug#1067151: xen-utils-common: vif-openvswitch ignores MTU

2024-03-19 Thread Hans van Kranenburg

Hi Aleksi,

Thanks for the report. I actually ran into the same situation recently,
wanting to set up a PPPoE connection from within a Xen domU, also using
openvswitch as bridge.

On 19/03/2024 12:21, Aleksi Suhonen wrote:
> Package: src:xen
> Version: 4.17.3+10-g091466ba55-1~deb12u1
> Severity: wishlist
> 
> I wasn't sure if this script comes from Debian or Xen or somewhere else, 
> so I thought it safest to report it here.

These scripts/vif-* files are located in tools/hotplug/Linux in the Xen
source tree, we ship them as such in the Debian package. So, yes,
changes to them should first go upstream. However, it's perfectly fine
to have a discussion here, so we can figure out what the right changes
should be.

> /etc/xen/scripts/vif-bridge handles MTU settings in the vif, but the 
> otherwise similar /etc/xen/scripts/vif-openvswitch does not. I added it 
> in, here's the diff-c and the full fixed file is also attached.
> 
> *** vif-openvswitch.orig2024-03-19 11:53:13.0 +0200
> --- vif-openvswitch 2024-03-19 11:56:17.0 +0200
> ***
> *** 89,94 
> --- 89,95 
>add|online)
>check_tools
>setup_virtual_bridge_port $dev
> + set_mtu "$bridge" "$dev" "$type_if"
>add_to_openvswitch $dev
>;;

Ah, interesting. I had some difficulties getting it to work back then.
But, when putting the set_mtu line back like this, it also gives me the
desired outcome now!

My use case is about setting up a PPPoE connection from a Xen domU over
vlan 6. I want an mtu of 1500 for the traffic inside the PPPoE
connection, so I need mtu 1508 for the connection between the PPPoE
client in the domU -> openvswitch in the dom0 -> physical interface ->
switchports -> ISP NTU device.

For some reason I had troubles to get the vifX.Y interface, as seen
inside dom0 set to mtu 1508. It seemed not to have any effect (using ip
link set mtu  dev ), or, openvswitch kept resetting it back to
1500 all the time. When I would use ovs-vsctl set interface 
mtu_request= instead, it actually sticked. That's what I remember.

I just did some more testing, and I cannot really reproduce that
situation... :| I can also just use ip link in the dom0 now.

Interesting, but good, since it would mean that we can indeed just
(re)use that set_mtu function! :) I'm still curious what the problem was
when I tried earlier... Maybe anyone else reading this knows more?

Are you familiar with the process of sending patches upstream? Otherwise
we (Debian Xen team) can assist with that.

Regards,
Hans

Bug#1063270: The "64bits time_t transition" in Debian/Xen

2024-02-12 Thread Hans van Kranenburg

Hi,

On 2/12/24 18:43, Andrew Cooper wrote:
> On 12/02/2024 5:27 pm, zithro wrote:
>> Hey all,
>>
>> the Debian project is focused on the "2038 time_t" switch.
>> So the maintainers of the Debian Xen package must ensure that all
>> imported Xen code conforms to the new Debian standards.
>>
>> I was asked by Andrew Cooper to post here about this, I'll quote him :
>> "So I had been idly wondering whether Xen would match up to Debian's new
>> policy, and it appears not
>> this topic really needs to be brought up on the xen-devel mailing list
>> do you have any more details as to what has gone wrong?
>> this is something we ought to arrange to happen in CI by default
>> but it sounds like there's some work needed first"
>>
>> (Not answering the question because I'm just a messenger).
> 
> xen.git/xen$ git grep -w time_t -- :/
> ../tools/console/client/main.c:106: time_t start, now;
> ../tools/console/daemon/io.c:272:   time_t now = time(NULL);
> ../tools/libs/light/libxl_qmp.c:116:    time_t timeout;
> ../tools/libs/light/libxl_qmp.c:585:   
> time_t ask_timeout)
> ../tools/libs/light/libxl_x86.c:516:    time_t t;
> ../tools/libs/toollog/xtl_logger_stdio.c:61:    time_t now = time(0);
> ../tools/tests/xenstore/test-xenstore.c:453:    time_t stop;
> ../tools/xenmon/xenbaked.c:98:time_t start_time;
> ../tools/xenstored/core.c:109:  time_t now;
> ../tools/xenstored/core.h:150:  time_t ta_start_time;
> ../tools/xenstored/domain.c:143:    time_t mem_last_msg;
> ../tools/xenstored/domain.c:188:static time_t wrl_log_last_warning; /*
> 0: no previous warning */
> ../tools/xenstored/domain.c:1584:   time_t now;
> ../tools/xenstored/lu.c:160:    time_t now = time(NULL);
> ../tools/xenstored/lu.c:185:    time_t now = time(NULL);
> ../tools/xenstored/lu.c:292:    time_t now = time(NULL);
> ../tools/xenstored/lu.h:32: time_t started_at;
> ../tools/xentop/xentop.c:947:   time_t curt;
> ../tools/xl/xl_info.c:742:static char *current_time_to_string(time_t now)
> ../tools/xl/xl_info.c:759:static void print_dom0_uptime(int short_mode,
> time_t now)
> ../tools/xl/xl_info.c:810:static void print_domU_uptime(uint32_t domuid,
> int short_mode, time_t now)
> ../tools/xl/xl_info.c:847:    time_t now;
> ../tools/xl/xl_vmcontrol.c:336:    time_t start;
> ../tools/xl/xl_vmcontrol.c:495:    time_t now;
> ../tools/xl/xl_vmcontrol.c:504:    if (now == ((time_t) -1)) {
> ../tools/xs-clients/xenstore_control.c:33:    time_t time_start;
> arch/x86/cpu/mcheck/mce.h:224:    uint64_t time; /* wall time_t when
> error was detected */
> arch/x86/time.c:1129: * machines were long is 32-bit! (However, as
> time_t is signed, we
> 
> 
> I don't see any ABI problems from using a 64bit time_t.  The only header
> file with a time_t is xenstored/lu.h which is a private header and not a
> public ABI.
> 
> I guess we fell into the "could not be analysed via
> abi-compliance-checker" case?

Thanks for also looking into this!

Maximilian mentioned in #debian-xen that doing a Debian package build
with DEB_BUILD_OPTIONS=abi=+lfs and _FILE_OFFSET_BITS=64 and
_TIME_BITS=64 resulted in the exact same binaries for shared libs.

What we also found is these reports:

1. Enabling lfs, which has no effect:
https://adrien.dcln.fr/misc/armhf-time_t/2024-02-06T16%3A48%3A00/compat_reports/libxen-dev/base_to_lfs/compat_report.html

2. Enabling the 64-bit time_t as well:
https://adrien.dcln.fr/misc/armhf-time_t/2024-02-06T16%3A48%3A00/compat_reports/libxen-dev/lfs_to_time_t/compat_report.html
In there, see "Problems with Data Types, Low Severity  2 " about
struct_timeval:

 >8 

  [+] struct timeval
Change -> Effect
1 Type of field tv_sec has been changed from __time_t to __time64_t.
-> Recompilation of a client program may be broken.
2 Type of field tv_usec has been changed from __suseconds_t to
__suseconds64_t. -> Recompilation of a client program may be broken.

  [+] affected symbols: 3 (0.2%)
* libxl_osevent_afterpoll ( libxl_ctx* ctx, int nfds, struct pollfd
const* fds, struct timeval now ) -> 4th parameter 'now' is of type
'struct timeval'.
* libxl_osevent_beforepoll ( libxl_ctx* ctx, int* nfds_io, struct
pollfd* fds, int* timeout_upd, struct timeval now ) -> 5th parameter
'now' is of type 'struct timeval'.
* libxl_osevent_register_hooks ( libxl_ctx* ctx, libxl_osevent_hooks
const* hooks, void* user ) -> Field 'hooks.timeout_modify.p2' in 2nd
parameter 'hooks' (pointer) has base type 'struct timeval'.

 >8 

So, the question is, is this correct and would it cause a problem.

If so, it also means that those functions are in a versioned lib,
libxenlight.so.4.17.0 (in binary package libxenmisc4.17).

Coincidentally, we are currently preparing the upload to switch from Xen
4.17 to Xen 4.18 in Debian unstable. So, if we just go ahead with doing
that, and make sure it's built in the new way already...

then...

tada.wav!

We just immediately have the correct

Bug#1053246: Security support ended for Xen 4.14 in Bullseye

2023-09-29 Thread Hans van Kranenburg

Package: debian-security-support
Version: 1:11+2023.05.04
Severity: normal

Hi,

Upstream security support for Xen 4.14 has ended recently. This also
means that security support for Debian Bullseye has ended.

The complexity of the software involved does not really allow for anyone
else than the upstream developers, with a deep understanding of the
inner workings of the hypervisor code, to apply/backport new patches.

For security-support-ended.deb11, this could be a line like:

xen 4.14.6-1 2023-09-21
https://xenbits.xen.org/docs/4.14-testing/SUPPORT.html#release-support

Note: This 4.14.6-1 package version is not visible for bullseye yet,
right now, in the archive. It was submitted for the bullseye point
release, and has just been accepted into it:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1053177

Thanks,
Hans

Bug#1053177: bullseye-pu: package xen/4.14.6-1

2023-09-28 Thread Hans van Kranenburg

Hi Adam,

On 9/28/23 19:09, Adam D. Barratt wrote:
> On Thu, 2023-09-28 at 18:27 +0200, Hans van Kranenburg wrote:
>> Xen 4.14 support (and security support) has ended upstream. The
>> upstream
>> stable branch for version 4.14 is frozen now, and a final maintenance
>> release version 4.14.6 has been released. We'd like to put this final
>> update into Bullseye, to properly finish the Xen work for Bullseye.
>> Also, a few security fixes (regarding CVE-2023-20593 CVE-2023-20569
>> CVE-2022-40982) are included.
>>
>> https://xenbits.xen.org/docs/4.14-testing/SUPPORT.html#release-support
>>
> 
> --- xen-4.14.5+94-ge49571868d/automation/scripts/qemu-smoke-x86-64.sh 
> 2023-03-21 13:07:44.0 +0100
> +++ xen-4.14.6/automation/scripts/qemu-smoke-x86-64.sh2023-08-07 
> 14:11:14.0 +0200
> @@ -5,11 +5,6 @@
>  # variant should be either pv or pvh
>  variant=$1
>  
> -# Install QEMU
> -export DEBIAN_FRONTENT=noninteractive
> -apt-get -qy update
> -apt-get -qy install qemu-system-x86
> 
> I realise this is an upstream change, but is it really intended to stop
> installing QEMU in a QEMU smoke test?

This particular change can be seen as the contents of the following
commit, in this case for 4.14:

 8< 

commit 98ec8ad2eeb96eb9d4b7f9bfd1ef3a994c63af17
Refs: RELEASE-4.14.5-103-g98ec8ad2eeb9
Author: Michal Orzel 
AuthorDate: Wed Apr 26 09:29:45 2023 +0200
Commit: Jan Beulich 
CommitDate: Wed Apr 26 09:29:45 2023 +0200

automation: Remove installation of packages from test scripts

Now, when these packages are already installed in the respective
containers, we can remove them from the test scripts.

Signed-off-by: Michal Orzel 
Reviewed-by: Stefano Stabellini 
master commit: 72cfe1c3ad1fae95f4f0ac51dbdd6838264fdd7f
master date: 2022-12-09 14:55:33 -0800

 >8 

This is part of a change to the upstream test machinery. The commit that
it's picked from (the 72cfe1c3ad1 thing) lockstep follows a previous
change to the development / master branch:

 8< 

commit 1ed7da301020ee1e16177cb3d9caa817f195a59a
Author: Michal Orzel 
Date:   Thu Nov 17 17:16:42 2022 +0100

automation: Install packages required by tests in containers

Installation of additional packages from the test scripts when running
the tests has some drawbacks. It is slower than cloning containers
and can
fail due to some network issues (apparently it often happens on x86
rackspace). This patch is adding the packages required by the tests
to be
installed when building the containers.

>From qemu-alpine-x86_64.sh into debian:stretch:
 - cpio,
 - busybox-static.

>From qemu-smoke-*-{arm,arm64}.sh into debian:unstable-arm64v8:
 - u-boot-qemu,
 - u-boot-tools,
 - device-tree-compiler,
 - curl,
 - cpio,
 - busybox-static.

The follow-up patch will remove installation of these packages from the
test scripts. This is done in order not to break the CI in-between.

Signed-off-by: Michal Orzel 
Reviewed-by: Stefano Stabellini 

 >8 

The Xen Project OSSTest machinery is used to run testing for the current
development version of Xen, as well as for the stable branch lines that
are still under active support.

After building/compiling the source code, all kinds of test scenarios
are executed, comprising tests for different virtualization modes, or
different kinds of functionality, but also different kinds of actual
hardware. AIUI, wanting to be able to do all of this quickly boils down
to a 'feet in the mud' situation, which involves automating interaction
with PDUs to be able to physically cut off power from a misbehaving
piece of server hardware, or, capturing actual serial console cable
output. I can understand that, at least for practical reasons, there is
no desire to duplicate/replicate all of this for each supported Xen version.

AIUI, The Xen source tree contains code/scripts to help setting up the
test cases, as well as to be able to run them. For the first part, the
current development code is used (the master branch), and for the second
part, well, whatever is in a branch line needs to be able to behave
correctly in that environment.

This is the reason why we can find the change with title "automation:
Install packages required by tests in containers" only once, committed
to the master branch at the time the change took place, and why similar
but possibly different variations on "automation: Remove installation of
packages from test scripts" do exist in various other branches, such as
stable-4.17 and stable-4.14 etc.

Also, note that for Debian, we don't do anything with this part of the
upstream source tree, or, at least, I mean, changes in there must not
cause changes in the actual debs that we ship.

Thanks for the question, it was a fun small exe

Bug#1051862: (Debian) Bug#1051862: server flooded with xen_mc_flush warnings with xen 4.17 + linux 6.1

2023-09-13 Thread Hans van Kranenburg

Hi Radoslav,

Thanks for your report...

Hi Juergen, Boris and xen-devel,

At Debian, we got the report below. (Also at
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1051862)

This hardware, with only Xen and Dom0 running is hitting the failed
multicall warning and logging in arch/x86/xen/multicalls.c. Can you help
advise what we can do to further debug this issue?

Since this looks like pretty low level Xen/hardware stuff, I'd rather
ask upstream for directions first. If needed the Debian Xen Team can
assist the end user with the debugging process.

Thanks,

More reply inline...

On 9/13/23 20:12, Radoslav Bodó wrote:
> Package: xen-system-amd64
> Version: 4.17.1+2-gb773c48e36-1
> Severity: important
> 
> Hello,
> 
> after upgrade from Bullseye to Bookworm one of our dom0's
> became unusable due to logs/system being continuously flooded
> with warnings from arch/x86/xen/multicalls.c:102 xen_mc_flush, and the 
> system become unusable.
> 
> The issue starts at some point where system services starts to come up, 
> but nothing very special is on that box (dom0, nftables, fail2ban, 
> prometheus-node-exporter, 3x domU). We have tried to disable all domU's 
> and fail2ban as the name of the process would suggest, but issue is 
> still present. We have tried also some other elaboration but none of 
> them have helped so far:
> 
> * the issue arise when xen 4.17 + linux >= 6.1 is booted
> * xen + bookworm-backports linux-image-6.4.0-0.deb12.2-amd64 have same isuue
> * without xen hypervisor, linux 6.1 runs just fine
> * systemrescue cd boot and xfs_repair rootfs did not helped
> * memtest seem to be fine running for hours

Thanks for already trying out all these combinations.

> As a workaround we have booted xen 4.17 + linux 5.10.0-25 (5.10.191-1)
> and the system is running fine as for last few months.
> 
> Hardware:
> * Dell PowerEdge R750xs
> * 2x Intel Xeon Silver 4310 2.1G
> * 256GB RAM
> * PERC H755 Adapter, 12x 18TB HDDs

I have a few quick additional questions already:

1. For clarification.. From your text, I understand that only this one
single server is showing the problem after the Debian version upgrade.
Does this mean that this is the only server you have running with
exactly this combination of hardware (and BIOS version, CPU microcode
etc etc)? Or, is there another one with same hardware which does not
show the problem?

2. Can you reply with the output of 'xl dmesg' when the problem happens?
Or, if the system gets unusable too quick, do you have a serial console
connection to capture the output?

3. To confirm... I understand that there are many of these messages.
Since you pasted only one, does that mean that all of them look exactly
the same, with "1 of 1 multicall(s) failed: cpu 10" "call  1: op=1
arg=[a1a9eb10] result=-22"? Or are there variations? If so, can
you reply with a few different ones?

Since this very much looks like an issue of Xen related code where the
Xen hypervisor, dom0 kernel and hardware has to work together correctly,
(and not a Debian packaging problem) I'm already asking upstream for
advice about what we should/could do next, instead of trying to make a
guess myself.

Thanks,
Hans

> Any help, advice or bug confirmation would be appreciated
> 
> Best regards
> bodik
> 
> 
> (log also in attachment)
> 
> ```
> kernel: [   99.762402] WARNING: CPU: 10 PID: 1301 at 
> arch/x86/xen/multicalls.c:102 xen_mc_flush+0x196/0x220
> kernel: [   99.762598] Modules linked in: nvme_fabrics nvme_core bridge 
> xen_acpi_processor xen_gntdev stp llc xen_evtchn xenfs xen_privcmd 
> binfmt_misc intel_rapl_msr ext4 intel_rapl_common crc16 
> intel_uncore_frequency_common mbcache ipmi_ssif jbd2 nfit libnvdimm 
> ghash_clmulni_intel sha512_ssse3 sha512_generic aesni_intel acpi_ipmi 
> nft_ct crypto_simd cryptd mei_me mgag200 ipmi_si iTCO_wdt intel_pmc_bxt 
> ipmi_devintf drm_shmem_helper dell_smbios nft_masq iTCO_vendor_support 
> isst_if_mbox_pci drm_kms_helper isst_if_mmio dcdbas mei intel_vsec 
> isst_if_common dell_wmi_descriptor wmi_bmof watchdog pcspkr 
> intel_pch_thermal ipmi_msghandler i2c_algo_bit acpi_power_meter button 
> nft_nat joydev evdev sg nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 
> nf_defrag_ipv4 nf_tables nfnetlink drm fuse loop efi_pstore configfs 
> ip_tables x_tables autofs4 xfs libcrc32c crc32c_generic hid_generic 
> usbhid hid dm_mod sd_mod t10_pi crc64_rocksoft crc64 crc_t10dif 
> crct10dif_generic ahci libahci xhci_pci libata xhci_hcd
> kernel: [   99.762633]  megaraid_sas tg3 crct10dif_pclmul 
> crct10dif_common crc32_pclmul crc32c_intel bnxt_en usbcore scsi_mod 
> i2c_i801 libphy i2c_smbus usb_common scsi_common wmi
> kernel: [   99.764765] CPU: 10 PID: 1301 Comm: python3 Tainted: G 
> W  6.1.0-12-amd64 #1  Debian 6.1.52-1
> kernel: [   99.764989] Hardware name: Dell Inc. PowerEdge R750xs/0441XG, 
> BIOS 1.8.2 09/14/2022
> kernel: [   99.765214] RIP: e030:xen_mc_flush+0x196/0x220
> kernel: [   99.765436] Code: e2 06 48 01 da 85 c0 0f

Bug#1042842: network interface names wrong in domU (>10 interfaces)

2023-08-08 Thread Hans van Kranenburg

Hi,

On 8/8/23 15:22, Valentin Kleibel wrote:
>> On [0], you can read "In both cases the device naming is subject to the 
>> usual guest or backend domain facilities for renaming network devices".
>> It says "naming/renaming", but you can assume "detecting".
>>
>>> I also checked which net_ids udev knows about and the only things that 
>>> pop up are:
>>> ID_NET_NAMING_SCHEME=v247
>>> ID_NET_NAME_MAC=enx00163efd832b
>>> ID_OUI_FROM_DATABASE=Xensource, Inc.

What I do is stuff like this:

-$ cat /etc/udev/rules.d/70-persistent-vifname.rules

SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{nodename}=="device/vif/0",
NAME="vlan2"
SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{nodename}=="device/vif/1",
NAME="vlan3"
SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{nodename}=="device/vif/2",
NAME="vlan4"
SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{nodename}=="device/vif/3",
NAME="vlan6"
SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{nodename}=="device/vif/4",
NAME="vlan9"
SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{nodename}=="device/vif/5",
NAME="vlan10"
SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{nodename}=="device/vif/6",
NAME="vlan11"
SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{nodename}=="device/vif/7",
NAME="vlan12"
SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{nodename}=="device/vif/8",
NAME="vlan13"
SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{nodename}=="device/vif/9",
NAME="vlan14"
SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{nodename}=="device/vif/10",
NAME="vlan15"
SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{nodename}=="device/vif/11",
NAME="vlan16"

The vif/X always matches the order in which you define the interfaces
inside the guest config file.

After starting to build router VMs (well before the whole interface
naming madness was a thing), it took only the first time when we wanted
to throw away a vlan, to realize that all the ethX numbers would shift 1
up, and from then on, I've always been using this so set my own style
predictable names (whenever there's more than one, otherwise it's just
eth0).

>> Is it from dom0 or domU ?
>> Are you using "net.ifnames=0" on the domU kernel command line ?
>> "v247" looks like systemd "predictive naming scheme" (eth -> enX).
>>  From bookworm on, domUs vifs get named enXN (enX0, enX1, ...).
>> Read on :
>> https://www.debian.org/releases/stable/i386/release-notes/ch-information.en.html#xen-network
> 
> This is from the domU, running bullseye with a bookworm dom0.
> 
>> See how ethN interfaces get messed up, like in your setup, but 
>> predictable names would work, as you can see in "altname enXN" :
>> eth1 (:01) -> enX1
>> eth2 (:10) -> enX10
>> eth3 (:02) -> enX2

But yeah, so, even while not depending on whatever order it gets
initialized, and still having it function correctly, this is still just
pretty annoying... If I'm doing stuff around here, and just quickly want
to look up things (e.g. messing around with vlan15 settings), and
quickly type ip a instead of having to spend more time typing ip a show
dev vlan15 jadijadi, I still every time get this short "WTF huh, argh",
raises arms, does table flip, grmbl grbml feeling for a split second.

2: vlan2:  I could not get our bullseye domU to show the "predictable names" even 
> though i tried installing the bullseye-backports kernel 6.1.
> After you wrote this i installed udev 252.5 from backports and it now 
> uses the correct enXn interface names, even with kernel 5.10.
> 
>> So, my answer does not tell you if something changed in Xen itself, only 
>> in Debian.
>> But I guess it relates to what Xen devs told us : vifs detection order 
>> cannot be relied upon, that's why "predictable names" were invented.
>> The vif detection part is related to the domains kernels, not Xen itself 
>> (at least that's what I understood).
>>
>> Using eth0 nowadays is a bit like using /dev/sda for hard drives, it's 
>> considered legacy as it may create problems in some setups, like yours 
>> (ie. for disks, it's recommended to use UUIDs or /dev/disk/by-*).
>>
>> I hope this answers your question.
> 
> Thank you, yes it does.
> 
> In our case the dom0 was updated to bookworm while the domU is still 
> running bullseye.
> -> updated Xen so the vif detection order changed (which we relied on)

I didn't read the other mailthread on the xen list fully yet. But, I
think it's shouldn't be very hard to find the code changes and see if
it's deterministic and can just be fixed. Simply just to decrease the
totally unnecessary amount of silliness.

> -> the predictable network names for Xen don't work with bullseye
> 
> So my new resolution for bullseye domUs on a bookworm dom0 is to install 
> udev from backports and change the domUs network config to use the new 
> enXn naming scheme instead of ethn.

Or the "device/vif/X" way...

So, anyway, did someone already did some test "just because we can" to
see how much network interfaces you can get added for fun, and if the
pattern keeps looking the same, also with enX4 enX40 .. enX49 enX5 etc?
:D enX1 enX10 enX100 .. enX109 enX11 enX110 argh o_O

Have fun,
Hans

Bug#1027456: gcc-10: gcc segfaults> 'tree-optimization/99824' patch is a fix

2023-01-27 Thread Hans van Kranenburg

Control: tags -1 + fixed-upstream confirmed patch

Hi all,

I also ran into this issue while trying to build src:linux 6.1.7-1
targeting bullseye-backports.

I can confirm that I was able to build the kernel packages successfully
using gcc-10/10.2.1-6, with only the following patch on top:

https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=ee15832c53d52656e562c29110f2be1cfb66c450

ee15832c53 "tree-optimization/99824 - avoid excessive integer type
precision in VN"

So, in order to be able to do the next 'official' bullseye-backports for
src:linux I guess we first need this fix for gcc-10 to go into bullseye
via a stable point release?

Thanks,
Hans (Knorrie)From ee15832c53d52656e562c29110f2be1cfb66c450 Mon Sep 17 00:00:00 2001
From: Richard Biener 
Date: Tue, 30 Mar 2021 11:22:52 +0200
Subject: [PATCH] tree-optimization/99824 - avoid excessive integer type
 precision in VN

VN sometimes builds new integer types to handle accesss where precision
of the access type does not match the access size.  The way
ao_ref_init_from_vn_reference is computing the access size ignores
the access type in case the ref operands have an outermost
COMPONENT_REF which, in case it is an array for example, can be
way larger than the access size.  This can cause us to try
building an integer type with precision larger than WIDE_INT_MAX_PRECISION
eventually leading to memory corruption.

The following adjusts ao_ref_init_from_vn_reference to only lower
access sizes via the outermost COMPONENT_REF but otherwise honor
the access size as specified by the access type.

It also places an assert in integer type building that we remain
in the limits of WIDE_INT_MAX_PRECISION.  I chose the shared code
where we set TYPE_MIN/MAX_VALUE because that will immediately
cross the wide_ints capacity otherwise.

2021-03-30  Richard Biener  

	PR tree-optimization/99824
	* stor-layout.c (set_min_and_max_values_for_integral_type):
	Assert the precision is within the bounds of
	WIDE_INT_MAX_PRECISION.
	* tree-ssa-sccvn.c (ao_ref_init_from_vn_reference): Use
	the outermost component ref only to lower the access size
	and initialize that from the access type.

	* gcc.dg/torture/pr99824.c: New testcase.
---
 gcc/stor-layout.c  |  2 ++
 gcc/testsuite/gcc.dg/torture/pr99824.c | 33 ++
 gcc/tree-ssa-sccvn.c   | 24 +++
 3 files changed, 49 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr99824.c

diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index bde6fa22b58a..57c8a2516d95 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -2816,6 +2816,8 @@ set_min_and_max_values_for_integral_type (tree type,
   if (precision < 1)
 return;
 
+  gcc_assert (precision <= WIDE_INT_MAX_PRECISION);
+
   TYPE_MIN_VALUE (type)
 = wide_int_to_tree (type, wi::min_value (precision, sgn));
   TYPE_MAX_VALUE (type)
diff --git a/gcc/testsuite/gcc.dg/torture/pr99824.c b/gcc/testsuite/gcc.dg/torture/pr99824.c
new file mode 100644
index ..9022d4a4b8e7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr99824.c
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+
+unsigned int
+strlenx(char *s)
+{
+  char *orig_s = s;
+  for (; *s; ++s)
+;
+  return s - orig_s;
+}
+
+struct i2c_adapter {
+char name[48];
+};
+
+struct {
+int instance;
+struct i2c_adapter i2c_adap[];
+} * init_cx18_i2c_cx;
+
+const struct i2c_adapter cx18_i2c_adap_template = {""};
+int init_cx18_i2c___trans_tmp_1;
+
+void
+init_cx18_i2c()
+{
+  int i = 0;
+  for (;; i++) {
+  init_cx18_i2c_cx->i2c_adap[i] = cx18_i2c_adap_template;
+  init_cx18_i2c___trans_tmp_1
+	= strlenx(init_cx18_i2c_cx->i2c_adap[i].name);
+  }
+}
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index 4b280f21006e..926b4a976aec 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -996,22 +996,26 @@ ao_ref_init_from_vn_reference (ao_ref *ref,
   poly_offset_int size = -1;
   tree size_tree = NULL_TREE;
 
-  /* First get the final access size from just the outermost expression.  */
+  machine_mode mode = TYPE_MODE (type);
+  if (mode == BLKmode)
+size_tree = TYPE_SIZE (type);
+  else
+size = GET_MODE_BITSIZE (mode);
+  if (size_tree != NULL_TREE
+  && poly_int_tree_p (size_tree))
+size = wi::to_poly_offset (size_tree);
+
+  /* Lower the final access size from the outermost expression.  */
   op = [0];
+  size_tree = NULL_TREE;
   if (op->opcode == COMPONENT_REF)
 size_tree = DECL_SIZE (op->op0);
   else if (op->opcode == BIT_FIELD_REF)
 size_tree = op->op0;
-  else
-{
-  machine_mode mode = TYPE_MODE (type);
-  if (mode == BLKmode)
-	size_tree = TYPE_SIZE (type);
-  else
-	size = GET_MODE_BITSIZE (mode);
-}
   if (size_tree != NULL_TREE
-  && poly_int_tree_p (size_tree))
+  && poly_int_tree_p (size_tree)
+  && (!known_size_p (size)
+	  || known_lt (wi::to_poly_offset (size_tree), size)))
 size = wi::to_poly_offset (size_tree);

Bug#1028251: New Patch (Was: Re: Bug#1028251: xen: FTBFS when building xen binary packages for sid on x86_64)

2023-01-13 Thread Hans van Kranenburg

Hi,

On 1/13/23 22:45, Chuck Zmudzinski wrote:
> On 1/13/23 7:39 AM, Marek Marczykowski-Górecki wrote:
>> On Fri, Jan 13, 2023 at 12:58:29AM -0500, Chuck Zmudzinski wrote:
>>> On 1/11/2023 10:58 PM, Chuck Zmudzinski wrote:
>>>> On 1/9/23 12:55 PM, Hans van Kranenburg wrote:
>>>>> Hi!
[...]
Yolo style cutting out lines here...
[...]
>>>
>>> Regarding the systemd files causing ftbfs, this explains it:
>>>
>>> https://salsa.debian.org/xen-team/debian-xen/-/blob/master/m4/systemd.m4#L119
>>>
>>> and this:
>>>
>>> https://salsa.debian.org/xen-team/debian-xen/-/blob/master/tools/configure.ac#L480
>>>
>>> The comments indicate that using AX_AVAILABLE_SYSTEMD() will
>>> by default enable systemd if systemd development files are on the
>>> build system, and AX_ALLOW_SYSTEMD() means --enable-systemd
>>> must explicitly be passed to tools/configure to enable it. Upstream
>>> uses the former, so build systems with systemd development files
>>> by default will ftbfs because that produces missing files that dh_missing
>>> in debian/rules does not like.
>>>
>>> So the reason there is ftbfs on my system is that my system has
>>> the systemd development package installed.
>>
>> By the way, maybe a better fix would be to pass --enable-systemd, add 
>> libsystemd-dev
>> build-dep and list them in the package? They might require patching to
>> support Debian-specific upgrade machinery, though...
>>
>> Not installing xendriverdomain.service is one of things missing for
>> driver domains support
>> (https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=922033).
>>
> 
> Hi Marek,
> 
> I wouldn't be against fixing it that way. In fact, I would prefer
> that Debian packaged Xen with full support for native systemd units.
> I am willing to wait until if/when the package maintainers have
> full systemd support in the Xen packages.
> 
> Perhaps this is an opportunity for you to try to fix 922033 again.
> I see it has been sitting there for a few years now. Let's see
> what Hans thinks.

Yeah, well, so, the thing here is...

When Debian started to package Xen (thanks! Bastian, in 200X), the
upstream init scripts were copy pasted, and adjusted to have the ability
to have different Hypervisor-ABI-incompatible versions installed at the
same time. Also, this is related to the collection of Makefile patches
we carry around to have ABI-incompatible stuff end up in a directory
like /usr/lib/xen-4.14/ and /usr/lib/xen-4.17/ !

What does this mean? Well, in the most basic sense it means that you
could apt-get (dist-)upgrade and then still be able to xl shutdown a
domU afterwards before doing reboot, because it will choose the right
tools which match with the ABI of the *now* running hypervisor instead
of being left with a dumpster fire, which in the end causes you to shout
curse words and cause you to have to go to the machine and hold the
power button for 5 seconds to force power it off.

This is the thing about where you upgrade from Xen 4.14 to Xen 4.17
during the upgrade from Debian 11/Bullseye to Debian 12/Bookworm, it
will allow you, if booting the whole new thing is a huge failure, to
reset the computer, and in grub, choose to use the previous Xen (and
possibly do that in combination with previous Debian linux kernel) and
then have a system where you again at least can start your domUs again
*) and first have a good rest, night of sleep before starting to dig
into what's going wrong.

So, this is exactly the same way of doing stuff like how you can also
reboot back into the previous Linux kernel (ABI-compatible) one during a
system upgrade, even if you're not using Xen at all!

I like this very much. This is the kind of thing that helps admins of
systems that have just local disks and a few domUs. Like, the case where
you support some non-profit organization with their server stuff running
on donated hardware. (Yes, I also do some of those, I do!) And, in case
something does fail (there could always be something like a misbehaving
mpt3sas card in the hardware or anything that no one else spotted yet),
the admin does not have to end up in total panic mode after doing the
upgrade on a Friday afternoon lying upside down inside a broom closet,
but they can just at least recover from the situation and have something
that's running again, and then a day later, or 2 or 3 days or a week
later return on another planned moment to fix it, after asking around.

Upstream Xen stuff doesn't have anything like that.

But, they actually look at us, and they think, ooh, this is actually
nice, we should have that also by default.

The fact that we have this changed/altered/divergent init scripts in
Debian is the main reason that we cannot just enable

Bug#1028251: [Pkg-xen-devel] Bug#1028251: xen: FTBFS when building xen binary packages for sid on x86_64

2023-01-09 Thread Hans van Kranenburg

Hi!

On 09/01/2023 18:44, Chuck Zmudzinski wrote:
> Control: tag -1 + moreinfo
> 
> thanks
> 
> On 1/9/23 8:09 AM, Hans van Kranenburg wrote:
>> Hi Chuck,
>>
>> On 1/8/23 23:18, Chuck Zmudzinski wrote:
>>> [...]
>>>
>>> The build failed:
>>>
>>>debian/rules override_dh_missing
>>> make[1]: Entering directory '/home/chuckz/sources-sid/xen/xen-4.17.0'
>>> dh_missing --list-missing
>>> dh_missing: warning: usr/lib/modules-load.d/xen.conf exists in debian/tmp 
>>> but is not installed to anywhere
>>> dh_missing: warning: usr/lib/systemd/system/proc-xen.mount exists in 
>>> debian/tmp but is not installed to anywhere
>>> dh_missing: warning: usr/lib/systemd/system/xen-init-dom0.service exists in 
>>> debian/tmp but is not installed to anywhere
>>> dh_missing: warning: 
>>> usr/lib/systemd/system/xen-qemu-dom0-disk-backend.service exists in 
>>> debian/tmp but is not installed to anywhere
>>> dh_missing: warning: usr/lib/systemd/system/xen-watchdog.service exists in 
>>> debian/tmp but is not installed to anywhere
>>> dh_missing: warning: usr/lib/systemd/system/xenconsoled.service exists in 
>>> debian/tmp but is not installed to anywhere
>>> dh_missing: warning: usr/lib/systemd/system/xendomains.service exists in 
>>> debian/tmp but is not installed to anywhere
>>> dh_missing: warning: usr/lib/systemd/system/xendriverdomain.service exists 
>>> in debian/tmp but is not installed to anywhere
>>> dh_missing: warning: usr/lib/systemd/system/xenstored.service exists in 
>>> debian/tmp but is not installed to anywhere
>>
>> I cannot reproduce this error here locally and the CI build also succeeds:
>>
>> https://salsa.debian.org/xen-team/debian-xen/-/pipelines/481577
> 
> I thought I had a fairly clean sid install, but I think the problem
> on my system could be caused by some obscure grandfathered in
> setting because the sid I am using was updated from all the way back to
> an original install of jessie many years ago...
> 
> It might be time for me to refresh my sid with a clean installation.
> 
> Out of curiosity and if you have time, can you answer a couple of
> question if you know the answer?
> 
> 1. Do the builds on a clean environment produce the missing files
> listed in my build?

No, after my local package build, there's no such things in there:

~/build/xen/debian-xen/debian/tmp/usr/lib m (master) 1-$ ll
total 0
drwxr-xr-x 1 knorrie knorrie  110 Jan  8 23:51 debug
drwxr-xr-x 1 knorrie knorrie 2048 Jan  8 23:50 x86_64-linux-gnu
drwxr-xr-x 1 knorrie knorrie   20 Jan  8 23:51 xen-4.17

> 
> 2. Are those systemd service files installed anywhere in the xen
> binary packages, either in arch=x86_64 packages or for the arch=all
> packages such as xen-utils-common?

No, they are not:

https://packages.debian.org/search?searchon=contents=xenconsoled.service=path=unstable=any

> If you don't know the answer to these questions I will investigate
> myself to find the answers, so you can work on more important things.
> 
>>
>> How are you building the packages? In a clean build environment, using
>> for example sbuild or pbuilder, or in an environment where unrelated
>> other build dependencies could be present, that are not included in the
>> xen list, but maybe 'wake up and do something' if they're present?
> 
> As I said, I am building on a sid install that might have some
> stuff grandfathered in from old releases going back to jessie.
> I also might have some stale stuff around from my private builds
> of the traditional device model available from xen that is not
> part of the Debian packages. I will investigate these possible causes.
> 
> I use debuild as a frontend to dpkg-buildpackage to build the packages.

Yes. So (I'm not entirely sure how it works, but as example, just making
something up here): After doing something else first, you might end up
with a system that has for example dh-systemd-yolo-all-the-things-helper
installed. And, it might be that only it being present means that the
package build process changes. It might even be a 'feature' of that
helper... "just add it to your build depends, and it will automatically
do all the things for you!!!~``1"

This is why it is very much recommended to build the packages using
something like sbuild, so that you can be sure that every time it will
start with a super minimal chroot which only has some essential things,
and that the only build dependencies used will be the ones that are
explicitly defined in the debian/control of the package.
>> You can also compare your own build output with the full one from the CI
>> job:
>>
>> https://salsa.debian.org/xen-team/debian-xen/-/jobs/3767564/raw
> 
> I will take a look at that when I get a chance.
> 
> This is not a real high priority for me, so I am content to let this
> be until I get a chance to investigate the quirks of my current
> installation of sid, and I also added the moreinfo tag, so you can
> ignore this bug if you wish until I do some further research. 

Sure, no problem.

Have fun,
Hans

Bug#1028251: [Pkg-xen-devel] Bug#1028251: xen: FTBFS when building xen binary packages for sid on x86_64

2023-01-09 Thread Hans van Kranenburg

Hi Chuck,

On 1/8/23 23:18, Chuck Zmudzinski wrote:
> [...]
> 
> The build failed:
> 
>debian/rules override_dh_missing
> make[1]: Entering directory '/home/chuckz/sources-sid/xen/xen-4.17.0'
> dh_missing --list-missing
> dh_missing: warning: usr/lib/modules-load.d/xen.conf exists in debian/tmp but 
> is not installed to anywhere
> dh_missing: warning: usr/lib/systemd/system/proc-xen.mount exists in 
> debian/tmp but is not installed to anywhere
> dh_missing: warning: usr/lib/systemd/system/xen-init-dom0.service exists in 
> debian/tmp but is not installed to anywhere
> dh_missing: warning: 
> usr/lib/systemd/system/xen-qemu-dom0-disk-backend.service exists in 
> debian/tmp but is not installed to anywhere
> dh_missing: warning: usr/lib/systemd/system/xen-watchdog.service exists in 
> debian/tmp but is not installed to anywhere
> dh_missing: warning: usr/lib/systemd/system/xenconsoled.service exists in 
> debian/tmp but is not installed to anywhere
> dh_missing: warning: usr/lib/systemd/system/xendomains.service exists in 
> debian/tmp but is not installed to anywhere
> dh_missing: warning: usr/lib/systemd/system/xendriverdomain.service exists in 
> debian/tmp but is not installed to anywhere
> dh_missing: warning: usr/lib/systemd/system/xenstored.service exists in 
> debian/tmp but is not installed to anywhere

I cannot reproduce this error here locally and the CI build also succeeds:

https://salsa.debian.org/xen-team/debian-xen/-/pipelines/481577

How are you building the packages? In a clean build environment, using
for example sbuild or pbuilder, or in an environment where unrelated
other build dependencies could be present, that are not included in the
xen list, but maybe 'wake up and do something' if they're present?

You can also compare your own build output with the full one from the CI
job:

https://salsa.debian.org/xen-team/debian-xen/-/jobs/3767564/raw

Hans

Bug#1021668: xen: CVE-2022-33749 CVE-2022-33748 CVE-2022-33747 CVE-2022-33746

2022-11-05 Thread Hans van Kranenburg

Hi :)

On 04/11/2022 22:51, Salvatore Bonaccorso wrote:
> Hi Hans
> 
> On Fri, Nov 04, 2022 at 02:59:29PM +0100, Hans van Kranenburg wrote:
>> Aha!
>>
>> On 02/11/2022 21:53, Salvatore Bonaccorso wrote:
>>> Hi,
>>>
>>> On Wed, Nov 02, 2022 at 08:02:26PM +0100, Hans van Kranenburg wrote:
>>>> Hi,
>>>>
>>>> On 10/19/22 21:55, Moritz Muehlenhoff wrote:
>>>>>>> For the latest set of Xen issues my estimate is that we can postpone
>>>>>>> them until the next batch, they seem all of moderate/limited impact.
>>>>>>> But let me know if you think otherwise.
>>>>>>
>>>>>> I agree. Let's do them together with the new stuff that's planned for
>>>>>> Nov 1st, https://xenbits.xen.org/xsa/
>>>>>
>>>>> Ack, I've updated the Security Tracker.
>>>>
>>>> I'm having a look at this now, and while writing the changelog entry, I
>>>> run into the following thing:
>>>>
>>>> XSA-403 has 4 CVE numbers. AFAIUI the first two are about the fixes done
>>>> to Linux, and the other two are about changes to Xen. Shouldn't the
>>>> Debian security tracker reflect that?
>>>>
>>>> CVE-2022-26365 CVE-2022-33740 -> src:linux only ?
>>>> CVE-2022-33741 CVE-2022-33742 -> src:xen only ?
>>>
>>> Speaking for src:linux I do not think we need to change the tracking:
>>>
>>> CVE-2022-26365: 2f446ffe9d73 ("xen/blkfront: fix leaking data in shared 
>>> pages")
>>> CVE-2022-33740: 307c8de2b023 ("xen/netfront: fix leaking data in shared 
>>> pages")
>>> CVE-2022-33741: 4491001c2e0f ("xen/netfront: force data bouncing when 
>>> backend is untrusted")
>>> CVE-2022-33742: 2400617da7ee ("xen/blkfront: force data bouncing when 
>>> backend is untrusted")
>>
>> Riiight. Thanks, now I get why I cannot find any CVE number related to
>> XSA-403 listed in the Xen upstream changes (at least for 4.14 which I'm
>> working on now). They're all over there at the Linux side.
> 
> It looks that there are still changes needed on the xen side, at least
> that is my understanding reading through 
> https://xenbits.xen.org/xsa/advisory-403.html 
> Quoting the advisory:
> 
> | For the stable branches (Xen 4.16.x - Xen 4.13.x) patch 1 introduces 
> support to
> | libxl for libxl_{disk,nic}_backend_untrusted environment variable to be 
> used in
> | order to set whether disk and network backends should be trusted.  Patch 2
> | reverts patch 1 and instead provides the more fine grained per-device 
> options
> | that break the libxl ABI.
> | 
> | Note that applying patch 2 to any of the stable releases will require a 
> rebuild
> | of any consumers of the libxl library, as it introduces an ABI breakage and
> | hence won't be applied to the official repository stable branches.  Users of
> | stable releases wanting to use the functionality provided by patch 2 will 
> need
> | to apply it manually.
> 
> This is the reason that in fact for those four CVEs, weh ave marked
> for bullseye:
> 
> [bullseye] - xen  (Too intrusive too backport)
> 
> The "signaling of whether a frontend should consider a backend as
> potentially malicious can be done **from either the Linux kernel
> command line or the toolstack.**" (highlighting is added by me).
> 
> So IMHO it is similarly correct to track src:xen under those CVEs, and
> they are marked as fixed with 4.16.2-1. *But* for bullseye, they can
> be ignored due to above reasons.

Yes, so the Xen part is about the "reporting whether the backend is to
be trusted".

That 'patch 1', the all-or-nothing option to signal the guest kernel is
now included with this update. But neither that change, nor the more
fine-grained patch 2 is directly linked to a CVE number. That change on
itself will not fix anything for any of the 4 CVE numbers.

Also, for 4.16 the story is the same, by the way. It's only in 4.17
which is to be released in the upcoming week that the otherwise lilbxl
ABI breaking changes are fully included, but even that doesn't change
anything for the CVE administration.

After all, it is a bit of a moot point for us. The only scenario in
which all of this is relevant is when using a 'driver domain' to
delegate the blk/net backend part to another untrusted guest domain.
Using this functionality is not properly enabled/supported out of the
box in our package builds for Debian.

Sometimes these XSA are like a little scavenger hunt.

Hans

Bug#1021668: xen: CVE-2022-33749 CVE-2022-33748 CVE-2022-33747 CVE-2022-33746

2022-11-04 Thread Hans van Kranenburg

Aha!

On 02/11/2022 21:53, Salvatore Bonaccorso wrote:
> Hi,
> 
> On Wed, Nov 02, 2022 at 08:02:26PM +0100, Hans van Kranenburg wrote:
>> Hi,
>>
>> On 10/19/22 21:55, Moritz Muehlenhoff wrote:
>>>>> For the latest set of Xen issues my estimate is that we can postpone
>>>>> them until the next batch, they seem all of moderate/limited impact.
>>>>> But let me know if you think otherwise.
>>>>
>>>> I agree. Let's do them together with the new stuff that's planned for
>>>> Nov 1st, https://xenbits.xen.org/xsa/
>>>
>>> Ack, I've updated the Security Tracker.
>>
>> I'm having a look at this now, and while writing the changelog entry, I
>> run into the following thing:
>>
>> XSA-403 has 4 CVE numbers. AFAIUI the first two are about the fixes done
>> to Linux, and the other two are about changes to Xen. Shouldn't the
>> Debian security tracker reflect that?
>>
>> CVE-2022-26365 CVE-2022-33740 -> src:linux only ?
>> CVE-2022-33741 CVE-2022-33742 -> src:xen only ?
> 
> Speaking for src:linux I do not think we need to change the tracking:
> 
> CVE-2022-26365: 2f446ffe9d73 ("xen/blkfront: fix leaking data in shared 
> pages")
> CVE-2022-33740: 307c8de2b023 ("xen/netfront: fix leaking data in shared 
> pages")
> CVE-2022-33741: 4491001c2e0f ("xen/netfront: force data bouncing when backend 
> is untrusted")
> CVE-2022-33742: 2400617da7ee ("xen/blkfront: force data bouncing when backend 
> is untrusted")

Riiight. Thanks, now I get why I cannot find any CVE number related to
XSA-403 listed in the Xen upstream changes (at least for 4.14 which I'm
working on now). They're all over there at the Linux side.

Hans

Bug#1021668: [Pkg-xen-devel] Bug#1021668: xen: CVE-2022-33749 CVE-2022-33748 CVE-2022-33747 CVE-2022-33746

2022-11-02 Thread Hans van Kranenburg

Hi,

On 10/19/22 21:55, Moritz Muehlenhoff wrote:
>>> For the latest set of Xen issues my estimate is that we can postpone
>>> them until the next batch, they seem all of moderate/limited impact.
>>> But let me know if you think otherwise.
>>
>> I agree. Let's do them together with the new stuff that's planned for
>> Nov 1st, https://xenbits.xen.org/xsa/
> 
> Ack, I've updated the Security Tracker.

I'm having a look at this now, and while writing the changelog entry, I
run into the following thing:

XSA-403 has 4 CVE numbers. AFAIUI the first two are about the fixes done
to Linux, and the other two are about changes to Xen. Shouldn't the
Debian security tracker reflect that?

CVE-2022-26365 CVE-2022-33740 -> src:linux only ?
CVE-2022-33741 CVE-2022-33742 -> src:xen only ?

And for XSA-403, at first upstream was unsure about what to do for older
Xen versions where the patches would be an ABI breaker. In the end, they
did apply the more coarse-grained patch to at least offer some kind of
mitigation in case a user wants to use it.

So, the changelog line I'm including now will just be:
  - Linux disk/nic frontends data leaks
XSA-403 CVE-2022-33741 CVE-2022-33742

HTH,
Hans

Bug#1021668: [Pkg-xen-devel] Bug#1021668: xen: CVE-2022-33749 CVE-2022-33748 CVE-2022-33747 CVE-2022-33746

2022-10-19 Thread Hans van Kranenburg

Hi,

On 18/10/2022 22:31, Moritz Muehlenhoff wrote:
> On Tue, Oct 18, 2022 at 02:17:32PM +0200, Hans van Kranenburg wrote:
>> Does explicitly opening a BTS bug mean that, like we use to call it,
>> "these CVEs warrant a DSA",
> 
> No, in general we aim to file bugs for any open CVEs regardless of
> the DSA state. This allows people to see that an issue is known
> (and some maintainers might also not have noticed in time).

Ok!

>> and that it is a request for an ASAP package
>> update and preparing a security update for stable, or, is this a new
>> thing where BTS bugs are opened for packages, just in case the
>> maintainer did not already track security issues themselves actively?
> 
> For the latest set of Xen issues my estimate is that we can postpone
> them until the next batch, they seem all of moderate/limited impact.
> But let me know if you think otherwise.

I agree. Let's do them together with the new stuff that's planned for
Nov 1st, https://xenbits.xen.org/xsa/

Hans

Bug#1021668: [Pkg-xen-devel] Bug#1021668: xen: CVE-2022-33749 CVE-2022-33748 CVE-2022-33747 CVE-2022-33746

2022-10-18 Thread Hans van Kranenburg

Hi!

On 10/12/22 19:38, Moritz Mühlenhoff wrote:
> Source: xen
> X-Debbugs-CC: t...@security.debian.org
> Severity: important
> Tags: security
> 
> Hi,
> 
> The following vulnerabilities were published for xen.
> 
> CVE-[...]
Thanks for the overview. The XAPI one indeed does not apply to src:xen.

I have a question, since the 'bug' report does not contain a question,
or explicit call for action, and I have not seen it in this way before.

Does explicitly opening a BTS bug mean that, like we use to call it,
"these CVEs warrant a DSA", and that it is a request for an ASAP package
update and preparing a security update for stable, or, is this a new
thing where BTS bugs are opened for packages, just in case the
maintainer did not already track security issues themselves actively?

I'm just wondering...

Thanks,
Hans

Bug#1021215: Kind request for backports of libtraceevent and libtracefs

2022-10-03 Thread Hans van Kranenburg

Package: src:libtraceevent
Version: 1:1.1.2-1

Hi maintainer, :)

Linux commit fe4d0d5dde ("rtla/Makefile: Properly handle dependencies")
helps making the dependency on libtraceevent and libtracefs more
explicit, so that the users run into less weird problems on the go.

Linux 5.19 is in Debian unstable now, and for the stable-backports
packages that our kernel team is providing, this means that it will
FTBFS, unless we either:
- exclude rtla for bullseye-backports
- have backports for libtraceevent and libtracefs present

So, the question for you is: Do you want to also provide
bullseye-backports packages for libtraceevent and libtracefs?

About making dependencies explicit in the kernel package:
https://salsa.debian.org/kernel-team/linux/-/merge_requests/539

Currently shipping without rtla, so far:
https://salsa.debian.org/benh/linux/-/commit/15b6859742d404abdcd68bcb589f8a8e2dfb6ce4

Thanks,
Hans

Bug#1020787: Xen: After updating to 5.19 kernel the VMs are started without XSAVE CPU flags

2022-09-28 Thread Hans van Kranenburg

Hi!

On 9/28/22 00:55, Diederik de Haas wrote:
> On Wednesday, 28 September 2022 00:24:27 CEST Patrick wrote:
>> I just applied the patch
>> (xen.git-c3bd0b83ea5b7c0da6542687436042eeea1e7909.patch) to the xen
>> packages and can confirm that this fixes the problems. The xsave flags are
>> available again and thus the binaries work too.
> 
> That is awesome, thank you :-)
> 
> IIUC:
> - Xen upstream will backport the patch to the stable branches; I do not know 
> when that will happen
> - Debian's package will probably be updated before that and 4.16.2-2 will be 
> uploaded to Sid Soon (tm) with that patch applied

Thanks for doing the investigation!

I'm currently preparing 4.16.2-2 which includes the fix.

Hans

Bug#1016547: [Pkg-xen-devel] Bug#1016547: /etc/default/grub.d/xen.cfg: Extraneous output line causes error message at boot

2022-08-03 Thread Hans van Kranenburg

Hi John,

On 8/2/22 19:50, John E. Krokes wrote:
> Package: xen-hypervisor-common
> Version: 4.14.5+24-g87d90d511c-1
> Severity: minor
> File: /etc/default/grub.d/xen.cfg
> 
> Dear Maintainer,
> 
> When invoked via grub-mkconfig, xen.cfg outputs this as its first line:
>   Including Xen overrides from /etc/default/grub.d/xen.cfg
> 
> The output of grub-mkconfig is expected to be redirected into a grub.cfg file.
> Grub will read the grub.cfg at boot. Unfortunately, "Including" is not a
> valid grub command. So when booting, grub emits this error message before
> displaying its menu:
>   error: can't find command `Including'.

Aha! Nice catch. That's indeed something that should be improved.

> [...]
> 
> The error message is obscured very quickly. It does not affect functionality
> in any way. It requires booting on a VERY slow machine in order to read
> the error message at all.
> 
> 
> If I add a '#' to the start of the "Including", the resulting grub config file
> boots with no error.
>   echo "#Including Xen overrides from /etc/default/grub.d/xen.cfg"
> 
> I'm not sure if this line was intended to go into the generated config
> file as a comment, or if it was intended to be shown to the user while
> grub-mkconfig is running.

I'm sure it's the latter, yes. Just some 'hey! I'm doing this now' message.

> I have observed this and tested my fix against version
> 4.14.5+24-g87d90d511c-1 of xen-hypervisor-common. I have also checked
> with the debian git at 
> https://salsa.debian.org/xen-team/debian-xen/-/blob/master/debian/tree/xen-hypervisor-common/etc/default/grub.d/xen.cfg.
>  This line
> has not changed in a very long time.
> 
> 
> I can also duplicate the behavior using grub-emu, with the output redirected
> to a file.
> 
> 
> I am running devuan, and originally reported this to their BTS but was
> redirected to debian. So my version number does not match. Apologies for
> that.

It's ok. The changes/improvements for this will end up in the Xen 4.16
package that's in Debian unstable now, anyway.

So, in our grub.d/xen.cfg file, there's two places that cause text
output: the 'Including Xen overrides ...' informational one, and the
notification/warning about overriding GRUB_DEFAULT.

The grub.d/* files are executed (sourced) in the context of the
grub-mkconfig itself using '.'. In there, I can see that similar status
messages are just redirected to stderr. We can do the same here. For the
warning, there's a grub_warn helper function, which we can use.

So, that results in the follow changes I have here now:

diff --git a/default/grub.d/xen.cfg b/default/grub.d/xen.cfg
index d35744e..42670eb 100644
--- a/default/grub.d/xen.cfg
+++ b/default/grub.d/xen.cfg
@@ -5,7 +5,7 @@
 # The configuration in here makes it possible to have different options set
 # for the linux kernel when booting with or without Xen.

-echo "Including Xen overrides from /etc/default/grub.d/xen.cfg"
+echo "Including Xen overrides from /etc/default/grub.d/xen.cfg" >&2

 ###
 # Xen Hypervisor Command Line Options
@@ -83,8 +83,8 @@ GRUB_CMDLINE_LINUX_XEN_REPLACE="earlyprintk=xen
console=hvc0 noresume"
 #XEN_OVERRIDE_GRUB_DEFAULT=
 #
 if [ "$XEN_OVERRIDE_GRUB_DEFAULT" = "" ]; then
-   echo "WARNING: GRUB_DEFAULT changed to boot into Xen by default!"
-   echo " Edit /etc/default/grub.d/xen.cfg to avoid this
warning."
+   grub_warn "GRUB_DEFAULT changed to boot into Xen by default!" \
+ "Edit /etc/default/grub.d/xen.cfg to avoid this warning."
XEN_OVERRIDE_GRUB_DEFAULT=1
 fi
 if [ "$XEN_OVERRIDE_GRUB_DEFAULT" = "1" ]; then

None of this output will now be mixed with the generated config any more.

This will be in the next package upload.

https://salsa.debian.org/xen-team/debian-xen/-/commits/wip/sid

Thanks,
Hans

Bug#1008048: RM: xen [i386] -- ROM; ANAIS; stop building for i386

2022-03-21 Thread Hans van Kranenburg


Package: ftp.debian.org
Severity: normal
X-Debbugs-CC: pkg-xen-de...@lists.alioth.debian.org

Hi,

Starting with Xen version 4.16, we're dropping support for the i386 arch.

There are currently already no reverse dependencies left on i386 
specific packages in unstable. We have worked together with collectd, 
libvirt and qemu maintainers to have their packages changed to remove 
i386 related xen things.


So, I understand that we now can ask for removal of the leftover Xen 
4.14 packages in i386, which will unblock the migration of Xen 4.16 to 
testing.


Thanks,
Hans van Kranenburg
Debian Xen Team

Bug#988333: [Pkg-xen-devel] Bug#988333: Bug#988333: libxenmisc4.16: libxl fails to grant necessary I/O memory access for gfx_passthru of Intel IGD

2022-03-08 Thread Hans van Kranenburg


On 3/7/22 18:30, Chuck Zmudzinski wrote:

[...]


Thanks for adding all the info and researching this, Chuck!

Hans

Bug#921187: Getting rid of rdepends on libxenmisc4.X so we can do backports

2022-02-27 Thread Hans van Kranenburg

I see the mail thread 'RFC: qemu and Xen ABI-unstable libs' on the 
upstream xen-devel mailing list did not get referenced from this Debian 
bug yet:


https://lists.xenproject.org/archives/html/xen-devel/2020-09/threads.html#01299

It contains a lot of info about the actual work that needs to be done.

Hans

Bug#1005176: xen-utils-4 library dependencies need update

2022-02-25 Thread Hans van Kranenburg


tags 1005176 + moreinfo
thanks

Hi Elliott, :)

On 2/8/22 14:19, Elliott Mitchell wrote:

Package: src:xen Version: 4.16.0-1~exp1

I'm guilty of pulling in later Xen source and building it based on
the experimental 4.16 packaging.  As such this may actually only be
an issue for a package version beyond 4.16.0.

I'm uncertain which it is, but xen-utils-4.16 appears to need an
update to one or more of libxencall1, libxenevtchn1,
libxenforeignmemory1, libxengnttab1 and/or libxentoollog1 in order to
function.

During my initial update I merely updated libxenmisc4.16 and 
libxenstore4.  In this condition something (I suspect xenstored) was 
rather broken and things were unusable.


Notably `xl list` was hanging.  I was unable to get VMs started and
it felt like everything wanted to explode.


This one is really too vague to be able to react to in any sensible manner.

Reading it was a fun experience though. It made me think of creating a
bingo card with 30 possible things that a bug reporter can say that are 
a synonym of "it doesn't work".


I hope this message does not come across as offensive, it's in no way 
meant as such. :D


I do appreciate your contributions and you sharing thoughts about 
possible things that could be done and could be improved.


However, I hope you understand that there's no way we can help when you 
use something else than the actual packages in Debian, do not provide 
any error messages seen, and describe what you see instead as "it felt 
like everything wanted to explode".


For me, Xen 4.16 does run OK on my test servers, FWIW.

Have fun,
Hans

Bug#1004269: Debian Bug#1004269: Linker segfault while building src:xen

2022-01-23 Thread Hans van Kranenburg


(To both the Debian bug # and xen-devel list, reply-all is fine)
Hi Xen people,

I just filed a bug at Debian on the binutils package, because since the 
latest binutils package update (Debian 2.37.50.20220106-2), Xen (both 
4.14 and 4.16) fails to build with a segfault at the following point:


x86_64-linux-gnu-ld -mi386pep --subsystem=10 
--image-base=0x82d04000 --stack=0,0 --heap=0,0 
--section-alignment=0x20 --file-alignment=0x20 
--major-image-version=4 --minor-image-version=16 --major-os-version=2 
--minor-os-version=0 --major-subsystem-version=2 
--minor-subsystem-version=0 --no-insert-timestamp --build-id=sha1 -T 
efi.lds -N prelink.o 
/builds/xen-team/debian-xen/debian/output/source_dir/xen/common/symbols-dummy.o 
-b pe-x86-64 efi/buildid.o -o 
/builds/xen-team/debian-xen/debian/output/source_dir/xen/.xen.efi.0x82d04000.0 
&& :

Segmentation fault (core dumped)

Full message and links to build logs etc are in the initial bug message, 
to be seen at https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1004269


We (Debian Xen Team) are awaiting response, but I thought to also let 
you know already.


* Does the above error 'ring a bell'?
* Can you maybe also reproduce this in a development environment with 
very latest binutils?
* Maybe someone has a useful comment for the Debian binutils maintainer 
about what's happening in this step of the build?

* Any suggestions about what we can do to help figure this out?
* We'll try to help debug, but will surely appreciate upstream help if 
things get too technical. It's simply the case that I did not have to 
look into a very similar issue before, so it's new.


Thanks!
Hans

Bug#1004269: Linker segfault while building src:xen

2022-01-23 Thread Hans van Kranenburg


Package: src:binutils
Version: 2.37.50.20220106-2
X-Debbugs-CC: pkg-xen-de...@lists.alioth.debian.org

Hi,

With the last binutils version src:xen starts to FTBFS.

 >8  Xen 4.16 for experimental  >8 

* Last passed build, using binutils 2.37-10.
Job overview:
 https://salsa.debian.org/xen-team/debian-xen/-/pipelines/329021
Full log:
 https://salsa.debian.org/xen-team/debian-xen/-/jobs/2290845/raw

* First failed build, using the same source code, and using binutils 
2.37.50.20220106-2:

Job overview:
 https://salsa.debian.org/xen-team/debian-xen/-/pipelines/338409
Full log:
 https://salsa.debian.org/xen-team/debian-xen/-/jobs/2375744/raw

At the end of the full log, the failure can be observed:

x86_64-linux-gnu-ld -mi386pep --subsystem=10 
--image-base=0x82d04000 --stack=0,0 --heap=0,0 
--section-alignment=0x20 --file-alignment=0x20 
--major-image-version=4 --minor-image-version=16 --major-os-version=2 
--minor-os-version=0 --major-subsystem-version=2 
--minor-subsystem-version=0 --no-insert-timestamp --build-id=sha1 -T 
efi.lds -N prelink.o 
/builds/xen-team/debian-xen/debian/output/source_dir/xen/common/symbols-dummy.o 
-b pe-x86-64 efi/buildid.o -o 
/builds/xen-team/debian-xen/debian/output/source_dir/xen/.xen.efi.0x82d04000.0 
&& :

Segmentation fault (core dumped)

The above logs are for src:xen 4.16.0-1~exp1 which we were about to 
upload to experimental.


 >8  Xen 4.14 currently in unstable  >8 

I also triggered a CI run again for the current src:xen 
4.14.3+32-g9de3671772-1. The same segfault happens there, and both for 
the amd64 and i386 build test (i386 is no longer included for Xen 4.16).


Job overview:
 https://salsa.debian.org/xen-team/debian-xen/-/pipelines/340556
Full logs:
 https://salsa.debian.org/xen-team/debian-xen/-/jobs/2394079/raw
 https://salsa.debian.org/xen-team/debian-xen/-/jobs/2394080/raw

 >8 

So, this is what we observe. In the Debian Xen team, there's not a great 
amount of knowledge about the exact internals of what happens here.


* At least, we can let you know there's a regression.
* Currently progress on our Xen 4.16 upload is blocked, and we also 
can't do updates of the current Xen 4.14 packages (e.g. because of 
security fixes).
* We're available to help debugging this issue if needed. We'll need 
guidance, so it will mean that we'll work based on your instructions.
* After sending this report and getting the confirmation from the BTS, 
I'll send a reply with the upstream Xen development mailing list in Cc.


Thanks in advance,
Hans van Kranenburg

Bug#1002658: [Pkg-xen-devel] Bug#1002658: FTBFS with OCaml 4.13.1

2021-12-27 Thread Hans van Kranenburg

Hi Stéphane,

On 12/26/21 9:06 PM, Stéphane Glondu wrote:
> [...]
> 
> Your package FTBFS with OCaml 4.13.1 with the following error:
>> [...]
>>57 | #define Some_val(v) Field(v,0)
>>   | 
>> In file included from /usr/lib/ocaml/caml/alloc.h:24,
>>  from xentoollog_stubs.c:23:
>> /usr/lib/ocaml/caml/mlvalues.h:404: note: this is the location of the 
>> previous definition
>>   404 | #define Some_val(v) Field(v, 0)
>>   | 
>> cc1: all warnings being treated as errors

Thanks for the report.

There is an upstream fix for the ocaml redefinition issues, so that's at
least a good thing.

This fix is already in the released Xen 4.16, but not in Xen 4.14 that
is in unstable now.

https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=2d1a35f1e6c2113a6322fdb758a198608c90e4bd

We're currently preparing the upload of the new Xen 4.16 to Debian
experimental->unstable. So, either when that one reaches unstable, or,
also, when we have to do another intermediate Xen 4.14 upload to
unstable first (e.g. some more urgent security fixes), we can resolve this.

Let me know if you have specific wishes around deadline etc for
completing the ocaml transition. It doesn't take much effort for us to
do a -2 upload to unstable which only will include the above upstream
fix as change. So, when we would get into the critical path of progress
in your work, feel free to ask for that.

Thanks,
Hans

Bug#992909: xen-utils-4.14: please stop recommending libc6-xen on i386

2021-12-17 Thread Hans van Kranenburg

Hi Aurelien,

On 8/24/21 10:58 PM, Aurelien Jarno wrote:
> 
> Due to the removal of 32-bit PV in Linux kernel 5.9 and the removal of
> the "nosegneg" hwcap from glibc 2.32, the libc6-xen package is not build
> anymore by the glibc package. This is already the case in experimental,
> and will be soon the case in testing. Could you please update
> xen-utils-4.14 to stop recommending this package?

Yes! It will be part of the Xen 4.16 upload that we're preparing now.

Thanks,
Hans

Bug#994899: [Pkg-xen-devel] Bug#994899: xen-hypervisor-4.14-amd64 breaks system poweroff on bullseye

2021-11-27 Thread Hans van Kranenburg

Hi all,

On 10/5/21 2:16 AM, Chuck Zmudzinski wrote:
> On 10/4/2021 1:51 PM, Diederik de Haas wrote:
>> On Monday, 4 October 2021 17:27:22 CEST Chuck Zmudzinski wrote:
>>>   I can confirm these 4 fix the bug on my hardware.
>> \o/
>> Thanks for testing and reporting back :-)
> 
> Thank you, Diederik, for your good work finding the commits
> from upstream that fix the bug. And also thanks to you, Andy,
> for helping fix this bug in the IRC and for your interest and
> support of the Debian Xen Team's work.

So, we're in the process of actually doing a package update now, which
includes these fixes.

I can confirm that my HP DL360 hardware at work also did not fully power
off. And now, it does:

 >8 

[  OK  ] Reached target Late Shutdown Services.
[  OK  ] Finished System Power Off.
[  OK  ] Reached target System Power Off.
[16302.044148] reboot: Power down
(XEN) Disabling non-boot CPUs ...
(XEN) Broke affinity for IRQ12, new: ,
(XEN) Broke affinity for IRQ1, new: ,
(XEN) Broke affinity for IRQ9, new: ,
(XEN) Broke affinity for IRQ16, new: ,
(XEN) Broke affinity for IRQ17, new: ,
(XEN) Broke affinity for IRQ99, new: ,
(XEN) Broke affinity for IRQ112, new: ,
(XEN) Broke affinity for IRQ113, new: ,
(XEN) Broke affinity for IRQ114, new: ,
(XEN) Broke affinity for IRQ115, new: ,
(XEN) Broke affinity for IRQ116, new: ,
(XEN) Broke affinity for IRQ117, new: ,
(XEN) Broke affinity for IRQ118, new: ,
(XEN) Broke affinity for IRQ119, new: ,
(XEN) Broke affinity for IRQ120, new: ,
(XEN) Broke affinity for IRQ121, new: ,
(XEN) Broke affinity for IRQ122, new: ,
(XEN) Broke affinity for IRQ123, new: ,
(XEN) Broke affinity for IRQ124, new: ,
(XEN) Broke affinity for IRQ125, new: ,
(XEN) Broke affinity for IRQ126, new: ,
(XEN) Broke affinity for IRQ127, new: ,
(XEN) Entering ACPI S5 state.
 The server is not powered on.  The Virtual Serial Port is not available.

 >8 

So, this one will get closed when we do the upload to unstable.

Besides that, it will of course also be fixed in stable if we get the
same thing into there in the next days.

Hans

Bug#988333: [Pkg-xen-devel] linux-image-5.10.0-6-amd64: VGA Intel IGD Passthrough to Debian Xen HVM DomUs not working, but Windows Xen HVMs do work

2021-10-23 Thread Hans van Kranenburg

Hi!

On 10/19/21 5:44 AM, Chuck Zmudzinski wrote:
> On 5/10/2021 1:33 PM, Chuck Zmudzinski wrote:
>> [...] with buster and bullseye running as the Dom0, I can only get the 
>> VGA/Passthrough feature to work with Windows Xen HVMs. I would expect both 
>> Windows and Linux HVMs to work comparably well.

You don't mention the used Xen version (Debian package version) for
buster and bullseye anywhere, so I'll assume it's the latest
4.14.3-1(~deb11u1) one.

> [...]
> 
> The biggest problems were that the Dom0 reported problems
> with IRQ 16 being disabled after starting the bullseye HVM DomU,
> and only xl destroy could be used to stop the corrupted process.

Well, at least we have an error somewhere already. That's a starting point.

Can you share the domU config file?

And, other configs you need to have in place to exclude the devices from
being seen as normal devices directly in dom0? (I haven't used
passthrough myself yet, but I read that this is needed.)

Can you share more verbose logging done by xl create when using xl -vvv
create ?

But, AFAIK what you want to do should be possible yes.

> The bullseye HVM DomU still fails to boot on an up-to-date
> bullseye Xen Dom0 configured to pass through the same PCI/IGD
> devices. The bullseye HVM DomU with IGD passthrough has so
> far only been verified to work on an old, slightly modified
> jessie Xen Dom0.
> 
> More Details: These latest tests are with linux version 5.10.70-1
> for bullseye stable. For the jessie Dom0, which worked with the
> unmodified bullseye HVM DomU, I had to add a few patches to
> the old jessie Xen packages so the unmodified bullseye Xen HVM

Ok, yes, clear, that makes the domU kernel not the primary suspect.

> These tests demonstrate that a fix for this bug is possible in src:xen
> rather than in src:linux, but the patches needed to fix this bug in
> Xen 4.14, which is the version of Xen on bullseye, are not yet
> identified.

It might also be possible (just a wild guess) that for Xen 4.14, the
options in the domU config file need to be different than for Xen 4.4.

> I will continue to investigate this issue and try to bisect the problem
> as it recurs in Dom0 for some version of Xen > 4.4 and <= 4.14. It
> will obviously take some time since there are so many differences
> between Xen 4.4 and 4.14.

If you can make progress on that, and find an actual commit that changes
the behavior, then we're probably at 95% towards finding a cause and
solution. :) That'd be great.

A possible time-saver that I can recommend is to send a post to the
upstream xen-users list [0] about this already. Like "Hi all, I'm
starting a HVM Linux domU with Linux 5.10.70 on a Xen 4.14.3 system with
also 5.10.70 dom0 kernel, with this and this domU config file. It fails
to start, this is the xl -vvv create output, and this error (the irq
stuff) appears in the dom0 kernel log.". Try to keep it simple and not
too long initially, without the surrounding stories, to increase chance
of it being fully read.

> If I find a fix in src:xen for Xen >=4.14 Dom0 on bullseye or sid, I will
> reassign #988333 to src:xen myself. Until then, I will leave it to the
> discretion of the Debian Kernel Team to decide whether or not to
> reassign it to src:xen now.

Yes, that makes sense indeed, I'll do it in a minute. Even while we
don't know if it has to do with the Xen or dom0 kernel code, it's more
likely that in either case, we'll end up asking the upstream Xen people
about it.

Have fun,
Hans

[0] https://lists.xenproject.org/mailman/listinfo/xen-users

Bug#991967: Simply ACPI powerdown/reset issue?

2021-10-04 Thread Hans van Kranenburg

Hi Elliot and others,

Also including #994899 for once, since that's the bug number for the Xen
issue now.

On 9/26/21 5:27 AM, Elliott Mitchell wrote:
> On Tue, Sep 21, 2021 at 06:33:20AM -0400, Chuck Zmudzinski wrote:
>> I presume you are suggesting I try booting 4.19.181-1 on the
>> current version of Xen-4.14 for bullseye as a dom0. I am not
>> inclined to try it until an official Debian developer endorses
>> your opinion that the bug I am seeing is distinct
>> from #991967, at which point I will report the bug I am
>> seeing as a new bug.
> 
> Chuck Zmudzinski you are getting rather close to my threshold for calling
> harrassment.  You're not /quite/ there, but I'm concerned.
> 
> 
> Since the purpose of the bug reports is to find and diagnose bugs, I did
> a bit of experimentation and made some observations.
> 
> I checked out the Debian Xen source via git.  I got the current
> "master" branch which is presently the candidate 4.14.3-1 version,
> which includes urgent fixes.  The hash is:
> e7a17db0305c8de891b366ad3528e5a43015
> 
> On top of this I cherry-picked 3 commits from Xen's main branch:
> 5a4087004d1adbbb223925f3306db0e5824a2bdc
> 0f089bbf43ecce6f27576cb548ba4341d0ec46a8
> bc141e8ca56200bdd0a12e04a6ebff3c19d6c27b
> 
> (these can be retrieved via Xen's gitweb at
> https://xenbits.xen.org/gitweb/?p=xen.git;a=patch;h=<$hash> which is
> suitable for the `git am` command)
> 
> With these I built 4.14.3-1 and then tried kernels 4.19.181-1 and
> 4.19.194-3 (this system is presently mostly on oldstable).  The results
> were:
> 
> Xen 4.14.3-1 with Linux 4.19.181-1: system reboots were successful
> 
> Xen 4.14.3-1 with Linux 4.19.194-3: system reboots hung

Ok, so it included 0f089bbf43, which is probably the most important of
the 3 fixes that we need indeed. And, it's good that the above
difference is still visible afterwards, since it confirms that we're
looking at two distinct problems.

> Unfortunately I was too quick at installing the rebuilt 4.14.3-1 and I
> missed trying the vanilla Debian 4.14.2+25-gb6a8c4f72d-2 with
> Linux 4.19.181-1.  I believe this combination would have hung during
> reboot.

The Xen related breakage was introduced in 4.14.0+88-g1d1d1f5391-2, so
with that combination, I would expect you would experience both of the
bugs at the same time, yes.

> As such, I believe there are in fact two distinct bugs being observed.
> The presence of EITHER of these is sufficient to cause hangs during
> powerdown or reboot.
> 
> First, some patch originally from Linux's main branch breaks Xen reboots
> was backported somewhere between 4.19.181-1 and 4.19.194-3.  This may
> either have been introduced before 5.10 diverged from main, or may also
> have been backported to 5.10.  THIS is Debian bug #991967.
> 
> Second, the Xen patch 3c428e9ecb1f290689080c11e0c37b793425bef1 which is
> valuable to ARM devices breaks reboots and powerdowns on x86.  This is
> correctly fixed by 0f089bbf43ecce6f27576cb548ba4341d0ec46a8.  Presently
> this has no Debian bug report.

Correct. Thanks a lot for your help with hunting down and confirming this.

And now we have #994899 for it. So, I would like to kindly ask everyone
to stop hijacking this one, #991967, for discussing the Xen problem.

> The first is presently unidentified, someone enthusiastic either needs to
> read git logs/source code, or bisect and build to find where it got
> broken.
> 
> The second we seem to have a fix.  The only question is how many patches
> to cherry pick?  bc141e8ca562 is non-urgent as it is merely superficial
> and not needed for functionality.
> 5a4087004d1a is a workaround for Linux kernel breakage, but how likely
> are we to see that fixed in the Linux kernel packages?  The fix is
> well-contained and needed for some highly popular ARM devices.

Diederik also helped with testing changes, and when combining results,
the best thing we can do is pick the 4 changes that were initially
posted in Nov 2020 as "x86: ACPI and DMI table mapping fixes", and ended
up in Xen 4.15 as well.

 >8 

commit 8b6d55c1261820bb9db8d867ce9ee77397d05203
Author: Jan Beulich 
Date:   Tue Nov 24 11:26:02 2020 +0100

x86/ACPI: fix mapping of FACS

commit f390941a92f102ece1b54be206a602187fd7
Author: Jan Beulich 
Date:   Tue Nov 24 11:26:34 2020 +0100

x86/DMI: fix table mapping when one lives above 1Mb

commit 0f089bbf43ecce6f27576cb548ba4341d0ec46a8
Author: Jan Beulich 
Date:   Tue Jan 5 13:09:55 2021 +0100

x86/ACPI: fix S3 wakeup vector mapping

commit 16ca5b3f873f17f4fbdaecf46c133e1aa3d623b2
Author: Jan Beulich 
Date:   Tue Jan 5 13:11:04 2021 +0100

x86/ACPI: don't invalidate S5 data when S3 wakeup vector cannot be
determined

 >8 

The 4th one is not explicitly tagged with Fixes: 1c4aa69ca1e1, but I
agree with Diederik that we should keep them all together.

I do not know if this is also the thing Chuck tested in the end, but I'm
a bit lost in the walls of text that were produced in these two bugs.

Bug#990642: linux-image-4.19.0-17-amd64: kernel panic on xen dom0 with Broadcom Limited NetXtreme II BCM5709

2021-09-30 Thread Hans van Kranenburg

Hi spi, Salvatore,

On 8/5/21 1:58 PM, s...@gmxpro.de wrote:
> 
> In preparation for the bug report for upstream I did some more
> investigation.
> 
> The kernel panic also occurs without bonding interfaces but needs much
> more time to happen. With a bonding interface it happens within some
> seconds. Without bonding interfaces it needs like a minute with the
> network discovery being re-launched for 2 or 3 times. The kernel panic
> is still the same about the bnx2 driver.
> 
> In the constellation without a bonding interface the kernel panic only
> occurs if
> - opnsense as a domU is running (this domU bounds all bridged interfaces
> as default gateway for all networks)

Just FWIW, I'm seeing this bug-mail-thread now, and it rings a bell.

I spent some time in the past to debug crashing BCM5719 (4x1G) nics in
HP DL360 G8/9 series servers. In this case, the firmware inside the nic
crashed, so the symptoms were different. This happened only when having
a Xen domU active as router, which was routing incoming traffic packets
(from outside the box) back to the outside again.

02:00.0 Ethernet controller: Broadcom Limited NetXtreme BCM5719 Gigabit
Ethernet PCIe (rev 01)
02:00.1 Ethernet controller: Broadcom Limited NetXtreme BCM5719 Gigabit
Ethernet PCIe (rev 01)
02:00.2 Ethernet controller: Broadcom Limited NetXtreme BCM5719 Gigabit
Ethernet PCIe (rev 01)
02:00.3 Ethernet controller: Broadcom Limited NetXtreme BCM5719 Gigabit
Ethernet PCIe (rev 01)

Also, 2x 1G were bonded, I use openvswitch with LACP for that.

The symptoms are obviously different, mine looked like this:

tg3 :02:00.2 eth1: transmit timed out, resetting
tg3 :02:00.2 eth1: 0x: 0x165714e4, 0x00100546, 0x0201,
0x00800010
tg3 :02:00.2 eth1: 0x0010: 0x92b3000c, 0x, 0x92b4000c,
0x
tg3 :02:00.2 eth1: 0x0020: 0x92b5000c, 0x, 0x,
0x22be103c

tg3 :02:00.2 eth1: 0x7000: 0x0808, 0x, 0x,
0x4cd8
tg3 :02:00.2 eth1: 0x7010: 0xdbbd2b97, 0x010080f3, 0x00d70081,
0x03008200
tg3 :02:00.2 eth1: 0x7020: 0x, 0x, 0x0406,
0x10004000
tg3 :02:00.2 eth1: 0x7030: 0x0002, 0x4cdc, 0x001f,
0x
tg3 :02:00.2 eth1: 0: Host status block
[0001:0070:(:0563:):(:0094)]
tg3 :02:00.2 eth1: 0: NAPI info
[0070:0070:(016a:0094:01ff)::(068c:::)]
tg3 :02:00.2 eth1: 1: Host status block
[0001:0083:(::):(015b:)]
tg3 :02:00.2 eth1: 1: NAPI info
[0051:0051:(::01ff):0124:(0124:0124::)]
tg3 :02:00.2 eth1: 2: Host status block
[0001:00d8:(0e96::):(:)]
tg3 :02:00.2 eth1: 2: NAPI info
[00a4:00a4:(::01ff):0e5b:(065b:065b::)]
tg3 :02:00.2 eth1: 3: Host status block
[0001:0013:(::):(:)]
tg3 :02:00.2 eth1: 3: NAPI info
[00f8:00f8:(::01ff):072f:(072f:072f::)]
tg3 :02:00.2 eth1: 4: Host status block
[0001:009c:(::0736):(:)]
tg3 :02:00.2 eth1: 4: NAPI info
[007c:007c:(::01ff):0716:(0716:0716::)]
tg3 :02:00.2: tg3_stop_block timed out, ofs=1400 enable_bit=2
tg3 :02:00.2: tg3_stop_block timed out, ofs=c00 enable_bit=2
tg3 :02:00.2 eth1: Link is down
tg3 :02:00.2 eth1: Link is up at 1000 Mbps, full duplex
tg3 :02:00.2 eth1: Flow control is off for TX and off for RX
tg3 :02:00.2 eth1: EEE is disabled

> - sysctl parameter net.bridge.bridge-nf-call-ip6tables is set to 0.
> 
> If both conditions are not met no kernel panic oaccurs.

What I found out in the end is that using `ethtool -K $iface tso off` is
a workaround to not make it trigger some obscure bug inside the nic that
makes it crash.

So, I think my actual suggestion would be, even while it does not look
like the same thing, but it's still Broadcom stuff which can have
*cough* weird issues... if you can reliably reproduce the problem, then
can you try setting tso off on the physical interfaces in dom0 and try
again? In Dutch we say "nooit geschoten altijd mis".

> Other IPv6 related sysctl parameters are set on dom0 like
> net.ipv6.conf.all.disable_ipv6 = 1
> net.ipv6.conf.default.disable_ipv6 = 1
> net.ipv6.conf.lo.disable_ipv6 = 1
> 
> 
> The layer2-iptables settings are
> net.bridge.bridge-nf-call-ip6tables = 0 ***
> 
> 
> net.bridge.bridge-nf-call-iptables = 1
> 
> 
> net.bridge.bridge-nf-call-arptables = 0
> 
> 
> 
> 
> As said, if I don't set the one marked with *** to 0 there is no kernel
> panic.
> 
> I wonder if this still is a kernel issue but still wouldn't expect a
> kernel panic to happen.
> 
> Cheers,
> spi
> 

Have fun,
Hans

Bug#994870: [Pkg-xen-devel] Bug#994870: Bug#994870: Bug#994870: Memory allocation problem for VM after xen security update

2021-09-30 Thread Hans van Kranenburg

Hi!

On 9/30/21 12:45 AM, Andy Smith wrote:
> Hi Alex,
> 
> On Thu, Sep 30, 2021 at 12:10:32AM +0200, Alexander Dahl wrote:
>> Am 22.09.21 um 20:54 schrieb Hans van Kranenburg:
>>> At this point I would really recommend to not wait for a fix to arrive
>>> which makes it start again, but change your VM to use a 64-bit kernel.
>>
>> How?
> 
> This was answered in earlier comments on this bug; please see:
> 
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=994870#15
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=994870#20
> 
> The brief summary is, "start out like a crossgrade, but only do the
> kernel". Very simple and quite safe.
> 
> You haven't said how you boot your guest though (show us your
> /etc/xen/guest.cfg file). If it's pvgrub, that has a 32-bit and a
> 64-bit version so you'll need to change those as well. If it's
> pygrub you probably don't need to do anything, though pygrub has its
> own issues outside the scope of this bug.
> 
>> FWIW, Debian 10 VMs with 32 bit running with PVH work fine. My important VM
>> is still Debian 9 however due to a software I can not simply upgrade.
> 
> I've found PVH needs at least 4.19 guest kernel to work, which can
> be achieved in Debian 9 (stretch) today by using kernel from
> stretch-backports, so perhaps that is an option for you.

You can certainly do that and then run PVH.

Since stretch-backports is not used any more since stretch became
oldoldstable, new 4.19 backports kernels for Stretch are released
through the security updates channel. Be aware of this.

https://lists.debian.org/debian-lts-announce/2020/08/msg00019.html

Latest in stretch-backports (frozen) is 4.19.118, and stretch security
is now at 4.19.194. So double check you end up following the right one.

Hans

Bug#995233: Files stored under /usr/lib/debug/ have a too specific xen version in their name

2021-09-28 Thread Hans van Kranenburg

Package: src:xen
Version: 4.14.3-1

Conversation in #debian-security, 25 Sep 2021

18:28 < adsb> hmmm, were the filename changes in

https://release.debian.org/proposed-updates/bullseye_diffs/xen_4.14.3-1~deb11u1_amd64.debdiff.html
expected?

 >8 
Files only in first set of .debs, found in package xen-hypervisor-4.14-amd64
-rw-r--r-- root/root /usr/lib/debug/xen-4.14.3-pre.efi.map.gz
-rw-r--r-- root/root /usr/lib/debug/xen-syms-4.14.3-pre.gz
-rw-r--r-- root/root /usr/lib/debug/xen-syms-4.14.3-pre.map.gz
New files in second set of .debs, found in package xen-hypervisor-4.14-amd64
-rw-r--r-- root/root /usr/lib/debug/xen-4.14.3.efi.map.gz
-rw-r--r-- root/root /usr/lib/debug/xen-syms-4.14.3.gz
-rw-r--r-- root/root /usr/lib/debug/xen-syms-4.14.3.map.gz
 >8 

^^^ files names in binary package should not change during an update of
a package in the Debian stable release

22:09 < Knorrie> adsb: no, I did not expect that, but after looking at
it for a few minutes, I understand why, and this is a
 small packaging bug that exists for years already
apparently (the same thing has been happening during buster
also already all the time)
22:09 < Knorrie> adsb: the '.3-pre' and '.3' parts should be stripped
out of the filename, just like the files in /boot (e.g.
 /boot/xen-4.14-amd64.gz). so, just '4.14'
22:10 < Knorrie> we have a thing for that, which fixes up all the files
in boot, the same should be added for those files in
 the debug location
https://salsa.debian.org/xen-team/debian-xen/-/blob/master/debian/shuffle-boot-files
22:10  * Knorrie takes note
22:10 < Knorrie> for some reason this was apparently not spotted yet,
because it's not on a todo list.
22:11 < Knorrie> so, let me rephrase, given the current packaging code
(which did not change), technically, I see why this is
 actually expected, but it's not meant to be so
22:18  * h01ger learned today there's a new DSA acronym: Distributed
Switch Architecture :)
22:54 < adsb> Knorrie: cool, thanks for looking
22:58 < Knorrie> adsb: thanks for sharing the observation

FTR,
Knorrie

Bug#994870: Memory allocation problem for VM after xen security update

2021-09-24 Thread Hans van Kranenburg

Hi!,

Please don't (accidentally) drop the debian bug email from the recipient
list. This information might also be useful for others later.

On 9/23/21 1:47 AM, H.-R. Oberhage wrote:
> Good evening Hans,
> 
> On 22.09.2021 20:54, Hans van Kranenburg wrote:
>> Hi Ruediger,
>>
>> On 9/22/21 11:37 AM, H.-R. Oberhage wrote:
>>> Package: xen-system-amd64
>>> Version: 4.14.3-1~deb11u1
>>>
>>> After applying the buster security update to xen, my VM won't start
>>> any longer, complaining about a memory allocation error.
>>
>> Can you confirm that this is a virtual machine that tries to boot a
>> 32-bit kernel as PV type?
> 
> yes, your assumption ...
> 
>> The error message you are seeing is not particularly helpful, but it is
>> most likely related to this.
> 
> ... is correct.
> 
>> The fact that with this package update 32-bit PV guests fail to start 
>> is
>> indeed a regression problem, which is quite inconvenient for you, right 
>> now.
> 
> Ok, then I will put the Xen-package on "hold" for now.
> 
>> At this point I would really recommend to not wait for a fix to arrive
>> which makes it start again, but change your VM to use a 64-bit kernel.
> 
> It really is a shame, that 32-bit isn't supported properly any longer.
> The address- and data-overhead in 64-bit machines only using a 32-bit
> address- and data-space is considerable.
> 
> I already experienced, "bullseye" not supporting a dom0-Kernel for the
> i686-pae architecture any longer :-(. A shame that it doesn't come with
> a kernel before 5.9, which would still allow this.
> 
>> Let me know if you need help or run into problems while making this 
>> change.
> 
> Would you know of a "simple" way to convert/clone a 32-bit VM to a
> work-alike 64-bit one? One has to replace all the .debs for this, after
> all.

The smallest amount of work to initially get your VM going again is to
only install the 64 bit kernel and keep running a 32 bit user land.

The process to fully change from a 32 to 64 system (in place) is called
'cross grading'. I found instructions at
https://wiki.debian.org/CrossGrading

I never did this myself, though.

>> Running 32-bit PV at all is already 'on life support' upstream for 
>> quite
>> a while now, and it also not under security support any more.
> 
> Well it's a Debian "stretch" one, so it's just working for now :-).

One of the main reasons why it's so problematic to keep around is that
in the 32 bit PV case, there are no possibilities to implement fixes for
all the speculative vulnerabilities that have been very much in the news
in the last years.

More about this: https://xenbits.xen.org/xsa/advisory-370.html

>> In the long run, I'd suggest working towards having 64-bit guests in 
>> PVH
>> mode, since that's one of the best options we have these days.
> 
> Thanks, I'll consider this for any newer VMs.
> Are 64-bit PV VMs automatically "moved" to or executed as PVH?
> I would even be willing to edit the .xml/.cfg-file manually.
> I see "bullseye's" virt-manager/libvirt offering only choices for
> "xen (fullvirt)", "xen (paravirt)", or xen", when creating a new
> VM.

It should be as simple as changing type="pv" to type="pvh" in the config
file. In Debian, using PVH this is possible since Buster. Also, using
the xen variant of grub2 (grub-xen and grub-xen-host) is possible.

More info:
https://wiki.xenproject.org/wiki/Understanding_the_Virtualization_Spectrum

Have fun,
Hans

Bug#994870: [Pkg-xen-devel] Bug#994870: Memory allocation problem for VM after xen security update

2021-09-22 Thread Hans van Kranenburg

Hi Ruediger,

On 9/22/21 11:37 AM, H.-R. Oberhage wrote:
> Package: xen-system-amd64
> Version: 4.14.3-1~deb11u1
> 
> After applying the buster security update to xen, my VM won't start
> any longer, complaining about a memory allocation error.

Can you confirm that this is a virtual machine that tries to boot a
32-bit kernel as PV type?

The error message you are seeing is not particularly helpful, but it is
most likely related to this.

The fact that with this package update 32-bit PV guests fail to start is
indeed a regression problem, which is quite inconvenient for you, right now.

At this point I would really recommend to not wait for a fix to arrive
which makes it start again, but change your VM to use a 64-bit kernel.

Let me know if you need help or run into problems while making this change.

Running 32-bit PV at all is already 'on life support' upstream for quite
a while now, and it also not under security support any more.

In the long run, I'd suggest working towards having 64-bit guests in PVH
mode, since that's one of the best options we have these days.

If there's a reason you really cannot switch to a 64-bit kernel or move
the functionality of this virtual machine to a new fully 64 bit system,
switching the virtualization type from PV to HVM would also be an option.

> Switching back to the previous version 4.14.2+25-gb6a8c4f72d-2 lets
> the VM start (again,) normally.
> 
> /var/log/libvirt/libxl/libxl-driver.log:
> 2021-09-21 14:01:44.645+: xc: panic: xc_dom_boot.c:120: 
> xc_dom_boot_mem_init: can't allocate low memory for domain: Out of 
> memory
> 2021-09-21 14:01:44.653+: libxl: libxl_dom.c:593:libxl__build_dom: 
> xc_dom_boot_mem_init failed: Die Operation wird nicht unterstützt 
> [means: the operation is not supported]
> 2021-09-21 14:01:44.662+: libxl: 
> libxl_create.c:1576:domcreate_rebuild_done: Domain 1:cannot (re-)build 
> domain: -3
> 
> The error is triggered, regardless if there was a boot-parameter
> "dom0_mem=1024M:max=2048M" set or not.
> /etc/xen/xl.conf was unaltered, i.e. 'autoballoon' was implicitely set
> to "auto".
> 
> I am "on" Buster, kernel 5.10.0-8-amd64 (5.10.46-4), all relevant fixes
> included.

Apologies for the inconvenience,

Hans

Bug#993168: Security support ended for Xen 4.11 in Buster

2021-08-28 Thread Hans van Kranenburg

Package: debian-security-support
Version: 2020.06.21~deb10u1
Severity: normal

Hi,

Upstream security support for Xen 4.11 has ended recently. This also
means that security support for Debian ended.

The complexity of the software involved does not really allow for anyone
else than the upstream developers, with a deep understanding of the
inner workings of the hypervisor code, to apply/backport new patches.

For security-support-ended.deb10, this would be a line like:

xen 4.11.4+107-gef32c7afa2-1
https://xenbits.xen.org/docs/4.11-testing/SUPPORT.html#release-support

Thanks,
Hans

Bug#989656: [Pkg-xen-devel] Bug#989656: Xen misusing syslog

2021-08-05 Thread Hans van Kranenburg

reassign 989656 src:xen 4.14.1+11-gb0b734a8b3-1
tags 989656 + upstream
thanks

Hi Phillip,

On 6/9/21 5:04 PM, Phillip Susi wrote:
> Package: xen-utils-common
> Version: 4.14.1+11-gb0b734a8b3-1
> 
> My syslog has entries that look like this:
> 
> Jun 09 10:54:26 hyper1 root[621]: /etc/xen/scripts/block: add
> XENBUS_PATH=backend/vbd/1/768
> 
> The third field is supposed to be the program name, which I would expect
> to either be xen or xl or something, but instead it appears to be
> passing $USER.

Yeah, that's a bit weird yes.

I guess this is one of the many things that have to be dealt with when
doing a great overhaul of all the ancient scripts-stuff in xen.

I'm marking it 'upstream' now, since we cannot fix this in the Debian
packaging process, any solution should go 'upstream-first'.

Hans

Bug#989560: [Pkg-xen-devel] Bug#989560: Bug #989560 is grub-common, not xen-hypervisor-common

2021-08-05 Thread Hans van Kranenburg

tags 989560 + moreinfo
thanks

Hi,

On 8/4/21 4:00 AM, Elliott Mitchell wrote:
> I rate #989560 as a grub-common bug, *not* a xen-hypervisor-common bug.
> As you've noticed, the problem is with the file /etc/grub.d/20_linux_xen,
> which is part of grub-common, not xen-hypervisor-common.
> 
> A working grub.cfg will be generated by the version of the file from
> GRUB 2.04.  If you can deal with installing *only* GRUB from testing,
> that should work.
> 
> The bug should be reassigned to grub-common, but marked as effecting
> Xen so duplicate reports don't show up (actually I'm pretty sure reports
> against grub-common or src:grub2 already exist).

The /etc/grub.d/20_linux_xen is indeed part of grub-common, but, I'm not
just going to NIMBA reassign, since the grub-common maintainer will not
have any idea what to do with it, unless you guys find out what's wrong
first and have clear directions and questions and patches about how to
improve the situation.

Currently, the only thing I can do before doing new unstable uploads or
stable/security stuff is do smoke testing on amd64.

That doesn't mean I don't care. It does mean however that extra help in
the team is really appreciated.

Have fun,
Hans

Bug#988477: [Pkg-xen-devel] Bug#988477: Acknowledgement (xen-hypervisor-4.14-amd64: xen dmesg shows (XEN) AMD-Vi: IO_PAGE_FAULT on sata pci device)

2021-08-05 Thread Hans van Kranenburg

severity 988477 normal
tags 988477 + moreinfo + upstream - bullseye-ignore
thanks

Hi!

On 6/13/21 3:58 PM, Imre Szőllősi wrote:
> i tested on 4th hw
> 
> 4. asus m4n78 pro, phenom ii x4 905e, md raid1, 2x samsung 1TB 860evo, 
> lvm: problem does not appear
> 
> as i see, not all mb/chipset/sata pcie device affected

Thanks for your report, and for trying out different combinations of
hardware.

While doing a short internet search about the problems you're seeing
while using AMD ryzen, sata, nvme and iommu, I suspect this problem does
not have a lot to do with Xen specifically, but more with the hardware
and its firmware.

This also means that it's not a Debian packaging problem, and it cannot
be fixed by me (or the Debian Xen team). If you want to research this
problem more, I can maybe be of some help by providing suggestions.
Still, you will have to do all of the actual work, since I do not have
your hardware here.

The first thing I would suggest is to try reproduce the problem when
booting with just Linux without Xen, and then trying the dbench test.

If you don't actually need to directly pass-through hardware to a Xen
guest, you can also try disabling iommu, or researching other iommu=
options that can serve as a workaround.

In any case, further reports will need to have more detailed
information. For example, instead of "there are a lot of messages",
provide a text attachment with a piece of logging that shows these messages.

I'm tagging this bug 'moreinfo' now, since it will depend on your
availability and abilities to work on it to have it advance.

Have fun,
Hans van Kranenburg

Bug#987030: linux-image-5.10.0-6-amd64 - Fans speed maximum - CPU load < 1%

2021-05-08 Thread Hans van Kranenburg

Oh,

On 4/16/21 11:44 AM, Hans van Kranenburg wrote:
> [...]
> 
> I have the same issue here, it started at the moment I moved from the
> 4.19 kernel to 5.9, and now 5.10. For totally non-obvious reasons fans
> start blowing like crazy regularly for a few seconds. When observing
> system load, it's just hovering around 0.2 - 0.4, no peaks observed.
> 
> [...]

So, while the fan misbehavior started around the time of the kernel
upgrade, the reason turned out to be a lot more simple.

For me, it was a dust problem. After thorough cleaning, the problem is
gone. :D

Hans

Bug#987030: linux-image-5.10.0-6-amd64 - Fans speed maximum - CPU load < 1%

2021-04-16 Thread Hans van Kranenburg

Hi,

On 4/15/21 10:51 PM, klak wrote:
> Package:  linux-image-5.10.0-6-amd64
> Version:  5.10.28-1
> 
> Hello Maintainer,
> 
> every few minutes the fans turn to maximum for a few seconds. The CPU
> load is less than 1 %, but the fans are turning maximum. The problen
> starts with version 5.9. I didn't see anything conspicuous in the
> syslog. The machine is a KVM host and the problem also occurs when it
> is idle.

I have the same issue here, it started at the moment I moved from the
4.19 kernel to 5.9, and now 5.10. For totally non-obvious reasons fans
start blowing like crazy regularly for a few seconds. When observing
system load, it's just hovering around 0.2 - 0.4, no peaks observed.

This is an Intel NUC with just a Buster system used as mostly inactive
desktop.

FWIW, attached are output of lshw and lspci -v.

> Board + CPU :
> =
> DMI: Intel Corporation S5520HC/S5520HC, BIOS
> S5500.86B.01.00.0064.050520141428 05/05/2014
> 
> smpboot: CPU0: Intel(R) Xeon(R) CPU   L5640  @ 2.27GHz (family:
> 0x6, model: 0x2c, stepping: 0x2)
> 
> Performance Events: PEBS fmt1+, Westmere events, 16-deep LBR, Intel PMU
> driver.
> DMAR: Intel(R) Virtualization Technology for Directed I/O

Hans
dorothy
description: Mini PC
product: NUC8i5BEH (BOXNUC8i5BEH)
vendor: Intel(R) Client Systems
version: J72747-305
serial: G6BE94400JDL
width: 64 bits
capabilities: smbios-3.2.1 dmi-3.2.1 smp vsyscall32
configuration: boot=normal chassis=mini family=Intel NUC sku=BOXNUC8i5BEH 
uuid=889D1A9F-A26C-DC8C-5D1C-1C697A09E4C6
  *-core
   description: Motherboard
   product: NUC8BEB
   vendor: Intel Corporation
   physical id: 0
   version: J72692-307
   serial: GEBE94TU
   slot: Default string
 *-firmware
  description: BIOS
  vendor: Intel Corp.
  physical id: 0
  version: BECFL357.86A.0072.2019.0524.1801
  date: 05/24/2019
  size: 64KiB
  capacity: 16MiB
  capabilities: pci upgrade shadowing cdboot bootselect socketedrom edd 
int13floppy1200 int13floppy720 int13floppy2880 int5printscreen int14serial 
int17printer acpi usb biosbootspecification uefi
 *-memory
  description: System Memory
  physical id: 3b
  slot: System board or motherboard
  size: 32GiB
*-bank:0
 description: SODIMM DDR4 Synchronous 2400 MHz (0.4 ns)
 product: CT16G4SFD824A.M16FE
 vendor: 859B
 physical id: 0
 serial: E3029A84
 slot: SODIMM1
 size: 16GiB
 width: 64 bits
 clock: 2400MHz (0.4ns)
*-bank:1
 description: SODIMM DDR4 Synchronous 2400 MHz (0.4 ns)
 product: CT16G4SFD824A.M16FE
 vendor: 859B
 physical id: 1
 serial: E302B39D
 slot: SODIMM2
 size: 16GiB
 width: 64 bits
 clock: 2400MHz (0.4ns)
 *-cache:0
  description: L1 cache
  physical id: 45
  slot: L1 Cache
  size: 256KiB
  capacity: 256KiB
  capabilities: synchronous internal write-back unified
  configuration: level=1
 *-cache:1
  description: L2 cache
  physical id: 46
  slot: L2 Cache
  size: 1MiB
  capacity: 1MiB
  capabilities: synchronous internal write-back unified
  configuration: level=2
 *-cache:2
  description: L3 cache
  physical id: 47
  slot: L3 Cache
  size: 6MiB
  capacity: 6MiB
  capabilities: synchronous internal write-back unified
  configuration: level=3
 *-cpu
  description: CPU
  product: Intel(R) Core(TM) i5-8259U CPU @ 2.30GHz
  vendor: Intel Corp.
  physical id: 48
  bus info: cpu@0
  version: Intel(R) Core(TM) i5-8259U CPU @ 2.30GHz
  serial: To Be Filled By O.E.M.
  slot: U3E1
  size: 3139MHz
  capacity: 3800MHz
  width: 64 bits
  clock: 100MHz
  capabilities: lm fpu fpu_exception wp vme de pse tsc msr pae mce cx8 
apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht 
tm pbe syscall nx pdpe1gb rdtscp x86-64 constant_tsc art arch_perfmon pebs bts 
rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 
monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 
x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 
3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp 
tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep 
bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec 
xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp 
md_clear flush_l1d cpufreq
  configuration: cores=4 enabledcores=4

Bug#983862: PVH -- cannot remove vm with pci passthrough

2021-03-07 Thread Hans van Kranenburg

reassign 983862 src:xen 4.11.4+57-g41a822c392-2
tags 983862 + upstream
thanks

Hi Adi!

On 3/2/21 12:46 PM, Adi Kriegisch wrote:
> Package: xen-utils-4.11
> Version: 4.11.4+57-g41a822c392-2
> Severity: minor
> 
> Dear maintainers,
> 
> we, by accident, added a pci passthrough device config to a pvh vm and were
> able to boot that machine. But shutdown did not work with the following
> error message:
>   | xl: libxl_pci.c:1427: do_pci_remove: Assertion `type == 
> LIBXL_DOMAIN_TYPE_PV' failed.
> To remove the virtual machine and free its resources a reboot of Dom0 was
> necessary. A corresponding assert when creating the machine seems to be
> missing.
> We consider this to be a bug, because we should not have been able to 'xl
> create' that machine in the first place (or would have needed a way to
> dispose the vm).

Aha. Interesting. I just had a look at libxl_pci.c in the latest
upstream code, and I think the same bug still exists.

PCI passthrough is still not supported for PVH, so it should refuse, a
bit higher up in the call stack. Probably not with an assert, but with a
nice error message. :)

I think I'm going to have a closer look at it somewhere in the next week.

Note that the bug fix will not reach Xen 4.11 (or our package) any more.

Regards,
Hans van Kranenburg

Bug#981052: xen: XSA-360: IRQ vector leak on x86

2021-01-27 Thread Hans van Kranenburg

Hi,

On 1/25/21 8:08 PM, Salvatore Bonaccorso wrote:
> Source: xen
> Version: 4.14.0+88-g1d1d1f5391-2
> Severity: important
> Tags: security upstream
> X-Debbugs-Cc: car...@debian.org, Debian Security Team 
> 
> 
> Hi
> 
> For details see https://xenbits.xen.org/xsa/advisory-360.html . 
> 
> It does not affect version in buster afaict.

Indeed. Currently upstream stable-4.11 is at commit 310ab79875, which is
actually the same as our buster-security package
(4.11.4+57-g41a822c392-2) because the last upload was done using the
embargoed patches.

Unless something really interesting suddenly happens next Tuesday, there
won't be a buster security update happening together with the 10.8 point
release.

For unstable, I plan do do something at the end of this week, and base
it on the current stable-4.14 of course with the XSA-360 thing. And, we
will have reproducible builds, woohoo!

Thanks,
Hans

Bug#977148: Removing Xen hypervisor packages does not update-grub

2020-12-11 Thread Hans van Kranenburg

Package: src:xen
Version: 4.11.4+57-g41a822c392-1
Severity: normal

When removing the Xen packages, the grub menu entries to boot should be
removed.

Currently, thanks to a missing postfix of a postrm filename...
 xen-hypervisor-V-F.postrm
vs.
 xen-hypervisor-V-F.postrm.vsn-in
...this script is ignored and not installed. This is the maintainer
script that contains the update-grub command.

The result of this packaging bug is that after removing Xen, the system
remains unbootable, except when interacting manually with the grub menu.

I'd like to also apply the fix to Debian stable since:
* The script was there before, the bug was introduced during a
refactoring in the Xen 4.11 packaging.
* The fix is very small and targeted.
* It's rather embarrasing to cause a system to be unbootable for someone.
* We cannot require all Debian users to have proper OOB in place to deal
with a situation like this.

Hans

Bug#962267: [Pkg-xen-devel] Bug#962267: Bug#962267: xen: please consider to not install NEWS into runtime library packages

2020-12-05 Thread Hans van Kranenburg

Hi,

On 6/5/20 1:34 PM, Ansgar wrote:
> On Fri, 2020-06-05 at 12:09 +0200, Hans van Kranenburg wrote:
>>> Installing NEWS into xen*, but not libxen* probably still reaches all
>>> relevant users.
>>
>> Yes, that makes sense.
>>
>> OTOH, what if there was a really weird problem with libxenmisc4.11 that
>> we would like to pro-actively inform users about?
> 
> In that case shipping NEWS in libxen* would of course be fine in my
> opinion even if it also includes some additional information that might
> only be relevant to users of the other NEWS files.
> 
>> I guess there is only one NEWS per source package?
> 
> You can have `debian/.NEWS` for per-binary NEWS when using
> `dh_installchangelogs` or install them in some other way. But it
> increases overhead and I would personally avoid having per-binary NEWS
> for this reason.

Can you help me by explaining me what your current expectations
regarding this issue are? You ask for not having NEWS in specific binary
packages, but then subsequently explain that you'd prefer to avoid doing
exactly that.

Adding information to NEWS is quite an exceptional thing to happen. It's
only used for cases in which the user needs to take actions or needs to
be aware of a real problem that needs to be solved outside of the
context of what we can do in the packaging and Debian. My opinion is
that adding the extra complexity is not warranted to fix the accidental
annoyance of pressing a key on the keyboard for a user who chooses to
install apt-listchanges.

Thanks,
Hans

Bug#976597: Xen Python dependencies are not specific enough

2020-12-05 Thread Hans van Kranenburg

Package: src:xen
Version: 4.14.0+80-gd101b417b7-1
Severity: normal

There is indeed something wrong that should be fixed. Creating a Debian
bug for it now.

tl;dr Currently Xen package needs Python 3.9 as default /usr/bin/python3
but the Xen packages went unstable->testing while testing has 3.8 as
default, causing pygrub to fail to find and import
xenfsimage.cpython-39-x86_64-linux-gnu.so.

On 12/5/20 10:43 AM, Alexander Dahl wrote:
> Hello,
> 
> FTR, we found what seems to be the problem in IRC yesterday, see
> below.
> 
> On Thu, Dec 03, 2020 at 10:56:12PM +0100, Alexander Dahl wrote:
>> On Tue, Nov 24, 2020 at 05:41:42PM +0100, Hans van Kranenburg wrote:
>>> [...]
>>>
>>> Any help with testing is appreciated, especially since there are so many
>>> combinations of hardware, different architectures and use cases (using
>>> legacy BIOS or EFI, PV, PVH, HVM, different boot loaders like pvgrub,
>>> pygrub, etc etc).
>>
>> x86_64 host here, and some old 32 bit virtual machines, no weird
>> network or hardware pass through setup, rather simple.  HVM and pvgrub
>> based DomU VMs run fine so far.  pygrub based VMs, both 32 bit and 64
>> bit fail with the following error:
>>
>>   Traceback (most recent call last):
>> File "/usr/lib/xen-4.14/bin/pygrub", line 27, in 
>>   import xenfsimage
>>   ModuleNotFoundError: No module named 'xenfsimage'
> 
> Testing has python 3.8 as default at the moment, while unstable
> already has 3.9.  The file packaged for testing however is
> 'xenfsimage.cpython-39-x86_64-linux-gnu.so' but that seems to be for
> python 3.9, not for 3.8.  When starting pygrub manually with python3.9
> that error goes away, but I suppose that would not work from within
> the xen config, or does it?

I have not dived further into this yet, but I can think of the following
TODO items, if anyone wants to help with research and fixing (yes please):

* Look at the build logs (buildd logs are linked from the PTS), and try
to understand why in Xen 4.11 (with Python 2) we just have fsimage.so
but with 4.14 and Python 3 we have this more specific longer name with
cpython-39-x86_64-linux-gnu in it.
* Figure out what we need to do to make a python dependency more
specific, so that the xen packages would have been blocked from the
transition to testing as long as python3-defaults in testing is not
pointing to the needed version.

Maybe there's something to be found in the Debian Python Policy?
https://www.debian.org/doc/packaging-manuals/python-policy/

Or, maybe if it's not super obvious we can ask some python packaging IRC
or mailing list or otherwise debian-devel@ for help.

Hans

Bug#976109: [Pkg-xen-devel] Bug#976109: xen: CVE-2020-29040

2020-11-30 Thread Hans van Kranenburg

Hi,

On 11/29/20 8:50 PM, Salvatore Bonaccorso wrote:
> Source: xen
> Version: 4.14.0+80-gd101b417b7-1
> Severity: grave
> Tags: security upstream
> Justification: user security hole
> X-Debbugs-Cc: car...@debian.org, Debian Security Team 
> 
> 
> Hi,
> 
> The following vulnerability was published for xen.
> 
> CVE-2020-29040[0]:
> | An issue was discovered in Xen through 4.14.x allowing x86 HVM guest
> | OS users to cause a denial of service (stack corruption), cause a data
> | leak, or possibly gain privileges because of an off-by-one error.
> | NOTE: this issue is caused by an incorrect fix for CVE-2020-27671.

Yes, there's also a limited number of cases in which this is possible,
and you just left that text out, which makes it sound a lot more
horrible: "Only x86 HVM guests which have physical devices passed
through to them can leverage the vulnerability.".

I suspect that if anyone today is using Debian testing to run Xen and
also is passing through devices is doing that to test performance use
cases and not to untrusted guests.

> If you fix the vulnerability please also make sure to include the
> CVE (Common Vulnerabilities & Exposures) id in your changelog entry.

Yes, it will off course be included in next upload.

Hans

Bug#942611: [Pkg-xen-devel] Bug#942611: xen-doc: Various text files stored as .txt.gz, but index references .txt

2020-11-26 Thread Hans van Kranenburg

tags 942611 + pending
thanks

Hi Diederik,

On 10/19/19 2:19 AM, Diederik de Haas wrote:
> Package: xen-doc
> Version: 4.11.1+92-g6c33308a8d-2+b1
> Severity: normal
> 
> file:///usr/share/doc/xen/html/index.html contains a link to
> file:///usr/share/doc/xen/html/misc/vtd.txt (VT-d HOWTO), but that file
> doesn't exist. There is a .../misc/vtd.txt.gz file though.
> A similar pattern can be found with various other .txt files, but not all.
> Since this is HTML documentation and presumably meant to be read in a browser 
> (which is what I did), I think those .txt.gz files should be stored as .txt, 
> so
> they can be viewed in the browser and it would make the hyperlink actually 
> work.

Yes, you are right. I do agree.

We already have the html documentation collection in a separate package,
xen-doc.

So, when someone installs that package, they explicitly choose to do so.
If they want to browse around at file:///usr/share/doc/xen/html/ then
there should not be broken links all over the place.

The difference between compressing or not compressing is 5.1M vs 5.2M
measured by doing dpkg-deb -x on the xen-doc .deb before and after and
then doing du -sch on the directory in which it was unpacked.

https://salsa.debian.org/xen-team/debian-xen/-/commit/38cde19f59ee4121e048b23cfe7e9ea4ddcbdf60
(commit id will vanish because of heavily rebasing later, it's "d/rules:
do not compress /usr/share/doc/xen/html")

> (Sidenote: I doubt including a file mentioning how to compile your own 2.6.18 
> kernel 
> to include support for VT-d is useful, and it is 10+ y/o, but that's probably 
> an 
> upstream issue)

Heh, yes. Patches to remove obsolete documentation can be sent upstream
directly.

Have fun,
Hans

Bug#963607: xen-hypervisor-4.11-amd64: Xen Hypervisor kernel fails to load arcmsr module with "arcmsr0: dma_alloc_coherent got error" message.

2020-11-26 Thread Hans van Kranenburg

tags 963607 + moreinfo
thanks

Hi Alex,

On 7/2/20 9:26 AM, debianb...@red-sand.com wrote:
> 
> [...]
> 
> I am about to purchase a new SAS HBA card to test as we have a number of
> these servers with Areca cards that I imagine will have the same problem
> on Xen 4.11.   I am leaning towards mpt3 driver cards but we have had
> problems with mpt3 previously so I am hesitating there too.  mpt2 has
> been rock solid. 
> 
> If you can think of anything else that I could try that would be
> excellent.  
> 
> [...]

No, I don't really have a suggestion.

Did you get new hardware? What do you want to do with this bug report?
There are no actionable items for us, we cannot solve a hardware problem
with packaging changes :) so I'd rather close it.

Thanks,
Hans

Bug#944247: xen domU crashes under high i/o load if you use qcow2 images

2020-11-26 Thread Hans van Kranenburg

tags 944247 + moreinfo
severity 944247 normal
thanks

Hi Mario,

On 11/6/19 4:46 PM, mario wrote:
> Source: xen
> Severity: important
> 
> Dear Maintainer,
> 
> we have updated our server from debian oldstable (which unfortunately wasn't 
> running stable after the last update, bug reported) to debian buster.
> 
> unfortunately xen doesn't work reliably there either:
> 
> the virtual server crashes every 1-2 week with i/o problems and sometimes 
> also takes other domU instances with it.
> we use qcow2 images.
> 
> the harddisk of the domU is simply no longer accessible for the linux kernel, 
> no logfiles are available. in the xl console the following last lines can be 
> read, login not possible:
> 
> [ 1450.976415] INFO: task nginx:376 blocked for more than 120 seconds.
> [ 1450.976423] Not tainted 4.9.0-9-amd64 #1 Debian 4.9.168-1+deb9u5
> [ 1450.976428] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
> this message.
> [ 1450.976469] INFO: task nginx:377 blocked for more than 120 seconds.
> [ 1450.976474] Not tainted 4.9.0-9-amd64 #1 Debian 4.9.168-1+deb9u5
> [ 1450.976479] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables 
> this message.
> [ 1450.976624] INFO: task nginx:378 blocked for more than 120 seconds.
> 
> the process varies:
> [1523692.508073] INFO: task jbd2/xvda2-8:159 blocked for more than 120 seconds
> [1523692.508084] Not tainted [...]
> 
> all hard disk accesses fail as if the i/o system is completely dead.
> only "xl destroy " and recreate will help

This report is now a year old. Unfortunately it did not get any reply.
This might have several reasons, and one of them is probably that
there's not someone else around reading it that uses the same storage
configuration and as well runs into the same problem.

> you can easily reproduce this with the tool stress "stress -c 8 -i 8 -d 8".
> it takes a maximum of 10 minutes until the vm crashes.
> 
> in our experience, as a workaround you can convert all images to raw. after 
> our tests, the error will no longer occur. 
> but since we need the snapshot functions of qcow2 images, this is not a 
> permanent solution.
> 
> does anyone else have problems with qcow2 images and xen under buster?
> maybe this also concerns qemu?
> 
> [...]
To be honest, I do not know.

Have you been able to find out more about the problem yet, in the last
year? Have you taken steps to try narrow down the problem by
investigating other combinations of used software with/without xen? I
mean, for example, reboot into just Linux and mount the qcow2 image
somewhere and do the same load test to see if it's also happening when
eliminating Xen from the equation?

The bug report right now is not really actionable for anyone else than
yourself. As Debian Xen team we unfortunately do not have the bandwidth
to go set up a test server with the same configuration as you have and
try hammer on it and cause the same problem to happen.

Thanks,
Hans

Bug#955994: [Pkg-xen-devel] Bug#955994: xen-utils-common: Could not start vif

2020-11-26 Thread Hans van Kranenburg

reassign 955994 src:xen
tags 955994 + pending
thanks

Hi Samuel,

On 4/5/20 9:14 PM, Samuel Thibault wrote:
> Package: xen-utils-common
> Version: 4.11.3+24-g14b62ab3e5-1
> Severity: normal
> Tags: patch
> 
> Hello,
> 
> I was having issues with starting domains with vif-nat: 
> 
> ♭ xl cr -c mydom
> Parsing config from mydom
> libxl: error: libxl_exec.c:117:libxl_report_child_exitstatus: 
> /etc/xen/scripts/vif-nat online [27191] exited with error status 1
> libxl: error: libxl_device.c:1286:device_hotplug_child_death_cb: script: 
> /etc/xen/scripts/vif-nat failed; error detected.
> libxl: error: libxl_create.c:1519:domcreate_attach_devices: Domain 25:unable 
> to add vif devices
> libxl: error: libxl_domain.c:1034:libxl__destroy_domid: Domain 
> 25:Non-existant domain
> libxl: error: libxl_domain.c:993:domain_destroy_callback: Domain 25:Unable to 
> destroy guest
> libxl: error: libxl_domain.c:920:domain_destroy_cb: Domain 25:Destruction of 
> domain failed
> 
> It happens that it seems that's merely because handle_iptable() does not
> pass a return value, and I guess the return value is thus that of the
> latest command, which may not be true, and that makes vif-nat fail. The
> attached patch fixes that.

Yes, you are completely right. Thanks for spotting this.

> [...]

I just added the explicit 0. I did not change the second return line,
since that code is unreachable anyway and it's patching upstream content.

https://salsa.debian.org/xen-team/debian-xen/-/commits/knorrie/sid

Thanks,
Hans

Bug#939186: [Pkg-xen-devel] Bug#939186: HVM + Balloon crashes Xen hypervisor

2020-11-25 Thread Hans van Kranenburg

Hi,

On 9/2/19 5:18 AM, Elliott Mitchell wrote:
> Package: xen-hypervisor-4.8-amd64
> Version: 4.8.5+shim4.10.2+xsa282-1+deb9u11
> 
> Trying to create a HVM domain with memory != maxmem reliably crashes
> Debian's build of Xen 4.8.  This may be a nonsensical configuration, but
> it still shouldn't cause everything, including the hypervisor to crash.
> 
> I recall running into this with 4.4 as well.

Can you still reproduce this with Xen 4.11 or 4.14?
If not, can you mail 939186-cl...@bugs.debian.org to close it?

I just tried a few things with maxmem and memory with a PVH guest on Xen
4.14, and it just seems to work like it should.

Hans

Bug#912975: xen-hypervisor-4.8-amd64: Dom0 crashes randomly without logs on Debian Stretch with Xen 4.8.4

2020-11-22 Thread Hans van Kranenburg

Hi,

This bug was reported against Xen 4.8 (which is out of support and out
of security support now) and there has not been any activity for over
almost two years.

I'm cleaning up old open bugs, and I will close the issue now.

If you found a solution to this problem, please let us know, so the
information is added in the bug report for anyone else who might run
into the same situation.

If the problem still persists with Xen 4.11 in Debian stable, please
reply and reopen.

Thanks,
Hans

Bug#934786: xen-system-amd64: xen host crashes when calling "npm run build" in a vm (reproducible)

2020-11-22 Thread Hans van Kranenburg

Hi Mario,

On 8/14/19 11:00 PM, mario wrote:
> Package: xen-system-amd64
> Version: 4.8.5+shim4.10.2+xsa282-1+deb9u11
> Severity: important
> 
> hello everyone,
> 
> we have a vm with kernel 4.9.0-5-amd64 running dabian oldstable
> if we run an "npm run build" on one of the virtual machines, the whole 
> xen-host system will restart (reset)
> there is no message, neither in the kernel nor in the syslog.
> 
> if we update the kernel of the virtual machine to 4.9.0-9-amd64, the problem 
> is no longer there and the build runs without errors.
> 
> the kernel of the hosts system doesn't matter, even the latest 4.9.0-9-amd64 
> doesn't help.
> 
> we have a second server (with different hardware) also here i can crash the 
> complete xen-server from the vm with all vms that are running.
> 
> the whole thing works reproducible by starting "npm run build" on this 
> special vm
> 
> any idea how we can narrow that down and provide more information???

Interesting. Do you know if it's Xen crashing, or the dom0 Linux kernel?

The Xen wiki has some hints about debugging:
https://wiki.xen.org/wiki/Debugging_Xen

The first thing I would recommend is getting the server serial port
working properly so that you can see Xen messages there.

This bug was reported against Xen 4.8 (which is out of support and out
of security support now). I would highly recommend to first upgrade to
Xen 4.11 in current Debian stable and see if you can still reproduce
this problem.

If you no longer have this problem, or found a solution, please let know.

If there's no activity, I will close this issue in about a month.

Thanks,
Hans

Bug#968965: [Pkg-xen-devel] Bug#968965: xen: FTBFS woes in sid

2020-11-21 Thread Hans van Kranenburg

On 11/20/20 8:02 PM, Hans van Kranenburg wrote:
> So,
> 
> On 9/21/20 4:16 PM, Hans van Kranenburg wrote:
>> [...]
>>
> [...]
>  >8 
> 
> dh_install: warning: Cannot find (any matches for)
> "usr/lib/debug/usr/lib/xen-*/boot/*" (tried in ., debian/tmp)
> 
> dh_install: warning: xen-utils-4.14 missing files:
> usr/lib/debug/usr/lib/xen-*/boot/*
> dh_install: error: missing files, aborting
> 
>  >8 
> 
> I can only find CONFIG_PV_SHIM=n in the build log. What is going on
> here? Attached is the build log.

Ok, this probably has something to do with upstream commit 8845155c83
"pvshim: make PV shim build selectable from configure" (Xen 4.12) which
causes the shim not to be built during our i386 build any more.

In Xen 4.11 we have commit a516bddbd3 "tools/firmware/Makefile:
CONFIG_PV_SHIM: enable only on x86_64". The part of this file that this
patch changes is removed in the above mentioned commit.

Because all of this is such a big mess, I just tried to revert
8845155c83 and then do 0b898ccc2 and a516bddbd3 on top of the previous
code again.

And, yes, now it goes through, and ./usr/lib/xen-4.14/boot/xen-shim is
included in the i386 package. At least we have a workaround now.

> My WIP branch is here (including the make-patches commit, it's ready to
> build). I also forwarded the thing to latest stable-4.14.

Again at:

> https://salsa.debian.org/xen-team/debian-xen/-/commits/knorrie/4.14/

I'll rerun both the amd64 and i386 build here and actually boot the
amd64 packages in a test environment. If success, then I'm going to try
put this in experimental again so we can see if it all succeeds on the
buildds.

Then after final review we should be able to upload to unstable
beginning next week.

K

Bug#968965: xen: FTBFS woes in sid

2020-11-21 Thread Hans van Kranenburg

On 11/21/20 5:40 AM, Elliott Mitchell wrote:
> On Fri, Nov 20, 2020 at 08:02:26PM +0100, Hans van Kranenburg wrote:
>> So,
>>
>> On 9/21/20 4:16 PM, Hans van Kranenburg wrote:
>>> [...]
>>>
>>> gcc-Wl,-z,relro -Wl,-z,now -pthread -Wl,-soname
>>> -Wl,libxentoolcore.so.1 -shared -Wl,--version-script=libxentoolcore.map
>>> -o libxentoolcore.so.1.0 handlereg.opic
>>> /usr/bin/ld: i386:x86-64 architecture of input file `handlereg.opic' is
>>> incompatible with i386 output
>>> /usr/bin/ld: handlereg.opic: file class ELFCLASS64 incompatible with
>>> ELFCLASS32
>>> /usr/bin/ld: final link failed: file in wrong format
>>> collect2: error: ld returned 1 exit status
>>
>> This one is caused by "debian/rules: Combine shared Make args". I
>> reverted that change for now.
>>
>> [...]
> 
> I was going to type, "That can't be true!  Both sections are identical,
> so that commit *couldn't* have done it!"
> 
> Being the careful sort, look closer.  Look closer.  Then realize if one
> reads fast they look identical, but they're getting *slightly* different
> values for ${XEN_TARGET_ARCH}.  Mainly for $(make_args_xen),
> ${XEN_TARGET_ARCH} gets $(xen_arch_$(flavour)), but for
> $(make_args_tools), ${XEN_TARGET_ARCH} gets $(xen_arch_$(DEB_HOST_ARCH)).
> 
> Three of us and we didn't spot that difference.  Should still combine
> ${XEN_COMPILE_ARCH} which remains identical for both values.

Ok, I will make it a partial revert and add the above information about it.

Thanks.

Hans

Bug#975062: Python 3 (pygrub) in 4.14 packages

2020-11-18 Thread Hans van Kranenburg

Hi!

On 11/18/20 6:45 PM, Ian Jackson wrote:
> Hans van Kranenburg writes ("Bug#975062: Python 3 (pygrub) in 4.14 packages"):
>> So, apparently there are cases in which pygrub 'works' and in which it
>> does not, and apparently using pygrub with "amd64 kernel and Xen tools
>> but i386 userland" is problematic, and I remember some remarks which I
>> can't find back about that that use case was probably already broken
>> always, in the past.
> 
> The problem with pygrub with 32-bit userland is as follows:
> 
>  * Xen has to be 64-bit since there is no 64-bit Xen.
^^ 32?

>  * dom0 kernel bitness and Xen tools bitness must match because
>Xen 32/64 compat ABI understands only one bitness for dom0
>and Xen dom0 tools make hypercalls directly so must match
>the kernel.
> 
>  * 32-bit kernels are starting not to be able to drive hardware
>(big PCI bars, bugs, etc.) so you want a 32-bit kernel.
  ^^ 64?

>  * So you must have libxen*:amd64.
> 
>  * pygrub uses python, obviously.  It needs to load the xenfsimage
>library, which is part of xen tools, since that is the userland
>library tht understands the guest filesystem to fish out the guest
>kernel.
> 
>  * The xenfsimage library is not in its own package [1] - it's in with
>some other Xen libraries.  But it is going to be loaded into a
>python interpreter, so it needs to match the bitness of the python
>interpreter.
> 
>  * You can't install a 64-bit python interpreter without basically
>doing the whole 32-to-64 crossgrade.  (That crossgrade is what I
>ended up doing on my home machine.)
> 
>  * You can't co-install libxen*:i386 because the Xen libraries aren't
>properly multiarched. [2]
> 
>  * This gets worse now that the Xen packages use python3.  Previously
>with a minimal but modern system you might be able to get away with
>having a 64-bit python2 and a 32-bit python3.
> 
> Both [1] and [2] are in principle bugs in the Xen packages.  Upstream
> sentiment seems to be that 32-bit userland is not really a very good
> idea any more anyway.  So we could solve this by fixing [1] or [2]

My personal opinion is that there are more interesting horses to fry (or
what was the saying) for our very bandwidth limited team. So yes, let's
move this from the perfection into the known issues department for now
(bullseye).

> or
> we could expect people to use 64-bit dom0 userland (and crossgrade if
> need be).

When someone shows up with a real world issue who's really panicing
after recklessly upgrading (after the bullseye release halfway 2021) we
probably might help by giving pointers and instructions. Until that
actually happens, we should not spend time writing documentation about
how to do that etc.

>> I wanted to find out about this and set up some test cases to reproduce
>> things (I've never used pygrub yet), but that obviously did not happen
>> yet. I have some stuff going on in my personal life that is taking up a
>> lot of time currently. What is rather easy for *me* is to help
>> organizing the work and managing todo lists etc, but not learning new
>> stuff ATM.
>>
>> So, my current questions are:
>>
>> 1. Is pygrub a blocker for having Xen 4.14 in unstable? Because that
>> should be our first team-goal now.
> 
> I think yes, a working pygrub ought to be a blocker for 4.14 in
> unstable.
> 
> But I think we have that - I rebuilt the existing packages for buster
> and it WFM.

OK.

>> 2. What exactly is going on, can we make a list/table/whatever about in
>> which cases pygrub 'does not work' (in more detail, how does it fail).
>> 3. pygrub keeps being the thing that always causes problems. What would
>> be your (asking anyone who wants to think along) ideas about which
>> well-defined situations/test-cases we should have to execute instead of
>> having the users report problems after big package changes?
> 
> IDK about any other problems than the bitness one above.

Ok, thanks a lot for the write up, both, and now we have a debian bug to
look back to which is a bit easier to track than mailing list messages.

So, I'm crossing out this issue now as blocker.

Hans

Bug#975062: Python 3 (pygrub) in 4.14 packages

2020-11-18 Thread Hans van Kranenburg

On 11/18/20 9:39 PM, Ian Jackson wrote:
> It seems I was distracted when I wrote this mail.
> 
> Ian Jackson writes ("Re: Bug#975062: Python 3 (pygrub) in 4.14 packages"):
>> The problem with pygrub with 32-bit userland is as follows:
>>
>>  * Xen has to be 64-bit since there is no 64-bit Xen.
> ^^ 32
> 
>>  * 32-bit kernels are starting not to be able to drive hardware
>>(big PCI bars, bugs, etc.) so you want a 32-bit kernel.
>   ^^ 64
> 

HAH, thanks. \:D/

Hans

Bug#975062: Python 3 (pygrub) in 4.14 packages

2020-11-18 Thread Hans van Kranenburg

Package: src:xen
Version: 4.14.0-1~exp1
Control: submitter -1 ehem+deb...@m5p.com
X-Debbugs-CC: ehem+deb...@m5p.com, ijack...@chiark.greenend.org.uk

Hi, I think this should be in a bug report in the BTS to track it in a
better way.

8< Forwarded Message 8<
Subject: [Pkg-xen-devel] Python 3 in 4.14 packages
Date: Sat, 26 Sep 2020 22:44:25 -0700
From: Elliott Mitchell 
To: pkg-xen-de...@alioth-lists.debian.net

I was trying to test `pygrub` and found the Python 3 version is
definitely broken in the 4.14 packages.  I was able to get the script to
display the help message by adding "/usr/lib/xen-4.14/lib/" to
sys.path.

The existing line:
sys.path.insert(1, sys.path[0] + '/../lib/python')

Is distinctly odd, usually this is better expressed:
sys.path.append(os.path.join(sys.path[0], "libexec"))

(though I suppose we can assume Linux, but this is Bad Practice)

The way some portions of `pygrub` are packaged are distinctly odd.
Certainly xc.so and xs.so are linked to core Xen libraries and need to be
version-specific.  Yet libfsimage.so appears independant of the Xen
version and should likely track `pygrub`, rather than matching the system
Xen version.

>8 Forwarded Message >8

I also have a little snippet from IRC, which is about this, where Ian
reports that he's seen it working.

https://salsa.debian.org/xen-team/debian-xen/-/snippets/500

So, apparently there are cases in which pygrub 'works' and in which it
does not, and apparently using pygrub with "amd64 kernel and Xen tools
but i386 userland" is problematic, and I remember some remarks which I
can't find back about that that use case was probably already broken
always, in the past.

I wanted to find out about this and set up some test cases to reproduce
things (I've never used pygrub yet), but that obviously did not happen
yet. I have some stuff going on in my personal life that is taking up a
lot of time currently. What is rather easy for *me* is to help
organizing the work and managing todo lists etc, but not learning new
stuff ATM.

So, my current questions are:

1. Is pygrub a blocker for having Xen 4.14 in unstable? Because that
should be our first team-goal now.
2. What exactly is going on, can we make a list/table/whatever about in
which cases pygrub 'does not work' (in more detail, how does it fail).
3. pygrub keeps being the thing that always causes problems. What would
be your (asking anyone who wants to think along) ideas about which
well-defined situations/test-cases we should have to execute instead of
having the users report problems after big package changes?

Hans

P.S. Next message after the commercials will be on #968965 which is the
other biggest issue for Xen 4.14 in unstable now.

Bug#970802: gcc-10: armhf: false positive when using -O2 and -Werror=format-truncation

2020-09-23 Thread Hans van Kranenburg

On 9/23/20 4:59 PM, Julien Grall wrote:
> X-Debbugs-CC: i...@xenproject.org
> X-Debbugs-CC: h...@knorrie.org
> Package: gcc-10
> Version: 10.2.0-9
> Severity: important
> 
> Dear Maintainer,
> 
> There was an FTBFS for Xen when building using GCC 10 on armhf (see
> bug #9689645 [1]).

FYI [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=968965#22

> After investigation, it looks a problem when with the optimizer in GCC 
> for armhf.
> 
> [...]

Thanks,
Hans

Bug#961511: [Pkg-xen-devel] Bug#961511: [PATCH] d/xen-utils-common.xen.init: disable oom killer for xenstored

2020-09-22 Thread Hans van Kranenburg

notfixed 961511 xen/4.14.0-1~exp1
thanks

Right... so in the end I made an off-by-one error while rebasing and
totally lost that commit. It's not actually in 4.14.0-1~exp1 now. That's
bad.

On 9/21/20 3:50 AM, Elliott Mitchell wrote:
> This is fun.  Actually isn't too difficult to trigger, simply slowly
> reduce the memory Xen allocates to Dom0 and eventually the oom-killer is
> likely to trigger (having tried to shrink Dom0 as far as possible,
> believe me, I know).  I had been wondering which of the Xen daemons could
> be safely restarted since it is handy to restart daemons instead of whole
> machine for security updates...
> 
> Interestingly running `xenstored --help` mentions:
>   -I, --internal-db   store database in memory, not on disk
> 
> There is a run/xenstored/tdb file so I end up wondering if newer versions
> are in fact storing everything in a file and restarting isn't so bad.

Not by default, and I don't know if it's actually considered best
practice. I could not find any info about this yet. I suspect it's not
recommended.

oxenstored has the following option in /etc/xen/oxenstored.conf:

# Activate filed base backend
persistent = false

When enabling this, the file /run/xenstored/db gets rewritten a lot and
I also see it's out of sync with what's in xenstore-ls after doing some
things. So, it might me inconsistent when the process is oom-killed.

> The patch switches the arguments from:
> --exec "$try_xenstored" -- ...
> to:
> --exec /usr/bin/choom -- -n -1000 "$try_xenstored" -- ...
> 
> I'm pretty sure start-stop-daemon is consuming the "--" and the second
> "--" shouldn't be there.

Well, I tested it and found out that it's needed...

-# start-stop-daemon --start \
   --pidfile "/run/xenstore.pid" \
   --exec /usr/bin/choom -- -n -1000 \
   /usr/lib/xen-4.14/bin/oxenstored --pid-file "/run/xenstore.pid"
/usr/bin/choom: unrecognized option '--pid-file'
Try 'choom --help' for more information.

-# start-stop-daemon --start \
   --pidfile "/run/xenstore.pid" \
   --exec /usr/lib/xen-4.14/bin/oxenstored --test
Would start /usr/lib/xen-4.14/bin/oxenstored .

and with the extra separator:

-# start-stop-daemon --start \
   --pidfile "/run/xenstore.pid" \
   --exec /usr/bin/choom -- -n -1000 \
   /usr/lib/xen-4.14/bin/oxenstored -- --pid-file "/run/xenstore.pid"

-# grep . /proc/$(pidof /usr/lib/xen-4.14/bin/oxenstored)/oom_*
/proc/363043/oom_adj:-17
/proc/363043/oom_score:0
/proc/363043/oom_score_adj:-1000

-# cat /proc/$(pidof /usr/lib/xen-4.14/bin/oxenstored)/cmdline
/usr/lib/xen-4.14/bin/oxenstored--pid-file/run/xenstore.pid

How did you test it and how did you get a working process without the --?

Hans

Bug#968965: [Pkg-xen-devel] Bug#968965: Bug#968965: xen: FTBFS in sid

2020-09-21 Thread Hans van Kranenburg

notfixed -1 xen/4.14.0-1~exp1
reopen
found -1 xen/4.14.0-1~exp1
thanks

Hi,

On 9/4/20 1:55 PM, Hans van Kranenburg wrote:
> 
> On 8/24/20 7:03 PM, Gianfranco Costamagna wrote:
>> Source: xen
>> Version: 4.11.4+24-gddaaccbbab-1
>> Severity: serious
>>
>> Hello, looks like xen is FTBFS because of some bd-uninstallable python 
>> package and a gcc-10 related build failure. 
> 
> [...]
Well, it seems we have more FTBFS, let's reuse this bug number to track
it again?

https://buildd.debian.org/status/package.php?p=xen=experimental

--->8--- arm64 --->8---

gcc -MMD -MP -MF ./.mem_access.o.d -DBUILD_ID -fno-strict-aliasing
-std=gnu99 -Wall -Wstrict-prototypes -Wdeclaration-after-statement
-Wno-unused-but-set-variable -Wno-unused-local-typedefs   -O2
-fomit-frame-pointer -nostdinc -fno-builtin -fno-common -Werror
-Wredundant-decls -Wno-pointer-arith -Wvla -pipe -D__XEN__ -include
/<>/xen/include/xen/config.h -Wa,--strip-local-absolute
-mcpu=generic -mgeneral-regs-only   -I/<>/xen/include
-fno-stack-protector -fno-exceptions -fno-asynchronous-unwind-tables
-fcf-protection=none -Wnested-externs '-D__OBJECT_FILE__="mem_access.o"'
 -c mem_access.c -o mem_access.o
mem_access.c: In function ‘p2m_mem_access_check’:
mem_access.c:227:6: note: parameter passing for argument of type ‘const
struct npfec’ changed in GCC 9.1
  227 | bool p2m_mem_access_check(paddr_t gpa, vaddr_t gla, const struct
npfec npfec)
  |  ^~~~

--->8--- armhf --->8---

gcc  -marm -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall
-Wstrict-prototypes -Wdeclaration-after-statement
-Wno-unused-but-set-variable -Wno-unused-local-typedefs   -O2
-fomit-frame-pointer
-D__XEN_INTERFACE_VERSION__=__XEN_LATEST_INTERFACE_VERSION__ -MMD -MP
-MF .xenpmd.o.d -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE
-D_LARGEFILE64_SOURCE  -g -O2 -fdebug-prefix-map=/<>=.
-fstack-protector-strong -Wformat -Werror=format-security -Wdate-time
-D_FORTIFY_SOURCE=2 -Werror
-I/<>/tools/xenpmd/../../tools/xenstore/include
-I/<>/tools/xenpmd/../../tools/include  -c -o xenpmd.o
xenpmd.c
xenpmd.c: In function ‘get_next_battery_file’:
xenpmd.c:92:37: error: ‘%s’ directive output may be truncated writing
between 4 and 2147483645 bytes into a region of size 271
[-Werror=format-truncation=]
   92 | #define BATTERY_STATE_FILE_PATH "/tmp/battery/%s/state"
  | ^~~
xenpmd.c:117:52: note: in expansion of macro ‘BATTERY_STATE_FILE_PATH’
  117 | snprintf(file_name, sizeof(file_name),
BATTERY_STATE_FILE_PATH,
  |
^~~
xenpmd.c:92:51: note: format string is defined here
   92 | #define BATTERY_STATE_FILE_PATH "/tmp/battery/%s/state"
  |   ^~
In file included from /usr/include/stdio.h:867,
 from xenpmd.c:35:
/usr/include/arm-linux-gnueabihf/bits/stdio2.h:67:10: note:
‘__builtin___snprintf_chk’ output between 24 and 2147483665 bytes into a
destination of size 284
   67 |   return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL
- 1,
  |
^~~~
   68 |__bos (__s), __fmt, __va_arg_pack ());
  |~
xenpmd.c:91:36: error: ‘%s’ directive output may be truncated writing
between 4 and 2147483645 bytes into a region of size 271
[-Werror=format-truncation=]
   91 | #define BATTERY_INFO_FILE_PATH "/tmp/battery/%s/info"
  |^~
xenpmd.c:114:52: note: in expansion of macro ‘BATTERY_INFO_FILE_PATH’
  114 | snprintf(file_name, sizeof(file_name),
BATTERY_INFO_FILE_PATH,
  |
^~
xenpmd.c:91:50: note: format string is defined here
   91 | #define BATTERY_INFO_FILE_PATH "/tmp/battery/%s/info"
  |  ^~
In file included from /usr/include/stdio.h:867,
 from xenpmd.c:35:
/usr/include/arm-linux-gnueabihf/bits/stdio2.h:67:10: note:
‘__builtin___snprintf_chk’ output between 23 and 2147483664 bytes into a
destination of size 284
   67 |   return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL
- 1,
  |
^~~~
   68 |__bos (__s), __fmt, __va_arg_pack ());
  |~

--->8--- i386 --->8---

gcc-Wl,-z,relro -Wl,-z,now -pthread -Wl,-soname
-Wl,libxentoolcore.so.1 -shared -Wl,--version-script=libxentoolcore.map
-o libxentoolcore.so.1.0 handlereg.opic
/usr/bin/ld: i386:x86-64 architecture of input file `handlereg.opic' is
incompatible with i386 output
/usr/bin/ld: handlereg.opic: file class ELFCLASS64 incompatible with
ELFCLASS32
/usr/bin/ld: final link failed: file in wrong format
collect2: error: ld returned 1 exit status

Hans

Bug#927071: [Pkg-xen-devel] Bug#927071: xen: More balloon-leak observation

2020-09-18 Thread Hans van Kranenburg

Hi again,

On 5/1/19 12:55 AM, Elliott Mitchell wrote:
> On Mon, Apr 22, 2019 at 04:02:28PM +0200, Hans van Kranenburg wrote:
>> On 4/22/19 1:10 AM, Elliott Mitchell wrote:
>>> There is plenty of free memory for creating additional VMs (perhaps too
>>> much, and that confused Xen?), so this is really puzzling that memory is
>>> being ballooned away from Dom0.  At this point I plan after the next
>>> restart to double the allocation for Dom0 and see whether Dom0 is able
>>> to last more than a week.
>>
>> Weird. Can you log memory stats over time, so that you can see when it
>> happens, and correlate it to other events?
> 
> At this point there is only one real pattern I've noticed:  Always
> `smartd` was the process which triggered the kernel OOM-killer.
> 
> Originally I was attributing this to `smartd` doing some large memory
> allocation during its night-time tasks (which I would attribute to
> perhaps `smartd` not being that well written).  Yet now, I never saw
> anything else trigger the OOM-killer and I'm now willing to speculate
> some I/O operation `smartd` was doing triggers a bug in Xen.

At first I replied with "I haven't heard about this symptom before your
report.", but later I realized that I am totally seeing the same kind of
behaviour.

During some debian-xen day in Feb 2020, I even had a bit of a closer
look at this together with Ian, and we ended up thinking that there's
actually some kind of obscure miscalculation bug happening. If you look
closely at the numbers in xl info and xl list, then you'll see that the
numbers just do not add up.

The dom0 gets some kind of fake-down-ballooning which is an accounting
error.

I can't provide more proof right now, because I have to reproduce the
thing in a simplified environment to be able to provide a kind of
walk-through scenario with all the output of the numbers.

And yes, I have seen oom killers do stuff in customer production
environments because of this. O_O

A team member in my team has been busy doing storage migrations where we
attach new block devices to domUs and then sync all their data to the
new filesystem (moving from ext4 to btrfs and also to new iSCSI storage)
and later reboot after a final sync and then swap block devices, etc.
>From the graphs we've been looking at, combined with when migration
stuff is happening, I have gotten a suspicion that it looks like the
fake dom0 down-ballooning is related to grant mappings, since it seems
like the dom0 memory is not decreasing when attaching the new disk, but
it is when starting activity using it.

To be continued

Hans

Bug#968501: btrfs-heatmap: Please depend on python3:any or drop python3 dependency

2020-09-15 Thread Hans van Kranenburg

Hi!

On 8/16/20 3:36 PM, Elrond wrote:
> Package: btrfs-heatmap
> Version: 8-1
> Severity: wishlist
> User: multiarch-de...@lists.alioth.debian.org
> Usertags: multiarch
> 
> Hi,
> 
> btrfs-heatmap currently depends on just python3.
> As btrfs-heatmap is Architecture=all, it probably
> should depend on python3:any.
> Alternatively, the python3 dependency could be dropped, as
> the dependency on python3-btrfs will already pull in
> an appropriate python3.

Yes, you're right.

btrfs-heatmap has a hard dependency on pyton3-btrfs, so I'll drop the
python3 dependency for btrfs-heatmap in the next upload.

Thanks,
Hans

Bug#961511: [PATCH] d/xen-utils-common.xen.init: disable oom killer for xenstored

2020-09-07 Thread Hans van Kranenburg

tag -1 + pending
thanks

On 9/7/20 12:40 PM, Ian Jackson wrote:
> ~Hans van Kranenburg writes ("[PATCH] d/xen-utils-common.xen.init: disable 
> oom killer for xenstored"):
>> In case of oom killer terminating some process, we'd rather not see
>> xenstored go. Xenstored has an in-memory database, and when starting the
>> process again, it would be empty, which is very inconvenient. Xenstored
>> should already score quite low and have a fairly low memory footprint,
>> but according to the user report, it happened.
>>
>> Closes: #961511
>> Suggested-by: Samuel Thibault 
>> Signed-off-by: Hans van Kranenburg 
> 
> Acked-by: Ian Jackson 

Thanks, added.

Hans

Bug#961511: [PATCH] d/xen-utils-common.xen.init: disable oom killer for xenstored

2020-09-06 Thread Hans van Kranenburg

In case of oom killer terminating some process, we'd rather not see
xenstored go. Xenstored has an in-memory database, and when starting the
process again, it would be empty, which is very inconvenient. Xenstored
should already score quite low and have a fairly low memory footprint,
but according to the user report, it happened.

Closes: #961511
Suggested-by: Samuel Thibault 
Signed-off-by: Hans van Kranenburg 
---
Cc: Ian Jackson 
---
This is in my knorrie/4.14-extra branch now. I think we should do this.
---
 debian/xen-utils-common.xen.init | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/debian/xen-utils-common.xen.init b/debian/xen-utils-common.xen.init
index 54aaba89d320..2a4c09fa3f71 100644
--- a/debian/xen-utils-common.xen.init
+++ b/debian/xen-utils-common.xen.init
@@ -226,7 +226,8 @@ xenstored_start()
eval "try_xenstored=\$$try_xenstored_var"
if [ -x $try_xenstored ]; then
if start-stop-daemon --start --quiet \
-   --pidfile "$XENSTORED_PIDFILE" --exec 
"$try_xenstored" -- \
+   --pidfile "$XENSTORED_PIDFILE" \
+   --exec /usr/bin/choom -- -n -1000 
"$try_xenstored" -- \
$XENSTORED_ARGS --pid-file 
"$XENSTORED_PIDFILE"; then
started_xenstored=$try_xenstored
break
-- 
2.20.1

Bug#961511: [Pkg-xen-devel] Bug#961511: xen-utils-common: Protect xenstored/xenconsoled against OOM

2020-09-06 Thread Hans van Kranenburg

Hi,

On 5/25/20 3:18 PM, Samuel Thibault wrote:
> Samuel Thibault, le lun. 25 mai 2020 15:11:44 +0200, a ecrit:
>> I'm currently using a hack such as
>>
>> for i in $(pgrep xenconsoled) ; do
>> echo -1000 > /proc/$i/oom_score_adj
>> done
>>
>> in /etc/init.d/xen, but there are cleaner ways to do this :)
> 
> For instance, using choom:
> 
>   start-stop-daemon --start --quiet --pidfile "$XENCONSOLED_PIDFILE" 
> --exec /usr/bin/choom -- \
>   -n -1000 "$XENCONSOLED" $XENCONSOLED_ARGS --pid-file 
> "$XENCONSOLED_PIDFILE" \

That's a nice idea! Especially for xenstored, because it only keeps
state in memory.

xenconsoled can be started again if it's ever oom killed. so, I'd like
to limit this to xenstored only.

E.g. in my situation at work, it's mostly openvswitch that gets killed
first, if there's really a situation in which something has to go. If I
can choose between that (which disrupts vm traffic) or xenconsoled
(which does not impact customer stuff directly), then I'd rather see the
last one go temporarily.

I had to insert another -- before $XENCONSOLED_ARGS to actually make it
work.

After reboot:

-# grep . /proc/$(pidof /usr/lib/xen-4.14/bin/oxenstored)/oom_*
/proc/7478/oom_adj:-17
/proc/7478/oom_score:0
/proc/7478/oom_score_adj:-1000

Hans

Bug#968965: [Pkg-xen-devel] Bug#968965: xen: FTBFS in sid

2020-09-04 Thread Hans van Kranenburg

Hi Gianfranco,

On 8/24/20 7:03 PM, Gianfranco Costamagna wrote:
> Source: xen
> Version: 4.11.4+24-gddaaccbbab-1
> Severity: serious
> 
> Hello, looks like xen is FTBFS because of some bd-uninstallable python 
> package and a gcc-10 related build failure. 

Yes. Thanks for the report.

Currently (actually, also today!) Ian Jackson and I are working on this.
We want to have Xen 4.14 in Debian unstable, and the two big things that
are needed are GCC 10 fixes and getting rid of python 2 usage.

So, just to let you know it's known and being worked on.

> gcc  -m64 -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall 
> -Wstrict-prototypes -Wdeclaration-after-statement 
> -Wno-unused-but-set-variable -Wno-unused-local-typedefs   -O2 
> -fomit-frame-pointer 
> -D__XEN_INTERFACE_VERSION__=__XEN_LATEST_INTERFACE_VERSION__ -MMD -MF 
> .tdb.o.d -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE  -g -O2 
> -fdebug-prefix-map=/build/xen-4.11.4+24-gddaaccbbab=. 
> -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time 
> -D_FORTIFY_SOURCE=2 -Werror -I. -include 
> /build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/config.h 
> -I./include 
> -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libs/evtchn/include
>  -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include 
> -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libxc/include 
> -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libs/toollog/include
>  -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include 
> -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libs/foreignmemory/include
>  -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include 
> -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libs/devicemodel/include
>  -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include 
> -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include 
> -D__XEN_TOOLS__ 
> -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libs/toolcore/include
>  -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include 
> -DXEN_LIB_STORED="\"/var/lib/xenstored\"" 
> -DXEN_RUN_STORED="\"/var/run/xenstored\""  
> -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libs/gnttab/include
>  -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include  -c -o 
> tdb.o tdb.c 
> gcc  -m64 -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall 
> -Wstrict-prototypes -Wdeclaration-after-statement 
> -Wno-unused-but-set-variable -Wno-unused-local-typedefs   -O2 
> -fomit-frame-pointer 
> -D__XEN_INTERFACE_VERSION__=__XEN_LATEST_INTERFACE_VERSION__ -MMD -MF 
> .talloc.o.d -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE  -g -O2 
> -fdebug-prefix-map=/build/xen-4.11.4+24-gddaaccbbab=. 
> -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time 
> -D_FORTIFY_SOURCE=2 -Werror -I. -include 
> /build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/config.h 
> -I./include 
> -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libs/evtchn/include
>  -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include 
> -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libxc/include 
> -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libs/toollog/include
>  -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include 
> -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libs/foreignmemory/include
>  -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include 
> -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libs/devicemodel/include
>  -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include 
> -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include 
> -D__XEN_TOOLS__ 
> -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libs/toolcore/include
>  -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include 
> -DXEN_LIB_STORED="\"/var/lib/xenstored\"" 
> -DXEN_RUN_STORED="\"/var/run/xenstored\""  
> -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libs/gnttab/include
>  -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include  -c -o 
> talloc.o talloc.c 
> gcc xs_tdb_dump.o utils.o tdb.o talloc.o-Wl,-z,relro -Wl,-z,now  -o 
> xs_tdb_dump 
> /usr/bin/ld: utils.o:./tools/xenstore/utils.h:27: multiple definition of 
> `xprintf'; xs_tdb_dump.o:./tools/xenstore/utils.h:27: first defined here
> collect2: error: ld returned 1 exit status
> make[6]: *** [Makefile:97: xs_tdb_dump] Error 1
> make[6]: Leaving directory '/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore'
> make[5]: *** [/build/xen-4.11.4+24-gddaaccbbab/tools/../tools/Rules.mk:253: 
> subdir-install-xenstore] Error 2
> make[5]: Leaving directory '/build/xen-4.11.4+24-gddaaccbbab/tools'
> make[4]: *** [/build/xen-4.11.4+24-gddaaccbbab/tools/../tools/Rules.mk:248: 
> subdirs-install] Error 2
> make[4]: Leaving directory

Bug#964494: File system corruption with ext3 + kernel-4.19.0-9-amd64

2020-07-20 Thread Hans van Kranenburg

Hi,

On Wed, 15 Jul 2020 20:52:40 -0700 Sarah Newman  wrote:
> On 7/7/20 8:13 PM, Ben Hutchings wrote:
> > Control: reassign -1 src:linux
> > Control: tag -1 moreinfo
> > 
> > On Tue, 2020-07-07 at 17:30 -0700, Sarah Newman wrote:
> >> Package: linux-signed-amd64
> >> Version: 4.19.0-9-amd64
> >>
> >> We've had two separate reports now of debian buster users running
> >> 4.19.0-9-amd64 who experienced serious file system corruption.
> > 
> > Which version?  (I.e. what does "uname -v" or
> > "dpkg -s linux-image-4.19.0-9-amd64" say?)
> > 
> >> - Both were using ext3
> >> - Both are running Xen HVM, but I do not have reason to believe this to be 
> >> related
> 
> [...]

I have servers which run 4.19.118-2 as dom0 kernel and a Xen 4.11.4-1
rebuild for Buster.

One example is a smallish 6-server cluster that got a reboot cycle 48
days ago.

It contains a few heavily loaded domUs with 4.19.118 or 4.19.131 based
kernels.

No problems or disk corruption or anything is seen yet. dom0 filesystem
is ext4, domUs use a mix of ext4 and btrfs (over iscsi). So, no ext3
anywhere.

We haven't got bug reports against Debian Xen packages in the BTS about
this.

I have not yet tried to make an ext3 fs on a block device in a test domU
and then have it do things with the fs and reboot it now and then. If
wanted, I can do that and see if there's any problem after a week or
two. Just to add chaos to help correlating.

FWIW,
Hans

Bug#965245: [Pkg-xen-devel] Bug#965245: Cross-build issues

2020-07-18 Thread Hans van Kranenburg

Hi Elliott,

On 7/18/20 5:53 AM, Elliott Mitchell wrote:
> Package: src:xen
> Version: 4.13
> Tags: patch
> 
> I've been playing try to get Xen 4.13 to cross-build for ARM.  In the
> process I've been running into bunches of problems, so here are fixes.

Can you:
* add a 'why' line to the commit message of the first patch
* add Signed-off-by lines
* and then mailbomb (git send-email) it to
pkg-xen-de...@lists.alioth.debian.org with Cc to Ian Jackson
? Just all of it in 1 mail thread? (So,
with 0/10 cover letter which does not have to contain anything else than
something like 'Hi! See #965245, kthxbye'.)

Then we can collect some Reviewed-by etc.

> OCAML/xenstored is being problematic, that looks like outright bugs on
> ocaml-nox making it unusable for cross-building.

The cxenstored is also still there. The init scripts look if oxenstored
is installed, and if not, it falls back to using normal xenstored. So, I
suspect if you patch it out of the build for this arch, then no other
changes are necessary. (Normally both are built now, so that if a user
wants, in case of problems or whatever, they can switch back).

> I'm including copies of 3 patches from Julien Grall.  Upstream source for
> this is: git://xenbits.xen.org/people/julieng/xen-unstable.git  The
> branch "arm-dma/v2".

Ok, these patches are in Xen 4.14 I see. First thing I want to do going
forward  is forwarding the packaging to that. I hope this will also only
make your life easier.

Like I said on IRC, the two other things before we can push it to Debian
experimental asap are making sure python 2 is not used any more
anywhere, and of course a proper debian/changelog. :) And then making
noise on the list to find users to try it out. And, a small pile of
backlog of things that are waiting, and then hopefully not too long
after the official Xen 4.14 release it can go into Debian unstable.

But, keep the 3 upstream patches in the set for now, so that it's
explicit that you need them for this.

> Why yes, I am trying to get Xen operational on a Raspberry PI.  Why do
> you ask?  :-)

Haha. Exciting. I like it. Looking forward to see it working and help
testing it here. I didn't do cross-building yet, so time to learn
something new.

Hans (Knorrie)

Bug#964793: odd qemu/xen crashes + toolchain rings a bell

2020-07-13 Thread Hans van Kranenburg

However,

On 7/13/20 4:19 PM, Hans van Kranenburg wrote:
> (Adding more To:; Note that mailing the bug number does not make it end
> up at the submitter automatically, only the package maintainer).
> 
> Hi Christian,
> 
> thanks for the hints!
> 
> On Mon, 13 Jul 2020 09:01:18 +0200 Christian Ehrhardt
>  wrote:
>> Hi,
>> I was seeing the bug updates flying by and just wanted to mention that we
>> have seen something similar in Ubuntu - but back then things weren't
>> replicable on Debian so we couldn't contribute things back.
>> It seemed to be due to the newer and different-defaults toolchain that we
>> had in Ubuntu at the time.
>>
>> But here qemu/xen crashes + new toolchain come together again which
>> reminded me.
>>
>> So without any promises that it really is related I wanted to FYI you to
>> these two fixes we needed for Xen:
>> https://git.launchpad.net/ubuntu/+source/xen/tree/debian/patches/1001-strip-note-gnu-property.patch?h=ubuntu/groovy-devel
> 
> I guess this first one would be one needed? "Force fcf-protection off
> when using -mindirect-branch".
> 
> In that case want this one, it's not backported to 4.11-stable:
> 
> "x86/build: Unilaterally disable -fcf-protection"
> 
> https://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=3a218961b16f1f4feb1147f56338faf1ac8f5703

However, this is a workaround for a gcc bug that is fixed in:

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=a03efb266f

This fix is included in gcc-9 in Debian since 9.3.0-12:

https://salsa.debian.org/toolchain-team/gcc/-/blob/gcc-9-debian/debian/changelog#L55
(it's the PR target/93654 (x86))

Reporter says the 4.11.4-1 package is used, which is built using gcc
9.3.0-13:

https://buildd.debian.org/status/fetch.php?pkg=xen=all=4.11.4-1=1590602099=0

>> https://git.launchpad.net/ubuntu/+source/xen/tree/debian/patches/1000-flags-fcs-protect-none.patch?h=ubuntu/groovy-devel
> 
> This one is about the build failing.
> 
>> This would seem more applicable if the new toolchain would have recently
>> rebuilt xen and not qemu as in this case. But as an FYI it is still worth a
>> ping.
> 
> 小太, can you do...
> 
>   xl create -vvv 
> 
> ...which should show how qemu is invoked. Can you show that command?
> 
> I can provide you with some test packages with the mentioned upstream
> patch applied (on top of 4.11.4+24-gddaaccbbab-1), so you can test if
> your domU starts with them.
> 
> If so, we can request the backport upstream and/or maybe pick it for
> Debian 4.11 into the patch queue, whatever happens earlier.

So, the above info tells us that this probably is not the issue that
we're looking at. (I'm fine with still making some test packages for
reporter to test with to 100% check this.)

Then, let's see what shows up in the xl -vvv output and if there's
anything that can be debugged when starting the qemu process with those
args?

> Thanks,
> Hans (Debian Xen Team)
>

Bug#964793: odd qemu/xen crashes + toolchain rings a bell

2020-07-13 Thread Hans van Kranenburg

(Adding more To:; Note that mailing the bug number does not make it end
up at the submitter automatically, only the package maintainer).

Hi Christian,

thanks for the hints!

On Mon, 13 Jul 2020 09:01:18 +0200 Christian Ehrhardt
 wrote:
> Hi,
> I was seeing the bug updates flying by and just wanted to mention that we
> have seen something similar in Ubuntu - but back then things weren't
> replicable on Debian so we couldn't contribute things back.
> It seemed to be due to the newer and different-defaults toolchain that we
> had in Ubuntu at the time.
> 
> But here qemu/xen crashes + new toolchain come together again which
> reminded me.
> 
> So without any promises that it really is related I wanted to FYI you to
> these two fixes we needed for Xen:
> https://git.launchpad.net/ubuntu/+source/xen/tree/debian/patches/1001-strip-note-gnu-property.patch?h=ubuntu/groovy-devel

I guess this first one would be one needed? "Force fcf-protection off
when using -mindirect-branch".

In that case want this one, it's not backported to 4.11-stable:

"x86/build: Unilaterally disable -fcf-protection"

https://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=3a218961b16f1f4feb1147f56338faf1ac8f5703

> https://git.launchpad.net/ubuntu/+source/xen/tree/debian/patches/1000-flags-fcs-protect-none.patch?h=ubuntu/groovy-devel

This one is about the build failing.

> This would seem more applicable if the new toolchain would have recently
> rebuilt xen and not qemu as in this case. But as an FYI it is still worth a
> ping.

小太, can you do...

  xl create -vvv 

...which should show how qemu is invoked. Can you show that command?

I can provide you with some test packages with the mentioned upstream
patch applied (on top of 4.11.4+24-gddaaccbbab-1), so you can test if
your domU starts with them.

If so, we can request the backport upstream and/or maybe pick it for
Debian 4.11 into the patch queue, whatever happens earlier.

Thanks,
Hans (Debian Xen Team)

Bug#964482: buster-pu: xen/4.11.4+24-gddaaccbbab-1~deb10u1

2020-07-08 Thread Hans van Kranenburg

Hi,

On 7/8/20 9:35 AM, Moritz Muehlenhoff wrote:
> On Tue, Jul 07, 2020 at 10:56:18PM +0200, Hans van Kranenburg wrote:
>> Additional To: t...@security.debian.org
>>
>> Hi Security team,
>>
>> After our last security update, which was
>> 4.11.3+24-g14b62ab3e5-1~deb10u1, we found out that there is a bugfix to
>> be done to help users upgrade from Buster to Bullseye. This fix was
>> included in the unstable xen 4.11.4-1 upload (it also helps for the
>> future from there) and has been in unstable for 41 days now.
>>
>> I have chosen to not bother you with a new security upload for 4.11.4 to
>> Buster at that time (while it included security fixes) because I didn't
>> want to skip going through the stable release process because of this
>> packaging change.
>>
>> Now, we're at the verge of a new buster point release.
>>
>> Can you please read https://bugs.debian.org/964482 and ack that we can
>> do a combination of the security updates and this packaging change for
>> stable?
> 
> Ack, we can piggyback the fix for 964482 to the buster-security update,
> no problem.

Ok, clear. In that case it will be a security update with the fix
included. I was just trying to be more 'compliant'. :)

Upstream Xen testing finished and has all the commits in stable-4.11
now. I did the upload for Debian unstable already, it's processed now.

https://packages.debian.org/source/sid/xen

So, I changed the changelog to buster-security, and did another build
and test run here, all is looking good.

https://salsa.debian.org/xen-team/debian-xen/-/commit/0da17d8b443233e521c84886c2fc913ea4ee4480

Since I'm a DM I guess I need a sponsor for the security upload. Can
someone from the security team do this? I put everything here, signed
and well:

https://syrinx.knorrie.org/~knorrie/tmp/xen/

I have another question, which is about timing. I have been asking
around a bit a few weeks ago, but did not get any response on this:

For the users, who are running some Xen cluster, it's really useful to
get Xen and Linux kernel changes at the same time, to reduce the amount
of 'reboot stress' we're causing them. Does anyone have a brilliant idea
about how to improve this? I mean, if we do this security update now,
then next week the new kernel is in the point release In general, if
the kernel team does a security update, or if a point release happens,
it would be useful to push out a Xen update as well at the same time...

I can of course write some dirty script that polls kernel team git all
the time and then emails me with "hola! activity in a -security branch!"...

Thanks,
Hans

Bug#964482: buster-pu: xen/4.11.4+24-gddaaccbbab-1~deb10u1

2020-07-07 Thread Hans van Kranenburg

On 7/7/20 9:51 PM, Adam D. Barratt wrote:
> Control: tags -1 + moreinfo
> 
> On Tue, 2020-07-07 at 21:16 +0200, Hans van Kranenburg wrote:
>> I'd like to update the xen packages in buster to
>> 4.11.4+24-gddaaccbbab-1~deb10u1 for the 10.5 point release. This is
>> an update to keep following the stable-4.11 upstream Xen code, which
>> mainly contains security fixes.
>>
>> https://salsa.debian.org/xen-team/debian-xen/-/blob/10f1a4a8f15b6748459cd1c826d3808694682faf/debian/changelog
> 
> In that case, please attach a source debdiff between the current stable
> package and the proposed package (built and tested on stable) to this
> request.

I can do that. Are you sure you want to read through the upstream
changes in a way that collapses everything and removes the context of
the original git commits with any useful information about whether it's
related to an XSA, or if it's a backport of a critical bug that crashes
systems for our stable users or if it's a commit that really needs to be
included before the security fix will actually work?

I'm trying to run this through the stable release process because
there's an (one) actual packaging change involved.

If we only had upstream changes, we'd do this as a regular security update.

>> I also have 4.11.4+24-gddaaccbbab-1 for unstable ready for upload
>> here.
>> All of it is right now waiting for the upstream testing at the Xen
>> project to finish, which is regression testing the latest additions
>> for todays published security advisories (
>> https://xenbits.xen.org/xsa/,
>> 2020-07-07). But, I'm already sending the request.
> 
> It's fine to send the request now, but the unstable upload needs to
> happen first.

That's for sure!

Hans

Bug#964482: buster-pu: xen/4.11.4+24-gddaaccbbab-1~deb10u1

2020-07-07 Thread Hans van Kranenburg

Package: release.debian.org
Severity: normal
Tags: buster
User: release.debian@packages.debian.org
Usertags: pu

Hi,

I'd like to update the xen packages in buster to
4.11.4+24-gddaaccbbab-1~deb10u1 for the 10.5 point release. This is an
update to keep following the stable-4.11 upstream Xen code, which mainly
contains security fixes.

https://salsa.debian.org/xen-team/debian-xen/-/blob/10f1a4a8f15b6748459cd1c826d3808694682faf/debian/changelog

I also have 4.11.4+24-gddaaccbbab-1 for unstable ready for upload here.
All of it is right now waiting for the upstream testing at the Xen
project to finish, which is regression testing the latest additions for
todays published security advisories (https://xenbits.xen.org/xsa/,
2020-07-07). But, I'm already sending the request.

Both unstable and Buster are on Xen 4.11. Currently buster has
4.11.3+24-g14b62ab3e5-1~deb10u1, so in the changelog you can see we'll
be syncing it up with unstable again.

The 4.11.4-1 package version contained an actual packaging change, that
fixes a bug for upgrading to a new Xen version. This is something we
want to have in Buster for our users. It means fixing upgrading from
Buster to Bullseye, but also for whoever follows Debian unstable now.
It's the stuff related to #932759 and these are the changes:

Init scripts:

https://salsa.debian.org/xen-team/debian-xen/-/commit/420d05e8b5950cb79b03a613f791cad400390bb8

NEWS:

https://salsa.debian.org/xen-team/debian-xen/-/commit/10baa2d48db43a5ff675bddf5482717f60fb748a

Testing and code review can also be seen in:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=932759#38

So, since 4.11.4-1 is in unstable already, these changes have been out
there for weeks now. We have not seen any user report about any regression.

Thanks,
Hans van Kranenburg

Bug#963607: [Pkg-xen-devel] Bug#963607: xen-hypervisor-4.11-amd64: Xen Hypervisor kernel fails to load arcmsr module with "arcmsr0: dma_alloc_coherent got error" message.

2020-06-30 Thread Hans van Kranenburg

Hi,

On 6/25/20 1:44 PM, Alex Sanderson wrote:
> 
> Hi Hans,
> 
> Thank you for your assistance with this.  I hesitated to log this with
> xen-dev but thought I should wait for a response here first. 
> 
> 
> On 25/06/2020 01:30, Hans van Kranenburg wrote:
>> Hi Alex,
>>
>> On 6/24/20 12:31 PM, Alex Sanderson wrote:
>>> Package: xen-hypervisor-4.11-amd64
>>> Version: 4.11.3+24-g14b62ab3e5-1~deb10u1
>>> Severity: important
>>>
>>> Dear Maintainer,
>>>
>>> After updating to Buster and Xen 4.11 our machine no longer boots the Xen 
>>> kernel.  The default kernel 4.19.118-2+deb10u1 boots normally.
>> When booting with Xen, the computer first starts the Xen hypervisor
>> code. This is the part where you see all the lines with (XEN) at the
>> beginning appear.
>>
>> Afterwards, it starts the same 4.19.118-2+deb10u1 Linux kernel that is
>> used when running without Xen, but it's started as the first virtual
>> machine, that has extra privileges to access all hardware.
>>
>> So, Linux vs. Xen + Linux.
>>
>>> The machine has an Areca 1882IX-16 card in it when the arcmsr module
>>> tries to load the following error appears. 
>>>
>>> Areca RAID Controller0: Model ARC-1882, F/W V1.56 2019-07-30
>>> arcmsr0: dma_alloc_coherent got error
>>>
>>> No drives are discovered and the initramfs prompt is shown.
>> Ok, so booting the Xen part succeeded, but apparently, when starting the
>> Linux kernel inside, there's apparently a problem with accessing the
>> raid controller hardware. Interesting.
>>
>> This likely means it's not a problem in the Debian packaging part, it's
>> a problem somewhere in the upstream Xen or Linux code. That means that I
>> cannot solve this for you, but I can help with tips to gather the right
>> information, and help finding out what the best place is where we can
>> report the issue.
>>
>>> The machine:
>>>  * Supermicro X9DRW 
>>>  * Dual Intel(R) Xeon(R) CPU E5-2630L v2 @ 2.40GHz
>>>  * 128G RAM
>>>  * Areca ARC-1882IX-16 (1G onboard cache)
>>>
>>> Nothing I have tried is effective:
>>>  * Turning on BIOS above 4G decoding stops the Intel 10GBE ixgbe driver 
>>> from functioning and doesn't fix the arcmsr
>>>  * Unloading and reloading the arcmsr module from initramfs prompt
>>>  * Downgrading the Areca 1882 bios to v1.52 as per 
>>> http://faq.areca.com.tw/index.php?action=artikel=7=902=en
>>>  * Kernel parameters
>>>  ** pci=nocrs 
>>>  ** dom0_mem=8G 
>>>  ** mem=3072M
>>>  ** mem2048M cma=1024M
>>>  ** cma=2048
>>>  ** cma=3076@512M
>>>  ** iommu=1 intel_iommu=1 
>>>  ** arcmsr.host_can_queue=64 as per 
>>> http://faq.areca.com.tw/index.php?action=artikel=15=387=en
>>>
>>> I expected the arcmsr module to load and detect disks as it does with
>>> the stock kernel.
>>>
>>> I can provide sysctl and dmesg output if it helps.
>> Yes. The first thing needed is full startup logs, and for the Xen part
>> preferably extra logging. In /etc/default/grub.d/xen.cfg in the
>> GRUB_CMDLINE_XEN_DEFAULT setting, you can add loglvl=all, and then run
>> update-grub and try to boot Xen+Linux again.
>>
>> Do you have a way to capture the logging during boot? Like, a working
>> serial console or something similar?
>>
>> The output of dmesg when starting Linux without Xen is of course also
>> interesting, so we can compare both scenarios.
>>
>> Hans
> 
> I tried using debian's paste https://paste.debian.net but it always
> thought it was spam.
> 
> dmesg output Xen Hypervisor 4.11 https://pastebin.com/3wUyYg0P

This one shows a Linux kernel boot, not the Xen Hypervisor, which should
go first (with all the (XEN) lines). By default the Xen output should
show up on your (serial) console. If you do dmesg after starting Linux
as dom0 after starting Xen, then you just get the Linux part of it.

If it actually boots and it's usable to login and get a shell prompt
etc, then you can immediately use xl dmesg to see the xen part, and if
it doesn't, then you need to make sure you have some sort of serial
console to capture the lines.

To do a bug report upstream, we'll need that information.

> dmesg output Debian Kernel 4.19.118-2+deb10u1 https://pastebin.com/GHzzW3vi

K

Bug#963607: [Pkg-xen-devel] Bug#963607: xen-hypervisor-4.11-amd64: Xen Hypervisor kernel fails to load arcmsr module with "arcmsr0: dma_alloc_coherent got error" message.

2020-06-24 Thread Hans van Kranenburg

Hi Alex,

On 6/24/20 12:31 PM, Alex Sanderson wrote:
> Package: xen-hypervisor-4.11-amd64
> Version: 4.11.3+24-g14b62ab3e5-1~deb10u1
> Severity: important
> 
> Dear Maintainer,
> 
> After updating to Buster and Xen 4.11 our machine no longer boots the Xen 
> kernel.  The default kernel 4.19.118-2+deb10u1 boots normally.

When booting with Xen, the computer first starts the Xen hypervisor
code. This is the part where you see all the lines with (XEN) at the
beginning appear.

Afterwards, it starts the same 4.19.118-2+deb10u1 Linux kernel that is
used when running without Xen, but it's started as the first virtual
machine, that has extra privileges to access all hardware.

So, Linux vs. Xen + Linux.

> The machine has an Areca 1882IX-16 card in it when the arcmsr module
> tries to load the following error appears. 
> 
>   Areca RAID Controller0: Model ARC-1882, F/W V1.56 2019-07-30
>   arcmsr0: dma_alloc_coherent got error
> 
> No drives are discovered and the initramfs prompt is shown.

Ok, so booting the Xen part succeeded, but apparently, when starting the
Linux kernel inside, there's apparently a problem with accessing the
raid controller hardware. Interesting.

This likely means it's not a problem in the Debian packaging part, it's
a problem somewhere in the upstream Xen or Linux code. That means that I
cannot solve this for you, but I can help with tips to gather the right
information, and help finding out what the best place is where we can
report the issue.

> The machine:
>  * Supermicro X9DRW 
>  * Dual Intel(R) Xeon(R) CPU E5-2630L v2 @ 2.40GHz
>  * 128G RAM
>  * Areca ARC-1882IX-16 (1G onboard cache)
> 
> Nothing I have tried is effective:
>  * Turning on BIOS above 4G decoding stops the Intel 10GBE ixgbe driver from 
> functioning and doesn't fix the arcmsr
>  * Unloading and reloading the arcmsr module from initramfs prompt
>  * Downgrading the Areca 1882 bios to v1.52 as per 
> http://faq.areca.com.tw/index.php?action=artikel=7=902=en
>  * Kernel parameters
>  ** pci=nocrs 
>  ** dom0_mem=8G 
>  ** mem=3072M
>  ** mem2048M cma=1024M
>  ** cma=2048
>  ** cma=3076@512M
>  ** iommu=1 intel_iommu=1 
>  ** arcmsr.host_can_queue=64 as per 
> http://faq.areca.com.tw/index.php?action=artikel=15=387=en
> 
> I expected the arcmsr module to load and detect disks as it does with
> the stock kernel.
> 
> I can provide sysctl and dmesg output if it helps.

Yes. The first thing needed is full startup logs, and for the Xen part
preferably extra logging. In /etc/default/grub.d/xen.cfg in the
GRUB_CMDLINE_XEN_DEFAULT setting, you can add loglvl=all, and then run
update-grub and try to boot Xen+Linux again.

Do you have a way to capture the logging during boot? Like, a working
serial console or something similar?

The output of dmesg when starting Linux without Xen is of course also
interesting, so we can compare both scenarios.

Hans

Bug#962267: [Pkg-xen-devel] Bug#962267: xen: please consider to not install NEWS into runtime library packages

2020-06-05 Thread Hans van Kranenburg

Hi Ansgar,

On 6/5/20 11:57 AM, Ansgar wrote:
> Source: xen
> Version: 4.11.4-1
> Severity: minor
> File: /usr/share/doc/libxenmisc4.11/NEWS.Debian.gz
> 
> Please consider to not install debian/NEWS into runtime library
> packages.  They get pulled into systems that do not run Xen at all in
> which case the NEWS aren't very helpful (just noise that
> apt-listchanges shows).  For my system for example:
> 
> +---
> | % aptitude why libxenmisc4.11
> | i   qemu-kvmDepends qemu-system-x86
> | i A qemu-system-x86 Depends libxenmisc4.11
> +---
> 
> Installing NEWS into xen*, but not libxen* probably still reaches all
> relevant users.

Yes, that makes sense.

OTOH, what if there was a really weird problem with libxenmisc4.11 that
we would like to pro-actively inform users about?

I guess there is only one NEWS per source package?

Hans

Bug#932759: marked as done (After upgrade from stretch to buster, removal of obsolete xen 4.8 packages seems to trigger shutdown of xenconsoled)

2020-05-27 Thread Hans van Kranenburg

Hi,

On 5/27/20 7:39 PM, Debian Bug Tracking System wrote:
> Your message dated Wed, 27 May 2020 17:36:26 +
> with message-id 
> and subject line Bug#932759: fixed in xen 4.11.4-1
> has caused the Debian Bug report #932759,
> regarding After upgrade from stretch to buster, removal of obsolete xen 4.8 
> packages seems to trigger shutdown of xenconsoled
> to be marked as done.
> 
> This means that you claim that the problem has been dealt with.

To avoid confusion, yes, this one closes with the upload of 4.11.4 to
unstable which has the fix.

However, it's still present in 4.11.3+24-g14b62ab3e5-1~deb10u1 in
buster. So, the same fix will also go into buster later, to in the end
help users upgrade from buster to bullseye.

Hans

Bug#932759: [PATCH 2/2] debian/rules: --no-start for xen dh_installinit

2020-05-26 Thread Hans van Kranenburg

Hi,

On 5/26/20 12:44 PM, Ian Jackson wrote:
> Hans van Kranenburg writes ("[PATCH 2/2] debian/rules: --no-start for xen 
> dh_installinit"):
>> When debugging the xen-utils postinst/prerm to find the cause of the
>> mysteriously disappearing xenconsoled processes, I discovered that the
>> xen-utils-common postinst and prerm stop and start the xen init script
>> as well!
>>
>> These commands are not visible in the packaging code, but they are added
>> by dh_installdeb into the postinst and prerm during package build time.
>>
>> We only want to call the script from xen-utils-V, so disable this
>> behavior by using --no-start
>>
>> Closes: #932759 (2/2)
>> Signed-off-by: Hans van Kranenburg 
> 
> Reviewed-by: Ian Jackson 

Thanks.

> I think it would be wise to look at the generated .debs and see that
> they contain (only) the expected pieces in their maintscripts.

Yes, I did this while testing by diffing the files installed in
/var/lib/dpkg/info with the old ones and verifying that exactly that
part went away.

(for 4.11:)

-$ diff -u ~/xen-utils-common.postinst xen-utils-common.postinst
--- /home/beheer/xen-utils-common.postinst  2020-05-26 13:08:45.738926207
+0200
+++ xen-utils-common.postinst   2020-05-25 14:14:28.0 +0200
@@ -31,13 +31,7 @@
 # Automatically added by dh_installinit/13.1
 if [ "$1" = "configure" ] || [ "$1" = "abort-upgrade" ] || [ "$1" =
"abort-deconfigure" ] || [ "$1" = "abort-remove" ] ; then
if [ -x "/etc/init.d/xen" ]; then
-   update-rc.d xen defaults 20 21 >/dev/null
-   if [ -n "$2" ]; then
-   _dh_action=restart
-   else
-   _dh_action=start
-   fi
-   invoke-rc.d xen $_dh_action || exit 1
+   update-rc.d xen defaults 20 21 >/dev/null || exit 1
fi
 fi
 # End automatically added section


-$ diff -u ~/xen-utils-common.prerm xen-utils-common.prerm
--- /home/beheer/xen-utils-common.prerm 2020-05-26 13:09:01.570617331 +0200
+++ xen-utils-common.prerm  2020-05-25 14:14:28.0 +0200
@@ -1,10 +1,5 @@
 #!/bin/sh
 set -e
-# Automatically added by dh_installinit/13.1
-if [ -x "/etc/init.d/xen" ] && [ "$1" = remove ]; then
-   invoke-rc.d xen stop || exit 1
-fi
-# End automatically added section
 # Automatically added by dh_installdeb/13.1
 dpkg-maintscript-helper rm_conffile /etc/default/xend 4.11.1-2\~ -- "$@"
 dpkg-maintscript-helper rm_conffile /etc/xen/xend-config.sxp 4.11.1-2\~
-- "$@"

Hans

Bug#932759: [PATCH 0/2] Bug#932759 Fix misfiring init scripts

2020-05-25 Thread Hans van Kranenburg

This should be enough to finally fix the problem of the mysteriously
disappearing xenconsoled process. We have tried to fix this before, but
it turned out the fix was incomplete.

The two attached patches...
* revert the previous fix
* prevent xen-utils-V prerm and postinst to call the xen init script
  when V != running version X.
* remove even more additional extra bonus redundant superfluous
  supererogatory inordinate loquacious start/stop calls from the
  xen-utils-common maintainer scripts, which were put there by
  dh_installinit and went unnoticed so far.

Ian, can you give your A-B on this. It will have to go into buster as
well, to help users upgrade to Bullseye without these problems.

The test scenario I used (all on current Debian unstable):
[x] Reproduce the problem (disappearing xenconsoled) with current
packages
[x] Install fixed 4.11 packages, and check that when upgrading 4.11 to
4.11, the init script stop/start is called
[x] Install fixed 4.13 packages, and check that the init script is not
called when installing xen-utils-4.13 and when upgrading
xen-utils-common
[x] Reboot into Xen 4.13
[x] Remove xen-utils-4.11 and check that the stop action on the init
script is not called.
[x] Install xen-utils-4.11 again and check that the start action is
not called.
[x] Reboot into just Linux without Xen
[x] Remove xen-utils-4.11 and check that this works good enough. It's
allowed to print some complaints on the screen and behave a little
weird, but it should not totally explode.

Now, there's a last edge case I can think of, which is installing
xen-utils-V in a domU. In there, the /usr/lib/xen-common/bin/xen-version
script will return the Xen version of the host that is carrying this
domU and then do a thing. I do not think we actively support doing
interesting things inside a domU with these packages however.

Hans van Kranenburg (2):
  xen init/maint scripts: Do nothing if running for wrong Xen package
  debian/rules: --no-start for xen dh_installinit

 debian/rules   |  2 +-
 debian/xen-utils-V.postinst.vsn-in | 10 +-
 debian/xen-utils-V.prerm.vsn-in| 10 +-
 debian/xen-utils-common.xen.init   | 27 ---
 4 files changed, 19 insertions(+), 30 deletions(-)

-- 
2.20.1

Bug#932759: [PATCH 2/2] debian/rules: --no-start for xen dh_installinit

2020-05-25 Thread Hans van Kranenburg

When debugging the xen-utils postinst/prerm to find the cause of the
mysteriously disappearing xenconsoled processes, I discovered that the
xen-utils-common postinst and prerm stop and start the xen init script
as well!

These commands are not visible in the packaging code, but they are added
by dh_installdeb into the postinst and prerm during package build time.

We only want to call the script from xen-utils-V, so disable this
behavior by using --no-start

Closes: #932759 (2/2)
Signed-off-by: Hans van Kranenburg 
---
 debian/rules | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/debian/rules b/debian/rules
index 23c982eb414b..73232ca20efe 100755
--- a/debian/rules
+++ b/debian/rules
@@ -282,7 +282,7 @@ override_dh_python2:
 
 # We have two init scripts.  (There used to be xend too.)
 override_dh_installinit:
-   dh_installinit --name xen -- defaults 20 21
+   dh_installinit --name xen --no-start -- defaults 20 21
dh_installinit --name xendomains --no-start -- defaults 21 20
 
 # dh_strip in dh compat 10 and earlier (which we are at so this
-- 
2.20.1

Bug#932759: [PATCH 1/2] xen init/maint scripts: Do nothing if running for wrong Xen package

2020-05-25 Thread Hans van Kranenburg

After trying to fix this issue in the init script, we found out that the
problem still happened for systems running with systemd.

The xen-utils-V postinst and prerm have DPKG_MAINTSCRIPT_PACKAGE in
their environment. When calling invoke-rc.d xen  under systemd,
the whole circus of translation and compatibility layers is used to
finally end up running the /etc/init.d/xen script again. However, when
ending up there, the DPKG_MAINTSCRIPT_PACKAGE variable is lost.

So, instead of trying to fix this in the init script, avoid calling
invoke-rc.d altogether, when installing or removing for a different
version of Xen than the currently running one.

Since we only call this from two places, and the check is a one liner,
directly put it into the prerm and postinst.

Carefully quote the values on both sides of the comparison. For example,
when removing a xen-utils-V package after rebooting into just Linux
without Xen, the version retrieval helper will print an error like
"ERROR:  Can't find hypervisor information in sysfs!", there will be no
useful output on stdout and it will compare an empty string with the
version of the xen-utils package, resulting in the right action, not
trying to stop or start anything.

To avoid hitting the disappearing xenconsoled scenario, the fix has to
be present in the maintainer scripts of the to be removed *old*
xen-utils-V package. This means users will have to first upgrade to a
package with this fix before upgrading to a different Xen version.

Signed-off-by: Hans van Kranenburg 
Closes: #932759 (1/2)
Fixes: cc85504103 "xen init script: Do nothing if running for wrong Xen package"
---
 debian/xen-utils-V.postinst.vsn-in | 10 +-
 debian/xen-utils-V.prerm.vsn-in| 10 +-
 debian/xen-utils-common.xen.init   | 27 ---
 3 files changed, 18 insertions(+), 29 deletions(-)

diff --git a/debian/xen-utils-V.postinst.vsn-in 
b/debian/xen-utils-V.postinst.vsn-in
index 581327f09ffd..0acebf836bb2 100644
--- a/debian/xen-utils-V.postinst.vsn-in
+++ b/debian/xen-utils-V.postinst.vsn-in
@@ -6,7 +6,15 @@ case "$1" in
 configure)
 update-alternatives --remove xen-default /usr/lib/xen-@version@
 if [ -x "/etc/init.d/xen" ]; then
-   invoke-rc.d xen start || exit $?
+# Only call the init script when this xen-utils-@version@ package
+# matches the currently running version of Xen. This means, doing
+# in-place updates (e.g. a security update for same version).
+#
+# When installing a xen-utils package for any other Xen version,
+# leave the running system alone.
+if [ "$(/usr/lib/xen-common/bin/xen-version)" = "@version@" ]; then
+invoke-rc.d xen start || exit $?
+fi
 fi
 ;;
 
diff --git a/debian/xen-utils-V.prerm.vsn-in b/debian/xen-utils-V.prerm.vsn-in
index 1aa2cae65fda..f1cb4299c30c 100644
--- a/debian/xen-utils-V.prerm.vsn-in
+++ b/debian/xen-utils-V.prerm.vsn-in
@@ -6,7 +6,15 @@ case "$1" in
 remove|upgrade)
 update-alternatives --remove xen-default /usr/lib/xen-@version@
 if [ -x "/etc/init.d/xen" ]; then
-invoke-rc.d xen stop || exit $?
+# Only call the init script when removing or while upgrading for
+# the currently running version of Xen.
+#
+# Otherwise, for example after a Xen version upgrade, autoremoval
+# of an obsolete xen-utils-V package would inadvertently stop
+# running daemons like xenconsoled.
+if [ "$(/usr/lib/xen-common/bin/xen-version)" = "@version@" ]; then
+invoke-rc.d xen stop || exit $?
+fi
 fi
 ;;
 
diff --git a/debian/xen-utils-common.xen.init b/debian/xen-utils-common.xen.init
index f66ce6b8db18..05521733494e 100644
--- a/debian/xen-utils-common.xen.init
+++ b/debian/xen-utils-common.xen.init
@@ -26,33 +26,6 @@ xen) ;;
 esac
 
 VERSION=$(/usr/lib/xen-common/bin/xen-version)
-
-# The arrangements for the `xen' init script are a bit odd.
-# This script is part of xen-utils-common, of which there is one
-# version installed regardless of the Xen version.
-#
-# But it is called by the prerm and postinsts of xen-utils-VERSION.
-# The idea is that (for example) if xen-utils-VERSION is upgraded, the
-# daemons are restarted.
-#
-# However, this means that this script may be called by the
-# maintscript of a xen-utils-V package for a different V to the
-# running version of Xen (X, say).  Such a xen-utils-V package does
-# not actually want to start or stop its daemons.  Indeed, the version
-# selection machinery would redirect its efforts to the xen-utils-X
-# utilities.  But this is not right: we don't actually want to (for
-# example) stop xenconsoled from xen-utils-X just because some
-# not-currently-relevant xen-util

Bug#939560: [Pkg-xen-devel] Bug#939560: xen: Various problems in debian/rules

2020-05-25 Thread Hans van Kranenburg

Hi Guillem,

On 9/6/19 12:55 PM, Guillem Jover wrote:
> [...]
> 
> During the debhelper recommendation thread there was a mail from Ian
> pointing out to the xen debian/rules file, I took a look and noticed
> the following. :)
> 
> The debian/rules file [...]

Thanks for the report! The fixes will be in the new Xen 4.13 packages,
which are going to be in experimental very soon and in unstable in a few
weeks, hopefully (we need users to upgrade to the last 4.11 package for
an upgrade fix regarding #932759).

If you would like to review the changes, it's the three commits by Ian,
named...
- debian/rules: Set DEB_BUILD_MAINT_OPTIONS in shell
- debian/rules: Improve comment about hardening options
- debian/rules: Drop redundant sequence numbers in dh_installinit

...which you can find at:
  https://salsa.debian.org/xen-team/debian-xen/-/commits/knorrie/4.13

I'm still finishing up all of that, so can't give commit ids because of
the force-pushing going on.

Thanks,
Hans

Bug#938108: [Pkg-xen-devel] Bug#938108: python-pyxenstore: Python2 removal in sid/bullseye

2020-05-12 Thread Hans van Kranenburg

On 5/9/20 9:57 PM, Moritz Mühlenhoff wrote:
> On Sat, May 09, 2020 at 02:36:24AM +0200, Thomas Goirand wrote:
>> On 5/8/20 9:35 PM, Moritz Mühlenhoff wrote:
>>> On Fri, Aug 30, 2019 at 07:45:40AM +, Matthias Klose wrote:
 Package: src:python-pyxenstore
 Version: 0.0.2-1
 Severity: normal
 Tags: sid bullseye
 User: debian-pyt...@lists.debian.org
 Usertags: py2removal

 Python2 becomes end-of-live upstream, and Debian aims to remove
 Python2 from the distribution, as discussed in
 https://lists.debian.org/debian-python/2019/07/msg00080.html

 Your package either build-depends, depends on Python2, or uses Python2
 in the autopkg tests.  Please stop using Python2, and fix this issue
 by one of the following actions.
>>>
>>> Hi,
>>> python-pyxenstore is dead upstream and there are no reverse deps, let's 
>>> remove?
>>>
>>> Cheers,
>>> Moritz
>>
>> By all means, yes, remove this.
>> I believe it is in Debian when I attempted to package XCP (aka: xen-api,
>> aka xen-server, etc.), and that's long gone from Debian.
> 
> Ack, I've just filed an RM bug.

(seeing it happening)

Also ACK from me.

A while ago this confused me because I initially thought this was a
binary package produced by src:xen, but it was not. At some point (I
think it was our latest IRL work together day of the Debian Xen team) I
realized that it really was not, and from that POV, I can confirm that
it is not used by anything in there.

Thanks,

Hans

Bug#952958: rrdtool crashes after the DLA-2131-1 security update

2020-03-02 Thread Hans van Kranenburg

Hi,

I filed 952964 because I failed to find this one first, apparently. I
merged it now, please ignore 952964.

The problem is that upstream commits around this issue are quite a bit
of a mess, with a number of trial and error fixup commits. So, a half
broken version of the fix was now included in the Jessie security update.

See...

https://github.com/oetiker/rrdtool-1.x/commits/master?after=caf8f7e4a06cd36a69142a46326e58296850781d+69%5B%5D=src%5B%5D=rrd_graph.c

...and then the 'a proper fix to...' and a bunch of newer commits, like
'fix character class definition' and more.

So, a bit more inspection of the history of that file is necessary to
collect the pieces for a proper fix together.

I can help testing a new package if you want.

Thanks,
Hans van Kranenburg

Bug#952964: Security update breaks graph generation: 'range out of order in character class'

2020-03-02 Thread Hans van Kranenburg

Package: rrdtool
Version: 1.4.8-1.2+deb8u1

Hi, the patch in the Jessie security update that was just released
properly breaks creating graphs.

The patch contains the following line:

#define FLOAT_STRING "%[+- 0#]?[0-9]*([.][0-9]+)?l[eEfF]"

Now, [+- 0#] is not a valid character class for a regex, because the -
defines a range, and a range from '+' to ' ' is not valid.

[RRD ERROR] Unable to graph
/var/lib/munin/cgi-tmp/munin-cgi-graph/[...].png : cannot compile
regular expression: Error while compiling regular expression
^(?:[^%]+|%%)*%[+-
0#]?[0-9]*([.][0-9]+)?l[eEfF](?:[^%]+|%%)*%s(?:[^%]+|%%)*$ at char 18:
range out of order in character class (^(?:[^%]+|%%)*%[+-
0#]?[0-9]*([.][0-9]+)?l[eEfF](?:[^%]+|%%)*%s(?:[^%]+|%%)*$)

Upstream did a fixup commit, 1615689e259bfd67e43cf7711948abc23f998ca9
which you missed to include:

https://github.com/oetiker/rrdtool-1.x/commit/1615689e259bfd67e43cf7711948abc23f998ca9

Thanks,
Hans van Kranenburg

Bug#796095: ftp.debian.org: Please allow uploads for DMs to security-master

2020-01-08 Thread Hans van Kranenburg

Hi,

Friendly ping for this issue. Today I ran into the situation that after
getting DM status to be able to help with the security updates for a
specific package (Xen), I found out I was not able to actually make this
happen, since I still need to send the result of my package build to
someone else in the Debian Xen team who is a DD to have this person swap
the GPG signature and do the upload.

Thanks,
Hans

Bug#947944: xen: Several CVEs open for xen (CVE-2018-12207 CVE-2019-11135 CVE-2019-18420 CVE-2019-18421 CVE-2019-18422 CVE-2019-18423 CVE-2019-18424 CVE-2019-18425 CVE-2019-19577 CVE-2019-19578 CVE-20

2020-01-08 Thread Hans van Kranenburg

On 1/7/20 11:34 PM, Hans van Kranenburg wrote:
> [...]
> 
> Today I have finally been working on this. The result is that I at least
> have a new (WIP) version for buster. I'm running it on a dom0 right now
> and did smoke testing, live migrate, restarting domUs etc. It just works
> (tm).
> 
> This was the easy part, most of the work was assembling the changelog by
> copy-pasting things. I cross-checked with your list (below), which is
> nice, since we can check that way that the info from different points of
> view is the same (except for one entry it is).
> 
> https://salsa.debian.org/xen-team/debian-xen/commits/knorrie/buster-security
> 
> Now the interesting part begins, which is not so much about the stable
> security update, but more about what to do with unstable. We currently
> still have the same Xen version in unstable and in Buster.
> 
> So, the most logical thing, which I mentioned before would be to have
> 4.11.3+24-g14b62ab3e5-1 in unstable and 4.11.3+24-g14b62ab3e5-1~deb10u1
> in stable.

Ok, this will just be ok, since I was confused about the
python-pyxenstore package, and thought it was a by-product from our
src:xen. This is not the case, it's a separate thing. So, false alarm.

> [...]

That means that the original plan will suffice for now.

The whole python2 situation will be resolved when we prepare Xen 4.13 or
4.14, or whichever one will be the Bullseye one.

The result:

https://salsa.debian.org/xen-team/debian-xen/tree/knorrie/unstable
https://salsa.debian.org/xen-team/debian-xen/tree/knorrie/buster-security

I just built and tested both of the resulting piles of packages, on
buster and on a bullseye dom0. All looks fine, I can live migrate,
restart things etc etc...

So, next step is getting things uploaded to the right place.

Hans

Bug#947944: xen: Several CVEs open for xen (CVE-2018-12207 CVE-2019-11135 CVE-2019-18420 CVE-2019-18421 CVE-2019-18422 CVE-2019-18423 CVE-2019-18424 CVE-2019-18425 CVE-2019-19577 CVE-2019-19578 CVE-20

2020-01-07 Thread Hans van Kranenburg

Hi,

Today I have finally been working on this. The result is that I at least
have a new (WIP) version for buster. I'm running it on a dom0 right now
and did smoke testing, live migrate, restarting domUs etc. It just works
(tm).

This was the easy part, most of the work was assembling the changelog by
copy-pasting things. I cross-checked with your list (below), which is
nice, since we can check that way that the info from different points of
view is the same (except for one entry it is).

https://salsa.debian.org/xen-team/debian-xen/commits/knorrie/buster-security

Now the interesting part begins, which is not so much about the stable
security update, but more about what to do with unstable. We currently
still have the same Xen version in unstable and in Buster.

So, the most logical thing, which I mentioned before would be to have
4.11.3+24-g14b62ab3e5-1 in unstable and 4.11.3+24-g14b62ab3e5-1~deb10u1
in stable.

However... https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=938843
And on Dec 15, python-pyxenstore REMOVED from testing

So, I guess we're not supposed to upload something new to unstable that
includes this package again and/or uses python 2.

Also, we of course do not like a situation where the package in stable
has a newer version number than the one in unstable.

Checkmate...

We (as in, Debian Xen team, which is Ian and I who are currently active)
haven't been working on getting the latest greatest Xen into unstable
for Bullseye yet. The most recent Xen release (4.13) includes python3
support which fixes that issue, but getting that in means we have to
actively start working on newer packages now. This mostly means
reserving a few days to work on it, since it's not a really trivial
undertaking.

Another ducttape-option is to put the same thing in unstable again,
while stripping out python-pyxenstore from the control file, since it's
not a required package for the average usecase. Still, xen-utils-4.11
contains a bunch of python 2 files, which apparently are still under the
radar.

I'm thinking out loud here, and am curious about what you and Ian can
come up with.

On 1/2/20 3:57 PM, Salvatore Bonaccorso wrote:
> [...]
> 
> There are several CVEs open for xen up to unstable, compiling a list
> from the information from the security-tracker it looks those below.
> 
> Any progress in getting those fixed at least for unstable already?
> 
> CVE-2018-12207[0]:

check, XSA-304

> CVE-2019-11135[1]:

check, XAS-305

> CVE-2019-18420[2]:

check, XSA-296

> CVE-2019-18421[3]:

check, XSA-299

> CVE-2019-18422[4]:

check, XSA-303

> CVE-2019-18423[5]:

check, XSA-301

> CVE-2019-18424[6]:

check, XSA-302

> CVE-2019-18425[7]:

check, XSA-298

> CVE-2019-19577[8]:

check, XSA-311

> CVE-2019-19578[9]:

check, XSA-309

> CVE-2019-19579[10]:

check, XSA-306

> CVE-2019-19580[11]:

check, XSA-310

> CVE-2019-19581[12]:

check, XSA-307

> CVE-2019-19582[13]:

check, XSA-307

> CVE-2019-19583[14]:

check, XSA-308

In the changelog, I also have a fix for:
 XSA-295 CVE-2019-17349 CVE-2019-17350
 https://xenbits.xen.org/xsa/advisory-295.html

> If you fix the vulnerabilities please also make sure to include the
> CVE (Common Vulnerabilities & Exposures) ids in your changelog entry.

I also added a commit to put in the CVE numbers in previous changelog
entries:

https://salsa.debian.org/xen-team/debian-xen/commit/0ee295f5caf6178f64febeb976d7ea968e44a191

Is this ok/wanted/great/what-you-like? Because, regularly, the numbers
are not available yet when we push out the update.

Thanks,
Hans van Kranenburg

Bug#821254: systemd[1]: xendomains.service start operation timed out.

2020-01-06 Thread Hans van Kranenburg

Hi,

On 1/3/20 5:42 PM, Martin Maney wrote:
> 
> [...]
> 
> Yes, the shutdown hang is a different issue, but I'm going to hope that
> the real systemd units mentioned in this bug will fix my problem, too.

What you could do already now is try testing those scripts, just
shutting down and starting up the domUs, without actually rebooting the
machine. By doing so we can learn if we could use them as a drop in
replacement or not.

The xendomains init script that we have in Debian is:

https://salsa.debian.org/xen-team/debian-xen/blob/master/debian/xen-utils-common.xendomains.init

The upstream one (which is quite a bit different) is:

https://salsa.debian.org/xen-team/debian-xen/blob/master/tools/hotplug/Linux/xendomains.in

Or, it seems that last one gets installed in a location for helper
scripts and it's just called from both the init.d script and the systemd
service:

https://salsa.debian.org/xen-team/debian-xen/blob/master/tools/hotplug/Linux/init.d/xendomains.in

https://salsa.debian.org/xen-team/debian-xen/blob/master/tools/hotplug/Linux/systemd/xendomains.service.in

It would be really helpful if you would want to spend some time on this.

Speaking for myself, I either deal with clusters and using live migrate
to empty a server before shutting it down, or otherwise I rather have my
own way to carefully shut down things before typing a reboot command,
combined with a molly-guard script to prevent accidental reboots while
something is still running. That way there's still an option to
debug/salvage a misbehaving domU before shutdown.

Hans

Bug#944612: [Pkg-xen-devel] system still crashes with bullseye and kernel v5.3

2019-12-18 Thread Hans van Kranenburg

Hi Alexander,

On 12/18/19 9:24 PM, Alexander Dahl wrote:
> 
> meanwhile I'm running bullseye with kernel v5.3, but the problem
> persists and my Xen system is annoyingly unstable due to this bug. I
> attach some more logs from the last days and add the debian xen devel
> list in Cc. Maybe someone over there has an idea how to fix this. After
> all the log shows plenty of hints it could have something to do with Xen.

I think the xen parts you see in the stack trace listings are usual
calls that show that a domU is asking dom0 via the hypervisor to do some
disk read/writes or send data over the network (the 'upcall').

https://wiki.xen.org/wiki/Event_Channel_Internals

So, after getting that request, the dom0 Linux kernel tries to execute
it, which is e.g. the enqueue function to throw a network packet at the
physical network interface.

The first error we see is the "transmit queue 0 timed out". This looks
like the Linux kernel is looking at the network port hardware, and
expects it to accept the packet, deal with it and put it on the wire.

When this does not happen, and the network port hardware seems frozen
and timeouts, it's forcibly reset (I don't know if the thing is
resetting itself because it crashed, or if the Linux kernel does
something to reset it). "Reset adapter unexpectedly" gives me the
feeling that the firmware inside the network card crashed and something
inside there also reset it.

> Anyone care to help debug this? I have no idea where to start. Can
> kernel or xen generate coredumps one could analyze? Or is the log output
> the only thing?
> 
> (If you look at the logs, the strange thing is the system does not crash
> and reboot immediately, but later after lots of errors with storage, but
> comes back fine after reboot.)

The ata errors (disk fails to process a command) happen after all of the
above happens. Usually disk errors that look like this point at broken
disk hardware or bugs in the firmware in the disk. However, if it
consistently happens 6 to 7 seconds after the network card disaster, it
might be a symptom of the former.

The first thing I would recommend is disabling transmit segmentation
offloading to the network card in dom0 (ethtool enp1s0 tso off) and see
if it prevents the network card from choking on some kind of input. If
not, play with more settings like transmit checksum offloading (tx off).

If this does not help, we can start asking some Xen developers if they
have an idea how we can help with debugging and what we should do. (I
help maintaining the Xen packages in Debian, my knowledge about
internals of it is mostly limited to all the been-there-done-thats
during the years of using it as a user.)

I expect the problem to be related to Linux and the hardware, and not
specifically Xen. Knowing if the same happens when just booting Linux
without Xen is valuable debugging info. However, I realize that it's
likely a bit complicated to, in that case, try triggering the problem by
generate the same workload that's now coming from the domUs.

Curious to hear what happens,
Thanks,
Hans van Kranenburg

Bug#880554: [Pkg-xen-devel] Bug#880554: #880554: max grant frames problem

2019-11-28 Thread Hans van Kranenburg

On 7/18/19 1:30 AM, Hans van Kranenburg wrote:
> Hi,
> 
> On 10/23/18 7:34 PM, Ian Jackson wrote:
>> Control: retitle -1 max grant frames problem (domu freeze with 
>> linux-image-4.9.0-4-amd64)
>> Control: severity -1 important
>> Control: reassign -1 src:xen 4.8.3+xsa267+shim4.10.1+xsa267-1+deb9u9
> 
> my last comment in this bts bug was about:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=29d11cfd8698038b87458ba4d1329b9da81150a5
> 
> ..which is in since linux 4.13-rc2, and buster has 4.19+
> 
> Is there anyone who would wants to try reproduce the max grant frames
> problem on buster with Xen 4.11 and Linux 4.19 dom0/domU?
> 
> The 'xen/grant-table: max_grant_frames reached' should show up on the
> serial console. I'd like to see a test report of it actually happening.

I actually just did this, by putting max_grant_frames = 4 in a domU
config file and starting it (Linux 4.19 domU on Xen 4.11):

Welcome to Debian GNU/Linux 10 (buster)!

[5.499058] systemd[1]: Set hostname to .
[5.552968] xen:grant_table: xen/grant-table: max_grant_frames
reached cur=4 extra=1 limit=4 gnttab_free_count=3 req_entries=1
[5.554012] xen:grant_table: xen/grant-table: max_grant_frames
reached cur=4 extra=1 limit=4 gnttab_free_count=3 req_entries=1
[5.555858] xen:grant_table: xen/grant-table: max_grant_frames
reached cur=4 extra=1 limit=4 gnttab_free_count=3 req_entries=1
[5.556950] xen:grant_table: xen/grant-table: max_grant_frames
reached cur=4 extra=1 limit=4 gnttab_free_count=3 req_entries=1
[5.557082] xen:grant_table: xen/grant-table: max_grant_frames
reached cur=4 extra=1 limit=4 gnttab_free_count=3 req_entries=1
[5.557295] xen:grant_table: xen/grant-table: max_grant_frames
reached cur=4 extra=1 limit=4 gnttab_free_count=3 req_entries=1
[5.557636] xen:grant_table: xen/grant-table: max_grant_frames
reached cur=4 extra=1 limit=4 gnttab_free_count=3 req_entries=1
[5.558960] xen:grant_table: xen/grant-table: max_grant_frames
reached cur=4 extra=1 limit=4 gnttab_free_count=3 req_entries=1
[5.559800] xen:grant_table: xen/grant-table: max_grant_frames
reached cur=4 extra=1 limit=4 gnttab_free_count=3 req_entries=1
[6.014291] gnttab_expand: 159 callbacks suppressed
[6.014296] xen:grant_table: xen/grant-table: max_grant_frames
reached cur=4 extra=1 limit=4 gnttab_free_count=0 req_entries=3
[6.014351] xen:grant_table: xen/grant-table: max_grant_frames
reached cur=4 extra=1 limit=4 gnttab_free_count=0 req_entries=8
[6.033683] xen:grant_table: xen/grant-table: max_grant_frames
reached cur=4 extra=1 limit=4 gnttab_free_count=0 req_entries=3
[6.055013] xen:grant_table: xen/grant-table: max_grant_frames
reached cur=4 extra=1 limit=4 gnttab_free_count=0 req_entries=3
[6.055729] xen:grant_table: xen/grant-table: max_grant_frames
reached cur=4 extra=1 limit=4 gnttab_free_count=0 req_entries=26
[6.060256] xen:grant_table: xen/grant-table: max_grant_frames
reached cur=4 extra=1 limit=4 gnttab_free_count=0 req_entries=3
[6.077000] xen:grant_table: xen/grant-table: max_grant_frames
reached cur=4 extra=1 limit=4 gnttab_free_count=0 req_entries=3
[6.109760] xen:grant_table: xen/grant-table: max_grant_frames
reached cur=4 extra=1 limit=4 gnttab_free_count=0 req_entries=3
[6.138126] xen:grant_table: xen/grant-table: max_grant_frames
reached cur=4 extra=1 limit=4 gnttab_free_count=0 req_entries=3
[6.148626] xen:grant_table: xen/grant-table: max_grant_frames
reached cur=4 extra=1 limit=4 gnttab_free_count=0 req_entries=3

Yay. Better info for the users!

Also, there's a patch in review that can improve the situation:

https://lists.xenproject.org/archives/html/xen-devel/2019-11/msg01607.html

The biggest annoyance in our Xen 4.11 now is that the default value for
the hypervisor command line of gnttab_max_frames is raised to 64 from 32
a while ago, but the toolstack overwrites this again with a default of
32. The patch attempts to fix that.

Hans

Bug#924360: xen-hypervisor-4.11-amd64 HVM Boot failure: "ERR: Bootloader shutdown EFI x64 boot services!" - also on stable

2019-08-09 Thread Hans van Kranenburg

Hi all (reporters on 924360, 901599),

On 8/6/19 5:43 PM, Gerald Wodni wrote:
> 
> I would like to confirm this bug in stable, as I have exactly the same
> issue (dom0 works/xen hangs/error message) since upgrading from stretch
> to buster.

Thanks for your report(s). Sorry to let you wait without reply for some
time.

Unfortunately booting Xen/dom0 with EFI is not something that is very
well tested in Debian. One of the reasons for this is simply that none
of the package maintainers is using EFI.

For these kind of cases, we rely on users who encounter the problem and
who have the ability/skills/etc to help debugging the problem.

I suspect the problem is caused by some intricacies concerning
interaction between grub, xen, etc. There are some other reports on the
upstream xen-users mailing list about this, but to be honest I have no
idea if those are related. The problems might or might not be specific
to Debian, I don't know.

I'm available to facilitate the process, for example by creating new
packages with a specific patch to test, but unfortunately I don't have
spare hardware and time to try reproduce the problems myself and dig
deep into it.

Thanks,
Hans van Kranenburg

Bug#932759: [Pkg-xen-devel] Bug#932759: After upgrade from stretch to buster, removal of obsolete xen 4.8 packages seems to trigger shutdown of xenconsoled

2019-08-01 Thread Hans van Kranenburg

On 7/23/19 4:07 PM, niek wrote:
> [...]
> 2019-07-21 07:38:40 status installed xen-utils-4.8:amd64
> 4.8.5+shim4.10.2+xsa282-1+deb9u11
> 2019-07-21 07:38:40 remove xen-utils-4.8:amd64
> 4.8.5+shim4.10.2+xsa282-1+deb9u11 
> 2019-07-21 07:38:40 status half-configured xen-utils-4.8:amd64
> 4.8.5+shim4.10.2+xsa282-1+deb9u11
> 2019-07-21 07:38:41 status half-installed xen-utils-4.8:amd64
> 4.8.5+shim4.10.2+xsa282-1+deb9u11
> 2019-07-21 07:38:41 status config-files xen-utils-4.8:amd64
> 4.8.5+shim4.10.2+xsa282-1+deb9u11
> 2019-07-21 07:38:41 status not-installed xen-utils-4.8:amd64 
> 2019-07-21 07:38:41 status installed libxen-4.8:amd64
> 4.8.5+shim4.10.2+xsa282-1+deb9u11
> 2019-07-21 07:38:41 remove libxen-4.8:amd64
> 4.8.5+shim4.10.2+xsa282-1+deb9u11 
> 2019-07-21 07:38:41 status half-configured libxen-4.8:amd64
> 4.8.5+shim4.10.2+xsa282-1+deb9u11
> 2019-07-21 07:38:41 status half-installed libxen-4.8:amd64
> 4.8.5+shim4.10.2+xsa282-1+deb9u11
> 2019-07-21 07:38:41 status config-files libxen-4.8:amd64
> 4.8.5+shim4.10.2+xsa282-1+deb9u11
> 2019-07-21 07:38:41 status not-installed libxen-4.8:amd64 
> [...]
> 2019-07-21 07:38:42 status installed xen-hypervisor-4.8-amd64:amd64
> 4.8.5+shim4.10.2+xsa282-1+deb9u11
> 2019-07-21 07:38:42 remove xen-hypervisor-4.8-amd64:amd64
> 4.8.5+shim4.10.2+xsa282-1+deb9u11 
> 2019-07-21 07:38:42 status half-configured
> xen-hypervisor-4.8-amd64:amd64 4.8.5+shim4.10.2+xsa282-1+deb9u11
> 2019-07-21 07:38:42 status half-installed xen-hypervisor-4.8-amd64:amd64
> 4.8.5+shim4.10.2+xsa282-1+deb9u11
> 2019-07-21 07:39:41 status config-files xen-hypervisor-4.8-amd64:amd64
> 4.8.5+shim4.10.2+xsa282-1+deb9u11

Ok, so, the most interesting question for me is...

On line 50 in the init script:

https://salsa.debian.org/xen-team/debian-xen/blob/master/debian/xen-utils-common.xen.init

case $DPKG_MAINTSCRIPT_PACKAGE in
xen-utils-$VERSION) ;;  # xen-utils-V maintscript, under Xen X=V
xen-utils-*)exit 0;; # xen-utils-V maintscript, but under Xen X!=V
*)  ;;  # maybe not under dpkg, etc.
esac

What is the value of this $DPKG_MAINTSCRIPT_PACKAGE when it happens?

Could it be something else than something beginning with xen-utils-?
I have a suspicion that the systemd[1]: Reloading. has something to do
with it. Or the triggers?

Anyway, if DPKG_MAINTSCRIPT_PACKAGE gets lost *anywhere* in whatever
happens, it might end up as empty, and then matching just *.

But, we really need find out how to reproduce it in a test environment. :|

Hans

Bug#932759: [Pkg-xen-devel] Bug#932759: After upgrade from stretch to buster, removal of obsolete xen 4.8 packages seems to trigger shutdown of xenconsoled

2019-07-22 Thread Hans van Kranenburg

Hi niek,

Thanks for the report!

On 7/22/19 8:32 PM, niek wrote:
> Package:  xen-hypervisor-4.11-amd64
> Version: 4.11.1+92-g6c33308a8d-2
> 
> What happened:
> - upgraded Debian Xen Dom0 from stretch to buster and rebooted, as
> described in
> https://www.debian.org/releases/buster/amd64/release-notes/ch-upgrading.en.html
> 
> - started some Linux pv domu without problems
> 
> - removed obsolete packages with 'apt autoremove'. This removed (among
> others)
> xen-hypervisor-4.8-amd64:amd64 (4.8.5+shim4.10.2+xsa282-1+deb9u11),
> libxen-4.8:amd64 (4.8.5+shim4.10.2+xsa282-1+deb9u11),
> xen-utils-4.8:amd64 (4.8.5+shim4.10.2+xsa282-1+deb9u11)
> 
> [...]
> - xenconsoled was not running
> 
> - searching system logs revealed that xenconsoled seemed to have stopped
> when 'apt autoremove' removed the obsolete xen 4.8
> packages after upgrading to xen 4.11.

Well, there it is again. We tried to make a fix, exactly for this...

https://salsa.debian.org/xen-team/debian-xen/commit/ef242a700765a971a6afc12d25ee19944dd3a27a

...and apparently there's another scenario in which even this doesn't work?

Can you show the lines from /var/log/dpkg.log from that moment, the
seconds around 07:38:40? It tells exactly what got removed, in what
second, just to confirm?

I'm pretty sure I tried to reproduce this after we added the fix I just
referenced, and I was unable to. So, I'm very interested in finding out
what's still going on here.

Usually being able to reproduce a problem is one of the biggest steps
towards finding a solution. (since it can be done over and over again,
finding out what exactly causes it). So, finding the right sequence of
steps to make it happen again is crucial here.

Do you think the systemd reload has anything to do with it? Maybe the
whole systemd init-script-wrapper-trickery is misbehaving in some way?

Can you reproduce this by manually grabbing the
xen-hypervisor-4.8-amd64, libxen-4.8 and xen-utils-4.8 from stretch
again, installing them and removing them again? Do you have any other idea?

Thanks,
Hans

Bug#880554: [Pkg-xen-devel] Bug#880554: #880554: max grant frames problem

2019-07-17 Thread Hans van Kranenburg

Hi,

On 10/23/18 7:34 PM, Ian Jackson wrote:
> Control: retitle -1 max grant frames problem (domu freeze with 
> linux-image-4.9.0-4-amd64)
> Control: severity -1 important
> Control: reassign -1 src:xen 4.8.3+xsa267+shim4.10.1+xsa267-1+deb9u9

my last comment in this bts bug was about:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=29d11cfd8698038b87458ba4d1329b9da81150a5

..which is in since linux 4.13-rc2, and buster has 4.19+

Is there anyone who would wants to try reproduce the max grant frames
problem on buster with Xen 4.11 and Linux 4.19 dom0/domU?

The 'xen/grant-table: max_grant_frames reached' should show up on the
serial console. I'd like to see a test report of it actually happening.

No further adjustments/fixes will go into the Stretch Xen packages at
this stage.

Having better documentation about how to set hypervisor and guest
options to deal with all of this is still a TODO. I would really like to
get some people together to start cleaning out the whole Xen related
wiki section for Debian, and actually provide some helpful content,
including FAQ stuff like max grants, PVH, PVH+grub etc...

Whoever would want to participate in that, just reply a Yay!

Doing documentation work might seem boring, but it's write once, read
many all the way.

Hans

Bug#932085: grub-common: Grub can't load initrd for Xen after upgrade to Buster

2019-07-17 Thread Hans van Kranenburg

On 7/14/19 11:43 PM, Colin Watson wrote:
> On Sun, Jul 14, 2019 at 01:27:23PM -0700, Slava Kryvel wrote:
>> After upgrade from Debian 9.9 to Debian 10 I have got unbootable system.
>>
>> I'm using Xen hypervisor, which was also upgraded from 4.8 to 4.11
>> during OS upgrade.
>> UEFI is enabled.
>>
>> After upgrade was finished, I was unable to boot again to Xen kernel.
>> But normal Debian kernel was still bootable.
> 
> [...]
> 
> I'm CCing a few folks who've contributed to GRUB's Xen support in one
> way or another in the recent past; hopefully at least one of them can
> help here?

Just to be transparent here, not all possible functionality is tested by
the package maintainers (currently Ian and me) before throwing a new
package into Debian. This is simply not practically feasible for us. [0]

We rely on the upstream tests to know that the upstream Xen code will
probably work. For Debian specific things, we do test our own use cases,
but e.g. UEFI is not one of them. For this, we rely on active users to
report problems and help solving them. So, yes, things like this can happen.

Thanks for reporting this. Next step would be to follow Rogers
instructions, and provide config dumps, serial console output etc...

We're certainly available to include changes / etc to fix things, given
proper information / testing reports from the user. But, the user has to
actively help to make that happen.

Hans van Kranenburg (with Debian Xen team hat on)

[0]
https://alioth-lists.debian.net/pipermail/pkg-xen-devel/2018-October/007438.html

Bug#930797: unblock: xen/4.11.1+92-g6c33308a8d-1

2019-06-22 Thread Hans van Kranenburg

Control: tags -1 - moreinfo

Hi Paul,

On 6/21/19 10:02 PM, Paul Gevers wrote:
> Control: tags -1 moreinfo
> 
> Hi Hans,
> 
> On 20-06-2019 21:14, Hans van Kranenburg wrote:
>>   * Note that the fixes for XSA-297 will only have effect when also loading
>> updated cpu microcode with MD_CLEAR functionality. When using the
>> intel-microcode package to include microcode in the dom0 initrd, it
>> has to
>> be loaded by Xen. Please refer to the hypervisor command line
>> documentation about the 'ucode=scan' option.
> 
> I asked this question recently for another unblock report (not by you)
> as well, but don't you think this is worth mentioning in NEWS? So that
> people that use apt-listchanges are warned about this?

Yes, it surely is. I realized the same thing, but only after the upload
was done.

What do you think about the following (also added as attachment):

https://salsa.debian.org/xen-team/debian-xen/commit/ce3646253ebb7d4834a83a8ee813d7bef9b7ffe2

I'm building it now to see if everything ends up in the right place in
the resulting packages.

Thanks,
Hans
commit ce3646253ebb7d4834a83a8ee813d7bef9b7ffe2 (HEAD -> knorrie/4.11, 
origin/knorrie/4.11)
Author: Hans van Kranenburg 
Date:   Sat Jun 22 11:45:34 2019 +0200

Update to 4.11.1+92-g6c33308a8d-2 with MDS documentation

Following up feedback from the release team, add a NEWS file mentioning
the MDS mitigations with some instructions, so that it will be more
visible to people using apt-listchanges.

Mention the ucode option in our default documented set of "usually used
options", so that users doing a new install will get a hint about the
existence of this option, and what it does.

diff --git a/debian/NEWS b/debian/NEWS
new file mode 100644
index 00..e32955a161
--- /dev/null
+++ b/debian/NEWS
@@ -0,0 +1,20 @@
+xen (4.11.1+92-g6c33308a8d-1) unstable; urgency=high
+
+This update contains the mitigations for the Microarchitectural Data
+Sampling speculative side channel attacks. Only Intel based processors are
+affected.
+
+Note that these fixes will only have effect when also loading updated cpu
+microcode with MD_CLEAR functionality. When using the intel-microcode
+package to include microcode in the dom0 initrd, it has to be loaded by
+Xen. Please refer to the hypervisor command line documentation about the
+'ucode=scan' option.
+
+For the fixes to be fully effective, it is currently also needed to disable
+hyper-threading, which can be done in BIOS settings, or by using smt=no on
+the hypervisor command line.
+
+Additional information is available in the upstream Xen security advisory:
+https://xenbits.xen.org/xsa/advisory-297.html
+
+ -- Hans van Kranenburg   Tue, 18 Jun 2019 09:50:19 +0200
diff --git a/debian/changelog b/debian/changelog
index 9c64ee1326..4d2fc62b5b 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -1,3 +1,11 @@
+xen (4.11.1+92-g6c33308a8d-2) unstable; urgency=high
+
+  * Mention MDS and the need for updated microcode and disabling
+hyper-threading in NEWS.
+  * Mention the ucode=scan option in the grub.d/xen documentation.
+
+ -- Hans van Kranenburg   Sat, 22 Jun 2019 11:15:08 +0200
+
 xen (4.11.1+92-g6c33308a8d-1) unstable; urgency=high
 
   * Update to new upstream version 4.11.1+92-g6c33308a8d, which also
diff --git a/debian/tree/xen-hypervisor-common/etc/default/grub.d/xen.cfg 
b/debian/tree/xen-hypervisor-common/etc/default/grub.d/xen.cfg
index e3853c33ca..900c12df5d 100644
--- a/debian/tree/xen-hypervisor-common/etc/default/grub.d/xen.cfg
+++ b/debian/tree/xen-hypervisor-common/etc/default/grub.d/xen.cfg
@@ -44,6 +44,11 @@ echo "Including Xen overrides from 
/etc/default/grub.d/xen.cfg"
 #   Do not automatically reboot after an error. This is useful for catching
 #   debug output.
 #
+# ucode=scan (only for x86)
+#   Scan the multiboot images mentioned in grub configuration for an cpio image
+#   that contains cpu microcode. This enables loading microcode that is stored
+#   in the dom0 initrd.img.
+#
 # Please also refer to the "Xen Hypervisor Command Line Options"
 # documentation for the version of Xen you have installed. This
 # documentation can be found at https://xenbits.xen.org/

Bug#930797: unblock: xen/4.11.1+92-g6c33308a8d-1

2019-06-20 Thread Hans van Kranenburg

Package: release.debian.org
User: release.debian@packages.debian.org
Usertags: unblock
Severity: normal

Please unblock package src:xen

Hi release team,

Yesterday we uploaded a security update for Xen. This update also
contains the mitigations for Microarchitectural Data Sampling.

The upstream source is forwarded from commit 87f51bf366 to commit
6c33308a8d:
https://xenbits.xen.org/gitweb/?p=xen.git;a=shortlog;hp=87f51bf366;h=6c33308a8d

There are no further packaging changes (except for the changelog, of
course):

 >8 

xen (4.11.1+92-g6c33308a8d-1) unstable; urgency=high

  * Update to new upstream version 4.11.1+92-g6c33308a8d, which also
contains the following security fixes:
- Fix: grant table transfer issues on large hosts
  XSA-284 (no CVE yet) (Closes: #929991)
- Fix: race with pass-through device hotplug
  XSA-285 (no CVE yet) (Closes: #929998)
- Fix: x86: steal_page violates page_struct access discipline
  XSA-287 (no CVE yet) (Closes: #930001)
- Fix: x86: Inconsistent PV IOMMU discipline
  XSA-288 (no CVE yet) (Closes: #929994)
- Fix: missing preemption in x86 PV page table unvalidation
  XSA-290 (no CVE yet) (Closes: #929996)
- Fix: x86/PV: page type reference counting issue with failed IOMMU
update
  XSA-291 (no CVE yet) (Closes: #929995)
- Fix: x86: insufficient TLB flushing when using PCID
  XSA-292 (no CVE yet) (Closes: #929993)
- Fix: x86: PV kernel context switch corruption
  XSA-293 (no CVE yet) (Closes: #92)
- Fix: x86 shadow: Insufficient TLB flushing when using PCID
  XSA-294 (no CVE yet) (Closes: #929992)
- Fix: Microarchitectural Data Sampling speculative side channel
  XSA-297 CVE-2018-12126 CVE-2018-12127 CVE-2018-12130 CVE-2019-11091
  (Closes: #929129)
  * Note that the fixes for XSA-297 will only have effect when also loading
updated cpu microcode with MD_CLEAR functionality. When using the
intel-microcode package to include microcode in the dom0 initrd, it
has to
be loaded by Xen. Please refer to the hypervisor command line
documentation about the 'ucode=scan' option.
  * Fixes for XSA-295 "Unlimited Arm Atomics Operations" will be added
in the
next upload.

 -- Hans van Kranenburg   Tue, 18 Jun 2019 09:50:19 +0200

 >8 

We prefer to keep releasing from the upstream stable release branches,
because:

(i) upstream only put bugfixes and security fixes on their stable
branches (ii) trying to assemble our own subset of the patches is
riskier than taking upstream's collection (iii) the upstream stable
release branch has undergone extensive testing, which we cannot repeat
in Debian.

The binary packages built from src:xen are:

libxencall1
libxencall1-dbgsym
libxen-dev
libxendevicemodel1
libxendevicemodel1-dbgsym
libxenevtchn1
libxenevtchn1-dbgsym
libxenforeignmemory1
libxenforeignmemory1-dbgsym
libxengnttab1
libxengnttab1-dbgsym
libxenmisc4.11
libxenmisc4.11-dbgsym
libxenstore3.0
libxenstore3.0-dbgsym
libxentoolcore1
libxentoolcore1-dbgsym
libxentoollog1
libxentoollog1-dbgsym
xen-doc
xen-hypervisor-4.11-amd64
xen-hypervisor-common
xenstore-utils
xenstore-utils-dbgsym
xen-system-amd64
xen-utils-4.11
xen-utils-4.11-dbgsym
xen-utils-common
xen-utils-common-dbgsym

The source debdiff is attached for sake of completeness.

Please unblock.

Thanks a lot,
Hans van Kranenburg
Debian Xen Team



debdiff_xen_4.11.1+26-g87f51bf366-3_xen_4.11.1+92-g6c33308a8d-1.txt.gz
Description: application/gzip

Bug#929129: [Pkg-xen-devel] Bug#929129: closed by Hans van Kranenburg (Bug#929129: fixed in xen 4.11.1+92-g6c33308a8d-1)

2019-06-19 Thread Hans van Kranenburg

On 6/19/19 4:43 PM, Wiebe Cazemier wrote:
> This is an update to the unstable release. What is one running Debian
> stable (9), with Xen Hypervisor 4.8, to do?

This is not meant as a middle finger to users of stable.

All of the bug numbers will be closed twice, also by the 4.8 upload,
which also has to mention them. This is confusing, however the automated
behaviour after uploading any of them is to close the bug with that report.

At least the 4.11 is out now, last thing I heard about 4.8 was that
there are issues compiling the current 4.8-stable upstream branch in
Stretch, and that's quite an important prerequisite for continuing. :|
Ian needs to work on that.

I will see if I can manipulate them a bit. All the other ones mentioned
in the changelog should also have the info that it's found in current
version in stable attached to them, so that the version graph shows both.

Hans

1 2 3 4 >

1 - 100 of 364 matches

Mail list logo