Bug#1067151: xen-utils-common: vif-openvswitch ignores MTU
Hi Aleksi, Thanks for the report. I actually ran into the same situation recently, wanting to set up a PPPoE connection from within a Xen domU, also using openvswitch as bridge. On 19/03/2024 12:21, Aleksi Suhonen wrote: > Package: src:xen > Version: 4.17.3+10-g091466ba55-1~deb12u1 > Severity: wishlist > > I wasn't sure if this script comes from Debian or Xen or somewhere else, > so I thought it safest to report it here. These scripts/vif-* files are located in tools/hotplug/Linux in the Xen source tree, we ship them as such in the Debian package. So, yes, changes to them should first go upstream. However, it's perfectly fine to have a discussion here, so we can figure out what the right changes should be. > /etc/xen/scripts/vif-bridge handles MTU settings in the vif, but the > otherwise similar /etc/xen/scripts/vif-openvswitch does not. I added it > in, here's the diff-c and the full fixed file is also attached. > > *** vif-openvswitch.orig2024-03-19 11:53:13.0 +0200 > --- vif-openvswitch 2024-03-19 11:56:17.0 +0200 > *** > *** 89,94 > --- 89,95 >add|online) >check_tools >setup_virtual_bridge_port $dev > + set_mtu "$bridge" "$dev" "$type_if" >add_to_openvswitch $dev >;; Ah, interesting. I had some difficulties getting it to work back then. But, when putting the set_mtu line back like this, it also gives me the desired outcome now! My use case is about setting up a PPPoE connection from a Xen domU over vlan 6. I want an mtu of 1500 for the traffic inside the PPPoE connection, so I need mtu 1508 for the connection between the PPPoE client in the domU -> openvswitch in the dom0 -> physical interface -> switchports -> ISP NTU device. For some reason I had troubles to get the vifX.Y interface, as seen inside dom0 set to mtu 1508. It seemed not to have any effect (using ip link set mtu dev ), or, openvswitch kept resetting it back to 1500 all the time. When I would use ovs-vsctl set interface mtu_request= instead, it actually sticked. That's what I remember. I just did some more testing, and I cannot really reproduce that situation... :| I can also just use ip link in the dom0 now. Interesting, but good, since it would mean that we can indeed just (re)use that set_mtu function! :) I'm still curious what the problem was when I tried earlier... Maybe anyone else reading this knows more? Are you familiar with the process of sending patches upstream? Otherwise we (Debian Xen team) can assist with that. Regards, Hans
Bug#1063270: The "64bits time_t transition" in Debian/Xen
Hi, On 2/12/24 18:43, Andrew Cooper wrote: > On 12/02/2024 5:27 pm, zithro wrote: >> Hey all, >> >> the Debian project is focused on the "2038 time_t" switch. >> So the maintainers of the Debian Xen package must ensure that all >> imported Xen code conforms to the new Debian standards. >> >> I was asked by Andrew Cooper to post here about this, I'll quote him : >> "So I had been idly wondering whether Xen would match up to Debian's new >> policy, and it appears not >> this topic really needs to be brought up on the xen-devel mailing list >> do you have any more details as to what has gone wrong? >> this is something we ought to arrange to happen in CI by default >> but it sounds like there's some work needed first" >> >> (Not answering the question because I'm just a messenger). > > xen.git/xen$ git grep -w time_t -- :/ > ../tools/console/client/main.c:106: time_t start, now; > ../tools/console/daemon/io.c:272: time_t now = time(NULL); > ../tools/libs/light/libxl_qmp.c:116: time_t timeout; > ../tools/libs/light/libxl_qmp.c:585: > time_t ask_timeout) > ../tools/libs/light/libxl_x86.c:516: time_t t; > ../tools/libs/toollog/xtl_logger_stdio.c:61: time_t now = time(0); > ../tools/tests/xenstore/test-xenstore.c:453: time_t stop; > ../tools/xenmon/xenbaked.c:98:time_t start_time; > ../tools/xenstored/core.c:109: time_t now; > ../tools/xenstored/core.h:150: time_t ta_start_time; > ../tools/xenstored/domain.c:143: time_t mem_last_msg; > ../tools/xenstored/domain.c:188:static time_t wrl_log_last_warning; /* > 0: no previous warning */ > ../tools/xenstored/domain.c:1584: time_t now; > ../tools/xenstored/lu.c:160: time_t now = time(NULL); > ../tools/xenstored/lu.c:185: time_t now = time(NULL); > ../tools/xenstored/lu.c:292: time_t now = time(NULL); > ../tools/xenstored/lu.h:32: time_t started_at; > ../tools/xentop/xentop.c:947: time_t curt; > ../tools/xl/xl_info.c:742:static char *current_time_to_string(time_t now) > ../tools/xl/xl_info.c:759:static void print_dom0_uptime(int short_mode, > time_t now) > ../tools/xl/xl_info.c:810:static void print_domU_uptime(uint32_t domuid, > int short_mode, time_t now) > ../tools/xl/xl_info.c:847: time_t now; > ../tools/xl/xl_vmcontrol.c:336: time_t start; > ../tools/xl/xl_vmcontrol.c:495: time_t now; > ../tools/xl/xl_vmcontrol.c:504: if (now == ((time_t) -1)) { > ../tools/xs-clients/xenstore_control.c:33: time_t time_start; > arch/x86/cpu/mcheck/mce.h:224: uint64_t time; /* wall time_t when > error was detected */ > arch/x86/time.c:1129: * machines were long is 32-bit! (However, as > time_t is signed, we > > > I don't see any ABI problems from using a 64bit time_t. The only header > file with a time_t is xenstored/lu.h which is a private header and not a > public ABI. > > I guess we fell into the "could not be analysed via > abi-compliance-checker" case? Thanks for also looking into this! Maximilian mentioned in #debian-xen that doing a Debian package build with DEB_BUILD_OPTIONS=abi=+lfs and _FILE_OFFSET_BITS=64 and _TIME_BITS=64 resulted in the exact same binaries for shared libs. What we also found is these reports: 1. Enabling lfs, which has no effect: https://adrien.dcln.fr/misc/armhf-time_t/2024-02-06T16%3A48%3A00/compat_reports/libxen-dev/base_to_lfs/compat_report.html 2. Enabling the 64-bit time_t as well: https://adrien.dcln.fr/misc/armhf-time_t/2024-02-06T16%3A48%3A00/compat_reports/libxen-dev/lfs_to_time_t/compat_report.html In there, see "Problems with Data Types, Low Severity 2 " about struct_timeval: >8 [+] struct timeval Change -> Effect 1 Type of field tv_sec has been changed from __time_t to __time64_t. -> Recompilation of a client program may be broken. 2 Type of field tv_usec has been changed from __suseconds_t to __suseconds64_t. -> Recompilation of a client program may be broken. [+] affected symbols: 3 (0.2%) * libxl_osevent_afterpoll ( libxl_ctx* ctx, int nfds, struct pollfd const* fds, struct timeval now ) -> 4th parameter 'now' is of type 'struct timeval'. * libxl_osevent_beforepoll ( libxl_ctx* ctx, int* nfds_io, struct pollfd* fds, int* timeout_upd, struct timeval now ) -> 5th parameter 'now' is of type 'struct timeval'. * libxl_osevent_register_hooks ( libxl_ctx* ctx, libxl_osevent_hooks const* hooks, void* user ) -> Field 'hooks.timeout_modify.p2' in 2nd parameter 'hooks' (pointer) has base type 'struct timeval'. >8 So, the question is, is this correct and would it cause a problem. If so, it also means that those functions are in a versioned lib, libxenlight.so.4.17.0 (in binary package libxenmisc4.17). Coincidentally, we are currently preparing the upload to switch from Xen 4.17 to Xen 4.18 in Debian unstable. So, if we just go ahead with doing that, and make sure it's built in the new way already... then... tada.wav! We just immediately have the correct
Bug#1053246: Security support ended for Xen 4.14 in Bullseye
Package: debian-security-support Version: 1:11+2023.05.04 Severity: normal Hi, Upstream security support for Xen 4.14 has ended recently. This also means that security support for Debian Bullseye has ended. The complexity of the software involved does not really allow for anyone else than the upstream developers, with a deep understanding of the inner workings of the hypervisor code, to apply/backport new patches. For security-support-ended.deb11, this could be a line like: xen 4.14.6-1 2023-09-21 https://xenbits.xen.org/docs/4.14-testing/SUPPORT.html#release-support Note: This 4.14.6-1 package version is not visible for bullseye yet, right now, in the archive. It was submitted for the bullseye point release, and has just been accepted into it: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1053177 Thanks, Hans
Bug#1053177: bullseye-pu: package xen/4.14.6-1
Hi Adam, On 9/28/23 19:09, Adam D. Barratt wrote: > On Thu, 2023-09-28 at 18:27 +0200, Hans van Kranenburg wrote: >> Xen 4.14 support (and security support) has ended upstream. The >> upstream >> stable branch for version 4.14 is frozen now, and a final maintenance >> release version 4.14.6 has been released. We'd like to put this final >> update into Bullseye, to properly finish the Xen work for Bullseye. >> Also, a few security fixes (regarding CVE-2023-20593 CVE-2023-20569 >> CVE-2022-40982) are included. >> >> https://xenbits.xen.org/docs/4.14-testing/SUPPORT.html#release-support >> > > --- xen-4.14.5+94-ge49571868d/automation/scripts/qemu-smoke-x86-64.sh > 2023-03-21 13:07:44.0 +0100 > +++ xen-4.14.6/automation/scripts/qemu-smoke-x86-64.sh2023-08-07 > 14:11:14.0 +0200 > @@ -5,11 +5,6 @@ > # variant should be either pv or pvh > variant=$1 > > -# Install QEMU > -export DEBIAN_FRONTENT=noninteractive > -apt-get -qy update > -apt-get -qy install qemu-system-x86 > > I realise this is an upstream change, but is it really intended to stop > installing QEMU in a QEMU smoke test? This particular change can be seen as the contents of the following commit, in this case for 4.14: 8< commit 98ec8ad2eeb96eb9d4b7f9bfd1ef3a994c63af17 Refs: RELEASE-4.14.5-103-g98ec8ad2eeb9 Author: Michal Orzel AuthorDate: Wed Apr 26 09:29:45 2023 +0200 Commit: Jan Beulich CommitDate: Wed Apr 26 09:29:45 2023 +0200 automation: Remove installation of packages from test scripts Now, when these packages are already installed in the respective containers, we can remove them from the test scripts. Signed-off-by: Michal Orzel Reviewed-by: Stefano Stabellini master commit: 72cfe1c3ad1fae95f4f0ac51dbdd6838264fdd7f master date: 2022-12-09 14:55:33 -0800 >8 This is part of a change to the upstream test machinery. The commit that it's picked from (the 72cfe1c3ad1 thing) lockstep follows a previous change to the development / master branch: 8< commit 1ed7da301020ee1e16177cb3d9caa817f195a59a Author: Michal Orzel Date: Thu Nov 17 17:16:42 2022 +0100 automation: Install packages required by tests in containers Installation of additional packages from the test scripts when running the tests has some drawbacks. It is slower than cloning containers and can fail due to some network issues (apparently it often happens on x86 rackspace). This patch is adding the packages required by the tests to be installed when building the containers. >From qemu-alpine-x86_64.sh into debian:stretch: - cpio, - busybox-static. >From qemu-smoke-*-{arm,arm64}.sh into debian:unstable-arm64v8: - u-boot-qemu, - u-boot-tools, - device-tree-compiler, - curl, - cpio, - busybox-static. The follow-up patch will remove installation of these packages from the test scripts. This is done in order not to break the CI in-between. Signed-off-by: Michal Orzel Reviewed-by: Stefano Stabellini >8 The Xen Project OSSTest machinery is used to run testing for the current development version of Xen, as well as for the stable branch lines that are still under active support. After building/compiling the source code, all kinds of test scenarios are executed, comprising tests for different virtualization modes, or different kinds of functionality, but also different kinds of actual hardware. AIUI, wanting to be able to do all of this quickly boils down to a 'feet in the mud' situation, which involves automating interaction with PDUs to be able to physically cut off power from a misbehaving piece of server hardware, or, capturing actual serial console cable output. I can understand that, at least for practical reasons, there is no desire to duplicate/replicate all of this for each supported Xen version. AIUI, The Xen source tree contains code/scripts to help setting up the test cases, as well as to be able to run them. For the first part, the current development code is used (the master branch), and for the second part, well, whatever is in a branch line needs to be able to behave correctly in that environment. This is the reason why we can find the change with title "automation: Install packages required by tests in containers" only once, committed to the master branch at the time the change took place, and why similar but possibly different variations on "automation: Remove installation of packages from test scripts" do exist in various other branches, such as stable-4.17 and stable-4.14 etc. Also, note that for Debian, we don't do anything with this part of the upstream source tree, or, at least, I mean, changes in there must not cause changes in the actual debs that we ship. Thanks for the question, it was a fun small exe
Bug#1051862: (Debian) Bug#1051862: server flooded with xen_mc_flush warnings with xen 4.17 + linux 6.1
Hi Radoslav, Thanks for your report... Hi Juergen, Boris and xen-devel, At Debian, we got the report below. (Also at https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1051862) This hardware, with only Xen and Dom0 running is hitting the failed multicall warning and logging in arch/x86/xen/multicalls.c. Can you help advise what we can do to further debug this issue? Since this looks like pretty low level Xen/hardware stuff, I'd rather ask upstream for directions first. If needed the Debian Xen Team can assist the end user with the debugging process. Thanks, More reply inline... On 9/13/23 20:12, Radoslav Bodó wrote: > Package: xen-system-amd64 > Version: 4.17.1+2-gb773c48e36-1 > Severity: important > > Hello, > > after upgrade from Bullseye to Bookworm one of our dom0's > became unusable due to logs/system being continuously flooded > with warnings from arch/x86/xen/multicalls.c:102 xen_mc_flush, and the > system become unusable. > > The issue starts at some point where system services starts to come up, > but nothing very special is on that box (dom0, nftables, fail2ban, > prometheus-node-exporter, 3x domU). We have tried to disable all domU's > and fail2ban as the name of the process would suggest, but issue is > still present. We have tried also some other elaboration but none of > them have helped so far: > > * the issue arise when xen 4.17 + linux >= 6.1 is booted > * xen + bookworm-backports linux-image-6.4.0-0.deb12.2-amd64 have same isuue > * without xen hypervisor, linux 6.1 runs just fine > * systemrescue cd boot and xfs_repair rootfs did not helped > * memtest seem to be fine running for hours Thanks for already trying out all these combinations. > As a workaround we have booted xen 4.17 + linux 5.10.0-25 (5.10.191-1) > and the system is running fine as for last few months. > > Hardware: > * Dell PowerEdge R750xs > * 2x Intel Xeon Silver 4310 2.1G > * 256GB RAM > * PERC H755 Adapter, 12x 18TB HDDs I have a few quick additional questions already: 1. For clarification.. From your text, I understand that only this one single server is showing the problem after the Debian version upgrade. Does this mean that this is the only server you have running with exactly this combination of hardware (and BIOS version, CPU microcode etc etc)? Or, is there another one with same hardware which does not show the problem? 2. Can you reply with the output of 'xl dmesg' when the problem happens? Or, if the system gets unusable too quick, do you have a serial console connection to capture the output? 3. To confirm... I understand that there are many of these messages. Since you pasted only one, does that mean that all of them look exactly the same, with "1 of 1 multicall(s) failed: cpu 10" "call 1: op=1 arg=[a1a9eb10] result=-22"? Or are there variations? If so, can you reply with a few different ones? Since this very much looks like an issue of Xen related code where the Xen hypervisor, dom0 kernel and hardware has to work together correctly, (and not a Debian packaging problem) I'm already asking upstream for advice about what we should/could do next, instead of trying to make a guess myself. Thanks, Hans > Any help, advice or bug confirmation would be appreciated > > Best regards > bodik > > > (log also in attachment) > > ``` > kernel: [ 99.762402] WARNING: CPU: 10 PID: 1301 at > arch/x86/xen/multicalls.c:102 xen_mc_flush+0x196/0x220 > kernel: [ 99.762598] Modules linked in: nvme_fabrics nvme_core bridge > xen_acpi_processor xen_gntdev stp llc xen_evtchn xenfs xen_privcmd > binfmt_misc intel_rapl_msr ext4 intel_rapl_common crc16 > intel_uncore_frequency_common mbcache ipmi_ssif jbd2 nfit libnvdimm > ghash_clmulni_intel sha512_ssse3 sha512_generic aesni_intel acpi_ipmi > nft_ct crypto_simd cryptd mei_me mgag200 ipmi_si iTCO_wdt intel_pmc_bxt > ipmi_devintf drm_shmem_helper dell_smbios nft_masq iTCO_vendor_support > isst_if_mbox_pci drm_kms_helper isst_if_mmio dcdbas mei intel_vsec > isst_if_common dell_wmi_descriptor wmi_bmof watchdog pcspkr > intel_pch_thermal ipmi_msghandler i2c_algo_bit acpi_power_meter button > nft_nat joydev evdev sg nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 > nf_defrag_ipv4 nf_tables nfnetlink drm fuse loop efi_pstore configfs > ip_tables x_tables autofs4 xfs libcrc32c crc32c_generic hid_generic > usbhid hid dm_mod sd_mod t10_pi crc64_rocksoft crc64 crc_t10dif > crct10dif_generic ahci libahci xhci_pci libata xhci_hcd > kernel: [ 99.762633] megaraid_sas tg3 crct10dif_pclmul > crct10dif_common crc32_pclmul crc32c_intel bnxt_en usbcore scsi_mod > i2c_i801 libphy i2c_smbus usb_common scsi_common wmi > kernel: [ 99.764765] CPU: 10 PID: 1301 Comm: python3 Tainted: G > W 6.1.0-12-amd64 #1 Debian 6.1.52-1 > kernel: [ 99.764989] Hardware name: Dell Inc. PowerEdge R750xs/0441XG, > BIOS 1.8.2 09/14/2022 > kernel: [ 99.765214] RIP: e030:xen_mc_flush+0x196/0x220 > kernel: [ 99.765436] Code: e2 06 48 01 da 85 c0 0f
Bug#1042842: network interface names wrong in domU (>10 interfaces)
Hi, On 8/8/23 15:22, Valentin Kleibel wrote: >> On [0], you can read "In both cases the device naming is subject to the >> usual guest or backend domain facilities for renaming network devices". >> It says "naming/renaming", but you can assume "detecting". >> >>> I also checked which net_ids udev knows about and the only things that >>> pop up are: >>> ID_NET_NAMING_SCHEME=v247 >>> ID_NET_NAME_MAC=enx00163efd832b >>> ID_OUI_FROM_DATABASE=Xensource, Inc. What I do is stuff like this: -$ cat /etc/udev/rules.d/70-persistent-vifname.rules SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{nodename}=="device/vif/0", NAME="vlan2" SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{nodename}=="device/vif/1", NAME="vlan3" SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{nodename}=="device/vif/2", NAME="vlan4" SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{nodename}=="device/vif/3", NAME="vlan6" SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{nodename}=="device/vif/4", NAME="vlan9" SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{nodename}=="device/vif/5", NAME="vlan10" SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{nodename}=="device/vif/6", NAME="vlan11" SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{nodename}=="device/vif/7", NAME="vlan12" SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{nodename}=="device/vif/8", NAME="vlan13" SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{nodename}=="device/vif/9", NAME="vlan14" SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{nodename}=="device/vif/10", NAME="vlan15" SUBSYSTEM=="net", DRIVERS=="?*", ATTRS{nodename}=="device/vif/11", NAME="vlan16" The vif/X always matches the order in which you define the interfaces inside the guest config file. After starting to build router VMs (well before the whole interface naming madness was a thing), it took only the first time when we wanted to throw away a vlan, to realize that all the ethX numbers would shift 1 up, and from then on, I've always been using this so set my own style predictable names (whenever there's more than one, otherwise it's just eth0). >> Is it from dom0 or domU ? >> Are you using "net.ifnames=0" on the domU kernel command line ? >> "v247" looks like systemd "predictive naming scheme" (eth -> enX). >> From bookworm on, domUs vifs get named enXN (enX0, enX1, ...). >> Read on : >> https://www.debian.org/releases/stable/i386/release-notes/ch-information.en.html#xen-network > > This is from the domU, running bullseye with a bookworm dom0. > >> See how ethN interfaces get messed up, like in your setup, but >> predictable names would work, as you can see in "altname enXN" : >> eth1 (:01) -> enX1 >> eth2 (:10) -> enX10 >> eth3 (:02) -> enX2 But yeah, so, even while not depending on whatever order it gets initialized, and still having it function correctly, this is still just pretty annoying... If I'm doing stuff around here, and just quickly want to look up things (e.g. messing around with vlan15 settings), and quickly type ip a instead of having to spend more time typing ip a show dev vlan15 jadijadi, I still every time get this short "WTF huh, argh", raises arms, does table flip, grmbl grbml feeling for a split second. 2: vlan2: I could not get our bullseye domU to show the "predictable names" even > though i tried installing the bullseye-backports kernel 6.1. > After you wrote this i installed udev 252.5 from backports and it now > uses the correct enXn interface names, even with kernel 5.10. > >> So, my answer does not tell you if something changed in Xen itself, only >> in Debian. >> But I guess it relates to what Xen devs told us : vifs detection order >> cannot be relied upon, that's why "predictable names" were invented. >> The vif detection part is related to the domains kernels, not Xen itself >> (at least that's what I understood). >> >> Using eth0 nowadays is a bit like using /dev/sda for hard drives, it's >> considered legacy as it may create problems in some setups, like yours >> (ie. for disks, it's recommended to use UUIDs or /dev/disk/by-*). >> >> I hope this answers your question. > > Thank you, yes it does. > > In our case the dom0 was updated to bookworm while the domU is still > running bullseye. > -> updated Xen so the vif detection order changed (which we relied on) I didn't read the other mailthread on the xen list fully yet. But, I think it's shouldn't be very hard to find the code changes and see if it's deterministic and can just be fixed. Simply just to decrease the totally unnecessary amount of silliness. > -> the predictable network names for Xen don't work with bullseye > > So my new resolution for bullseye domUs on a bookworm dom0 is to install > udev from backports and change the domUs network config to use the new > enXn naming scheme instead of ethn. Or the "device/vif/X" way... So, anyway, did someone already did some test "just because we can" to see how much network interfaces you can get added for fun, and if the pattern keeps looking the same, also with enX4 enX40 .. enX49 enX5 etc? :D enX1 enX10 enX100 .. enX109 enX11 enX110 argh o_O Have fun, Hans
Bug#1027456: gcc-10: gcc segfaults> 'tree-optimization/99824' patch is a fix
Control: tags -1 + fixed-upstream confirmed patch Hi all, I also ran into this issue while trying to build src:linux 6.1.7-1 targeting bullseye-backports. I can confirm that I was able to build the kernel packages successfully using gcc-10/10.2.1-6, with only the following patch on top: https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;h=ee15832c53d52656e562c29110f2be1cfb66c450 ee15832c53 "tree-optimization/99824 - avoid excessive integer type precision in VN" So, in order to be able to do the next 'official' bullseye-backports for src:linux I guess we first need this fix for gcc-10 to go into bullseye via a stable point release? Thanks, Hans (Knorrie)From ee15832c53d52656e562c29110f2be1cfb66c450 Mon Sep 17 00:00:00 2001 From: Richard Biener Date: Tue, 30 Mar 2021 11:22:52 +0200 Subject: [PATCH] tree-optimization/99824 - avoid excessive integer type precision in VN VN sometimes builds new integer types to handle accesss where precision of the access type does not match the access size. The way ao_ref_init_from_vn_reference is computing the access size ignores the access type in case the ref operands have an outermost COMPONENT_REF which, in case it is an array for example, can be way larger than the access size. This can cause us to try building an integer type with precision larger than WIDE_INT_MAX_PRECISION eventually leading to memory corruption. The following adjusts ao_ref_init_from_vn_reference to only lower access sizes via the outermost COMPONENT_REF but otherwise honor the access size as specified by the access type. It also places an assert in integer type building that we remain in the limits of WIDE_INT_MAX_PRECISION. I chose the shared code where we set TYPE_MIN/MAX_VALUE because that will immediately cross the wide_ints capacity otherwise. 2021-03-30 Richard Biener PR tree-optimization/99824 * stor-layout.c (set_min_and_max_values_for_integral_type): Assert the precision is within the bounds of WIDE_INT_MAX_PRECISION. * tree-ssa-sccvn.c (ao_ref_init_from_vn_reference): Use the outermost component ref only to lower the access size and initialize that from the access type. * gcc.dg/torture/pr99824.c: New testcase. --- gcc/stor-layout.c | 2 ++ gcc/testsuite/gcc.dg/torture/pr99824.c | 33 ++ gcc/tree-ssa-sccvn.c | 24 +++ 3 files changed, 49 insertions(+), 10 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/torture/pr99824.c diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c index bde6fa22b58a..57c8a2516d95 100644 --- a/gcc/stor-layout.c +++ b/gcc/stor-layout.c @@ -2816,6 +2816,8 @@ set_min_and_max_values_for_integral_type (tree type, if (precision < 1) return; + gcc_assert (precision <= WIDE_INT_MAX_PRECISION); + TYPE_MIN_VALUE (type) = wide_int_to_tree (type, wi::min_value (precision, sgn)); TYPE_MAX_VALUE (type) diff --git a/gcc/testsuite/gcc.dg/torture/pr99824.c b/gcc/testsuite/gcc.dg/torture/pr99824.c new file mode 100644 index ..9022d4a4b8e7 --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/pr99824.c @@ -0,0 +1,33 @@ +/* { dg-do compile } */ + +unsigned int +strlenx(char *s) +{ + char *orig_s = s; + for (; *s; ++s) +; + return s - orig_s; +} + +struct i2c_adapter { +char name[48]; +}; + +struct { +int instance; +struct i2c_adapter i2c_adap[]; +} * init_cx18_i2c_cx; + +const struct i2c_adapter cx18_i2c_adap_template = {""}; +int init_cx18_i2c___trans_tmp_1; + +void +init_cx18_i2c() +{ + int i = 0; + for (;; i++) { + init_cx18_i2c_cx->i2c_adap[i] = cx18_i2c_adap_template; + init_cx18_i2c___trans_tmp_1 + = strlenx(init_cx18_i2c_cx->i2c_adap[i].name); + } +} diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c index 4b280f21006e..926b4a976aec 100644 --- a/gcc/tree-ssa-sccvn.c +++ b/gcc/tree-ssa-sccvn.c @@ -996,22 +996,26 @@ ao_ref_init_from_vn_reference (ao_ref *ref, poly_offset_int size = -1; tree size_tree = NULL_TREE; - /* First get the final access size from just the outermost expression. */ + machine_mode mode = TYPE_MODE (type); + if (mode == BLKmode) +size_tree = TYPE_SIZE (type); + else +size = GET_MODE_BITSIZE (mode); + if (size_tree != NULL_TREE + && poly_int_tree_p (size_tree)) +size = wi::to_poly_offset (size_tree); + + /* Lower the final access size from the outermost expression. */ op = [0]; + size_tree = NULL_TREE; if (op->opcode == COMPONENT_REF) size_tree = DECL_SIZE (op->op0); else if (op->opcode == BIT_FIELD_REF) size_tree = op->op0; - else -{ - machine_mode mode = TYPE_MODE (type); - if (mode == BLKmode) - size_tree = TYPE_SIZE (type); - else - size = GET_MODE_BITSIZE (mode); -} if (size_tree != NULL_TREE - && poly_int_tree_p (size_tree)) + && poly_int_tree_p (size_tree) + && (!known_size_p (size) + || known_lt (wi::to_poly_offset (size_tree), size))) size = wi::to_poly_offset (size_tree);
Bug#1028251: New Patch (Was: Re: Bug#1028251: xen: FTBFS when building xen binary packages for sid on x86_64)
Hi, On 1/13/23 22:45, Chuck Zmudzinski wrote: > On 1/13/23 7:39 AM, Marek Marczykowski-Górecki wrote: >> On Fri, Jan 13, 2023 at 12:58:29AM -0500, Chuck Zmudzinski wrote: >>> On 1/11/2023 10:58 PM, Chuck Zmudzinski wrote: >>>> On 1/9/23 12:55 PM, Hans van Kranenburg wrote: >>>>> Hi! [...] Yolo style cutting out lines here... [...] >>> >>> Regarding the systemd files causing ftbfs, this explains it: >>> >>> https://salsa.debian.org/xen-team/debian-xen/-/blob/master/m4/systemd.m4#L119 >>> >>> and this: >>> >>> https://salsa.debian.org/xen-team/debian-xen/-/blob/master/tools/configure.ac#L480 >>> >>> The comments indicate that using AX_AVAILABLE_SYSTEMD() will >>> by default enable systemd if systemd development files are on the >>> build system, and AX_ALLOW_SYSTEMD() means --enable-systemd >>> must explicitly be passed to tools/configure to enable it. Upstream >>> uses the former, so build systems with systemd development files >>> by default will ftbfs because that produces missing files that dh_missing >>> in debian/rules does not like. >>> >>> So the reason there is ftbfs on my system is that my system has >>> the systemd development package installed. >> >> By the way, maybe a better fix would be to pass --enable-systemd, add >> libsystemd-dev >> build-dep and list them in the package? They might require patching to >> support Debian-specific upgrade machinery, though... >> >> Not installing xendriverdomain.service is one of things missing for >> driver domains support >> (https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=922033). >> > > Hi Marek, > > I wouldn't be against fixing it that way. In fact, I would prefer > that Debian packaged Xen with full support for native systemd units. > I am willing to wait until if/when the package maintainers have > full systemd support in the Xen packages. > > Perhaps this is an opportunity for you to try to fix 922033 again. > I see it has been sitting there for a few years now. Let's see > what Hans thinks. Yeah, well, so, the thing here is... When Debian started to package Xen (thanks! Bastian, in 200X), the upstream init scripts were copy pasted, and adjusted to have the ability to have different Hypervisor-ABI-incompatible versions installed at the same time. Also, this is related to the collection of Makefile patches we carry around to have ABI-incompatible stuff end up in a directory like /usr/lib/xen-4.14/ and /usr/lib/xen-4.17/ ! What does this mean? Well, in the most basic sense it means that you could apt-get (dist-)upgrade and then still be able to xl shutdown a domU afterwards before doing reboot, because it will choose the right tools which match with the ABI of the *now* running hypervisor instead of being left with a dumpster fire, which in the end causes you to shout curse words and cause you to have to go to the machine and hold the power button for 5 seconds to force power it off. This is the thing about where you upgrade from Xen 4.14 to Xen 4.17 during the upgrade from Debian 11/Bullseye to Debian 12/Bookworm, it will allow you, if booting the whole new thing is a huge failure, to reset the computer, and in grub, choose to use the previous Xen (and possibly do that in combination with previous Debian linux kernel) and then have a system where you again at least can start your domUs again *) and first have a good rest, night of sleep before starting to dig into what's going wrong. So, this is exactly the same way of doing stuff like how you can also reboot back into the previous Linux kernel (ABI-compatible) one during a system upgrade, even if you're not using Xen at all! I like this very much. This is the kind of thing that helps admins of systems that have just local disks and a few domUs. Like, the case where you support some non-profit organization with their server stuff running on donated hardware. (Yes, I also do some of those, I do!) And, in case something does fail (there could always be something like a misbehaving mpt3sas card in the hardware or anything that no one else spotted yet), the admin does not have to end up in total panic mode after doing the upgrade on a Friday afternoon lying upside down inside a broom closet, but they can just at least recover from the situation and have something that's running again, and then a day later, or 2 or 3 days or a week later return on another planned moment to fix it, after asking around. Upstream Xen stuff doesn't have anything like that. But, they actually look at us, and they think, ooh, this is actually nice, we should have that also by default. The fact that we have this changed/altered/divergent init scripts in Debian is the main reason that we cannot just enable
Bug#1028251: [Pkg-xen-devel] Bug#1028251: xen: FTBFS when building xen binary packages for sid on x86_64
Hi! On 09/01/2023 18:44, Chuck Zmudzinski wrote: > Control: tag -1 + moreinfo > > thanks > > On 1/9/23 8:09 AM, Hans van Kranenburg wrote: >> Hi Chuck, >> >> On 1/8/23 23:18, Chuck Zmudzinski wrote: >>> [...] >>> >>> The build failed: >>> >>>debian/rules override_dh_missing >>> make[1]: Entering directory '/home/chuckz/sources-sid/xen/xen-4.17.0' >>> dh_missing --list-missing >>> dh_missing: warning: usr/lib/modules-load.d/xen.conf exists in debian/tmp >>> but is not installed to anywhere >>> dh_missing: warning: usr/lib/systemd/system/proc-xen.mount exists in >>> debian/tmp but is not installed to anywhere >>> dh_missing: warning: usr/lib/systemd/system/xen-init-dom0.service exists in >>> debian/tmp but is not installed to anywhere >>> dh_missing: warning: >>> usr/lib/systemd/system/xen-qemu-dom0-disk-backend.service exists in >>> debian/tmp but is not installed to anywhere >>> dh_missing: warning: usr/lib/systemd/system/xen-watchdog.service exists in >>> debian/tmp but is not installed to anywhere >>> dh_missing: warning: usr/lib/systemd/system/xenconsoled.service exists in >>> debian/tmp but is not installed to anywhere >>> dh_missing: warning: usr/lib/systemd/system/xendomains.service exists in >>> debian/tmp but is not installed to anywhere >>> dh_missing: warning: usr/lib/systemd/system/xendriverdomain.service exists >>> in debian/tmp but is not installed to anywhere >>> dh_missing: warning: usr/lib/systemd/system/xenstored.service exists in >>> debian/tmp but is not installed to anywhere >> >> I cannot reproduce this error here locally and the CI build also succeeds: >> >> https://salsa.debian.org/xen-team/debian-xen/-/pipelines/481577 > > I thought I had a fairly clean sid install, but I think the problem > on my system could be caused by some obscure grandfathered in > setting because the sid I am using was updated from all the way back to > an original install of jessie many years ago... > > It might be time for me to refresh my sid with a clean installation. > > Out of curiosity and if you have time, can you answer a couple of > question if you know the answer? > > 1. Do the builds on a clean environment produce the missing files > listed in my build? No, after my local package build, there's no such things in there: ~/build/xen/debian-xen/debian/tmp/usr/lib m (master) 1-$ ll total 0 drwxr-xr-x 1 knorrie knorrie 110 Jan 8 23:51 debug drwxr-xr-x 1 knorrie knorrie 2048 Jan 8 23:50 x86_64-linux-gnu drwxr-xr-x 1 knorrie knorrie 20 Jan 8 23:51 xen-4.17 > > 2. Are those systemd service files installed anywhere in the xen > binary packages, either in arch=x86_64 packages or for the arch=all > packages such as xen-utils-common? No, they are not: https://packages.debian.org/search?searchon=contents=xenconsoled.service=path=unstable=any > If you don't know the answer to these questions I will investigate > myself to find the answers, so you can work on more important things. > >> >> How are you building the packages? In a clean build environment, using >> for example sbuild or pbuilder, or in an environment where unrelated >> other build dependencies could be present, that are not included in the >> xen list, but maybe 'wake up and do something' if they're present? > > As I said, I am building on a sid install that might have some > stuff grandfathered in from old releases going back to jessie. > I also might have some stale stuff around from my private builds > of the traditional device model available from xen that is not > part of the Debian packages. I will investigate these possible causes. > > I use debuild as a frontend to dpkg-buildpackage to build the packages. Yes. So (I'm not entirely sure how it works, but as example, just making something up here): After doing something else first, you might end up with a system that has for example dh-systemd-yolo-all-the-things-helper installed. And, it might be that only it being present means that the package build process changes. It might even be a 'feature' of that helper... "just add it to your build depends, and it will automatically do all the things for you!!!~``1" This is why it is very much recommended to build the packages using something like sbuild, so that you can be sure that every time it will start with a super minimal chroot which only has some essential things, and that the only build dependencies used will be the ones that are explicitly defined in the debian/control of the package. >> You can also compare your own build output with the full one from the CI >> job: >> >> https://salsa.debian.org/xen-team/debian-xen/-/jobs/3767564/raw > > I will take a look at that when I get a chance. > > This is not a real high priority for me, so I am content to let this > be until I get a chance to investigate the quirks of my current > installation of sid, and I also added the moreinfo tag, so you can > ignore this bug if you wish until I do some further research. Sure, no problem. Have fun, Hans
Bug#1028251: [Pkg-xen-devel] Bug#1028251: xen: FTBFS when building xen binary packages for sid on x86_64
Hi Chuck, On 1/8/23 23:18, Chuck Zmudzinski wrote: > [...] > > The build failed: > >debian/rules override_dh_missing > make[1]: Entering directory '/home/chuckz/sources-sid/xen/xen-4.17.0' > dh_missing --list-missing > dh_missing: warning: usr/lib/modules-load.d/xen.conf exists in debian/tmp but > is not installed to anywhere > dh_missing: warning: usr/lib/systemd/system/proc-xen.mount exists in > debian/tmp but is not installed to anywhere > dh_missing: warning: usr/lib/systemd/system/xen-init-dom0.service exists in > debian/tmp but is not installed to anywhere > dh_missing: warning: > usr/lib/systemd/system/xen-qemu-dom0-disk-backend.service exists in > debian/tmp but is not installed to anywhere > dh_missing: warning: usr/lib/systemd/system/xen-watchdog.service exists in > debian/tmp but is not installed to anywhere > dh_missing: warning: usr/lib/systemd/system/xenconsoled.service exists in > debian/tmp but is not installed to anywhere > dh_missing: warning: usr/lib/systemd/system/xendomains.service exists in > debian/tmp but is not installed to anywhere > dh_missing: warning: usr/lib/systemd/system/xendriverdomain.service exists in > debian/tmp but is not installed to anywhere > dh_missing: warning: usr/lib/systemd/system/xenstored.service exists in > debian/tmp but is not installed to anywhere I cannot reproduce this error here locally and the CI build also succeeds: https://salsa.debian.org/xen-team/debian-xen/-/pipelines/481577 How are you building the packages? In a clean build environment, using for example sbuild or pbuilder, or in an environment where unrelated other build dependencies could be present, that are not included in the xen list, but maybe 'wake up and do something' if they're present? You can also compare your own build output with the full one from the CI job: https://salsa.debian.org/xen-team/debian-xen/-/jobs/3767564/raw Hans
Bug#1021668: xen: CVE-2022-33749 CVE-2022-33748 CVE-2022-33747 CVE-2022-33746
Hi :) On 04/11/2022 22:51, Salvatore Bonaccorso wrote: > Hi Hans > > On Fri, Nov 04, 2022 at 02:59:29PM +0100, Hans van Kranenburg wrote: >> Aha! >> >> On 02/11/2022 21:53, Salvatore Bonaccorso wrote: >>> Hi, >>> >>> On Wed, Nov 02, 2022 at 08:02:26PM +0100, Hans van Kranenburg wrote: >>>> Hi, >>>> >>>> On 10/19/22 21:55, Moritz Muehlenhoff wrote: >>>>>>> For the latest set of Xen issues my estimate is that we can postpone >>>>>>> them until the next batch, they seem all of moderate/limited impact. >>>>>>> But let me know if you think otherwise. >>>>>> >>>>>> I agree. Let's do them together with the new stuff that's planned for >>>>>> Nov 1st, https://xenbits.xen.org/xsa/ >>>>> >>>>> Ack, I've updated the Security Tracker. >>>> >>>> I'm having a look at this now, and while writing the changelog entry, I >>>> run into the following thing: >>>> >>>> XSA-403 has 4 CVE numbers. AFAIUI the first two are about the fixes done >>>> to Linux, and the other two are about changes to Xen. Shouldn't the >>>> Debian security tracker reflect that? >>>> >>>> CVE-2022-26365 CVE-2022-33740 -> src:linux only ? >>>> CVE-2022-33741 CVE-2022-33742 -> src:xen only ? >>> >>> Speaking for src:linux I do not think we need to change the tracking: >>> >>> CVE-2022-26365: 2f446ffe9d73 ("xen/blkfront: fix leaking data in shared >>> pages") >>> CVE-2022-33740: 307c8de2b023 ("xen/netfront: fix leaking data in shared >>> pages") >>> CVE-2022-33741: 4491001c2e0f ("xen/netfront: force data bouncing when >>> backend is untrusted") >>> CVE-2022-33742: 2400617da7ee ("xen/blkfront: force data bouncing when >>> backend is untrusted") >> >> Riiight. Thanks, now I get why I cannot find any CVE number related to >> XSA-403 listed in the Xen upstream changes (at least for 4.14 which I'm >> working on now). They're all over there at the Linux side. > > It looks that there are still changes needed on the xen side, at least > that is my understanding reading through > https://xenbits.xen.org/xsa/advisory-403.html > Quoting the advisory: > > | For the stable branches (Xen 4.16.x - Xen 4.13.x) patch 1 introduces > support to > | libxl for libxl_{disk,nic}_backend_untrusted environment variable to be > used in > | order to set whether disk and network backends should be trusted. Patch 2 > | reverts patch 1 and instead provides the more fine grained per-device > options > | that break the libxl ABI. > | > | Note that applying patch 2 to any of the stable releases will require a > rebuild > | of any consumers of the libxl library, as it introduces an ABI breakage and > | hence won't be applied to the official repository stable branches. Users of > | stable releases wanting to use the functionality provided by patch 2 will > need > | to apply it manually. > > This is the reason that in fact for those four CVEs, weh ave marked > for bullseye: > > [bullseye] - xen (Too intrusive too backport) > > The "signaling of whether a frontend should consider a backend as > potentially malicious can be done **from either the Linux kernel > command line or the toolstack.**" (highlighting is added by me). > > So IMHO it is similarly correct to track src:xen under those CVEs, and > they are marked as fixed with 4.16.2-1. *But* for bullseye, they can > be ignored due to above reasons. Yes, so the Xen part is about the "reporting whether the backend is to be trusted". That 'patch 1', the all-or-nothing option to signal the guest kernel is now included with this update. But neither that change, nor the more fine-grained patch 2 is directly linked to a CVE number. That change on itself will not fix anything for any of the 4 CVE numbers. Also, for 4.16 the story is the same, by the way. It's only in 4.17 which is to be released in the upcoming week that the otherwise lilbxl ABI breaking changes are fully included, but even that doesn't change anything for the CVE administration. After all, it is a bit of a moot point for us. The only scenario in which all of this is relevant is when using a 'driver domain' to delegate the blk/net backend part to another untrusted guest domain. Using this functionality is not properly enabled/supported out of the box in our package builds for Debian. Sometimes these XSA are like a little scavenger hunt. Hans
Bug#1021668: xen: CVE-2022-33749 CVE-2022-33748 CVE-2022-33747 CVE-2022-33746
Aha! On 02/11/2022 21:53, Salvatore Bonaccorso wrote: > Hi, > > On Wed, Nov 02, 2022 at 08:02:26PM +0100, Hans van Kranenburg wrote: >> Hi, >> >> On 10/19/22 21:55, Moritz Muehlenhoff wrote: >>>>> For the latest set of Xen issues my estimate is that we can postpone >>>>> them until the next batch, they seem all of moderate/limited impact. >>>>> But let me know if you think otherwise. >>>> >>>> I agree. Let's do them together with the new stuff that's planned for >>>> Nov 1st, https://xenbits.xen.org/xsa/ >>> >>> Ack, I've updated the Security Tracker. >> >> I'm having a look at this now, and while writing the changelog entry, I >> run into the following thing: >> >> XSA-403 has 4 CVE numbers. AFAIUI the first two are about the fixes done >> to Linux, and the other two are about changes to Xen. Shouldn't the >> Debian security tracker reflect that? >> >> CVE-2022-26365 CVE-2022-33740 -> src:linux only ? >> CVE-2022-33741 CVE-2022-33742 -> src:xen only ? > > Speaking for src:linux I do not think we need to change the tracking: > > CVE-2022-26365: 2f446ffe9d73 ("xen/blkfront: fix leaking data in shared > pages") > CVE-2022-33740: 307c8de2b023 ("xen/netfront: fix leaking data in shared > pages") > CVE-2022-33741: 4491001c2e0f ("xen/netfront: force data bouncing when backend > is untrusted") > CVE-2022-33742: 2400617da7ee ("xen/blkfront: force data bouncing when backend > is untrusted") Riiight. Thanks, now I get why I cannot find any CVE number related to XSA-403 listed in the Xen upstream changes (at least for 4.14 which I'm working on now). They're all over there at the Linux side. Hans
Bug#1021668: [Pkg-xen-devel] Bug#1021668: xen: CVE-2022-33749 CVE-2022-33748 CVE-2022-33747 CVE-2022-33746
Hi, On 10/19/22 21:55, Moritz Muehlenhoff wrote: >>> For the latest set of Xen issues my estimate is that we can postpone >>> them until the next batch, they seem all of moderate/limited impact. >>> But let me know if you think otherwise. >> >> I agree. Let's do them together with the new stuff that's planned for >> Nov 1st, https://xenbits.xen.org/xsa/ > > Ack, I've updated the Security Tracker. I'm having a look at this now, and while writing the changelog entry, I run into the following thing: XSA-403 has 4 CVE numbers. AFAIUI the first two are about the fixes done to Linux, and the other two are about changes to Xen. Shouldn't the Debian security tracker reflect that? CVE-2022-26365 CVE-2022-33740 -> src:linux only ? CVE-2022-33741 CVE-2022-33742 -> src:xen only ? And for XSA-403, at first upstream was unsure about what to do for older Xen versions where the patches would be an ABI breaker. In the end, they did apply the more coarse-grained patch to at least offer some kind of mitigation in case a user wants to use it. So, the changelog line I'm including now will just be: - Linux disk/nic frontends data leaks XSA-403 CVE-2022-33741 CVE-2022-33742 HTH, Hans
Bug#1021668: [Pkg-xen-devel] Bug#1021668: xen: CVE-2022-33749 CVE-2022-33748 CVE-2022-33747 CVE-2022-33746
Hi, On 18/10/2022 22:31, Moritz Muehlenhoff wrote: > On Tue, Oct 18, 2022 at 02:17:32PM +0200, Hans van Kranenburg wrote: >> Does explicitly opening a BTS bug mean that, like we use to call it, >> "these CVEs warrant a DSA", > > No, in general we aim to file bugs for any open CVEs regardless of > the DSA state. This allows people to see that an issue is known > (and some maintainers might also not have noticed in time). Ok! >> and that it is a request for an ASAP package >> update and preparing a security update for stable, or, is this a new >> thing where BTS bugs are opened for packages, just in case the >> maintainer did not already track security issues themselves actively? > > For the latest set of Xen issues my estimate is that we can postpone > them until the next batch, they seem all of moderate/limited impact. > But let me know if you think otherwise. I agree. Let's do them together with the new stuff that's planned for Nov 1st, https://xenbits.xen.org/xsa/ Hans
Bug#1021668: [Pkg-xen-devel] Bug#1021668: xen: CVE-2022-33749 CVE-2022-33748 CVE-2022-33747 CVE-2022-33746
Hi! On 10/12/22 19:38, Moritz Mühlenhoff wrote: > Source: xen > X-Debbugs-CC: t...@security.debian.org > Severity: important > Tags: security > > Hi, > > The following vulnerabilities were published for xen. > > CVE-[...] Thanks for the overview. The XAPI one indeed does not apply to src:xen. I have a question, since the 'bug' report does not contain a question, or explicit call for action, and I have not seen it in this way before. Does explicitly opening a BTS bug mean that, like we use to call it, "these CVEs warrant a DSA", and that it is a request for an ASAP package update and preparing a security update for stable, or, is this a new thing where BTS bugs are opened for packages, just in case the maintainer did not already track security issues themselves actively? I'm just wondering... Thanks, Hans
Bug#1021215: Kind request for backports of libtraceevent and libtracefs
Package: src:libtraceevent Version: 1:1.1.2-1 Hi maintainer, :) Linux commit fe4d0d5dde ("rtla/Makefile: Properly handle dependencies") helps making the dependency on libtraceevent and libtracefs more explicit, so that the users run into less weird problems on the go. Linux 5.19 is in Debian unstable now, and for the stable-backports packages that our kernel team is providing, this means that it will FTBFS, unless we either: - exclude rtla for bullseye-backports - have backports for libtraceevent and libtracefs present So, the question for you is: Do you want to also provide bullseye-backports packages for libtraceevent and libtracefs? About making dependencies explicit in the kernel package: https://salsa.debian.org/kernel-team/linux/-/merge_requests/539 Currently shipping without rtla, so far: https://salsa.debian.org/benh/linux/-/commit/15b6859742d404abdcd68bcb589f8a8e2dfb6ce4 Thanks, Hans
Bug#1020787: Xen: After updating to 5.19 kernel the VMs are started without XSAVE CPU flags
Hi! On 9/28/22 00:55, Diederik de Haas wrote: > On Wednesday, 28 September 2022 00:24:27 CEST Patrick wrote: >> I just applied the patch >> (xen.git-c3bd0b83ea5b7c0da6542687436042eeea1e7909.patch) to the xen >> packages and can confirm that this fixes the problems. The xsave flags are >> available again and thus the binaries work too. > > That is awesome, thank you :-) > > IIUC: > - Xen upstream will backport the patch to the stable branches; I do not know > when that will happen > - Debian's package will probably be updated before that and 4.16.2-2 will be > uploaded to Sid Soon (tm) with that patch applied Thanks for doing the investigation! I'm currently preparing 4.16.2-2 which includes the fix. Hans
Bug#1016547: [Pkg-xen-devel] Bug#1016547: /etc/default/grub.d/xen.cfg: Extraneous output line causes error message at boot
Hi John, On 8/2/22 19:50, John E. Krokes wrote: > Package: xen-hypervisor-common > Version: 4.14.5+24-g87d90d511c-1 > Severity: minor > File: /etc/default/grub.d/xen.cfg > > Dear Maintainer, > > When invoked via grub-mkconfig, xen.cfg outputs this as its first line: > Including Xen overrides from /etc/default/grub.d/xen.cfg > > The output of grub-mkconfig is expected to be redirected into a grub.cfg file. > Grub will read the grub.cfg at boot. Unfortunately, "Including" is not a > valid grub command. So when booting, grub emits this error message before > displaying its menu: > error: can't find command `Including'. Aha! Nice catch. That's indeed something that should be improved. > [...] > > The error message is obscured very quickly. It does not affect functionality > in any way. It requires booting on a VERY slow machine in order to read > the error message at all. > > > If I add a '#' to the start of the "Including", the resulting grub config file > boots with no error. > echo "#Including Xen overrides from /etc/default/grub.d/xen.cfg" > > I'm not sure if this line was intended to go into the generated config > file as a comment, or if it was intended to be shown to the user while > grub-mkconfig is running. I'm sure it's the latter, yes. Just some 'hey! I'm doing this now' message. > I have observed this and tested my fix against version > 4.14.5+24-g87d90d511c-1 of xen-hypervisor-common. I have also checked > with the debian git at > https://salsa.debian.org/xen-team/debian-xen/-/blob/master/debian/tree/xen-hypervisor-common/etc/default/grub.d/xen.cfg. > This line > has not changed in a very long time. > > > I can also duplicate the behavior using grub-emu, with the output redirected > to a file. > > > I am running devuan, and originally reported this to their BTS but was > redirected to debian. So my version number does not match. Apologies for > that. It's ok. The changes/improvements for this will end up in the Xen 4.16 package that's in Debian unstable now, anyway. So, in our grub.d/xen.cfg file, there's two places that cause text output: the 'Including Xen overrides ...' informational one, and the notification/warning about overriding GRUB_DEFAULT. The grub.d/* files are executed (sourced) in the context of the grub-mkconfig itself using '.'. In there, I can see that similar status messages are just redirected to stderr. We can do the same here. For the warning, there's a grub_warn helper function, which we can use. So, that results in the follow changes I have here now: diff --git a/default/grub.d/xen.cfg b/default/grub.d/xen.cfg index d35744e..42670eb 100644 --- a/default/grub.d/xen.cfg +++ b/default/grub.d/xen.cfg @@ -5,7 +5,7 @@ # The configuration in here makes it possible to have different options set # for the linux kernel when booting with or without Xen. -echo "Including Xen overrides from /etc/default/grub.d/xen.cfg" +echo "Including Xen overrides from /etc/default/grub.d/xen.cfg" >&2 ### # Xen Hypervisor Command Line Options @@ -83,8 +83,8 @@ GRUB_CMDLINE_LINUX_XEN_REPLACE="earlyprintk=xen console=hvc0 noresume" #XEN_OVERRIDE_GRUB_DEFAULT= # if [ "$XEN_OVERRIDE_GRUB_DEFAULT" = "" ]; then - echo "WARNING: GRUB_DEFAULT changed to boot into Xen by default!" - echo " Edit /etc/default/grub.d/xen.cfg to avoid this warning." + grub_warn "GRUB_DEFAULT changed to boot into Xen by default!" \ + "Edit /etc/default/grub.d/xen.cfg to avoid this warning." XEN_OVERRIDE_GRUB_DEFAULT=1 fi if [ "$XEN_OVERRIDE_GRUB_DEFAULT" = "1" ]; then None of this output will now be mixed with the generated config any more. This will be in the next package upload. https://salsa.debian.org/xen-team/debian-xen/-/commits/wip/sid Thanks, Hans
Bug#1008048: RM: xen [i386] -- ROM; ANAIS; stop building for i386
Package: ftp.debian.org Severity: normal X-Debbugs-CC: pkg-xen-de...@lists.alioth.debian.org Hi, Starting with Xen version 4.16, we're dropping support for the i386 arch. There are currently already no reverse dependencies left on i386 specific packages in unstable. We have worked together with collectd, libvirt and qemu maintainers to have their packages changed to remove i386 related xen things. So, I understand that we now can ask for removal of the leftover Xen 4.14 packages in i386, which will unblock the migration of Xen 4.16 to testing. Thanks, Hans van Kranenburg Debian Xen Team
Bug#988333: [Pkg-xen-devel] Bug#988333: Bug#988333: libxenmisc4.16: libxl fails to grant necessary I/O memory access for gfx_passthru of Intel IGD
On 3/7/22 18:30, Chuck Zmudzinski wrote: [...] Thanks for adding all the info and researching this, Chuck! Hans
Bug#921187: Getting rid of rdepends on libxenmisc4.X so we can do backports
I see the mail thread 'RFC: qemu and Xen ABI-unstable libs' on the upstream xen-devel mailing list did not get referenced from this Debian bug yet: https://lists.xenproject.org/archives/html/xen-devel/2020-09/threads.html#01299 It contains a lot of info about the actual work that needs to be done. Hans
Bug#1005176: xen-utils-4 library dependencies need update
tags 1005176 + moreinfo thanks Hi Elliott, :) On 2/8/22 14:19, Elliott Mitchell wrote: Package: src:xen Version: 4.16.0-1~exp1 I'm guilty of pulling in later Xen source and building it based on the experimental 4.16 packaging. As such this may actually only be an issue for a package version beyond 4.16.0. I'm uncertain which it is, but xen-utils-4.16 appears to need an update to one or more of libxencall1, libxenevtchn1, libxenforeignmemory1, libxengnttab1 and/or libxentoollog1 in order to function. During my initial update I merely updated libxenmisc4.16 and libxenstore4. In this condition something (I suspect xenstored) was rather broken and things were unusable. Notably `xl list` was hanging. I was unable to get VMs started and it felt like everything wanted to explode. This one is really too vague to be able to react to in any sensible manner. Reading it was a fun experience though. It made me think of creating a bingo card with 30 possible things that a bug reporter can say that are a synonym of "it doesn't work". I hope this message does not come across as offensive, it's in no way meant as such. :D I do appreciate your contributions and you sharing thoughts about possible things that could be done and could be improved. However, I hope you understand that there's no way we can help when you use something else than the actual packages in Debian, do not provide any error messages seen, and describe what you see instead as "it felt like everything wanted to explode". For me, Xen 4.16 does run OK on my test servers, FWIW. Have fun, Hans
Bug#1004269: Debian Bug#1004269: Linker segfault while building src:xen
(To both the Debian bug # and xen-devel list, reply-all is fine) Hi Xen people, I just filed a bug at Debian on the binutils package, because since the latest binutils package update (Debian 2.37.50.20220106-2), Xen (both 4.14 and 4.16) fails to build with a segfault at the following point: x86_64-linux-gnu-ld -mi386pep --subsystem=10 --image-base=0x82d04000 --stack=0,0 --heap=0,0 --section-alignment=0x20 --file-alignment=0x20 --major-image-version=4 --minor-image-version=16 --major-os-version=2 --minor-os-version=0 --major-subsystem-version=2 --minor-subsystem-version=0 --no-insert-timestamp --build-id=sha1 -T efi.lds -N prelink.o /builds/xen-team/debian-xen/debian/output/source_dir/xen/common/symbols-dummy.o -b pe-x86-64 efi/buildid.o -o /builds/xen-team/debian-xen/debian/output/source_dir/xen/.xen.efi.0x82d04000.0 && : Segmentation fault (core dumped) Full message and links to build logs etc are in the initial bug message, to be seen at https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1004269 We (Debian Xen Team) are awaiting response, but I thought to also let you know already. * Does the above error 'ring a bell'? * Can you maybe also reproduce this in a development environment with very latest binutils? * Maybe someone has a useful comment for the Debian binutils maintainer about what's happening in this step of the build? * Any suggestions about what we can do to help figure this out? * We'll try to help debug, but will surely appreciate upstream help if things get too technical. It's simply the case that I did not have to look into a very similar issue before, so it's new. Thanks! Hans
Bug#1004269: Linker segfault while building src:xen
Package: src:binutils Version: 2.37.50.20220106-2 X-Debbugs-CC: pkg-xen-de...@lists.alioth.debian.org Hi, With the last binutils version src:xen starts to FTBFS. >8 Xen 4.16 for experimental >8 * Last passed build, using binutils 2.37-10. Job overview: https://salsa.debian.org/xen-team/debian-xen/-/pipelines/329021 Full log: https://salsa.debian.org/xen-team/debian-xen/-/jobs/2290845/raw * First failed build, using the same source code, and using binutils 2.37.50.20220106-2: Job overview: https://salsa.debian.org/xen-team/debian-xen/-/pipelines/338409 Full log: https://salsa.debian.org/xen-team/debian-xen/-/jobs/2375744/raw At the end of the full log, the failure can be observed: x86_64-linux-gnu-ld -mi386pep --subsystem=10 --image-base=0x82d04000 --stack=0,0 --heap=0,0 --section-alignment=0x20 --file-alignment=0x20 --major-image-version=4 --minor-image-version=16 --major-os-version=2 --minor-os-version=0 --major-subsystem-version=2 --minor-subsystem-version=0 --no-insert-timestamp --build-id=sha1 -T efi.lds -N prelink.o /builds/xen-team/debian-xen/debian/output/source_dir/xen/common/symbols-dummy.o -b pe-x86-64 efi/buildid.o -o /builds/xen-team/debian-xen/debian/output/source_dir/xen/.xen.efi.0x82d04000.0 && : Segmentation fault (core dumped) The above logs are for src:xen 4.16.0-1~exp1 which we were about to upload to experimental. >8 Xen 4.14 currently in unstable >8 I also triggered a CI run again for the current src:xen 4.14.3+32-g9de3671772-1. The same segfault happens there, and both for the amd64 and i386 build test (i386 is no longer included for Xen 4.16). Job overview: https://salsa.debian.org/xen-team/debian-xen/-/pipelines/340556 Full logs: https://salsa.debian.org/xen-team/debian-xen/-/jobs/2394079/raw https://salsa.debian.org/xen-team/debian-xen/-/jobs/2394080/raw >8 So, this is what we observe. In the Debian Xen team, there's not a great amount of knowledge about the exact internals of what happens here. * At least, we can let you know there's a regression. * Currently progress on our Xen 4.16 upload is blocked, and we also can't do updates of the current Xen 4.14 packages (e.g. because of security fixes). * We're available to help debugging this issue if needed. We'll need guidance, so it will mean that we'll work based on your instructions. * After sending this report and getting the confirmation from the BTS, I'll send a reply with the upstream Xen development mailing list in Cc. Thanks in advance, Hans van Kranenburg
Bug#1002658: [Pkg-xen-devel] Bug#1002658: FTBFS with OCaml 4.13.1
Hi Stéphane, On 12/26/21 9:06 PM, Stéphane Glondu wrote: > [...] > > Your package FTBFS with OCaml 4.13.1 with the following error: >> [...] >>57 | #define Some_val(v) Field(v,0) >> | >> In file included from /usr/lib/ocaml/caml/alloc.h:24, >> from xentoollog_stubs.c:23: >> /usr/lib/ocaml/caml/mlvalues.h:404: note: this is the location of the >> previous definition >> 404 | #define Some_val(v) Field(v, 0) >> | >> cc1: all warnings being treated as errors Thanks for the report. There is an upstream fix for the ocaml redefinition issues, so that's at least a good thing. This fix is already in the released Xen 4.16, but not in Xen 4.14 that is in unstable now. https://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=2d1a35f1e6c2113a6322fdb758a198608c90e4bd We're currently preparing the upload of the new Xen 4.16 to Debian experimental->unstable. So, either when that one reaches unstable, or, also, when we have to do another intermediate Xen 4.14 upload to unstable first (e.g. some more urgent security fixes), we can resolve this. Let me know if you have specific wishes around deadline etc for completing the ocaml transition. It doesn't take much effort for us to do a -2 upload to unstable which only will include the above upstream fix as change. So, when we would get into the critical path of progress in your work, feel free to ask for that. Thanks, Hans
Bug#992909: xen-utils-4.14: please stop recommending libc6-xen on i386
Hi Aurelien, On 8/24/21 10:58 PM, Aurelien Jarno wrote: > > Due to the removal of 32-bit PV in Linux kernel 5.9 and the removal of > the "nosegneg" hwcap from glibc 2.32, the libc6-xen package is not build > anymore by the glibc package. This is already the case in experimental, > and will be soon the case in testing. Could you please update > xen-utils-4.14 to stop recommending this package? Yes! It will be part of the Xen 4.16 upload that we're preparing now. Thanks, Hans
Bug#994899: [Pkg-xen-devel] Bug#994899: xen-hypervisor-4.14-amd64 breaks system poweroff on bullseye
Hi all, On 10/5/21 2:16 AM, Chuck Zmudzinski wrote: > On 10/4/2021 1:51 PM, Diederik de Haas wrote: >> On Monday, 4 October 2021 17:27:22 CEST Chuck Zmudzinski wrote: >>> I can confirm these 4 fix the bug on my hardware. >> \o/ >> Thanks for testing and reporting back :-) > > Thank you, Diederik, for your good work finding the commits > from upstream that fix the bug. And also thanks to you, Andy, > for helping fix this bug in the IRC and for your interest and > support of the Debian Xen Team's work. So, we're in the process of actually doing a package update now, which includes these fixes. I can confirm that my HP DL360 hardware at work also did not fully power off. And now, it does: >8 [ OK ] Reached target Late Shutdown Services. [ OK ] Finished System Power Off. [ OK ] Reached target System Power Off. [16302.044148] reboot: Power down (XEN) Disabling non-boot CPUs ... (XEN) Broke affinity for IRQ12, new: , (XEN) Broke affinity for IRQ1, new: , (XEN) Broke affinity for IRQ9, new: , (XEN) Broke affinity for IRQ16, new: , (XEN) Broke affinity for IRQ17, new: , (XEN) Broke affinity for IRQ99, new: , (XEN) Broke affinity for IRQ112, new: , (XEN) Broke affinity for IRQ113, new: , (XEN) Broke affinity for IRQ114, new: , (XEN) Broke affinity for IRQ115, new: , (XEN) Broke affinity for IRQ116, new: , (XEN) Broke affinity for IRQ117, new: , (XEN) Broke affinity for IRQ118, new: , (XEN) Broke affinity for IRQ119, new: , (XEN) Broke affinity for IRQ120, new: , (XEN) Broke affinity for IRQ121, new: , (XEN) Broke affinity for IRQ122, new: , (XEN) Broke affinity for IRQ123, new: , (XEN) Broke affinity for IRQ124, new: , (XEN) Broke affinity for IRQ125, new: , (XEN) Broke affinity for IRQ126, new: , (XEN) Broke affinity for IRQ127, new: , (XEN) Entering ACPI S5 state. The server is not powered on. The Virtual Serial Port is not available. >8 So, this one will get closed when we do the upload to unstable. Besides that, it will of course also be fixed in stable if we get the same thing into there in the next days. Hans
Bug#988333: [Pkg-xen-devel] linux-image-5.10.0-6-amd64: VGA Intel IGD Passthrough to Debian Xen HVM DomUs not working, but Windows Xen HVMs do work
Hi! On 10/19/21 5:44 AM, Chuck Zmudzinski wrote: > On 5/10/2021 1:33 PM, Chuck Zmudzinski wrote: >> [...] with buster and bullseye running as the Dom0, I can only get the >> VGA/Passthrough feature to work with Windows Xen HVMs. I would expect both >> Windows and Linux HVMs to work comparably well. You don't mention the used Xen version (Debian package version) for buster and bullseye anywhere, so I'll assume it's the latest 4.14.3-1(~deb11u1) one. > [...] > > The biggest problems were that the Dom0 reported problems > with IRQ 16 being disabled after starting the bullseye HVM DomU, > and only xl destroy could be used to stop the corrupted process. Well, at least we have an error somewhere already. That's a starting point. Can you share the domU config file? And, other configs you need to have in place to exclude the devices from being seen as normal devices directly in dom0? (I haven't used passthrough myself yet, but I read that this is needed.) Can you share more verbose logging done by xl create when using xl -vvv create ? But, AFAIK what you want to do should be possible yes. > The bullseye HVM DomU still fails to boot on an up-to-date > bullseye Xen Dom0 configured to pass through the same PCI/IGD > devices. The bullseye HVM DomU with IGD passthrough has so > far only been verified to work on an old, slightly modified > jessie Xen Dom0. > > More Details: These latest tests are with linux version 5.10.70-1 > for bullseye stable. For the jessie Dom0, which worked with the > unmodified bullseye HVM DomU, I had to add a few patches to > the old jessie Xen packages so the unmodified bullseye Xen HVM Ok, yes, clear, that makes the domU kernel not the primary suspect. > These tests demonstrate that a fix for this bug is possible in src:xen > rather than in src:linux, but the patches needed to fix this bug in > Xen 4.14, which is the version of Xen on bullseye, are not yet > identified. It might also be possible (just a wild guess) that for Xen 4.14, the options in the domU config file need to be different than for Xen 4.4. > I will continue to investigate this issue and try to bisect the problem > as it recurs in Dom0 for some version of Xen > 4.4 and <= 4.14. It > will obviously take some time since there are so many differences > between Xen 4.4 and 4.14. If you can make progress on that, and find an actual commit that changes the behavior, then we're probably at 95% towards finding a cause and solution. :) That'd be great. A possible time-saver that I can recommend is to send a post to the upstream xen-users list [0] about this already. Like "Hi all, I'm starting a HVM Linux domU with Linux 5.10.70 on a Xen 4.14.3 system with also 5.10.70 dom0 kernel, with this and this domU config file. It fails to start, this is the xl -vvv create output, and this error (the irq stuff) appears in the dom0 kernel log.". Try to keep it simple and not too long initially, without the surrounding stories, to increase chance of it being fully read. > If I find a fix in src:xen for Xen >=4.14 Dom0 on bullseye or sid, I will > reassign #988333 to src:xen myself. Until then, I will leave it to the > discretion of the Debian Kernel Team to decide whether or not to > reassign it to src:xen now. Yes, that makes sense indeed, I'll do it in a minute. Even while we don't know if it has to do with the Xen or dom0 kernel code, it's more likely that in either case, we'll end up asking the upstream Xen people about it. Have fun, Hans [0] https://lists.xenproject.org/mailman/listinfo/xen-users
Bug#991967: Simply ACPI powerdown/reset issue?
Hi Elliot and others, Also including #994899 for once, since that's the bug number for the Xen issue now. On 9/26/21 5:27 AM, Elliott Mitchell wrote: > On Tue, Sep 21, 2021 at 06:33:20AM -0400, Chuck Zmudzinski wrote: >> I presume you are suggesting I try booting 4.19.181-1 on the >> current version of Xen-4.14 for bullseye as a dom0. I am not >> inclined to try it until an official Debian developer endorses >> your opinion that the bug I am seeing is distinct >> from #991967, at which point I will report the bug I am >> seeing as a new bug. > > Chuck Zmudzinski you are getting rather close to my threshold for calling > harrassment. You're not /quite/ there, but I'm concerned. > > > Since the purpose of the bug reports is to find and diagnose bugs, I did > a bit of experimentation and made some observations. > > I checked out the Debian Xen source via git. I got the current > "master" branch which is presently the candidate 4.14.3-1 version, > which includes urgent fixes. The hash is: > e7a17db0305c8de891b366ad3528e5a43015 > > On top of this I cherry-picked 3 commits from Xen's main branch: > 5a4087004d1adbbb223925f3306db0e5824a2bdc > 0f089bbf43ecce6f27576cb548ba4341d0ec46a8 > bc141e8ca56200bdd0a12e04a6ebff3c19d6c27b > > (these can be retrieved via Xen's gitweb at > https://xenbits.xen.org/gitweb/?p=xen.git;a=patch;h=<$hash> which is > suitable for the `git am` command) > > With these I built 4.14.3-1 and then tried kernels 4.19.181-1 and > 4.19.194-3 (this system is presently mostly on oldstable). The results > were: > > Xen 4.14.3-1 with Linux 4.19.181-1: system reboots were successful > > Xen 4.14.3-1 with Linux 4.19.194-3: system reboots hung Ok, so it included 0f089bbf43, which is probably the most important of the 3 fixes that we need indeed. And, it's good that the above difference is still visible afterwards, since it confirms that we're looking at two distinct problems. > Unfortunately I was too quick at installing the rebuilt 4.14.3-1 and I > missed trying the vanilla Debian 4.14.2+25-gb6a8c4f72d-2 with > Linux 4.19.181-1. I believe this combination would have hung during > reboot. The Xen related breakage was introduced in 4.14.0+88-g1d1d1f5391-2, so with that combination, I would expect you would experience both of the bugs at the same time, yes. > As such, I believe there are in fact two distinct bugs being observed. > The presence of EITHER of these is sufficient to cause hangs during > powerdown or reboot. > > First, some patch originally from Linux's main branch breaks Xen reboots > was backported somewhere between 4.19.181-1 and 4.19.194-3. This may > either have been introduced before 5.10 diverged from main, or may also > have been backported to 5.10. THIS is Debian bug #991967. > > Second, the Xen patch 3c428e9ecb1f290689080c11e0c37b793425bef1 which is > valuable to ARM devices breaks reboots and powerdowns on x86. This is > correctly fixed by 0f089bbf43ecce6f27576cb548ba4341d0ec46a8. Presently > this has no Debian bug report. Correct. Thanks a lot for your help with hunting down and confirming this. And now we have #994899 for it. So, I would like to kindly ask everyone to stop hijacking this one, #991967, for discussing the Xen problem. > The first is presently unidentified, someone enthusiastic either needs to > read git logs/source code, or bisect and build to find where it got > broken. > > The second we seem to have a fix. The only question is how many patches > to cherry pick? bc141e8ca562 is non-urgent as it is merely superficial > and not needed for functionality. > 5a4087004d1a is a workaround for Linux kernel breakage, but how likely > are we to see that fixed in the Linux kernel packages? The fix is > well-contained and needed for some highly popular ARM devices. Diederik also helped with testing changes, and when combining results, the best thing we can do is pick the 4 changes that were initially posted in Nov 2020 as "x86: ACPI and DMI table mapping fixes", and ended up in Xen 4.15 as well. >8 commit 8b6d55c1261820bb9db8d867ce9ee77397d05203 Author: Jan Beulich Date: Tue Nov 24 11:26:02 2020 +0100 x86/ACPI: fix mapping of FACS commit f390941a92f102ece1b54be206a602187fd7 Author: Jan Beulich Date: Tue Nov 24 11:26:34 2020 +0100 x86/DMI: fix table mapping when one lives above 1Mb commit 0f089bbf43ecce6f27576cb548ba4341d0ec46a8 Author: Jan Beulich Date: Tue Jan 5 13:09:55 2021 +0100 x86/ACPI: fix S3 wakeup vector mapping commit 16ca5b3f873f17f4fbdaecf46c133e1aa3d623b2 Author: Jan Beulich Date: Tue Jan 5 13:11:04 2021 +0100 x86/ACPI: don't invalidate S5 data when S3 wakeup vector cannot be determined >8 The 4th one is not explicitly tagged with Fixes: 1c4aa69ca1e1, but I agree with Diederik that we should keep them all together. I do not know if this is also the thing Chuck tested in the end, but I'm a bit lost in the walls of text that were produced in these two bugs.
Bug#990642: linux-image-4.19.0-17-amd64: kernel panic on xen dom0 with Broadcom Limited NetXtreme II BCM5709
Hi spi, Salvatore, On 8/5/21 1:58 PM, s...@gmxpro.de wrote: > > In preparation for the bug report for upstream I did some more > investigation. > > The kernel panic also occurs without bonding interfaces but needs much > more time to happen. With a bonding interface it happens within some > seconds. Without bonding interfaces it needs like a minute with the > network discovery being re-launched for 2 or 3 times. The kernel panic > is still the same about the bnx2 driver. > > In the constellation without a bonding interface the kernel panic only > occurs if > - opnsense as a domU is running (this domU bounds all bridged interfaces > as default gateway for all networks) Just FWIW, I'm seeing this bug-mail-thread now, and it rings a bell. I spent some time in the past to debug crashing BCM5719 (4x1G) nics in HP DL360 G8/9 series servers. In this case, the firmware inside the nic crashed, so the symptoms were different. This happened only when having a Xen domU active as router, which was routing incoming traffic packets (from outside the box) back to the outside again. 02:00.0 Ethernet controller: Broadcom Limited NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01) 02:00.1 Ethernet controller: Broadcom Limited NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01) 02:00.2 Ethernet controller: Broadcom Limited NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01) 02:00.3 Ethernet controller: Broadcom Limited NetXtreme BCM5719 Gigabit Ethernet PCIe (rev 01) Also, 2x 1G were bonded, I use openvswitch with LACP for that. The symptoms are obviously different, mine looked like this: tg3 :02:00.2 eth1: transmit timed out, resetting tg3 :02:00.2 eth1: 0x: 0x165714e4, 0x00100546, 0x0201, 0x00800010 tg3 :02:00.2 eth1: 0x0010: 0x92b3000c, 0x, 0x92b4000c, 0x tg3 :02:00.2 eth1: 0x0020: 0x92b5000c, 0x, 0x, 0x22be103c tg3 :02:00.2 eth1: 0x7000: 0x0808, 0x, 0x, 0x4cd8 tg3 :02:00.2 eth1: 0x7010: 0xdbbd2b97, 0x010080f3, 0x00d70081, 0x03008200 tg3 :02:00.2 eth1: 0x7020: 0x, 0x, 0x0406, 0x10004000 tg3 :02:00.2 eth1: 0x7030: 0x0002, 0x4cdc, 0x001f, 0x tg3 :02:00.2 eth1: 0: Host status block [0001:0070:(:0563:):(:0094)] tg3 :02:00.2 eth1: 0: NAPI info [0070:0070:(016a:0094:01ff)::(068c:::)] tg3 :02:00.2 eth1: 1: Host status block [0001:0083:(::):(015b:)] tg3 :02:00.2 eth1: 1: NAPI info [0051:0051:(::01ff):0124:(0124:0124::)] tg3 :02:00.2 eth1: 2: Host status block [0001:00d8:(0e96::):(:)] tg3 :02:00.2 eth1: 2: NAPI info [00a4:00a4:(::01ff):0e5b:(065b:065b::)] tg3 :02:00.2 eth1: 3: Host status block [0001:0013:(::):(:)] tg3 :02:00.2 eth1: 3: NAPI info [00f8:00f8:(::01ff):072f:(072f:072f::)] tg3 :02:00.2 eth1: 4: Host status block [0001:009c:(::0736):(:)] tg3 :02:00.2 eth1: 4: NAPI info [007c:007c:(::01ff):0716:(0716:0716::)] tg3 :02:00.2: tg3_stop_block timed out, ofs=1400 enable_bit=2 tg3 :02:00.2: tg3_stop_block timed out, ofs=c00 enable_bit=2 tg3 :02:00.2 eth1: Link is down tg3 :02:00.2 eth1: Link is up at 1000 Mbps, full duplex tg3 :02:00.2 eth1: Flow control is off for TX and off for RX tg3 :02:00.2 eth1: EEE is disabled > - sysctl parameter net.bridge.bridge-nf-call-ip6tables is set to 0. > > If both conditions are not met no kernel panic oaccurs. What I found out in the end is that using `ethtool -K $iface tso off` is a workaround to not make it trigger some obscure bug inside the nic that makes it crash. So, I think my actual suggestion would be, even while it does not look like the same thing, but it's still Broadcom stuff which can have *cough* weird issues... if you can reliably reproduce the problem, then can you try setting tso off on the physical interfaces in dom0 and try again? In Dutch we say "nooit geschoten altijd mis". > Other IPv6 related sysctl parameters are set on dom0 like > net.ipv6.conf.all.disable_ipv6 = 1 > net.ipv6.conf.default.disable_ipv6 = 1 > net.ipv6.conf.lo.disable_ipv6 = 1 > > > The layer2-iptables settings are > net.bridge.bridge-nf-call-ip6tables = 0 *** > > > net.bridge.bridge-nf-call-iptables = 1 > > > net.bridge.bridge-nf-call-arptables = 0 > > > > > As said, if I don't set the one marked with *** to 0 there is no kernel > panic. > > I wonder if this still is a kernel issue but still wouldn't expect a > kernel panic to happen. > > Cheers, > spi > Have fun, Hans
Bug#994870: [Pkg-xen-devel] Bug#994870: Bug#994870: Bug#994870: Memory allocation problem for VM after xen security update
Hi! On 9/30/21 12:45 AM, Andy Smith wrote: > Hi Alex, > > On Thu, Sep 30, 2021 at 12:10:32AM +0200, Alexander Dahl wrote: >> Am 22.09.21 um 20:54 schrieb Hans van Kranenburg: >>> At this point I would really recommend to not wait for a fix to arrive >>> which makes it start again, but change your VM to use a 64-bit kernel. >> >> How? > > This was answered in earlier comments on this bug; please see: > > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=994870#15 > https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=994870#20 > > The brief summary is, "start out like a crossgrade, but only do the > kernel". Very simple and quite safe. > > You haven't said how you boot your guest though (show us your > /etc/xen/guest.cfg file). If it's pvgrub, that has a 32-bit and a > 64-bit version so you'll need to change those as well. If it's > pygrub you probably don't need to do anything, though pygrub has its > own issues outside the scope of this bug. > >> FWIW, Debian 10 VMs with 32 bit running with PVH work fine. My important VM >> is still Debian 9 however due to a software I can not simply upgrade. > > I've found PVH needs at least 4.19 guest kernel to work, which can > be achieved in Debian 9 (stretch) today by using kernel from > stretch-backports, so perhaps that is an option for you. You can certainly do that and then run PVH. Since stretch-backports is not used any more since stretch became oldoldstable, new 4.19 backports kernels for Stretch are released through the security updates channel. Be aware of this. https://lists.debian.org/debian-lts-announce/2020/08/msg00019.html Latest in stretch-backports (frozen) is 4.19.118, and stretch security is now at 4.19.194. So double check you end up following the right one. Hans
Bug#995233: Files stored under /usr/lib/debug/ have a too specific xen version in their name
Package: src:xen Version: 4.14.3-1 Conversation in #debian-security, 25 Sep 2021 18:28 < adsb> hmmm, were the filename changes in https://release.debian.org/proposed-updates/bullseye_diffs/xen_4.14.3-1~deb11u1_amd64.debdiff.html expected? >8 Files only in first set of .debs, found in package xen-hypervisor-4.14-amd64 -rw-r--r-- root/root /usr/lib/debug/xen-4.14.3-pre.efi.map.gz -rw-r--r-- root/root /usr/lib/debug/xen-syms-4.14.3-pre.gz -rw-r--r-- root/root /usr/lib/debug/xen-syms-4.14.3-pre.map.gz New files in second set of .debs, found in package xen-hypervisor-4.14-amd64 -rw-r--r-- root/root /usr/lib/debug/xen-4.14.3.efi.map.gz -rw-r--r-- root/root /usr/lib/debug/xen-syms-4.14.3.gz -rw-r--r-- root/root /usr/lib/debug/xen-syms-4.14.3.map.gz >8 ^^^ files names in binary package should not change during an update of a package in the Debian stable release 22:09 < Knorrie> adsb: no, I did not expect that, but after looking at it for a few minutes, I understand why, and this is a small packaging bug that exists for years already apparently (the same thing has been happening during buster also already all the time) 22:09 < Knorrie> adsb: the '.3-pre' and '.3' parts should be stripped out of the filename, just like the files in /boot (e.g. /boot/xen-4.14-amd64.gz). so, just '4.14' 22:10 < Knorrie> we have a thing for that, which fixes up all the files in boot, the same should be added for those files in the debug location https://salsa.debian.org/xen-team/debian-xen/-/blob/master/debian/shuffle-boot-files 22:10 * Knorrie takes note 22:10 < Knorrie> for some reason this was apparently not spotted yet, because it's not on a todo list. 22:11 < Knorrie> so, let me rephrase, given the current packaging code (which did not change), technically, I see why this is actually expected, but it's not meant to be so 22:18 * h01ger learned today there's a new DSA acronym: Distributed Switch Architecture :) 22:54 < adsb> Knorrie: cool, thanks for looking 22:58 < Knorrie> adsb: thanks for sharing the observation FTR, Knorrie
Bug#994870: Memory allocation problem for VM after xen security update
Hi!, Please don't (accidentally) drop the debian bug email from the recipient list. This information might also be useful for others later. On 9/23/21 1:47 AM, H.-R. Oberhage wrote: > Good evening Hans, > > On 22.09.2021 20:54, Hans van Kranenburg wrote: >> Hi Ruediger, >> >> On 9/22/21 11:37 AM, H.-R. Oberhage wrote: >>> Package: xen-system-amd64 >>> Version: 4.14.3-1~deb11u1 >>> >>> After applying the buster security update to xen, my VM won't start >>> any longer, complaining about a memory allocation error. >> >> Can you confirm that this is a virtual machine that tries to boot a >> 32-bit kernel as PV type? > > yes, your assumption ... > >> The error message you are seeing is not particularly helpful, but it is >> most likely related to this. > > ... is correct. > >> The fact that with this package update 32-bit PV guests fail to start >> is >> indeed a regression problem, which is quite inconvenient for you, right >> now. > > Ok, then I will put the Xen-package on "hold" for now. > >> At this point I would really recommend to not wait for a fix to arrive >> which makes it start again, but change your VM to use a 64-bit kernel. > > It really is a shame, that 32-bit isn't supported properly any longer. > The address- and data-overhead in 64-bit machines only using a 32-bit > address- and data-space is considerable. > > I already experienced, "bullseye" not supporting a dom0-Kernel for the > i686-pae architecture any longer :-(. A shame that it doesn't come with > a kernel before 5.9, which would still allow this. > >> Let me know if you need help or run into problems while making this >> change. > > Would you know of a "simple" way to convert/clone a 32-bit VM to a > work-alike 64-bit one? One has to replace all the .debs for this, after > all. The smallest amount of work to initially get your VM going again is to only install the 64 bit kernel and keep running a 32 bit user land. The process to fully change from a 32 to 64 system (in place) is called 'cross grading'. I found instructions at https://wiki.debian.org/CrossGrading I never did this myself, though. >> Running 32-bit PV at all is already 'on life support' upstream for >> quite >> a while now, and it also not under security support any more. > > Well it's a Debian "stretch" one, so it's just working for now :-). One of the main reasons why it's so problematic to keep around is that in the 32 bit PV case, there are no possibilities to implement fixes for all the speculative vulnerabilities that have been very much in the news in the last years. More about this: https://xenbits.xen.org/xsa/advisory-370.html >> In the long run, I'd suggest working towards having 64-bit guests in >> PVH >> mode, since that's one of the best options we have these days. > > Thanks, I'll consider this for any newer VMs. > Are 64-bit PV VMs automatically "moved" to or executed as PVH? > I would even be willing to edit the .xml/.cfg-file manually. > I see "bullseye's" virt-manager/libvirt offering only choices for > "xen (fullvirt)", "xen (paravirt)", or xen", when creating a new > VM. It should be as simple as changing type="pv" to type="pvh" in the config file. In Debian, using PVH this is possible since Buster. Also, using the xen variant of grub2 (grub-xen and grub-xen-host) is possible. More info: https://wiki.xenproject.org/wiki/Understanding_the_Virtualization_Spectrum Have fun, Hans
Bug#994870: [Pkg-xen-devel] Bug#994870: Memory allocation problem for VM after xen security update
Hi Ruediger, On 9/22/21 11:37 AM, H.-R. Oberhage wrote: > Package: xen-system-amd64 > Version: 4.14.3-1~deb11u1 > > After applying the buster security update to xen, my VM won't start > any longer, complaining about a memory allocation error. Can you confirm that this is a virtual machine that tries to boot a 32-bit kernel as PV type? The error message you are seeing is not particularly helpful, but it is most likely related to this. The fact that with this package update 32-bit PV guests fail to start is indeed a regression problem, which is quite inconvenient for you, right now. At this point I would really recommend to not wait for a fix to arrive which makes it start again, but change your VM to use a 64-bit kernel. Let me know if you need help or run into problems while making this change. Running 32-bit PV at all is already 'on life support' upstream for quite a while now, and it also not under security support any more. In the long run, I'd suggest working towards having 64-bit guests in PVH mode, since that's one of the best options we have these days. If there's a reason you really cannot switch to a 64-bit kernel or move the functionality of this virtual machine to a new fully 64 bit system, switching the virtualization type from PV to HVM would also be an option. > Switching back to the previous version 4.14.2+25-gb6a8c4f72d-2 lets > the VM start (again,) normally. > > /var/log/libvirt/libxl/libxl-driver.log: > 2021-09-21 14:01:44.645+: xc: panic: xc_dom_boot.c:120: > xc_dom_boot_mem_init: can't allocate low memory for domain: Out of > memory > 2021-09-21 14:01:44.653+: libxl: libxl_dom.c:593:libxl__build_dom: > xc_dom_boot_mem_init failed: Die Operation wird nicht unterstützt > [means: the operation is not supported] > 2021-09-21 14:01:44.662+: libxl: > libxl_create.c:1576:domcreate_rebuild_done: Domain 1:cannot (re-)build > domain: -3 > > The error is triggered, regardless if there was a boot-parameter > "dom0_mem=1024M:max=2048M" set or not. > /etc/xen/xl.conf was unaltered, i.e. 'autoballoon' was implicitely set > to "auto". > > I am "on" Buster, kernel 5.10.0-8-amd64 (5.10.46-4), all relevant fixes > included. Apologies for the inconvenience, Hans
Bug#993168: Security support ended for Xen 4.11 in Buster
Package: debian-security-support Version: 2020.06.21~deb10u1 Severity: normal Hi, Upstream security support for Xen 4.11 has ended recently. This also means that security support for Debian ended. The complexity of the software involved does not really allow for anyone else than the upstream developers, with a deep understanding of the inner workings of the hypervisor code, to apply/backport new patches. For security-support-ended.deb10, this would be a line like: xen 4.11.4+107-gef32c7afa2-1 https://xenbits.xen.org/docs/4.11-testing/SUPPORT.html#release-support Thanks, Hans
Bug#989656: [Pkg-xen-devel] Bug#989656: Xen misusing syslog
reassign 989656 src:xen 4.14.1+11-gb0b734a8b3-1 tags 989656 + upstream thanks Hi Phillip, On 6/9/21 5:04 PM, Phillip Susi wrote: > Package: xen-utils-common > Version: 4.14.1+11-gb0b734a8b3-1 > > My syslog has entries that look like this: > > Jun 09 10:54:26 hyper1 root[621]: /etc/xen/scripts/block: add > XENBUS_PATH=backend/vbd/1/768 > > The third field is supposed to be the program name, which I would expect > to either be xen or xl or something, but instead it appears to be > passing $USER. Yeah, that's a bit weird yes. I guess this is one of the many things that have to be dealt with when doing a great overhaul of all the ancient scripts-stuff in xen. I'm marking it 'upstream' now, since we cannot fix this in the Debian packaging process, any solution should go 'upstream-first'. Hans
Bug#989560: [Pkg-xen-devel] Bug#989560: Bug #989560 is grub-common, not xen-hypervisor-common
tags 989560 + moreinfo thanks Hi, On 8/4/21 4:00 AM, Elliott Mitchell wrote: > I rate #989560 as a grub-common bug, *not* a xen-hypervisor-common bug. > As you've noticed, the problem is with the file /etc/grub.d/20_linux_xen, > which is part of grub-common, not xen-hypervisor-common. > > A working grub.cfg will be generated by the version of the file from > GRUB 2.04. If you can deal with installing *only* GRUB from testing, > that should work. > > The bug should be reassigned to grub-common, but marked as effecting > Xen so duplicate reports don't show up (actually I'm pretty sure reports > against grub-common or src:grub2 already exist). The /etc/grub.d/20_linux_xen is indeed part of grub-common, but, I'm not just going to NIMBA reassign, since the grub-common maintainer will not have any idea what to do with it, unless you guys find out what's wrong first and have clear directions and questions and patches about how to improve the situation. Currently, the only thing I can do before doing new unstable uploads or stable/security stuff is do smoke testing on amd64. That doesn't mean I don't care. It does mean however that extra help in the team is really appreciated. Have fun, Hans
Bug#988477: [Pkg-xen-devel] Bug#988477: Acknowledgement (xen-hypervisor-4.14-amd64: xen dmesg shows (XEN) AMD-Vi: IO_PAGE_FAULT on sata pci device)
severity 988477 normal tags 988477 + moreinfo + upstream - bullseye-ignore thanks Hi! On 6/13/21 3:58 PM, Imre Szőllősi wrote: > i tested on 4th hw > > 4. asus m4n78 pro, phenom ii x4 905e, md raid1, 2x samsung 1TB 860evo, > lvm: problem does not appear > > as i see, not all mb/chipset/sata pcie device affected Thanks for your report, and for trying out different combinations of hardware. While doing a short internet search about the problems you're seeing while using AMD ryzen, sata, nvme and iommu, I suspect this problem does not have a lot to do with Xen specifically, but more with the hardware and its firmware. This also means that it's not a Debian packaging problem, and it cannot be fixed by me (or the Debian Xen team). If you want to research this problem more, I can maybe be of some help by providing suggestions. Still, you will have to do all of the actual work, since I do not have your hardware here. The first thing I would suggest is to try reproduce the problem when booting with just Linux without Xen, and then trying the dbench test. If you don't actually need to directly pass-through hardware to a Xen guest, you can also try disabling iommu, or researching other iommu= options that can serve as a workaround. In any case, further reports will need to have more detailed information. For example, instead of "there are a lot of messages", provide a text attachment with a piece of logging that shows these messages. I'm tagging this bug 'moreinfo' now, since it will depend on your availability and abilities to work on it to have it advance. Have fun, Hans van Kranenburg
Bug#987030: linux-image-5.10.0-6-amd64 - Fans speed maximum - CPU load < 1%
Oh, On 4/16/21 11:44 AM, Hans van Kranenburg wrote: > [...] > > I have the same issue here, it started at the moment I moved from the > 4.19 kernel to 5.9, and now 5.10. For totally non-obvious reasons fans > start blowing like crazy regularly for a few seconds. When observing > system load, it's just hovering around 0.2 - 0.4, no peaks observed. > > [...] So, while the fan misbehavior started around the time of the kernel upgrade, the reason turned out to be a lot more simple. For me, it was a dust problem. After thorough cleaning, the problem is gone. :D Hans
Bug#987030: linux-image-5.10.0-6-amd64 - Fans speed maximum - CPU load < 1%
Hi, On 4/15/21 10:51 PM, klak wrote: > Package: linux-image-5.10.0-6-amd64 > Version: 5.10.28-1 > > Hello Maintainer, > > every few minutes the fans turn to maximum for a few seconds. The CPU > load is less than 1 %, but the fans are turning maximum. The problen > starts with version 5.9. I didn't see anything conspicuous in the > syslog. The machine is a KVM host and the problem also occurs when it > is idle. I have the same issue here, it started at the moment I moved from the 4.19 kernel to 5.9, and now 5.10. For totally non-obvious reasons fans start blowing like crazy regularly for a few seconds. When observing system load, it's just hovering around 0.2 - 0.4, no peaks observed. This is an Intel NUC with just a Buster system used as mostly inactive desktop. FWIW, attached are output of lshw and lspci -v. > Board + CPU : > = > DMI: Intel Corporation S5520HC/S5520HC, BIOS > S5500.86B.01.00.0064.050520141428 05/05/2014 > > smpboot: CPU0: Intel(R) Xeon(R) CPU L5640 @ 2.27GHz (family: > 0x6, model: 0x2c, stepping: 0x2) > > Performance Events: PEBS fmt1+, Westmere events, 16-deep LBR, Intel PMU > driver. > DMAR: Intel(R) Virtualization Technology for Directed I/O Hans dorothy description: Mini PC product: NUC8i5BEH (BOXNUC8i5BEH) vendor: Intel(R) Client Systems version: J72747-305 serial: G6BE94400JDL width: 64 bits capabilities: smbios-3.2.1 dmi-3.2.1 smp vsyscall32 configuration: boot=normal chassis=mini family=Intel NUC sku=BOXNUC8i5BEH uuid=889D1A9F-A26C-DC8C-5D1C-1C697A09E4C6 *-core description: Motherboard product: NUC8BEB vendor: Intel Corporation physical id: 0 version: J72692-307 serial: GEBE94TU slot: Default string *-firmware description: BIOS vendor: Intel Corp. physical id: 0 version: BECFL357.86A.0072.2019.0524.1801 date: 05/24/2019 size: 64KiB capacity: 16MiB capabilities: pci upgrade shadowing cdboot bootselect socketedrom edd int13floppy1200 int13floppy720 int13floppy2880 int5printscreen int14serial int17printer acpi usb biosbootspecification uefi *-memory description: System Memory physical id: 3b slot: System board or motherboard size: 32GiB *-bank:0 description: SODIMM DDR4 Synchronous 2400 MHz (0.4 ns) product: CT16G4SFD824A.M16FE vendor: 859B physical id: 0 serial: E3029A84 slot: SODIMM1 size: 16GiB width: 64 bits clock: 2400MHz (0.4ns) *-bank:1 description: SODIMM DDR4 Synchronous 2400 MHz (0.4 ns) product: CT16G4SFD824A.M16FE vendor: 859B physical id: 1 serial: E302B39D slot: SODIMM2 size: 16GiB width: 64 bits clock: 2400MHz (0.4ns) *-cache:0 description: L1 cache physical id: 45 slot: L1 Cache size: 256KiB capacity: 256KiB capabilities: synchronous internal write-back unified configuration: level=1 *-cache:1 description: L2 cache physical id: 46 slot: L2 Cache size: 1MiB capacity: 1MiB capabilities: synchronous internal write-back unified configuration: level=2 *-cache:2 description: L3 cache physical id: 47 slot: L3 Cache size: 6MiB capacity: 6MiB capabilities: synchronous internal write-back unified configuration: level=3 *-cpu description: CPU product: Intel(R) Core(TM) i5-8259U CPU @ 2.30GHz vendor: Intel Corp. physical id: 48 bus info: cpu@0 version: Intel(R) Core(TM) i5-8259U CPU @ 2.30GHz serial: To Be Filled By O.E.M. slot: U3E1 size: 3139MHz capacity: 3800MHz width: 64 bits clock: 100MHz capabilities: lm fpu fpu_exception wp vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp x86-64 constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp md_clear flush_l1d cpufreq configuration: cores=4 enabledcores=4
Bug#983862: PVH -- cannot remove vm with pci passthrough
reassign 983862 src:xen 4.11.4+57-g41a822c392-2 tags 983862 + upstream thanks Hi Adi! On 3/2/21 12:46 PM, Adi Kriegisch wrote: > Package: xen-utils-4.11 > Version: 4.11.4+57-g41a822c392-2 > Severity: minor > > Dear maintainers, > > we, by accident, added a pci passthrough device config to a pvh vm and were > able to boot that machine. But shutdown did not work with the following > error message: > | xl: libxl_pci.c:1427: do_pci_remove: Assertion `type == > LIBXL_DOMAIN_TYPE_PV' failed. > To remove the virtual machine and free its resources a reboot of Dom0 was > necessary. A corresponding assert when creating the machine seems to be > missing. > We consider this to be a bug, because we should not have been able to 'xl > create' that machine in the first place (or would have needed a way to > dispose the vm). Aha. Interesting. I just had a look at libxl_pci.c in the latest upstream code, and I think the same bug still exists. PCI passthrough is still not supported for PVH, so it should refuse, a bit higher up in the call stack. Probably not with an assert, but with a nice error message. :) I think I'm going to have a closer look at it somewhere in the next week. Note that the bug fix will not reach Xen 4.11 (or our package) any more. Regards, Hans van Kranenburg
Bug#981052: xen: XSA-360: IRQ vector leak on x86
Hi, On 1/25/21 8:08 PM, Salvatore Bonaccorso wrote: > Source: xen > Version: 4.14.0+88-g1d1d1f5391-2 > Severity: important > Tags: security upstream > X-Debbugs-Cc: car...@debian.org, Debian Security Team > > > Hi > > For details see https://xenbits.xen.org/xsa/advisory-360.html . > > It does not affect version in buster afaict. Indeed. Currently upstream stable-4.11 is at commit 310ab79875, which is actually the same as our buster-security package (4.11.4+57-g41a822c392-2) because the last upload was done using the embargoed patches. Unless something really interesting suddenly happens next Tuesday, there won't be a buster security update happening together with the 10.8 point release. For unstable, I plan do do something at the end of this week, and base it on the current stable-4.14 of course with the XSA-360 thing. And, we will have reproducible builds, woohoo! Thanks, Hans
Bug#977148: Removing Xen hypervisor packages does not update-grub
Package: src:xen Version: 4.11.4+57-g41a822c392-1 Severity: normal When removing the Xen packages, the grub menu entries to boot should be removed. Currently, thanks to a missing postfix of a postrm filename... xen-hypervisor-V-F.postrm vs. xen-hypervisor-V-F.postrm.vsn-in ...this script is ignored and not installed. This is the maintainer script that contains the update-grub command. The result of this packaging bug is that after removing Xen, the system remains unbootable, except when interacting manually with the grub menu. I'd like to also apply the fix to Debian stable since: * The script was there before, the bug was introduced during a refactoring in the Xen 4.11 packaging. * The fix is very small and targeted. * It's rather embarrasing to cause a system to be unbootable for someone. * We cannot require all Debian users to have proper OOB in place to deal with a situation like this. Hans
Bug#962267: [Pkg-xen-devel] Bug#962267: Bug#962267: xen: please consider to not install NEWS into runtime library packages
Hi, On 6/5/20 1:34 PM, Ansgar wrote: > On Fri, 2020-06-05 at 12:09 +0200, Hans van Kranenburg wrote: >>> Installing NEWS into xen*, but not libxen* probably still reaches all >>> relevant users. >> >> Yes, that makes sense. >> >> OTOH, what if there was a really weird problem with libxenmisc4.11 that >> we would like to pro-actively inform users about? > > In that case shipping NEWS in libxen* would of course be fine in my > opinion even if it also includes some additional information that might > only be relevant to users of the other NEWS files. > >> I guess there is only one NEWS per source package? > > You can have `debian/.NEWS` for per-binary NEWS when using > `dh_installchangelogs` or install them in some other way. But it > increases overhead and I would personally avoid having per-binary NEWS > for this reason. Can you help me by explaining me what your current expectations regarding this issue are? You ask for not having NEWS in specific binary packages, but then subsequently explain that you'd prefer to avoid doing exactly that. Adding information to NEWS is quite an exceptional thing to happen. It's only used for cases in which the user needs to take actions or needs to be aware of a real problem that needs to be solved outside of the context of what we can do in the packaging and Debian. My opinion is that adding the extra complexity is not warranted to fix the accidental annoyance of pressing a key on the keyboard for a user who chooses to install apt-listchanges. Thanks, Hans
Bug#976597: Xen Python dependencies are not specific enough
Package: src:xen Version: 4.14.0+80-gd101b417b7-1 Severity: normal There is indeed something wrong that should be fixed. Creating a Debian bug for it now. tl;dr Currently Xen package needs Python 3.9 as default /usr/bin/python3 but the Xen packages went unstable->testing while testing has 3.8 as default, causing pygrub to fail to find and import xenfsimage.cpython-39-x86_64-linux-gnu.so. On 12/5/20 10:43 AM, Alexander Dahl wrote: > Hello, > > FTR, we found what seems to be the problem in IRC yesterday, see > below. > > On Thu, Dec 03, 2020 at 10:56:12PM +0100, Alexander Dahl wrote: >> On Tue, Nov 24, 2020 at 05:41:42PM +0100, Hans van Kranenburg wrote: >>> [...] >>> >>> Any help with testing is appreciated, especially since there are so many >>> combinations of hardware, different architectures and use cases (using >>> legacy BIOS or EFI, PV, PVH, HVM, different boot loaders like pvgrub, >>> pygrub, etc etc). >> >> x86_64 host here, and some old 32 bit virtual machines, no weird >> network or hardware pass through setup, rather simple. HVM and pvgrub >> based DomU VMs run fine so far. pygrub based VMs, both 32 bit and 64 >> bit fail with the following error: >> >> Traceback (most recent call last): >> File "/usr/lib/xen-4.14/bin/pygrub", line 27, in >> import xenfsimage >> ModuleNotFoundError: No module named 'xenfsimage' > > Testing has python 3.8 as default at the moment, while unstable > already has 3.9. The file packaged for testing however is > 'xenfsimage.cpython-39-x86_64-linux-gnu.so' but that seems to be for > python 3.9, not for 3.8. When starting pygrub manually with python3.9 > that error goes away, but I suppose that would not work from within > the xen config, or does it? I have not dived further into this yet, but I can think of the following TODO items, if anyone wants to help with research and fixing (yes please): * Look at the build logs (buildd logs are linked from the PTS), and try to understand why in Xen 4.11 (with Python 2) we just have fsimage.so but with 4.14 and Python 3 we have this more specific longer name with cpython-39-x86_64-linux-gnu in it. * Figure out what we need to do to make a python dependency more specific, so that the xen packages would have been blocked from the transition to testing as long as python3-defaults in testing is not pointing to the needed version. Maybe there's something to be found in the Debian Python Policy? https://www.debian.org/doc/packaging-manuals/python-policy/ Or, maybe if it's not super obvious we can ask some python packaging IRC or mailing list or otherwise debian-devel@ for help. Hans
Bug#976109: [Pkg-xen-devel] Bug#976109: xen: CVE-2020-29040
Hi, On 11/29/20 8:50 PM, Salvatore Bonaccorso wrote: > Source: xen > Version: 4.14.0+80-gd101b417b7-1 > Severity: grave > Tags: security upstream > Justification: user security hole > X-Debbugs-Cc: car...@debian.org, Debian Security Team > > > Hi, > > The following vulnerability was published for xen. > > CVE-2020-29040[0]: > | An issue was discovered in Xen through 4.14.x allowing x86 HVM guest > | OS users to cause a denial of service (stack corruption), cause a data > | leak, or possibly gain privileges because of an off-by-one error. > | NOTE: this issue is caused by an incorrect fix for CVE-2020-27671. Yes, there's also a limited number of cases in which this is possible, and you just left that text out, which makes it sound a lot more horrible: "Only x86 HVM guests which have physical devices passed through to them can leverage the vulnerability.". I suspect that if anyone today is using Debian testing to run Xen and also is passing through devices is doing that to test performance use cases and not to untrusted guests. > If you fix the vulnerability please also make sure to include the > CVE (Common Vulnerabilities & Exposures) id in your changelog entry. Yes, it will off course be included in next upload. Hans
Bug#942611: [Pkg-xen-devel] Bug#942611: xen-doc: Various text files stored as .txt.gz, but index references .txt
tags 942611 + pending thanks Hi Diederik, On 10/19/19 2:19 AM, Diederik de Haas wrote: > Package: xen-doc > Version: 4.11.1+92-g6c33308a8d-2+b1 > Severity: normal > > file:///usr/share/doc/xen/html/index.html contains a link to > file:///usr/share/doc/xen/html/misc/vtd.txt (VT-d HOWTO), but that file > doesn't exist. There is a .../misc/vtd.txt.gz file though. > A similar pattern can be found with various other .txt files, but not all. > Since this is HTML documentation and presumably meant to be read in a browser > (which is what I did), I think those .txt.gz files should be stored as .txt, > so > they can be viewed in the browser and it would make the hyperlink actually > work. Yes, you are right. I do agree. We already have the html documentation collection in a separate package, xen-doc. So, when someone installs that package, they explicitly choose to do so. If they want to browse around at file:///usr/share/doc/xen/html/ then there should not be broken links all over the place. The difference between compressing or not compressing is 5.1M vs 5.2M measured by doing dpkg-deb -x on the xen-doc .deb before and after and then doing du -sch on the directory in which it was unpacked. https://salsa.debian.org/xen-team/debian-xen/-/commit/38cde19f59ee4121e048b23cfe7e9ea4ddcbdf60 (commit id will vanish because of heavily rebasing later, it's "d/rules: do not compress /usr/share/doc/xen/html") > (Sidenote: I doubt including a file mentioning how to compile your own 2.6.18 > kernel > to include support for VT-d is useful, and it is 10+ y/o, but that's probably > an > upstream issue) Heh, yes. Patches to remove obsolete documentation can be sent upstream directly. Have fun, Hans
Bug#963607: xen-hypervisor-4.11-amd64: Xen Hypervisor kernel fails to load arcmsr module with "arcmsr0: dma_alloc_coherent got error" message.
tags 963607 + moreinfo thanks Hi Alex, On 7/2/20 9:26 AM, debianb...@red-sand.com wrote: > > [...] > > I am about to purchase a new SAS HBA card to test as we have a number of > these servers with Areca cards that I imagine will have the same problem > on Xen 4.11. I am leaning towards mpt3 driver cards but we have had > problems with mpt3 previously so I am hesitating there too. mpt2 has > been rock solid. > > If you can think of anything else that I could try that would be > excellent. > > [...] No, I don't really have a suggestion. Did you get new hardware? What do you want to do with this bug report? There are no actionable items for us, we cannot solve a hardware problem with packaging changes :) so I'd rather close it. Thanks, Hans
Bug#944247: xen domU crashes under high i/o load if you use qcow2 images
tags 944247 + moreinfo severity 944247 normal thanks Hi Mario, On 11/6/19 4:46 PM, mario wrote: > Source: xen > Severity: important > > Dear Maintainer, > > we have updated our server from debian oldstable (which unfortunately wasn't > running stable after the last update, bug reported) to debian buster. > > unfortunately xen doesn't work reliably there either: > > the virtual server crashes every 1-2 week with i/o problems and sometimes > also takes other domU instances with it. > we use qcow2 images. > > the harddisk of the domU is simply no longer accessible for the linux kernel, > no logfiles are available. in the xl console the following last lines can be > read, login not possible: > > [ 1450.976415] INFO: task nginx:376 blocked for more than 120 seconds. > [ 1450.976423] Not tainted 4.9.0-9-amd64 #1 Debian 4.9.168-1+deb9u5 > [ 1450.976428] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables > this message. > [ 1450.976469] INFO: task nginx:377 blocked for more than 120 seconds. > [ 1450.976474] Not tainted 4.9.0-9-amd64 #1 Debian 4.9.168-1+deb9u5 > [ 1450.976479] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables > this message. > [ 1450.976624] INFO: task nginx:378 blocked for more than 120 seconds. > > the process varies: > [1523692.508073] INFO: task jbd2/xvda2-8:159 blocked for more than 120 seconds > [1523692.508084] Not tainted [...] > > all hard disk accesses fail as if the i/o system is completely dead. > only "xl destroy " and recreate will help This report is now a year old. Unfortunately it did not get any reply. This might have several reasons, and one of them is probably that there's not someone else around reading it that uses the same storage configuration and as well runs into the same problem. > you can easily reproduce this with the tool stress "stress -c 8 -i 8 -d 8". > it takes a maximum of 10 minutes until the vm crashes. > > in our experience, as a workaround you can convert all images to raw. after > our tests, the error will no longer occur. > but since we need the snapshot functions of qcow2 images, this is not a > permanent solution. > > does anyone else have problems with qcow2 images and xen under buster? > maybe this also concerns qemu? > > [...] To be honest, I do not know. Have you been able to find out more about the problem yet, in the last year? Have you taken steps to try narrow down the problem by investigating other combinations of used software with/without xen? I mean, for example, reboot into just Linux and mount the qcow2 image somewhere and do the same load test to see if it's also happening when eliminating Xen from the equation? The bug report right now is not really actionable for anyone else than yourself. As Debian Xen team we unfortunately do not have the bandwidth to go set up a test server with the same configuration as you have and try hammer on it and cause the same problem to happen. Thanks, Hans
Bug#955994: [Pkg-xen-devel] Bug#955994: xen-utils-common: Could not start vif
reassign 955994 src:xen tags 955994 + pending thanks Hi Samuel, On 4/5/20 9:14 PM, Samuel Thibault wrote: > Package: xen-utils-common > Version: 4.11.3+24-g14b62ab3e5-1 > Severity: normal > Tags: patch > > Hello, > > I was having issues with starting domains with vif-nat: > > ♭ xl cr -c mydom > Parsing config from mydom > libxl: error: libxl_exec.c:117:libxl_report_child_exitstatus: > /etc/xen/scripts/vif-nat online [27191] exited with error status 1 > libxl: error: libxl_device.c:1286:device_hotplug_child_death_cb: script: > /etc/xen/scripts/vif-nat failed; error detected. > libxl: error: libxl_create.c:1519:domcreate_attach_devices: Domain 25:unable > to add vif devices > libxl: error: libxl_domain.c:1034:libxl__destroy_domid: Domain > 25:Non-existant domain > libxl: error: libxl_domain.c:993:domain_destroy_callback: Domain 25:Unable to > destroy guest > libxl: error: libxl_domain.c:920:domain_destroy_cb: Domain 25:Destruction of > domain failed > > It happens that it seems that's merely because handle_iptable() does not > pass a return value, and I guess the return value is thus that of the > latest command, which may not be true, and that makes vif-nat fail. The > attached patch fixes that. Yes, you are completely right. Thanks for spotting this. > [...] I just added the explicit 0. I did not change the second return line, since that code is unreachable anyway and it's patching upstream content. https://salsa.debian.org/xen-team/debian-xen/-/commits/knorrie/sid Thanks, Hans
Bug#939186: [Pkg-xen-devel] Bug#939186: HVM + Balloon crashes Xen hypervisor
Hi, On 9/2/19 5:18 AM, Elliott Mitchell wrote: > Package: xen-hypervisor-4.8-amd64 > Version: 4.8.5+shim4.10.2+xsa282-1+deb9u11 > > Trying to create a HVM domain with memory != maxmem reliably crashes > Debian's build of Xen 4.8. This may be a nonsensical configuration, but > it still shouldn't cause everything, including the hypervisor to crash. > > I recall running into this with 4.4 as well. Can you still reproduce this with Xen 4.11 or 4.14? If not, can you mail 939186-cl...@bugs.debian.org to close it? I just tried a few things with maxmem and memory with a PVH guest on Xen 4.14, and it just seems to work like it should. Hans
Bug#912975: xen-hypervisor-4.8-amd64: Dom0 crashes randomly without logs on Debian Stretch with Xen 4.8.4
Hi, This bug was reported against Xen 4.8 (which is out of support and out of security support now) and there has not been any activity for over almost two years. I'm cleaning up old open bugs, and I will close the issue now. If you found a solution to this problem, please let us know, so the information is added in the bug report for anyone else who might run into the same situation. If the problem still persists with Xen 4.11 in Debian stable, please reply and reopen. Thanks, Hans
Bug#934786: xen-system-amd64: xen host crashes when calling "npm run build" in a vm (reproducible)
Hi Mario, On 8/14/19 11:00 PM, mario wrote: > Package: xen-system-amd64 > Version: 4.8.5+shim4.10.2+xsa282-1+deb9u11 > Severity: important > > hello everyone, > > we have a vm with kernel 4.9.0-5-amd64 running dabian oldstable > if we run an "npm run build" on one of the virtual machines, the whole > xen-host system will restart (reset) > there is no message, neither in the kernel nor in the syslog. > > if we update the kernel of the virtual machine to 4.9.0-9-amd64, the problem > is no longer there and the build runs without errors. > > the kernel of the hosts system doesn't matter, even the latest 4.9.0-9-amd64 > doesn't help. > > we have a second server (with different hardware) also here i can crash the > complete xen-server from the vm with all vms that are running. > > the whole thing works reproducible by starting "npm run build" on this > special vm > > any idea how we can narrow that down and provide more information??? Interesting. Do you know if it's Xen crashing, or the dom0 Linux kernel? The Xen wiki has some hints about debugging: https://wiki.xen.org/wiki/Debugging_Xen The first thing I would recommend is getting the server serial port working properly so that you can see Xen messages there. This bug was reported against Xen 4.8 (which is out of support and out of security support now). I would highly recommend to first upgrade to Xen 4.11 in current Debian stable and see if you can still reproduce this problem. If you no longer have this problem, or found a solution, please let know. If there's no activity, I will close this issue in about a month. Thanks, Hans
Bug#968965: [Pkg-xen-devel] Bug#968965: xen: FTBFS woes in sid
On 11/20/20 8:02 PM, Hans van Kranenburg wrote: > So, > > On 9/21/20 4:16 PM, Hans van Kranenburg wrote: >> [...] >> > [...] > >8 > > dh_install: warning: Cannot find (any matches for) > "usr/lib/debug/usr/lib/xen-*/boot/*" (tried in ., debian/tmp) > > dh_install: warning: xen-utils-4.14 missing files: > usr/lib/debug/usr/lib/xen-*/boot/* > dh_install: error: missing files, aborting > > >8 > > I can only find CONFIG_PV_SHIM=n in the build log. What is going on > here? Attached is the build log. Ok, this probably has something to do with upstream commit 8845155c83 "pvshim: make PV shim build selectable from configure" (Xen 4.12) which causes the shim not to be built during our i386 build any more. In Xen 4.11 we have commit a516bddbd3 "tools/firmware/Makefile: CONFIG_PV_SHIM: enable only on x86_64". The part of this file that this patch changes is removed in the above mentioned commit. Because all of this is such a big mess, I just tried to revert 8845155c83 and then do 0b898ccc2 and a516bddbd3 on top of the previous code again. And, yes, now it goes through, and ./usr/lib/xen-4.14/boot/xen-shim is included in the i386 package. At least we have a workaround now. > My WIP branch is here (including the make-patches commit, it's ready to > build). I also forwarded the thing to latest stable-4.14. Again at: > https://salsa.debian.org/xen-team/debian-xen/-/commits/knorrie/4.14/ I'll rerun both the amd64 and i386 build here and actually boot the amd64 packages in a test environment. If success, then I'm going to try put this in experimental again so we can see if it all succeeds on the buildds. Then after final review we should be able to upload to unstable beginning next week. K
Bug#968965: xen: FTBFS woes in sid
On 11/21/20 5:40 AM, Elliott Mitchell wrote: > On Fri, Nov 20, 2020 at 08:02:26PM +0100, Hans van Kranenburg wrote: >> So, >> >> On 9/21/20 4:16 PM, Hans van Kranenburg wrote: >>> [...] >>> >>> gcc-Wl,-z,relro -Wl,-z,now -pthread -Wl,-soname >>> -Wl,libxentoolcore.so.1 -shared -Wl,--version-script=libxentoolcore.map >>> -o libxentoolcore.so.1.0 handlereg.opic >>> /usr/bin/ld: i386:x86-64 architecture of input file `handlereg.opic' is >>> incompatible with i386 output >>> /usr/bin/ld: handlereg.opic: file class ELFCLASS64 incompatible with >>> ELFCLASS32 >>> /usr/bin/ld: final link failed: file in wrong format >>> collect2: error: ld returned 1 exit status >> >> This one is caused by "debian/rules: Combine shared Make args". I >> reverted that change for now. >> >> [...] > > I was going to type, "That can't be true! Both sections are identical, > so that commit *couldn't* have done it!" > > Being the careful sort, look closer. Look closer. Then realize if one > reads fast they look identical, but they're getting *slightly* different > values for ${XEN_TARGET_ARCH}. Mainly for $(make_args_xen), > ${XEN_TARGET_ARCH} gets $(xen_arch_$(flavour)), but for > $(make_args_tools), ${XEN_TARGET_ARCH} gets $(xen_arch_$(DEB_HOST_ARCH)). > > Three of us and we didn't spot that difference. Should still combine > ${XEN_COMPILE_ARCH} which remains identical for both values. Ok, I will make it a partial revert and add the above information about it. Thanks. Hans
Bug#975062: Python 3 (pygrub) in 4.14 packages
Hi! On 11/18/20 6:45 PM, Ian Jackson wrote: > Hans van Kranenburg writes ("Bug#975062: Python 3 (pygrub) in 4.14 packages"): >> So, apparently there are cases in which pygrub 'works' and in which it >> does not, and apparently using pygrub with "amd64 kernel and Xen tools >> but i386 userland" is problematic, and I remember some remarks which I >> can't find back about that that use case was probably already broken >> always, in the past. > > The problem with pygrub with 32-bit userland is as follows: > > * Xen has to be 64-bit since there is no 64-bit Xen. ^^ 32? > * dom0 kernel bitness and Xen tools bitness must match because >Xen 32/64 compat ABI understands only one bitness for dom0 >and Xen dom0 tools make hypercalls directly so must match >the kernel. > > * 32-bit kernels are starting not to be able to drive hardware >(big PCI bars, bugs, etc.) so you want a 32-bit kernel. ^^ 64? > * So you must have libxen*:amd64. > > * pygrub uses python, obviously. It needs to load the xenfsimage >library, which is part of xen tools, since that is the userland >library tht understands the guest filesystem to fish out the guest >kernel. > > * The xenfsimage library is not in its own package [1] - it's in with >some other Xen libraries. But it is going to be loaded into a >python interpreter, so it needs to match the bitness of the python >interpreter. > > * You can't install a 64-bit python interpreter without basically >doing the whole 32-to-64 crossgrade. (That crossgrade is what I >ended up doing on my home machine.) > > * You can't co-install libxen*:i386 because the Xen libraries aren't >properly multiarched. [2] > > * This gets worse now that the Xen packages use python3. Previously >with a minimal but modern system you might be able to get away with >having a 64-bit python2 and a 32-bit python3. > > Both [1] and [2] are in principle bugs in the Xen packages. Upstream > sentiment seems to be that 32-bit userland is not really a very good > idea any more anyway. So we could solve this by fixing [1] or [2] My personal opinion is that there are more interesting horses to fry (or what was the saying) for our very bandwidth limited team. So yes, let's move this from the perfection into the known issues department for now (bullseye). > or > we could expect people to use 64-bit dom0 userland (and crossgrade if > need be). When someone shows up with a real world issue who's really panicing after recklessly upgrading (after the bullseye release halfway 2021) we probably might help by giving pointers and instructions. Until that actually happens, we should not spend time writing documentation about how to do that etc. >> I wanted to find out about this and set up some test cases to reproduce >> things (I've never used pygrub yet), but that obviously did not happen >> yet. I have some stuff going on in my personal life that is taking up a >> lot of time currently. What is rather easy for *me* is to help >> organizing the work and managing todo lists etc, but not learning new >> stuff ATM. >> >> So, my current questions are: >> >> 1. Is pygrub a blocker for having Xen 4.14 in unstable? Because that >> should be our first team-goal now. > > I think yes, a working pygrub ought to be a blocker for 4.14 in > unstable. > > But I think we have that - I rebuilt the existing packages for buster > and it WFM. OK. >> 2. What exactly is going on, can we make a list/table/whatever about in >> which cases pygrub 'does not work' (in more detail, how does it fail). >> 3. pygrub keeps being the thing that always causes problems. What would >> be your (asking anyone who wants to think along) ideas about which >> well-defined situations/test-cases we should have to execute instead of >> having the users report problems after big package changes? > > IDK about any other problems than the bitness one above. Ok, thanks a lot for the write up, both, and now we have a debian bug to look back to which is a bit easier to track than mailing list messages. So, I'm crossing out this issue now as blocker. Hans
Bug#975062: Python 3 (pygrub) in 4.14 packages
On 11/18/20 9:39 PM, Ian Jackson wrote: > It seems I was distracted when I wrote this mail. > > Ian Jackson writes ("Re: Bug#975062: Python 3 (pygrub) in 4.14 packages"): >> The problem with pygrub with 32-bit userland is as follows: >> >> * Xen has to be 64-bit since there is no 64-bit Xen. > ^^ 32 > >> * 32-bit kernels are starting not to be able to drive hardware >>(big PCI bars, bugs, etc.) so you want a 32-bit kernel. > ^^ 64 > HAH, thanks. \:D/ Hans
Bug#975062: Python 3 (pygrub) in 4.14 packages
Package: src:xen Version: 4.14.0-1~exp1 Control: submitter -1 ehem+deb...@m5p.com X-Debbugs-CC: ehem+deb...@m5p.com, ijack...@chiark.greenend.org.uk Hi, I think this should be in a bug report in the BTS to track it in a better way. 8< Forwarded Message 8< Subject: [Pkg-xen-devel] Python 3 in 4.14 packages Date: Sat, 26 Sep 2020 22:44:25 -0700 From: Elliott Mitchell To: pkg-xen-de...@alioth-lists.debian.net I was trying to test `pygrub` and found the Python 3 version is definitely broken in the 4.14 packages. I was able to get the script to display the help message by adding "/usr/lib/xen-4.14/lib/" to sys.path. The existing line: sys.path.insert(1, sys.path[0] + '/../lib/python') Is distinctly odd, usually this is better expressed: sys.path.append(os.path.join(sys.path[0], "libexec")) (though I suppose we can assume Linux, but this is Bad Practice) The way some portions of `pygrub` are packaged are distinctly odd. Certainly xc.so and xs.so are linked to core Xen libraries and need to be version-specific. Yet libfsimage.so appears independant of the Xen version and should likely track `pygrub`, rather than matching the system Xen version. >8 Forwarded Message >8 I also have a little snippet from IRC, which is about this, where Ian reports that he's seen it working. https://salsa.debian.org/xen-team/debian-xen/-/snippets/500 So, apparently there are cases in which pygrub 'works' and in which it does not, and apparently using pygrub with "amd64 kernel and Xen tools but i386 userland" is problematic, and I remember some remarks which I can't find back about that that use case was probably already broken always, in the past. I wanted to find out about this and set up some test cases to reproduce things (I've never used pygrub yet), but that obviously did not happen yet. I have some stuff going on in my personal life that is taking up a lot of time currently. What is rather easy for *me* is to help organizing the work and managing todo lists etc, but not learning new stuff ATM. So, my current questions are: 1. Is pygrub a blocker for having Xen 4.14 in unstable? Because that should be our first team-goal now. 2. What exactly is going on, can we make a list/table/whatever about in which cases pygrub 'does not work' (in more detail, how does it fail). 3. pygrub keeps being the thing that always causes problems. What would be your (asking anyone who wants to think along) ideas about which well-defined situations/test-cases we should have to execute instead of having the users report problems after big package changes? Hans P.S. Next message after the commercials will be on #968965 which is the other biggest issue for Xen 4.14 in unstable now.
Bug#970802: gcc-10: armhf: false positive when using -O2 and -Werror=format-truncation
On 9/23/20 4:59 PM, Julien Grall wrote: > X-Debbugs-CC: i...@xenproject.org > X-Debbugs-CC: h...@knorrie.org > Package: gcc-10 > Version: 10.2.0-9 > Severity: important > > Dear Maintainer, > > There was an FTBFS for Xen when building using GCC 10 on armhf (see > bug #9689645 [1]). FYI [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=968965#22 > After investigation, it looks a problem when with the optimizer in GCC > for armhf. > > [...] Thanks, Hans
Bug#961511: [Pkg-xen-devel] Bug#961511: [PATCH] d/xen-utils-common.xen.init: disable oom killer for xenstored
notfixed 961511 xen/4.14.0-1~exp1 thanks Right... so in the end I made an off-by-one error while rebasing and totally lost that commit. It's not actually in 4.14.0-1~exp1 now. That's bad. On 9/21/20 3:50 AM, Elliott Mitchell wrote: > This is fun. Actually isn't too difficult to trigger, simply slowly > reduce the memory Xen allocates to Dom0 and eventually the oom-killer is > likely to trigger (having tried to shrink Dom0 as far as possible, > believe me, I know). I had been wondering which of the Xen daemons could > be safely restarted since it is handy to restart daemons instead of whole > machine for security updates... > > Interestingly running `xenstored --help` mentions: > -I, --internal-db store database in memory, not on disk > > There is a run/xenstored/tdb file so I end up wondering if newer versions > are in fact storing everything in a file and restarting isn't so bad. Not by default, and I don't know if it's actually considered best practice. I could not find any info about this yet. I suspect it's not recommended. oxenstored has the following option in /etc/xen/oxenstored.conf: # Activate filed base backend persistent = false When enabling this, the file /run/xenstored/db gets rewritten a lot and I also see it's out of sync with what's in xenstore-ls after doing some things. So, it might me inconsistent when the process is oom-killed. > The patch switches the arguments from: > --exec "$try_xenstored" -- ... > to: > --exec /usr/bin/choom -- -n -1000 "$try_xenstored" -- ... > > I'm pretty sure start-stop-daemon is consuming the "--" and the second > "--" shouldn't be there. Well, I tested it and found out that it's needed... -# start-stop-daemon --start \ --pidfile "/run/xenstore.pid" \ --exec /usr/bin/choom -- -n -1000 \ /usr/lib/xen-4.14/bin/oxenstored --pid-file "/run/xenstore.pid" /usr/bin/choom: unrecognized option '--pid-file' Try 'choom --help' for more information. -# start-stop-daemon --start \ --pidfile "/run/xenstore.pid" \ --exec /usr/lib/xen-4.14/bin/oxenstored --test Would start /usr/lib/xen-4.14/bin/oxenstored . and with the extra separator: -# start-stop-daemon --start \ --pidfile "/run/xenstore.pid" \ --exec /usr/bin/choom -- -n -1000 \ /usr/lib/xen-4.14/bin/oxenstored -- --pid-file "/run/xenstore.pid" -# grep . /proc/$(pidof /usr/lib/xen-4.14/bin/oxenstored)/oom_* /proc/363043/oom_adj:-17 /proc/363043/oom_score:0 /proc/363043/oom_score_adj:-1000 -# cat /proc/$(pidof /usr/lib/xen-4.14/bin/oxenstored)/cmdline /usr/lib/xen-4.14/bin/oxenstored--pid-file/run/xenstore.pid How did you test it and how did you get a working process without the --? Hans
Bug#968965: [Pkg-xen-devel] Bug#968965: Bug#968965: xen: FTBFS in sid
notfixed -1 xen/4.14.0-1~exp1 reopen found -1 xen/4.14.0-1~exp1 thanks Hi, On 9/4/20 1:55 PM, Hans van Kranenburg wrote: > > On 8/24/20 7:03 PM, Gianfranco Costamagna wrote: >> Source: xen >> Version: 4.11.4+24-gddaaccbbab-1 >> Severity: serious >> >> Hello, looks like xen is FTBFS because of some bd-uninstallable python >> package and a gcc-10 related build failure. > > [...] Well, it seems we have more FTBFS, let's reuse this bug number to track it again? https://buildd.debian.org/status/package.php?p=xen=experimental --->8--- arm64 --->8--- gcc -MMD -MP -MF ./.mem_access.o.d -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable -Wno-unused-local-typedefs -O2 -fomit-frame-pointer -nostdinc -fno-builtin -fno-common -Werror -Wredundant-decls -Wno-pointer-arith -Wvla -pipe -D__XEN__ -include /<>/xen/include/xen/config.h -Wa,--strip-local-absolute -mcpu=generic -mgeneral-regs-only -I/<>/xen/include -fno-stack-protector -fno-exceptions -fno-asynchronous-unwind-tables -fcf-protection=none -Wnested-externs '-D__OBJECT_FILE__="mem_access.o"' -c mem_access.c -o mem_access.o mem_access.c: In function ‘p2m_mem_access_check’: mem_access.c:227:6: note: parameter passing for argument of type ‘const struct npfec’ changed in GCC 9.1 227 | bool p2m_mem_access_check(paddr_t gpa, vaddr_t gla, const struct npfec npfec) | ^~~~ --->8--- armhf --->8--- gcc -marm -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall -Wstrict-prototypes -Wdeclaration-after-statement -Wno-unused-but-set-variable -Wno-unused-local-typedefs -O2 -fomit-frame-pointer -D__XEN_INTERFACE_VERSION__=__XEN_LATEST_INTERFACE_VERSION__ -MMD -MP -MF .xenpmd.o.d -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -g -O2 -fdebug-prefix-map=/<>=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Werror -I/<>/tools/xenpmd/../../tools/xenstore/include -I/<>/tools/xenpmd/../../tools/include -c -o xenpmd.o xenpmd.c xenpmd.c: In function ‘get_next_battery_file’: xenpmd.c:92:37: error: ‘%s’ directive output may be truncated writing between 4 and 2147483645 bytes into a region of size 271 [-Werror=format-truncation=] 92 | #define BATTERY_STATE_FILE_PATH "/tmp/battery/%s/state" | ^~~ xenpmd.c:117:52: note: in expansion of macro ‘BATTERY_STATE_FILE_PATH’ 117 | snprintf(file_name, sizeof(file_name), BATTERY_STATE_FILE_PATH, | ^~~ xenpmd.c:92:51: note: format string is defined here 92 | #define BATTERY_STATE_FILE_PATH "/tmp/battery/%s/state" | ^~ In file included from /usr/include/stdio.h:867, from xenpmd.c:35: /usr/include/arm-linux-gnueabihf/bits/stdio2.h:67:10: note: ‘__builtin___snprintf_chk’ output between 24 and 2147483665 bytes into a destination of size 284 67 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ^~~~ 68 |__bos (__s), __fmt, __va_arg_pack ()); |~ xenpmd.c:91:36: error: ‘%s’ directive output may be truncated writing between 4 and 2147483645 bytes into a region of size 271 [-Werror=format-truncation=] 91 | #define BATTERY_INFO_FILE_PATH "/tmp/battery/%s/info" |^~ xenpmd.c:114:52: note: in expansion of macro ‘BATTERY_INFO_FILE_PATH’ 114 | snprintf(file_name, sizeof(file_name), BATTERY_INFO_FILE_PATH, | ^~ xenpmd.c:91:50: note: format string is defined here 91 | #define BATTERY_INFO_FILE_PATH "/tmp/battery/%s/info" | ^~ In file included from /usr/include/stdio.h:867, from xenpmd.c:35: /usr/include/arm-linux-gnueabihf/bits/stdio2.h:67:10: note: ‘__builtin___snprintf_chk’ output between 23 and 2147483664 bytes into a destination of size 284 67 | return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1, | ^~~~ 68 |__bos (__s), __fmt, __va_arg_pack ()); |~ --->8--- i386 --->8--- gcc-Wl,-z,relro -Wl,-z,now -pthread -Wl,-soname -Wl,libxentoolcore.so.1 -shared -Wl,--version-script=libxentoolcore.map -o libxentoolcore.so.1.0 handlereg.opic /usr/bin/ld: i386:x86-64 architecture of input file `handlereg.opic' is incompatible with i386 output /usr/bin/ld: handlereg.opic: file class ELFCLASS64 incompatible with ELFCLASS32 /usr/bin/ld: final link failed: file in wrong format collect2: error: ld returned 1 exit status Hans
Bug#927071: [Pkg-xen-devel] Bug#927071: xen: More balloon-leak observation
Hi again, On 5/1/19 12:55 AM, Elliott Mitchell wrote: > On Mon, Apr 22, 2019 at 04:02:28PM +0200, Hans van Kranenburg wrote: >> On 4/22/19 1:10 AM, Elliott Mitchell wrote: >>> There is plenty of free memory for creating additional VMs (perhaps too >>> much, and that confused Xen?), so this is really puzzling that memory is >>> being ballooned away from Dom0. At this point I plan after the next >>> restart to double the allocation for Dom0 and see whether Dom0 is able >>> to last more than a week. >> >> Weird. Can you log memory stats over time, so that you can see when it >> happens, and correlate it to other events? > > At this point there is only one real pattern I've noticed: Always > `smartd` was the process which triggered the kernel OOM-killer. > > Originally I was attributing this to `smartd` doing some large memory > allocation during its night-time tasks (which I would attribute to > perhaps `smartd` not being that well written). Yet now, I never saw > anything else trigger the OOM-killer and I'm now willing to speculate > some I/O operation `smartd` was doing triggers a bug in Xen. At first I replied with "I haven't heard about this symptom before your report.", but later I realized that I am totally seeing the same kind of behaviour. During some debian-xen day in Feb 2020, I even had a bit of a closer look at this together with Ian, and we ended up thinking that there's actually some kind of obscure miscalculation bug happening. If you look closely at the numbers in xl info and xl list, then you'll see that the numbers just do not add up. The dom0 gets some kind of fake-down-ballooning which is an accounting error. I can't provide more proof right now, because I have to reproduce the thing in a simplified environment to be able to provide a kind of walk-through scenario with all the output of the numbers. And yes, I have seen oom killers do stuff in customer production environments because of this. O_O A team member in my team has been busy doing storage migrations where we attach new block devices to domUs and then sync all their data to the new filesystem (moving from ext4 to btrfs and also to new iSCSI storage) and later reboot after a final sync and then swap block devices, etc. >From the graphs we've been looking at, combined with when migration stuff is happening, I have gotten a suspicion that it looks like the fake dom0 down-ballooning is related to grant mappings, since it seems like the dom0 memory is not decreasing when attaching the new disk, but it is when starting activity using it. To be continued Hans
Bug#968501: btrfs-heatmap: Please depend on python3:any or drop python3 dependency
Hi! On 8/16/20 3:36 PM, Elrond wrote: > Package: btrfs-heatmap > Version: 8-1 > Severity: wishlist > User: multiarch-de...@lists.alioth.debian.org > Usertags: multiarch > > Hi, > > btrfs-heatmap currently depends on just python3. > As btrfs-heatmap is Architecture=all, it probably > should depend on python3:any. > Alternatively, the python3 dependency could be dropped, as > the dependency on python3-btrfs will already pull in > an appropriate python3. Yes, you're right. btrfs-heatmap has a hard dependency on pyton3-btrfs, so I'll drop the python3 dependency for btrfs-heatmap in the next upload. Thanks, Hans
Bug#961511: [PATCH] d/xen-utils-common.xen.init: disable oom killer for xenstored
tag -1 + pending thanks On 9/7/20 12:40 PM, Ian Jackson wrote: > ~Hans van Kranenburg writes ("[PATCH] d/xen-utils-common.xen.init: disable > oom killer for xenstored"): >> In case of oom killer terminating some process, we'd rather not see >> xenstored go. Xenstored has an in-memory database, and when starting the >> process again, it would be empty, which is very inconvenient. Xenstored >> should already score quite low and have a fairly low memory footprint, >> but according to the user report, it happened. >> >> Closes: #961511 >> Suggested-by: Samuel Thibault >> Signed-off-by: Hans van Kranenburg > > Acked-by: Ian Jackson Thanks, added. Hans
Bug#961511: [PATCH] d/xen-utils-common.xen.init: disable oom killer for xenstored
In case of oom killer terminating some process, we'd rather not see xenstored go. Xenstored has an in-memory database, and when starting the process again, it would be empty, which is very inconvenient. Xenstored should already score quite low and have a fairly low memory footprint, but according to the user report, it happened. Closes: #961511 Suggested-by: Samuel Thibault Signed-off-by: Hans van Kranenburg --- Cc: Ian Jackson --- This is in my knorrie/4.14-extra branch now. I think we should do this. --- debian/xen-utils-common.xen.init | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/debian/xen-utils-common.xen.init b/debian/xen-utils-common.xen.init index 54aaba89d320..2a4c09fa3f71 100644 --- a/debian/xen-utils-common.xen.init +++ b/debian/xen-utils-common.xen.init @@ -226,7 +226,8 @@ xenstored_start() eval "try_xenstored=\$$try_xenstored_var" if [ -x $try_xenstored ]; then if start-stop-daemon --start --quiet \ - --pidfile "$XENSTORED_PIDFILE" --exec "$try_xenstored" -- \ + --pidfile "$XENSTORED_PIDFILE" \ + --exec /usr/bin/choom -- -n -1000 "$try_xenstored" -- \ $XENSTORED_ARGS --pid-file "$XENSTORED_PIDFILE"; then started_xenstored=$try_xenstored break -- 2.20.1
Bug#961511: [Pkg-xen-devel] Bug#961511: xen-utils-common: Protect xenstored/xenconsoled against OOM
Hi, On 5/25/20 3:18 PM, Samuel Thibault wrote: > Samuel Thibault, le lun. 25 mai 2020 15:11:44 +0200, a ecrit: >> I'm currently using a hack such as >> >> for i in $(pgrep xenconsoled) ; do >> echo -1000 > /proc/$i/oom_score_adj >> done >> >> in /etc/init.d/xen, but there are cleaner ways to do this :) > > For instance, using choom: > > start-stop-daemon --start --quiet --pidfile "$XENCONSOLED_PIDFILE" > --exec /usr/bin/choom -- \ > -n -1000 "$XENCONSOLED" $XENCONSOLED_ARGS --pid-file > "$XENCONSOLED_PIDFILE" \ That's a nice idea! Especially for xenstored, because it only keeps state in memory. xenconsoled can be started again if it's ever oom killed. so, I'd like to limit this to xenstored only. E.g. in my situation at work, it's mostly openvswitch that gets killed first, if there's really a situation in which something has to go. If I can choose between that (which disrupts vm traffic) or xenconsoled (which does not impact customer stuff directly), then I'd rather see the last one go temporarily. I had to insert another -- before $XENCONSOLED_ARGS to actually make it work. After reboot: -# grep . /proc/$(pidof /usr/lib/xen-4.14/bin/oxenstored)/oom_* /proc/7478/oom_adj:-17 /proc/7478/oom_score:0 /proc/7478/oom_score_adj:-1000 Hans
Bug#968965: [Pkg-xen-devel] Bug#968965: xen: FTBFS in sid
Hi Gianfranco, On 8/24/20 7:03 PM, Gianfranco Costamagna wrote: > Source: xen > Version: 4.11.4+24-gddaaccbbab-1 > Severity: serious > > Hello, looks like xen is FTBFS because of some bd-uninstallable python > package and a gcc-10 related build failure. Yes. Thanks for the report. Currently (actually, also today!) Ian Jackson and I are working on this. We want to have Xen 4.14 in Debian unstable, and the two big things that are needed are GCC 10 fixes and getting rid of python 2 usage. So, just to let you know it's known and being worked on. > gcc -m64 -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall > -Wstrict-prototypes -Wdeclaration-after-statement > -Wno-unused-but-set-variable -Wno-unused-local-typedefs -O2 > -fomit-frame-pointer > -D__XEN_INTERFACE_VERSION__=__XEN_LATEST_INTERFACE_VERSION__ -MMD -MF > .tdb.o.d -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -g -O2 > -fdebug-prefix-map=/build/xen-4.11.4+24-gddaaccbbab=. > -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time > -D_FORTIFY_SOURCE=2 -Werror -I. -include > /build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/config.h > -I./include > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libs/evtchn/include > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libxc/include > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libs/toollog/include > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libs/foreignmemory/include > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libs/devicemodel/include > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include > -D__XEN_TOOLS__ > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libs/toolcore/include > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include > -DXEN_LIB_STORED="\"/var/lib/xenstored\"" > -DXEN_RUN_STORED="\"/var/run/xenstored\"" > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libs/gnttab/include > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include -c -o > tdb.o tdb.c > gcc -m64 -DBUILD_ID -fno-strict-aliasing -std=gnu99 -Wall > -Wstrict-prototypes -Wdeclaration-after-statement > -Wno-unused-but-set-variable -Wno-unused-local-typedefs -O2 > -fomit-frame-pointer > -D__XEN_INTERFACE_VERSION__=__XEN_LATEST_INTERFACE_VERSION__ -MMD -MF > .talloc.o.d -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -g -O2 > -fdebug-prefix-map=/build/xen-4.11.4+24-gddaaccbbab=. > -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time > -D_FORTIFY_SOURCE=2 -Werror -I. -include > /build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/config.h > -I./include > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libs/evtchn/include > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libxc/include > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libs/toollog/include > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libs/foreignmemory/include > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libs/devicemodel/include > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include > -D__XEN_TOOLS__ > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libs/toolcore/include > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include > -DXEN_LIB_STORED="\"/var/lib/xenstored\"" > -DXEN_RUN_STORED="\"/var/run/xenstored\"" > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/libs/gnttab/include > -I/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore/../../tools/include -c -o > talloc.o talloc.c > gcc xs_tdb_dump.o utils.o tdb.o talloc.o-Wl,-z,relro -Wl,-z,now -o > xs_tdb_dump > /usr/bin/ld: utils.o:./tools/xenstore/utils.h:27: multiple definition of > `xprintf'; xs_tdb_dump.o:./tools/xenstore/utils.h:27: first defined here > collect2: error: ld returned 1 exit status > make[6]: *** [Makefile:97: xs_tdb_dump] Error 1 > make[6]: Leaving directory '/build/xen-4.11.4+24-gddaaccbbab/tools/xenstore' > make[5]: *** [/build/xen-4.11.4+24-gddaaccbbab/tools/../tools/Rules.mk:253: > subdir-install-xenstore] Error 2 > make[5]: Leaving directory '/build/xen-4.11.4+24-gddaaccbbab/tools' > make[4]: *** [/build/xen-4.11.4+24-gddaaccbbab/tools/../tools/Rules.mk:248: > subdirs-install] Error 2 > make[4]: Leaving directory
Bug#964494: File system corruption with ext3 + kernel-4.19.0-9-amd64
Hi, On Wed, 15 Jul 2020 20:52:40 -0700 Sarah Newman wrote: > On 7/7/20 8:13 PM, Ben Hutchings wrote: > > Control: reassign -1 src:linux > > Control: tag -1 moreinfo > > > > On Tue, 2020-07-07 at 17:30 -0700, Sarah Newman wrote: > >> Package: linux-signed-amd64 > >> Version: 4.19.0-9-amd64 > >> > >> We've had two separate reports now of debian buster users running > >> 4.19.0-9-amd64 who experienced serious file system corruption. > > > > Which version? (I.e. what does "uname -v" or > > "dpkg -s linux-image-4.19.0-9-amd64" say?) > > > >> - Both were using ext3 > >> - Both are running Xen HVM, but I do not have reason to believe this to be > >> related > > [...] I have servers which run 4.19.118-2 as dom0 kernel and a Xen 4.11.4-1 rebuild for Buster. One example is a smallish 6-server cluster that got a reboot cycle 48 days ago. It contains a few heavily loaded domUs with 4.19.118 or 4.19.131 based kernels. No problems or disk corruption or anything is seen yet. dom0 filesystem is ext4, domUs use a mix of ext4 and btrfs (over iscsi). So, no ext3 anywhere. We haven't got bug reports against Debian Xen packages in the BTS about this. I have not yet tried to make an ext3 fs on a block device in a test domU and then have it do things with the fs and reboot it now and then. If wanted, I can do that and see if there's any problem after a week or two. Just to add chaos to help correlating. FWIW, Hans
Bug#965245: [Pkg-xen-devel] Bug#965245: Cross-build issues
Hi Elliott, On 7/18/20 5:53 AM, Elliott Mitchell wrote: > Package: src:xen > Version: 4.13 > Tags: patch > > I've been playing try to get Xen 4.13 to cross-build for ARM. In the > process I've been running into bunches of problems, so here are fixes. Can you: * add a 'why' line to the commit message of the first patch * add Signed-off-by lines * and then mailbomb (git send-email) it to pkg-xen-de...@lists.alioth.debian.org with Cc to Ian Jackson ? Just all of it in 1 mail thread? (So, with 0/10 cover letter which does not have to contain anything else than something like 'Hi! See #965245, kthxbye'.) Then we can collect some Reviewed-by etc. > OCAML/xenstored is being problematic, that looks like outright bugs on > ocaml-nox making it unusable for cross-building. The cxenstored is also still there. The init scripts look if oxenstored is installed, and if not, it falls back to using normal xenstored. So, I suspect if you patch it out of the build for this arch, then no other changes are necessary. (Normally both are built now, so that if a user wants, in case of problems or whatever, they can switch back). > I'm including copies of 3 patches from Julien Grall. Upstream source for > this is: git://xenbits.xen.org/people/julieng/xen-unstable.git The > branch "arm-dma/v2". Ok, these patches are in Xen 4.14 I see. First thing I want to do going forward is forwarding the packaging to that. I hope this will also only make your life easier. Like I said on IRC, the two other things before we can push it to Debian experimental asap are making sure python 2 is not used any more anywhere, and of course a proper debian/changelog. :) And then making noise on the list to find users to try it out. And, a small pile of backlog of things that are waiting, and then hopefully not too long after the official Xen 4.14 release it can go into Debian unstable. But, keep the 3 upstream patches in the set for now, so that it's explicit that you need them for this. > Why yes, I am trying to get Xen operational on a Raspberry PI. Why do > you ask? :-) Haha. Exciting. I like it. Looking forward to see it working and help testing it here. I didn't do cross-building yet, so time to learn something new. Hans (Knorrie)
Bug#964793: odd qemu/xen crashes + toolchain rings a bell
However, On 7/13/20 4:19 PM, Hans van Kranenburg wrote: > (Adding more To:; Note that mailing the bug number does not make it end > up at the submitter automatically, only the package maintainer). > > Hi Christian, > > thanks for the hints! > > On Mon, 13 Jul 2020 09:01:18 +0200 Christian Ehrhardt > wrote: >> Hi, >> I was seeing the bug updates flying by and just wanted to mention that we >> have seen something similar in Ubuntu - but back then things weren't >> replicable on Debian so we couldn't contribute things back. >> It seemed to be due to the newer and different-defaults toolchain that we >> had in Ubuntu at the time. >> >> But here qemu/xen crashes + new toolchain come together again which >> reminded me. >> >> So without any promises that it really is related I wanted to FYI you to >> these two fixes we needed for Xen: >> https://git.launchpad.net/ubuntu/+source/xen/tree/debian/patches/1001-strip-note-gnu-property.patch?h=ubuntu/groovy-devel > > I guess this first one would be one needed? "Force fcf-protection off > when using -mindirect-branch". > > In that case want this one, it's not backported to 4.11-stable: > > "x86/build: Unilaterally disable -fcf-protection" > > https://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=3a218961b16f1f4feb1147f56338faf1ac8f5703 However, this is a workaround for a gcc bug that is fixed in: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=a03efb266f This fix is included in gcc-9 in Debian since 9.3.0-12: https://salsa.debian.org/toolchain-team/gcc/-/blob/gcc-9-debian/debian/changelog#L55 (it's the PR target/93654 (x86)) Reporter says the 4.11.4-1 package is used, which is built using gcc 9.3.0-13: https://buildd.debian.org/status/fetch.php?pkg=xen=all=4.11.4-1=1590602099=0 >> https://git.launchpad.net/ubuntu/+source/xen/tree/debian/patches/1000-flags-fcs-protect-none.patch?h=ubuntu/groovy-devel > > This one is about the build failing. > >> This would seem more applicable if the new toolchain would have recently >> rebuilt xen and not qemu as in this case. But as an FYI it is still worth a >> ping. > > 小太, can you do... > > xl create -vvv > > ...which should show how qemu is invoked. Can you show that command? > > I can provide you with some test packages with the mentioned upstream > patch applied (on top of 4.11.4+24-gddaaccbbab-1), so you can test if > your domU starts with them. > > If so, we can request the backport upstream and/or maybe pick it for > Debian 4.11 into the patch queue, whatever happens earlier. So, the above info tells us that this probably is not the issue that we're looking at. (I'm fine with still making some test packages for reporter to test with to 100% check this.) Then, let's see what shows up in the xl -vvv output and if there's anything that can be debugged when starting the qemu process with those args? > Thanks, > Hans (Debian Xen Team) >
Bug#964793: odd qemu/xen crashes + toolchain rings a bell
(Adding more To:; Note that mailing the bug number does not make it end up at the submitter automatically, only the package maintainer). Hi Christian, thanks for the hints! On Mon, 13 Jul 2020 09:01:18 +0200 Christian Ehrhardt wrote: > Hi, > I was seeing the bug updates flying by and just wanted to mention that we > have seen something similar in Ubuntu - but back then things weren't > replicable on Debian so we couldn't contribute things back. > It seemed to be due to the newer and different-defaults toolchain that we > had in Ubuntu at the time. > > But here qemu/xen crashes + new toolchain come together again which > reminded me. > > So without any promises that it really is related I wanted to FYI you to > these two fixes we needed for Xen: > https://git.launchpad.net/ubuntu/+source/xen/tree/debian/patches/1001-strip-note-gnu-property.patch?h=ubuntu/groovy-devel I guess this first one would be one needed? "Force fcf-protection off when using -mindirect-branch". In that case want this one, it's not backported to 4.11-stable: "x86/build: Unilaterally disable -fcf-protection" https://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=3a218961b16f1f4feb1147f56338faf1ac8f5703 > https://git.launchpad.net/ubuntu/+source/xen/tree/debian/patches/1000-flags-fcs-protect-none.patch?h=ubuntu/groovy-devel This one is about the build failing. > This would seem more applicable if the new toolchain would have recently > rebuilt xen and not qemu as in this case. But as an FYI it is still worth a > ping. 小太, can you do... xl create -vvv ...which should show how qemu is invoked. Can you show that command? I can provide you with some test packages with the mentioned upstream patch applied (on top of 4.11.4+24-gddaaccbbab-1), so you can test if your domU starts with them. If so, we can request the backport upstream and/or maybe pick it for Debian 4.11 into the patch queue, whatever happens earlier. Thanks, Hans (Debian Xen Team)
Bug#964482: buster-pu: xen/4.11.4+24-gddaaccbbab-1~deb10u1
Hi, On 7/8/20 9:35 AM, Moritz Muehlenhoff wrote: > On Tue, Jul 07, 2020 at 10:56:18PM +0200, Hans van Kranenburg wrote: >> Additional To: t...@security.debian.org >> >> Hi Security team, >> >> After our last security update, which was >> 4.11.3+24-g14b62ab3e5-1~deb10u1, we found out that there is a bugfix to >> be done to help users upgrade from Buster to Bullseye. This fix was >> included in the unstable xen 4.11.4-1 upload (it also helps for the >> future from there) and has been in unstable for 41 days now. >> >> I have chosen to not bother you with a new security upload for 4.11.4 to >> Buster at that time (while it included security fixes) because I didn't >> want to skip going through the stable release process because of this >> packaging change. >> >> Now, we're at the verge of a new buster point release. >> >> Can you please read https://bugs.debian.org/964482 and ack that we can >> do a combination of the security updates and this packaging change for >> stable? > > Ack, we can piggyback the fix for 964482 to the buster-security update, > no problem. Ok, clear. In that case it will be a security update with the fix included. I was just trying to be more 'compliant'. :) Upstream Xen testing finished and has all the commits in stable-4.11 now. I did the upload for Debian unstable already, it's processed now. https://packages.debian.org/source/sid/xen So, I changed the changelog to buster-security, and did another build and test run here, all is looking good. https://salsa.debian.org/xen-team/debian-xen/-/commit/0da17d8b443233e521c84886c2fc913ea4ee4480 Since I'm a DM I guess I need a sponsor for the security upload. Can someone from the security team do this? I put everything here, signed and well: https://syrinx.knorrie.org/~knorrie/tmp/xen/ I have another question, which is about timing. I have been asking around a bit a few weeks ago, but did not get any response on this: For the users, who are running some Xen cluster, it's really useful to get Xen and Linux kernel changes at the same time, to reduce the amount of 'reboot stress' we're causing them. Does anyone have a brilliant idea about how to improve this? I mean, if we do this security update now, then next week the new kernel is in the point release In general, if the kernel team does a security update, or if a point release happens, it would be useful to push out a Xen update as well at the same time... I can of course write some dirty script that polls kernel team git all the time and then emails me with "hola! activity in a -security branch!"... Thanks, Hans
Bug#964482: buster-pu: xen/4.11.4+24-gddaaccbbab-1~deb10u1
On 7/7/20 9:51 PM, Adam D. Barratt wrote: > Control: tags -1 + moreinfo > > On Tue, 2020-07-07 at 21:16 +0200, Hans van Kranenburg wrote: >> I'd like to update the xen packages in buster to >> 4.11.4+24-gddaaccbbab-1~deb10u1 for the 10.5 point release. This is >> an update to keep following the stable-4.11 upstream Xen code, which >> mainly contains security fixes. >> >> https://salsa.debian.org/xen-team/debian-xen/-/blob/10f1a4a8f15b6748459cd1c826d3808694682faf/debian/changelog > > In that case, please attach a source debdiff between the current stable > package and the proposed package (built and tested on stable) to this > request. I can do that. Are you sure you want to read through the upstream changes in a way that collapses everything and removes the context of the original git commits with any useful information about whether it's related to an XSA, or if it's a backport of a critical bug that crashes systems for our stable users or if it's a commit that really needs to be included before the security fix will actually work? I'm trying to run this through the stable release process because there's an (one) actual packaging change involved. If we only had upstream changes, we'd do this as a regular security update. >> I also have 4.11.4+24-gddaaccbbab-1 for unstable ready for upload >> here. >> All of it is right now waiting for the upstream testing at the Xen >> project to finish, which is regression testing the latest additions >> for todays published security advisories ( >> https://xenbits.xen.org/xsa/, >> 2020-07-07). But, I'm already sending the request. > > It's fine to send the request now, but the unstable upload needs to > happen first. That's for sure! Hans
Bug#964482: buster-pu: xen/4.11.4+24-gddaaccbbab-1~deb10u1
Package: release.debian.org Severity: normal Tags: buster User: release.debian@packages.debian.org Usertags: pu Hi, I'd like to update the xen packages in buster to 4.11.4+24-gddaaccbbab-1~deb10u1 for the 10.5 point release. This is an update to keep following the stable-4.11 upstream Xen code, which mainly contains security fixes. https://salsa.debian.org/xen-team/debian-xen/-/blob/10f1a4a8f15b6748459cd1c826d3808694682faf/debian/changelog I also have 4.11.4+24-gddaaccbbab-1 for unstable ready for upload here. All of it is right now waiting for the upstream testing at the Xen project to finish, which is regression testing the latest additions for todays published security advisories (https://xenbits.xen.org/xsa/, 2020-07-07). But, I'm already sending the request. Both unstable and Buster are on Xen 4.11. Currently buster has 4.11.3+24-g14b62ab3e5-1~deb10u1, so in the changelog you can see we'll be syncing it up with unstable again. The 4.11.4-1 package version contained an actual packaging change, that fixes a bug for upgrading to a new Xen version. This is something we want to have in Buster for our users. It means fixing upgrading from Buster to Bullseye, but also for whoever follows Debian unstable now. It's the stuff related to #932759 and these are the changes: Init scripts: https://salsa.debian.org/xen-team/debian-xen/-/commit/420d05e8b5950cb79b03a613f791cad400390bb8 NEWS: https://salsa.debian.org/xen-team/debian-xen/-/commit/10baa2d48db43a5ff675bddf5482717f60fb748a Testing and code review can also be seen in: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=932759#38 So, since 4.11.4-1 is in unstable already, these changes have been out there for weeks now. We have not seen any user report about any regression. Thanks, Hans van Kranenburg
Bug#963607: [Pkg-xen-devel] Bug#963607: xen-hypervisor-4.11-amd64: Xen Hypervisor kernel fails to load arcmsr module with "arcmsr0: dma_alloc_coherent got error" message.
Hi, On 6/25/20 1:44 PM, Alex Sanderson wrote: > > Hi Hans, > > Thank you for your assistance with this. I hesitated to log this with > xen-dev but thought I should wait for a response here first. > > > On 25/06/2020 01:30, Hans van Kranenburg wrote: >> Hi Alex, >> >> On 6/24/20 12:31 PM, Alex Sanderson wrote: >>> Package: xen-hypervisor-4.11-amd64 >>> Version: 4.11.3+24-g14b62ab3e5-1~deb10u1 >>> Severity: important >>> >>> Dear Maintainer, >>> >>> After updating to Buster and Xen 4.11 our machine no longer boots the Xen >>> kernel. The default kernel 4.19.118-2+deb10u1 boots normally. >> When booting with Xen, the computer first starts the Xen hypervisor >> code. This is the part where you see all the lines with (XEN) at the >> beginning appear. >> >> Afterwards, it starts the same 4.19.118-2+deb10u1 Linux kernel that is >> used when running without Xen, but it's started as the first virtual >> machine, that has extra privileges to access all hardware. >> >> So, Linux vs. Xen + Linux. >> >>> The machine has an Areca 1882IX-16 card in it when the arcmsr module >>> tries to load the following error appears. >>> >>> Areca RAID Controller0: Model ARC-1882, F/W V1.56 2019-07-30 >>> arcmsr0: dma_alloc_coherent got error >>> >>> No drives are discovered and the initramfs prompt is shown. >> Ok, so booting the Xen part succeeded, but apparently, when starting the >> Linux kernel inside, there's apparently a problem with accessing the >> raid controller hardware. Interesting. >> >> This likely means it's not a problem in the Debian packaging part, it's >> a problem somewhere in the upstream Xen or Linux code. That means that I >> cannot solve this for you, but I can help with tips to gather the right >> information, and help finding out what the best place is where we can >> report the issue. >> >>> The machine: >>> * Supermicro X9DRW >>> * Dual Intel(R) Xeon(R) CPU E5-2630L v2 @ 2.40GHz >>> * 128G RAM >>> * Areca ARC-1882IX-16 (1G onboard cache) >>> >>> Nothing I have tried is effective: >>> * Turning on BIOS above 4G decoding stops the Intel 10GBE ixgbe driver >>> from functioning and doesn't fix the arcmsr >>> * Unloading and reloading the arcmsr module from initramfs prompt >>> * Downgrading the Areca 1882 bios to v1.52 as per >>> http://faq.areca.com.tw/index.php?action=artikel=7=902=en >>> * Kernel parameters >>> ** pci=nocrs >>> ** dom0_mem=8G >>> ** mem=3072M >>> ** mem2048M cma=1024M >>> ** cma=2048 >>> ** cma=3076@512M >>> ** iommu=1 intel_iommu=1 >>> ** arcmsr.host_can_queue=64 as per >>> http://faq.areca.com.tw/index.php?action=artikel=15=387=en >>> >>> I expected the arcmsr module to load and detect disks as it does with >>> the stock kernel. >>> >>> I can provide sysctl and dmesg output if it helps. >> Yes. The first thing needed is full startup logs, and for the Xen part >> preferably extra logging. In /etc/default/grub.d/xen.cfg in the >> GRUB_CMDLINE_XEN_DEFAULT setting, you can add loglvl=all, and then run >> update-grub and try to boot Xen+Linux again. >> >> Do you have a way to capture the logging during boot? Like, a working >> serial console or something similar? >> >> The output of dmesg when starting Linux without Xen is of course also >> interesting, so we can compare both scenarios. >> >> Hans > > I tried using debian's paste https://paste.debian.net but it always > thought it was spam. > > dmesg output Xen Hypervisor 4.11 https://pastebin.com/3wUyYg0P This one shows a Linux kernel boot, not the Xen Hypervisor, which should go first (with all the (XEN) lines). By default the Xen output should show up on your (serial) console. If you do dmesg after starting Linux as dom0 after starting Xen, then you just get the Linux part of it. If it actually boots and it's usable to login and get a shell prompt etc, then you can immediately use xl dmesg to see the xen part, and if it doesn't, then you need to make sure you have some sort of serial console to capture the lines. To do a bug report upstream, we'll need that information. > dmesg output Debian Kernel 4.19.118-2+deb10u1 https://pastebin.com/GHzzW3vi K
Bug#963607: [Pkg-xen-devel] Bug#963607: xen-hypervisor-4.11-amd64: Xen Hypervisor kernel fails to load arcmsr module with "arcmsr0: dma_alloc_coherent got error" message.
Hi Alex, On 6/24/20 12:31 PM, Alex Sanderson wrote: > Package: xen-hypervisor-4.11-amd64 > Version: 4.11.3+24-g14b62ab3e5-1~deb10u1 > Severity: important > > Dear Maintainer, > > After updating to Buster and Xen 4.11 our machine no longer boots the Xen > kernel. The default kernel 4.19.118-2+deb10u1 boots normally. When booting with Xen, the computer first starts the Xen hypervisor code. This is the part where you see all the lines with (XEN) at the beginning appear. Afterwards, it starts the same 4.19.118-2+deb10u1 Linux kernel that is used when running without Xen, but it's started as the first virtual machine, that has extra privileges to access all hardware. So, Linux vs. Xen + Linux. > The machine has an Areca 1882IX-16 card in it when the arcmsr module > tries to load the following error appears. > > Areca RAID Controller0: Model ARC-1882, F/W V1.56 2019-07-30 > arcmsr0: dma_alloc_coherent got error > > No drives are discovered and the initramfs prompt is shown. Ok, so booting the Xen part succeeded, but apparently, when starting the Linux kernel inside, there's apparently a problem with accessing the raid controller hardware. Interesting. This likely means it's not a problem in the Debian packaging part, it's a problem somewhere in the upstream Xen or Linux code. That means that I cannot solve this for you, but I can help with tips to gather the right information, and help finding out what the best place is where we can report the issue. > The machine: > * Supermicro X9DRW > * Dual Intel(R) Xeon(R) CPU E5-2630L v2 @ 2.40GHz > * 128G RAM > * Areca ARC-1882IX-16 (1G onboard cache) > > Nothing I have tried is effective: > * Turning on BIOS above 4G decoding stops the Intel 10GBE ixgbe driver from > functioning and doesn't fix the arcmsr > * Unloading and reloading the arcmsr module from initramfs prompt > * Downgrading the Areca 1882 bios to v1.52 as per > http://faq.areca.com.tw/index.php?action=artikel=7=902=en > * Kernel parameters > ** pci=nocrs > ** dom0_mem=8G > ** mem=3072M > ** mem2048M cma=1024M > ** cma=2048 > ** cma=3076@512M > ** iommu=1 intel_iommu=1 > ** arcmsr.host_can_queue=64 as per > http://faq.areca.com.tw/index.php?action=artikel=15=387=en > > I expected the arcmsr module to load and detect disks as it does with > the stock kernel. > > I can provide sysctl and dmesg output if it helps. Yes. The first thing needed is full startup logs, and for the Xen part preferably extra logging. In /etc/default/grub.d/xen.cfg in the GRUB_CMDLINE_XEN_DEFAULT setting, you can add loglvl=all, and then run update-grub and try to boot Xen+Linux again. Do you have a way to capture the logging during boot? Like, a working serial console or something similar? The output of dmesg when starting Linux without Xen is of course also interesting, so we can compare both scenarios. Hans
Bug#962267: [Pkg-xen-devel] Bug#962267: xen: please consider to not install NEWS into runtime library packages
Hi Ansgar, On 6/5/20 11:57 AM, Ansgar wrote: > Source: xen > Version: 4.11.4-1 > Severity: minor > File: /usr/share/doc/libxenmisc4.11/NEWS.Debian.gz > > Please consider to not install debian/NEWS into runtime library > packages. They get pulled into systems that do not run Xen at all in > which case the NEWS aren't very helpful (just noise that > apt-listchanges shows). For my system for example: > > +--- > | % aptitude why libxenmisc4.11 > | i qemu-kvmDepends qemu-system-x86 > | i A qemu-system-x86 Depends libxenmisc4.11 > +--- > > Installing NEWS into xen*, but not libxen* probably still reaches all > relevant users. Yes, that makes sense. OTOH, what if there was a really weird problem with libxenmisc4.11 that we would like to pro-actively inform users about? I guess there is only one NEWS per source package? Hans
Bug#932759: marked as done (After upgrade from stretch to buster, removal of obsolete xen 4.8 packages seems to trigger shutdown of xenconsoled)
Hi, On 5/27/20 7:39 PM, Debian Bug Tracking System wrote: > Your message dated Wed, 27 May 2020 17:36:26 + > with message-id > and subject line Bug#932759: fixed in xen 4.11.4-1 > has caused the Debian Bug report #932759, > regarding After upgrade from stretch to buster, removal of obsolete xen 4.8 > packages seems to trigger shutdown of xenconsoled > to be marked as done. > > This means that you claim that the problem has been dealt with. To avoid confusion, yes, this one closes with the upload of 4.11.4 to unstable which has the fix. However, it's still present in 4.11.3+24-g14b62ab3e5-1~deb10u1 in buster. So, the same fix will also go into buster later, to in the end help users upgrade from buster to bullseye. Hans
Bug#932759: [PATCH 2/2] debian/rules: --no-start for xen dh_installinit
Hi, On 5/26/20 12:44 PM, Ian Jackson wrote: > Hans van Kranenburg writes ("[PATCH 2/2] debian/rules: --no-start for xen > dh_installinit"): >> When debugging the xen-utils postinst/prerm to find the cause of the >> mysteriously disappearing xenconsoled processes, I discovered that the >> xen-utils-common postinst and prerm stop and start the xen init script >> as well! >> >> These commands are not visible in the packaging code, but they are added >> by dh_installdeb into the postinst and prerm during package build time. >> >> We only want to call the script from xen-utils-V, so disable this >> behavior by using --no-start >> >> Closes: #932759 (2/2) >> Signed-off-by: Hans van Kranenburg > > Reviewed-by: Ian Jackson Thanks. > I think it would be wise to look at the generated .debs and see that > they contain (only) the expected pieces in their maintscripts. Yes, I did this while testing by diffing the files installed in /var/lib/dpkg/info with the old ones and verifying that exactly that part went away. (for 4.11:) -$ diff -u ~/xen-utils-common.postinst xen-utils-common.postinst --- /home/beheer/xen-utils-common.postinst 2020-05-26 13:08:45.738926207 +0200 +++ xen-utils-common.postinst 2020-05-25 14:14:28.0 +0200 @@ -31,13 +31,7 @@ # Automatically added by dh_installinit/13.1 if [ "$1" = "configure" ] || [ "$1" = "abort-upgrade" ] || [ "$1" = "abort-deconfigure" ] || [ "$1" = "abort-remove" ] ; then if [ -x "/etc/init.d/xen" ]; then - update-rc.d xen defaults 20 21 >/dev/null - if [ -n "$2" ]; then - _dh_action=restart - else - _dh_action=start - fi - invoke-rc.d xen $_dh_action || exit 1 + update-rc.d xen defaults 20 21 >/dev/null || exit 1 fi fi # End automatically added section -$ diff -u ~/xen-utils-common.prerm xen-utils-common.prerm --- /home/beheer/xen-utils-common.prerm 2020-05-26 13:09:01.570617331 +0200 +++ xen-utils-common.prerm 2020-05-25 14:14:28.0 +0200 @@ -1,10 +1,5 @@ #!/bin/sh set -e -# Automatically added by dh_installinit/13.1 -if [ -x "/etc/init.d/xen" ] && [ "$1" = remove ]; then - invoke-rc.d xen stop || exit 1 -fi -# End automatically added section # Automatically added by dh_installdeb/13.1 dpkg-maintscript-helper rm_conffile /etc/default/xend 4.11.1-2\~ -- "$@" dpkg-maintscript-helper rm_conffile /etc/xen/xend-config.sxp 4.11.1-2\~ -- "$@" Hans
Bug#932759: [PATCH 0/2] Bug#932759 Fix misfiring init scripts
This should be enough to finally fix the problem of the mysteriously disappearing xenconsoled process. We have tried to fix this before, but it turned out the fix was incomplete. The two attached patches... * revert the previous fix * prevent xen-utils-V prerm and postinst to call the xen init script when V != running version X. * remove even more additional extra bonus redundant superfluous supererogatory inordinate loquacious start/stop calls from the xen-utils-common maintainer scripts, which were put there by dh_installinit and went unnoticed so far. Ian, can you give your A-B on this. It will have to go into buster as well, to help users upgrade to Bullseye without these problems. The test scenario I used (all on current Debian unstable): [x] Reproduce the problem (disappearing xenconsoled) with current packages [x] Install fixed 4.11 packages, and check that when upgrading 4.11 to 4.11, the init script stop/start is called [x] Install fixed 4.13 packages, and check that the init script is not called when installing xen-utils-4.13 and when upgrading xen-utils-common [x] Reboot into Xen 4.13 [x] Remove xen-utils-4.11 and check that the stop action on the init script is not called. [x] Install xen-utils-4.11 again and check that the start action is not called. [x] Reboot into just Linux without Xen [x] Remove xen-utils-4.11 and check that this works good enough. It's allowed to print some complaints on the screen and behave a little weird, but it should not totally explode. Now, there's a last edge case I can think of, which is installing xen-utils-V in a domU. In there, the /usr/lib/xen-common/bin/xen-version script will return the Xen version of the host that is carrying this domU and then do a thing. I do not think we actively support doing interesting things inside a domU with these packages however. Hans van Kranenburg (2): xen init/maint scripts: Do nothing if running for wrong Xen package debian/rules: --no-start for xen dh_installinit debian/rules | 2 +- debian/xen-utils-V.postinst.vsn-in | 10 +- debian/xen-utils-V.prerm.vsn-in| 10 +- debian/xen-utils-common.xen.init | 27 --- 4 files changed, 19 insertions(+), 30 deletions(-) -- 2.20.1
Bug#932759: [PATCH 2/2] debian/rules: --no-start for xen dh_installinit
When debugging the xen-utils postinst/prerm to find the cause of the mysteriously disappearing xenconsoled processes, I discovered that the xen-utils-common postinst and prerm stop and start the xen init script as well! These commands are not visible in the packaging code, but they are added by dh_installdeb into the postinst and prerm during package build time. We only want to call the script from xen-utils-V, so disable this behavior by using --no-start Closes: #932759 (2/2) Signed-off-by: Hans van Kranenburg --- debian/rules | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/debian/rules b/debian/rules index 23c982eb414b..73232ca20efe 100755 --- a/debian/rules +++ b/debian/rules @@ -282,7 +282,7 @@ override_dh_python2: # We have two init scripts. (There used to be xend too.) override_dh_installinit: - dh_installinit --name xen -- defaults 20 21 + dh_installinit --name xen --no-start -- defaults 20 21 dh_installinit --name xendomains --no-start -- defaults 21 20 # dh_strip in dh compat 10 and earlier (which we are at so this -- 2.20.1
Bug#932759: [PATCH 1/2] xen init/maint scripts: Do nothing if running for wrong Xen package
After trying to fix this issue in the init script, we found out that the problem still happened for systems running with systemd. The xen-utils-V postinst and prerm have DPKG_MAINTSCRIPT_PACKAGE in their environment. When calling invoke-rc.d xen under systemd, the whole circus of translation and compatibility layers is used to finally end up running the /etc/init.d/xen script again. However, when ending up there, the DPKG_MAINTSCRIPT_PACKAGE variable is lost. So, instead of trying to fix this in the init script, avoid calling invoke-rc.d altogether, when installing or removing for a different version of Xen than the currently running one. Since we only call this from two places, and the check is a one liner, directly put it into the prerm and postinst. Carefully quote the values on both sides of the comparison. For example, when removing a xen-utils-V package after rebooting into just Linux without Xen, the version retrieval helper will print an error like "ERROR: Can't find hypervisor information in sysfs!", there will be no useful output on stdout and it will compare an empty string with the version of the xen-utils package, resulting in the right action, not trying to stop or start anything. To avoid hitting the disappearing xenconsoled scenario, the fix has to be present in the maintainer scripts of the to be removed *old* xen-utils-V package. This means users will have to first upgrade to a package with this fix before upgrading to a different Xen version. Signed-off-by: Hans van Kranenburg Closes: #932759 (1/2) Fixes: cc85504103 "xen init script: Do nothing if running for wrong Xen package" --- debian/xen-utils-V.postinst.vsn-in | 10 +- debian/xen-utils-V.prerm.vsn-in| 10 +- debian/xen-utils-common.xen.init | 27 --- 3 files changed, 18 insertions(+), 29 deletions(-) diff --git a/debian/xen-utils-V.postinst.vsn-in b/debian/xen-utils-V.postinst.vsn-in index 581327f09ffd..0acebf836bb2 100644 --- a/debian/xen-utils-V.postinst.vsn-in +++ b/debian/xen-utils-V.postinst.vsn-in @@ -6,7 +6,15 @@ case "$1" in configure) update-alternatives --remove xen-default /usr/lib/xen-@version@ if [ -x "/etc/init.d/xen" ]; then - invoke-rc.d xen start || exit $? +# Only call the init script when this xen-utils-@version@ package +# matches the currently running version of Xen. This means, doing +# in-place updates (e.g. a security update for same version). +# +# When installing a xen-utils package for any other Xen version, +# leave the running system alone. +if [ "$(/usr/lib/xen-common/bin/xen-version)" = "@version@" ]; then +invoke-rc.d xen start || exit $? +fi fi ;; diff --git a/debian/xen-utils-V.prerm.vsn-in b/debian/xen-utils-V.prerm.vsn-in index 1aa2cae65fda..f1cb4299c30c 100644 --- a/debian/xen-utils-V.prerm.vsn-in +++ b/debian/xen-utils-V.prerm.vsn-in @@ -6,7 +6,15 @@ case "$1" in remove|upgrade) update-alternatives --remove xen-default /usr/lib/xen-@version@ if [ -x "/etc/init.d/xen" ]; then -invoke-rc.d xen stop || exit $? +# Only call the init script when removing or while upgrading for +# the currently running version of Xen. +# +# Otherwise, for example after a Xen version upgrade, autoremoval +# of an obsolete xen-utils-V package would inadvertently stop +# running daemons like xenconsoled. +if [ "$(/usr/lib/xen-common/bin/xen-version)" = "@version@" ]; then +invoke-rc.d xen stop || exit $? +fi fi ;; diff --git a/debian/xen-utils-common.xen.init b/debian/xen-utils-common.xen.init index f66ce6b8db18..05521733494e 100644 --- a/debian/xen-utils-common.xen.init +++ b/debian/xen-utils-common.xen.init @@ -26,33 +26,6 @@ xen) ;; esac VERSION=$(/usr/lib/xen-common/bin/xen-version) - -# The arrangements for the `xen' init script are a bit odd. -# This script is part of xen-utils-common, of which there is one -# version installed regardless of the Xen version. -# -# But it is called by the prerm and postinsts of xen-utils-VERSION. -# The idea is that (for example) if xen-utils-VERSION is upgraded, the -# daemons are restarted. -# -# However, this means that this script may be called by the -# maintscript of a xen-utils-V package for a different V to the -# running version of Xen (X, say). Such a xen-utils-V package does -# not actually want to start or stop its daemons. Indeed, the version -# selection machinery would redirect its efforts to the xen-utils-X -# utilities. But this is not right: we don't actually want to (for -# example) stop xenconsoled from xen-utils-X just because some -# not-currently-relevant xen-util
Bug#939560: [Pkg-xen-devel] Bug#939560: xen: Various problems in debian/rules
Hi Guillem, On 9/6/19 12:55 PM, Guillem Jover wrote: > [...] > > During the debhelper recommendation thread there was a mail from Ian > pointing out to the xen debian/rules file, I took a look and noticed > the following. :) > > The debian/rules file [...] Thanks for the report! The fixes will be in the new Xen 4.13 packages, which are going to be in experimental very soon and in unstable in a few weeks, hopefully (we need users to upgrade to the last 4.11 package for an upgrade fix regarding #932759). If you would like to review the changes, it's the three commits by Ian, named... - debian/rules: Set DEB_BUILD_MAINT_OPTIONS in shell - debian/rules: Improve comment about hardening options - debian/rules: Drop redundant sequence numbers in dh_installinit ...which you can find at: https://salsa.debian.org/xen-team/debian-xen/-/commits/knorrie/4.13 I'm still finishing up all of that, so can't give commit ids because of the force-pushing going on. Thanks, Hans
Bug#938108: [Pkg-xen-devel] Bug#938108: python-pyxenstore: Python2 removal in sid/bullseye
On 5/9/20 9:57 PM, Moritz Mühlenhoff wrote: > On Sat, May 09, 2020 at 02:36:24AM +0200, Thomas Goirand wrote: >> On 5/8/20 9:35 PM, Moritz Mühlenhoff wrote: >>> On Fri, Aug 30, 2019 at 07:45:40AM +, Matthias Klose wrote: Package: src:python-pyxenstore Version: 0.0.2-1 Severity: normal Tags: sid bullseye User: debian-pyt...@lists.debian.org Usertags: py2removal Python2 becomes end-of-live upstream, and Debian aims to remove Python2 from the distribution, as discussed in https://lists.debian.org/debian-python/2019/07/msg00080.html Your package either build-depends, depends on Python2, or uses Python2 in the autopkg tests. Please stop using Python2, and fix this issue by one of the following actions. >>> >>> Hi, >>> python-pyxenstore is dead upstream and there are no reverse deps, let's >>> remove? >>> >>> Cheers, >>> Moritz >> >> By all means, yes, remove this. >> I believe it is in Debian when I attempted to package XCP (aka: xen-api, >> aka xen-server, etc.), and that's long gone from Debian. > > Ack, I've just filed an RM bug. (seeing it happening) Also ACK from me. A while ago this confused me because I initially thought this was a binary package produced by src:xen, but it was not. At some point (I think it was our latest IRL work together day of the Debian Xen team) I realized that it really was not, and from that POV, I can confirm that it is not used by anything in there. Thanks, Hans
Bug#952958: rrdtool crashes after the DLA-2131-1 security update
Hi, I filed 952964 because I failed to find this one first, apparently. I merged it now, please ignore 952964. The problem is that upstream commits around this issue are quite a bit of a mess, with a number of trial and error fixup commits. So, a half broken version of the fix was now included in the Jessie security update. See... https://github.com/oetiker/rrdtool-1.x/commits/master?after=caf8f7e4a06cd36a69142a46326e58296850781d+69%5B%5D=src%5B%5D=rrd_graph.c ...and then the 'a proper fix to...' and a bunch of newer commits, like 'fix character class definition' and more. So, a bit more inspection of the history of that file is necessary to collect the pieces for a proper fix together. I can help testing a new package if you want. Thanks, Hans van Kranenburg
Bug#952964: Security update breaks graph generation: 'range out of order in character class'
Package: rrdtool Version: 1.4.8-1.2+deb8u1 Hi, the patch in the Jessie security update that was just released properly breaks creating graphs. The patch contains the following line: #define FLOAT_STRING "%[+- 0#]?[0-9]*([.][0-9]+)?l[eEfF]" Now, [+- 0#] is not a valid character class for a regex, because the - defines a range, and a range from '+' to ' ' is not valid. [RRD ERROR] Unable to graph /var/lib/munin/cgi-tmp/munin-cgi-graph/[...].png : cannot compile regular expression: Error while compiling regular expression ^(?:[^%]+|%%)*%[+- 0#]?[0-9]*([.][0-9]+)?l[eEfF](?:[^%]+|%%)*%s(?:[^%]+|%%)*$ at char 18: range out of order in character class (^(?:[^%]+|%%)*%[+- 0#]?[0-9]*([.][0-9]+)?l[eEfF](?:[^%]+|%%)*%s(?:[^%]+|%%)*$) Upstream did a fixup commit, 1615689e259bfd67e43cf7711948abc23f998ca9 which you missed to include: https://github.com/oetiker/rrdtool-1.x/commit/1615689e259bfd67e43cf7711948abc23f998ca9 Thanks, Hans van Kranenburg
Bug#796095: ftp.debian.org: Please allow uploads for DMs to security-master
Hi, Friendly ping for this issue. Today I ran into the situation that after getting DM status to be able to help with the security updates for a specific package (Xen), I found out I was not able to actually make this happen, since I still need to send the result of my package build to someone else in the Debian Xen team who is a DD to have this person swap the GPG signature and do the upload. Thanks, Hans
Bug#947944: xen: Several CVEs open for xen (CVE-2018-12207 CVE-2019-11135 CVE-2019-18420 CVE-2019-18421 CVE-2019-18422 CVE-2019-18423 CVE-2019-18424 CVE-2019-18425 CVE-2019-19577 CVE-2019-19578 CVE-20
On 1/7/20 11:34 PM, Hans van Kranenburg wrote: > [...] > > Today I have finally been working on this. The result is that I at least > have a new (WIP) version for buster. I'm running it on a dom0 right now > and did smoke testing, live migrate, restarting domUs etc. It just works > (tm). > > This was the easy part, most of the work was assembling the changelog by > copy-pasting things. I cross-checked with your list (below), which is > nice, since we can check that way that the info from different points of > view is the same (except for one entry it is). > > https://salsa.debian.org/xen-team/debian-xen/commits/knorrie/buster-security > > Now the interesting part begins, which is not so much about the stable > security update, but more about what to do with unstable. We currently > still have the same Xen version in unstable and in Buster. > > So, the most logical thing, which I mentioned before would be to have > 4.11.3+24-g14b62ab3e5-1 in unstable and 4.11.3+24-g14b62ab3e5-1~deb10u1 > in stable. Ok, this will just be ok, since I was confused about the python-pyxenstore package, and thought it was a by-product from our src:xen. This is not the case, it's a separate thing. So, false alarm. > [...] That means that the original plan will suffice for now. The whole python2 situation will be resolved when we prepare Xen 4.13 or 4.14, or whichever one will be the Bullseye one. The result: https://salsa.debian.org/xen-team/debian-xen/tree/knorrie/unstable https://salsa.debian.org/xen-team/debian-xen/tree/knorrie/buster-security I just built and tested both of the resulting piles of packages, on buster and on a bullseye dom0. All looks fine, I can live migrate, restart things etc etc... So, next step is getting things uploaded to the right place. Hans
Bug#947944: xen: Several CVEs open for xen (CVE-2018-12207 CVE-2019-11135 CVE-2019-18420 CVE-2019-18421 CVE-2019-18422 CVE-2019-18423 CVE-2019-18424 CVE-2019-18425 CVE-2019-19577 CVE-2019-19578 CVE-20
Hi, Today I have finally been working on this. The result is that I at least have a new (WIP) version for buster. I'm running it on a dom0 right now and did smoke testing, live migrate, restarting domUs etc. It just works (tm). This was the easy part, most of the work was assembling the changelog by copy-pasting things. I cross-checked with your list (below), which is nice, since we can check that way that the info from different points of view is the same (except for one entry it is). https://salsa.debian.org/xen-team/debian-xen/commits/knorrie/buster-security Now the interesting part begins, which is not so much about the stable security update, but more about what to do with unstable. We currently still have the same Xen version in unstable and in Buster. So, the most logical thing, which I mentioned before would be to have 4.11.3+24-g14b62ab3e5-1 in unstable and 4.11.3+24-g14b62ab3e5-1~deb10u1 in stable. However... https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=938843 And on Dec 15, python-pyxenstore REMOVED from testing So, I guess we're not supposed to upload something new to unstable that includes this package again and/or uses python 2. Also, we of course do not like a situation where the package in stable has a newer version number than the one in unstable. Checkmate... We (as in, Debian Xen team, which is Ian and I who are currently active) haven't been working on getting the latest greatest Xen into unstable for Bullseye yet. The most recent Xen release (4.13) includes python3 support which fixes that issue, but getting that in means we have to actively start working on newer packages now. This mostly means reserving a few days to work on it, since it's not a really trivial undertaking. Another ducttape-option is to put the same thing in unstable again, while stripping out python-pyxenstore from the control file, since it's not a required package for the average usecase. Still, xen-utils-4.11 contains a bunch of python 2 files, which apparently are still under the radar. I'm thinking out loud here, and am curious about what you and Ian can come up with. On 1/2/20 3:57 PM, Salvatore Bonaccorso wrote: > [...] > > There are several CVEs open for xen up to unstable, compiling a list > from the information from the security-tracker it looks those below. > > Any progress in getting those fixed at least for unstable already? > > CVE-2018-12207[0]: check, XSA-304 > CVE-2019-11135[1]: check, XAS-305 > CVE-2019-18420[2]: check, XSA-296 > CVE-2019-18421[3]: check, XSA-299 > CVE-2019-18422[4]: check, XSA-303 > CVE-2019-18423[5]: check, XSA-301 > CVE-2019-18424[6]: check, XSA-302 > CVE-2019-18425[7]: check, XSA-298 > CVE-2019-19577[8]: check, XSA-311 > CVE-2019-19578[9]: check, XSA-309 > CVE-2019-19579[10]: check, XSA-306 > CVE-2019-19580[11]: check, XSA-310 > CVE-2019-19581[12]: check, XSA-307 > CVE-2019-19582[13]: check, XSA-307 > CVE-2019-19583[14]: check, XSA-308 In the changelog, I also have a fix for: XSA-295 CVE-2019-17349 CVE-2019-17350 https://xenbits.xen.org/xsa/advisory-295.html > If you fix the vulnerabilities please also make sure to include the > CVE (Common Vulnerabilities & Exposures) ids in your changelog entry. I also added a commit to put in the CVE numbers in previous changelog entries: https://salsa.debian.org/xen-team/debian-xen/commit/0ee295f5caf6178f64febeb976d7ea968e44a191 Is this ok/wanted/great/what-you-like? Because, regularly, the numbers are not available yet when we push out the update. Thanks, Hans van Kranenburg
Bug#821254: systemd[1]: xendomains.service start operation timed out.
Hi, On 1/3/20 5:42 PM, Martin Maney wrote: > > [...] > > Yes, the shutdown hang is a different issue, but I'm going to hope that > the real systemd units mentioned in this bug will fix my problem, too. What you could do already now is try testing those scripts, just shutting down and starting up the domUs, without actually rebooting the machine. By doing so we can learn if we could use them as a drop in replacement or not. The xendomains init script that we have in Debian is: https://salsa.debian.org/xen-team/debian-xen/blob/master/debian/xen-utils-common.xendomains.init The upstream one (which is quite a bit different) is: https://salsa.debian.org/xen-team/debian-xen/blob/master/tools/hotplug/Linux/xendomains.in Or, it seems that last one gets installed in a location for helper scripts and it's just called from both the init.d script and the systemd service: https://salsa.debian.org/xen-team/debian-xen/blob/master/tools/hotplug/Linux/init.d/xendomains.in https://salsa.debian.org/xen-team/debian-xen/blob/master/tools/hotplug/Linux/systemd/xendomains.service.in It would be really helpful if you would want to spend some time on this. Speaking for myself, I either deal with clusters and using live migrate to empty a server before shutting it down, or otherwise I rather have my own way to carefully shut down things before typing a reboot command, combined with a molly-guard script to prevent accidental reboots while something is still running. That way there's still an option to debug/salvage a misbehaving domU before shutdown. Hans
Bug#944612: [Pkg-xen-devel] system still crashes with bullseye and kernel v5.3
Hi Alexander, On 12/18/19 9:24 PM, Alexander Dahl wrote: > > meanwhile I'm running bullseye with kernel v5.3, but the problem > persists and my Xen system is annoyingly unstable due to this bug. I > attach some more logs from the last days and add the debian xen devel > list in Cc. Maybe someone over there has an idea how to fix this. After > all the log shows plenty of hints it could have something to do with Xen. I think the xen parts you see in the stack trace listings are usual calls that show that a domU is asking dom0 via the hypervisor to do some disk read/writes or send data over the network (the 'upcall'). https://wiki.xen.org/wiki/Event_Channel_Internals So, after getting that request, the dom0 Linux kernel tries to execute it, which is e.g. the enqueue function to throw a network packet at the physical network interface. The first error we see is the "transmit queue 0 timed out". This looks like the Linux kernel is looking at the network port hardware, and expects it to accept the packet, deal with it and put it on the wire. When this does not happen, and the network port hardware seems frozen and timeouts, it's forcibly reset (I don't know if the thing is resetting itself because it crashed, or if the Linux kernel does something to reset it). "Reset adapter unexpectedly" gives me the feeling that the firmware inside the network card crashed and something inside there also reset it. > Anyone care to help debug this? I have no idea where to start. Can > kernel or xen generate coredumps one could analyze? Or is the log output > the only thing? > > (If you look at the logs, the strange thing is the system does not crash > and reboot immediately, but later after lots of errors with storage, but > comes back fine after reboot.) The ata errors (disk fails to process a command) happen after all of the above happens. Usually disk errors that look like this point at broken disk hardware or bugs in the firmware in the disk. However, if it consistently happens 6 to 7 seconds after the network card disaster, it might be a symptom of the former. The first thing I would recommend is disabling transmit segmentation offloading to the network card in dom0 (ethtool enp1s0 tso off) and see if it prevents the network card from choking on some kind of input. If not, play with more settings like transmit checksum offloading (tx off). If this does not help, we can start asking some Xen developers if they have an idea how we can help with debugging and what we should do. (I help maintaining the Xen packages in Debian, my knowledge about internals of it is mostly limited to all the been-there-done-thats during the years of using it as a user.) I expect the problem to be related to Linux and the hardware, and not specifically Xen. Knowing if the same happens when just booting Linux without Xen is valuable debugging info. However, I realize that it's likely a bit complicated to, in that case, try triggering the problem by generate the same workload that's now coming from the domUs. Curious to hear what happens, Thanks, Hans van Kranenburg
Bug#880554: [Pkg-xen-devel] Bug#880554: #880554: max grant frames problem
On 7/18/19 1:30 AM, Hans van Kranenburg wrote: > Hi, > > On 10/23/18 7:34 PM, Ian Jackson wrote: >> Control: retitle -1 max grant frames problem (domu freeze with >> linux-image-4.9.0-4-amd64) >> Control: severity -1 important >> Control: reassign -1 src:xen 4.8.3+xsa267+shim4.10.1+xsa267-1+deb9u9 > > my last comment in this bts bug was about: > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=29d11cfd8698038b87458ba4d1329b9da81150a5 > > ..which is in since linux 4.13-rc2, and buster has 4.19+ > > Is there anyone who would wants to try reproduce the max grant frames > problem on buster with Xen 4.11 and Linux 4.19 dom0/domU? > > The 'xen/grant-table: max_grant_frames reached' should show up on the > serial console. I'd like to see a test report of it actually happening. I actually just did this, by putting max_grant_frames = 4 in a domU config file and starting it (Linux 4.19 domU on Xen 4.11): Welcome to Debian GNU/Linux 10 (buster)! [5.499058] systemd[1]: Set hostname to . [5.552968] xen:grant_table: xen/grant-table: max_grant_frames reached cur=4 extra=1 limit=4 gnttab_free_count=3 req_entries=1 [5.554012] xen:grant_table: xen/grant-table: max_grant_frames reached cur=4 extra=1 limit=4 gnttab_free_count=3 req_entries=1 [5.555858] xen:grant_table: xen/grant-table: max_grant_frames reached cur=4 extra=1 limit=4 gnttab_free_count=3 req_entries=1 [5.556950] xen:grant_table: xen/grant-table: max_grant_frames reached cur=4 extra=1 limit=4 gnttab_free_count=3 req_entries=1 [5.557082] xen:grant_table: xen/grant-table: max_grant_frames reached cur=4 extra=1 limit=4 gnttab_free_count=3 req_entries=1 [5.557295] xen:grant_table: xen/grant-table: max_grant_frames reached cur=4 extra=1 limit=4 gnttab_free_count=3 req_entries=1 [5.557636] xen:grant_table: xen/grant-table: max_grant_frames reached cur=4 extra=1 limit=4 gnttab_free_count=3 req_entries=1 [5.558960] xen:grant_table: xen/grant-table: max_grant_frames reached cur=4 extra=1 limit=4 gnttab_free_count=3 req_entries=1 [5.559800] xen:grant_table: xen/grant-table: max_grant_frames reached cur=4 extra=1 limit=4 gnttab_free_count=3 req_entries=1 [6.014291] gnttab_expand: 159 callbacks suppressed [6.014296] xen:grant_table: xen/grant-table: max_grant_frames reached cur=4 extra=1 limit=4 gnttab_free_count=0 req_entries=3 [6.014351] xen:grant_table: xen/grant-table: max_grant_frames reached cur=4 extra=1 limit=4 gnttab_free_count=0 req_entries=8 [6.033683] xen:grant_table: xen/grant-table: max_grant_frames reached cur=4 extra=1 limit=4 gnttab_free_count=0 req_entries=3 [6.055013] xen:grant_table: xen/grant-table: max_grant_frames reached cur=4 extra=1 limit=4 gnttab_free_count=0 req_entries=3 [6.055729] xen:grant_table: xen/grant-table: max_grant_frames reached cur=4 extra=1 limit=4 gnttab_free_count=0 req_entries=26 [6.060256] xen:grant_table: xen/grant-table: max_grant_frames reached cur=4 extra=1 limit=4 gnttab_free_count=0 req_entries=3 [6.077000] xen:grant_table: xen/grant-table: max_grant_frames reached cur=4 extra=1 limit=4 gnttab_free_count=0 req_entries=3 [6.109760] xen:grant_table: xen/grant-table: max_grant_frames reached cur=4 extra=1 limit=4 gnttab_free_count=0 req_entries=3 [6.138126] xen:grant_table: xen/grant-table: max_grant_frames reached cur=4 extra=1 limit=4 gnttab_free_count=0 req_entries=3 [6.148626] xen:grant_table: xen/grant-table: max_grant_frames reached cur=4 extra=1 limit=4 gnttab_free_count=0 req_entries=3 Yay. Better info for the users! Also, there's a patch in review that can improve the situation: https://lists.xenproject.org/archives/html/xen-devel/2019-11/msg01607.html The biggest annoyance in our Xen 4.11 now is that the default value for the hypervisor command line of gnttab_max_frames is raised to 64 from 32 a while ago, but the toolstack overwrites this again with a default of 32. The patch attempts to fix that. Hans
Bug#924360: xen-hypervisor-4.11-amd64 HVM Boot failure: "ERR: Bootloader shutdown EFI x64 boot services!" - also on stable
Hi all (reporters on 924360, 901599), On 8/6/19 5:43 PM, Gerald Wodni wrote: > > I would like to confirm this bug in stable, as I have exactly the same > issue (dom0 works/xen hangs/error message) since upgrading from stretch > to buster. Thanks for your report(s). Sorry to let you wait without reply for some time. Unfortunately booting Xen/dom0 with EFI is not something that is very well tested in Debian. One of the reasons for this is simply that none of the package maintainers is using EFI. For these kind of cases, we rely on users who encounter the problem and who have the ability/skills/etc to help debugging the problem. I suspect the problem is caused by some intricacies concerning interaction between grub, xen, etc. There are some other reports on the upstream xen-users mailing list about this, but to be honest I have no idea if those are related. The problems might or might not be specific to Debian, I don't know. I'm available to facilitate the process, for example by creating new packages with a specific patch to test, but unfortunately I don't have spare hardware and time to try reproduce the problems myself and dig deep into it. Thanks, Hans van Kranenburg
Bug#932759: [Pkg-xen-devel] Bug#932759: After upgrade from stretch to buster, removal of obsolete xen 4.8 packages seems to trigger shutdown of xenconsoled
On 7/23/19 4:07 PM, niek wrote: > [...] > 2019-07-21 07:38:40 status installed xen-utils-4.8:amd64 > 4.8.5+shim4.10.2+xsa282-1+deb9u11 > 2019-07-21 07:38:40 remove xen-utils-4.8:amd64 > 4.8.5+shim4.10.2+xsa282-1+deb9u11 > 2019-07-21 07:38:40 status half-configured xen-utils-4.8:amd64 > 4.8.5+shim4.10.2+xsa282-1+deb9u11 > 2019-07-21 07:38:41 status half-installed xen-utils-4.8:amd64 > 4.8.5+shim4.10.2+xsa282-1+deb9u11 > 2019-07-21 07:38:41 status config-files xen-utils-4.8:amd64 > 4.8.5+shim4.10.2+xsa282-1+deb9u11 > 2019-07-21 07:38:41 status not-installed xen-utils-4.8:amd64 > 2019-07-21 07:38:41 status installed libxen-4.8:amd64 > 4.8.5+shim4.10.2+xsa282-1+deb9u11 > 2019-07-21 07:38:41 remove libxen-4.8:amd64 > 4.8.5+shim4.10.2+xsa282-1+deb9u11 > 2019-07-21 07:38:41 status half-configured libxen-4.8:amd64 > 4.8.5+shim4.10.2+xsa282-1+deb9u11 > 2019-07-21 07:38:41 status half-installed libxen-4.8:amd64 > 4.8.5+shim4.10.2+xsa282-1+deb9u11 > 2019-07-21 07:38:41 status config-files libxen-4.8:amd64 > 4.8.5+shim4.10.2+xsa282-1+deb9u11 > 2019-07-21 07:38:41 status not-installed libxen-4.8:amd64 > [...] > 2019-07-21 07:38:42 status installed xen-hypervisor-4.8-amd64:amd64 > 4.8.5+shim4.10.2+xsa282-1+deb9u11 > 2019-07-21 07:38:42 remove xen-hypervisor-4.8-amd64:amd64 > 4.8.5+shim4.10.2+xsa282-1+deb9u11 > 2019-07-21 07:38:42 status half-configured > xen-hypervisor-4.8-amd64:amd64 4.8.5+shim4.10.2+xsa282-1+deb9u11 > 2019-07-21 07:38:42 status half-installed xen-hypervisor-4.8-amd64:amd64 > 4.8.5+shim4.10.2+xsa282-1+deb9u11 > 2019-07-21 07:39:41 status config-files xen-hypervisor-4.8-amd64:amd64 > 4.8.5+shim4.10.2+xsa282-1+deb9u11 Ok, so, the most interesting question for me is... On line 50 in the init script: https://salsa.debian.org/xen-team/debian-xen/blob/master/debian/xen-utils-common.xen.init case $DPKG_MAINTSCRIPT_PACKAGE in xen-utils-$VERSION) ;; # xen-utils-V maintscript, under Xen X=V xen-utils-*)exit 0;; # xen-utils-V maintscript, but under Xen X!=V *) ;; # maybe not under dpkg, etc. esac What is the value of this $DPKG_MAINTSCRIPT_PACKAGE when it happens? Could it be something else than something beginning with xen-utils-? I have a suspicion that the systemd[1]: Reloading. has something to do with it. Or the triggers? Anyway, if DPKG_MAINTSCRIPT_PACKAGE gets lost *anywhere* in whatever happens, it might end up as empty, and then matching just *. But, we really need find out how to reproduce it in a test environment. :| Hans
Bug#932759: [Pkg-xen-devel] Bug#932759: After upgrade from stretch to buster, removal of obsolete xen 4.8 packages seems to trigger shutdown of xenconsoled
Hi niek, Thanks for the report! On 7/22/19 8:32 PM, niek wrote: > Package: xen-hypervisor-4.11-amd64 > Version: 4.11.1+92-g6c33308a8d-2 > > What happened: > - upgraded Debian Xen Dom0 from stretch to buster and rebooted, as > described in > https://www.debian.org/releases/buster/amd64/release-notes/ch-upgrading.en.html > > - started some Linux pv domu without problems > > - removed obsolete packages with 'apt autoremove'. This removed (among > others) > xen-hypervisor-4.8-amd64:amd64 (4.8.5+shim4.10.2+xsa282-1+deb9u11), > libxen-4.8:amd64 (4.8.5+shim4.10.2+xsa282-1+deb9u11), > xen-utils-4.8:amd64 (4.8.5+shim4.10.2+xsa282-1+deb9u11) > > [...] > - xenconsoled was not running > > - searching system logs revealed that xenconsoled seemed to have stopped > when 'apt autoremove' removed the obsolete xen 4.8 > packages after upgrading to xen 4.11. Well, there it is again. We tried to make a fix, exactly for this... https://salsa.debian.org/xen-team/debian-xen/commit/ef242a700765a971a6afc12d25ee19944dd3a27a ...and apparently there's another scenario in which even this doesn't work? Can you show the lines from /var/log/dpkg.log from that moment, the seconds around 07:38:40? It tells exactly what got removed, in what second, just to confirm? I'm pretty sure I tried to reproduce this after we added the fix I just referenced, and I was unable to. So, I'm very interested in finding out what's still going on here. Usually being able to reproduce a problem is one of the biggest steps towards finding a solution. (since it can be done over and over again, finding out what exactly causes it). So, finding the right sequence of steps to make it happen again is crucial here. Do you think the systemd reload has anything to do with it? Maybe the whole systemd init-script-wrapper-trickery is misbehaving in some way? Can you reproduce this by manually grabbing the xen-hypervisor-4.8-amd64, libxen-4.8 and xen-utils-4.8 from stretch again, installing them and removing them again? Do you have any other idea? Thanks, Hans
Bug#880554: [Pkg-xen-devel] Bug#880554: #880554: max grant frames problem
Hi, On 10/23/18 7:34 PM, Ian Jackson wrote: > Control: retitle -1 max grant frames problem (domu freeze with > linux-image-4.9.0-4-amd64) > Control: severity -1 important > Control: reassign -1 src:xen 4.8.3+xsa267+shim4.10.1+xsa267-1+deb9u9 my last comment in this bts bug was about: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=29d11cfd8698038b87458ba4d1329b9da81150a5 ..which is in since linux 4.13-rc2, and buster has 4.19+ Is there anyone who would wants to try reproduce the max grant frames problem on buster with Xen 4.11 and Linux 4.19 dom0/domU? The 'xen/grant-table: max_grant_frames reached' should show up on the serial console. I'd like to see a test report of it actually happening. No further adjustments/fixes will go into the Stretch Xen packages at this stage. Having better documentation about how to set hypervisor and guest options to deal with all of this is still a TODO. I would really like to get some people together to start cleaning out the whole Xen related wiki section for Debian, and actually provide some helpful content, including FAQ stuff like max grants, PVH, PVH+grub etc... Whoever would want to participate in that, just reply a Yay! Doing documentation work might seem boring, but it's write once, read many all the way. Hans
Bug#932085: grub-common: Grub can't load initrd for Xen after upgrade to Buster
On 7/14/19 11:43 PM, Colin Watson wrote: > On Sun, Jul 14, 2019 at 01:27:23PM -0700, Slava Kryvel wrote: >> After upgrade from Debian 9.9 to Debian 10 I have got unbootable system. >> >> I'm using Xen hypervisor, which was also upgraded from 4.8 to 4.11 >> during OS upgrade. >> UEFI is enabled. >> >> After upgrade was finished, I was unable to boot again to Xen kernel. >> But normal Debian kernel was still bootable. > > [...] > > I'm CCing a few folks who've contributed to GRUB's Xen support in one > way or another in the recent past; hopefully at least one of them can > help here? Just to be transparent here, not all possible functionality is tested by the package maintainers (currently Ian and me) before throwing a new package into Debian. This is simply not practically feasible for us. [0] We rely on the upstream tests to know that the upstream Xen code will probably work. For Debian specific things, we do test our own use cases, but e.g. UEFI is not one of them. For this, we rely on active users to report problems and help solving them. So, yes, things like this can happen. Thanks for reporting this. Next step would be to follow Rogers instructions, and provide config dumps, serial console output etc... We're certainly available to include changes / etc to fix things, given proper information / testing reports from the user. But, the user has to actively help to make that happen. Hans van Kranenburg (with Debian Xen team hat on) [0] https://alioth-lists.debian.net/pipermail/pkg-xen-devel/2018-October/007438.html
Bug#930797: unblock: xen/4.11.1+92-g6c33308a8d-1
Control: tags -1 - moreinfo Hi Paul, On 6/21/19 10:02 PM, Paul Gevers wrote: > Control: tags -1 moreinfo > > Hi Hans, > > On 20-06-2019 21:14, Hans van Kranenburg wrote: >> * Note that the fixes for XSA-297 will only have effect when also loading >> updated cpu microcode with MD_CLEAR functionality. When using the >> intel-microcode package to include microcode in the dom0 initrd, it >> has to >> be loaded by Xen. Please refer to the hypervisor command line >> documentation about the 'ucode=scan' option. > > I asked this question recently for another unblock report (not by you) > as well, but don't you think this is worth mentioning in NEWS? So that > people that use apt-listchanges are warned about this? Yes, it surely is. I realized the same thing, but only after the upload was done. What do you think about the following (also added as attachment): https://salsa.debian.org/xen-team/debian-xen/commit/ce3646253ebb7d4834a83a8ee813d7bef9b7ffe2 I'm building it now to see if everything ends up in the right place in the resulting packages. Thanks, Hans commit ce3646253ebb7d4834a83a8ee813d7bef9b7ffe2 (HEAD -> knorrie/4.11, origin/knorrie/4.11) Author: Hans van Kranenburg Date: Sat Jun 22 11:45:34 2019 +0200 Update to 4.11.1+92-g6c33308a8d-2 with MDS documentation Following up feedback from the release team, add a NEWS file mentioning the MDS mitigations with some instructions, so that it will be more visible to people using apt-listchanges. Mention the ucode option in our default documented set of "usually used options", so that users doing a new install will get a hint about the existence of this option, and what it does. diff --git a/debian/NEWS b/debian/NEWS new file mode 100644 index 00..e32955a161 --- /dev/null +++ b/debian/NEWS @@ -0,0 +1,20 @@ +xen (4.11.1+92-g6c33308a8d-1) unstable; urgency=high + +This update contains the mitigations for the Microarchitectural Data +Sampling speculative side channel attacks. Only Intel based processors are +affected. + +Note that these fixes will only have effect when also loading updated cpu +microcode with MD_CLEAR functionality. When using the intel-microcode +package to include microcode in the dom0 initrd, it has to be loaded by +Xen. Please refer to the hypervisor command line documentation about the +'ucode=scan' option. + +For the fixes to be fully effective, it is currently also needed to disable +hyper-threading, which can be done in BIOS settings, or by using smt=no on +the hypervisor command line. + +Additional information is available in the upstream Xen security advisory: +https://xenbits.xen.org/xsa/advisory-297.html + + -- Hans van Kranenburg Tue, 18 Jun 2019 09:50:19 +0200 diff --git a/debian/changelog b/debian/changelog index 9c64ee1326..4d2fc62b5b 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,3 +1,11 @@ +xen (4.11.1+92-g6c33308a8d-2) unstable; urgency=high + + * Mention MDS and the need for updated microcode and disabling +hyper-threading in NEWS. + * Mention the ucode=scan option in the grub.d/xen documentation. + + -- Hans van Kranenburg Sat, 22 Jun 2019 11:15:08 +0200 + xen (4.11.1+92-g6c33308a8d-1) unstable; urgency=high * Update to new upstream version 4.11.1+92-g6c33308a8d, which also diff --git a/debian/tree/xen-hypervisor-common/etc/default/grub.d/xen.cfg b/debian/tree/xen-hypervisor-common/etc/default/grub.d/xen.cfg index e3853c33ca..900c12df5d 100644 --- a/debian/tree/xen-hypervisor-common/etc/default/grub.d/xen.cfg +++ b/debian/tree/xen-hypervisor-common/etc/default/grub.d/xen.cfg @@ -44,6 +44,11 @@ echo "Including Xen overrides from /etc/default/grub.d/xen.cfg" # Do not automatically reboot after an error. This is useful for catching # debug output. # +# ucode=scan (only for x86) +# Scan the multiboot images mentioned in grub configuration for an cpio image +# that contains cpu microcode. This enables loading microcode that is stored +# in the dom0 initrd.img. +# # Please also refer to the "Xen Hypervisor Command Line Options" # documentation for the version of Xen you have installed. This # documentation can be found at https://xenbits.xen.org/
Bug#930797: unblock: xen/4.11.1+92-g6c33308a8d-1
Package: release.debian.org User: release.debian@packages.debian.org Usertags: unblock Severity: normal Please unblock package src:xen Hi release team, Yesterday we uploaded a security update for Xen. This update also contains the mitigations for Microarchitectural Data Sampling. The upstream source is forwarded from commit 87f51bf366 to commit 6c33308a8d: https://xenbits.xen.org/gitweb/?p=xen.git;a=shortlog;hp=87f51bf366;h=6c33308a8d There are no further packaging changes (except for the changelog, of course): >8 xen (4.11.1+92-g6c33308a8d-1) unstable; urgency=high * Update to new upstream version 4.11.1+92-g6c33308a8d, which also contains the following security fixes: - Fix: grant table transfer issues on large hosts XSA-284 (no CVE yet) (Closes: #929991) - Fix: race with pass-through device hotplug XSA-285 (no CVE yet) (Closes: #929998) - Fix: x86: steal_page violates page_struct access discipline XSA-287 (no CVE yet) (Closes: #930001) - Fix: x86: Inconsistent PV IOMMU discipline XSA-288 (no CVE yet) (Closes: #929994) - Fix: missing preemption in x86 PV page table unvalidation XSA-290 (no CVE yet) (Closes: #929996) - Fix: x86/PV: page type reference counting issue with failed IOMMU update XSA-291 (no CVE yet) (Closes: #929995) - Fix: x86: insufficient TLB flushing when using PCID XSA-292 (no CVE yet) (Closes: #929993) - Fix: x86: PV kernel context switch corruption XSA-293 (no CVE yet) (Closes: #92) - Fix: x86 shadow: Insufficient TLB flushing when using PCID XSA-294 (no CVE yet) (Closes: #929992) - Fix: Microarchitectural Data Sampling speculative side channel XSA-297 CVE-2018-12126 CVE-2018-12127 CVE-2018-12130 CVE-2019-11091 (Closes: #929129) * Note that the fixes for XSA-297 will only have effect when also loading updated cpu microcode with MD_CLEAR functionality. When using the intel-microcode package to include microcode in the dom0 initrd, it has to be loaded by Xen. Please refer to the hypervisor command line documentation about the 'ucode=scan' option. * Fixes for XSA-295 "Unlimited Arm Atomics Operations" will be added in the next upload. -- Hans van Kranenburg Tue, 18 Jun 2019 09:50:19 +0200 >8 We prefer to keep releasing from the upstream stable release branches, because: (i) upstream only put bugfixes and security fixes on their stable branches (ii) trying to assemble our own subset of the patches is riskier than taking upstream's collection (iii) the upstream stable release branch has undergone extensive testing, which we cannot repeat in Debian. The binary packages built from src:xen are: libxencall1 libxencall1-dbgsym libxen-dev libxendevicemodel1 libxendevicemodel1-dbgsym libxenevtchn1 libxenevtchn1-dbgsym libxenforeignmemory1 libxenforeignmemory1-dbgsym libxengnttab1 libxengnttab1-dbgsym libxenmisc4.11 libxenmisc4.11-dbgsym libxenstore3.0 libxenstore3.0-dbgsym libxentoolcore1 libxentoolcore1-dbgsym libxentoollog1 libxentoollog1-dbgsym xen-doc xen-hypervisor-4.11-amd64 xen-hypervisor-common xenstore-utils xenstore-utils-dbgsym xen-system-amd64 xen-utils-4.11 xen-utils-4.11-dbgsym xen-utils-common xen-utils-common-dbgsym The source debdiff is attached for sake of completeness. Please unblock. Thanks a lot, Hans van Kranenburg Debian Xen Team debdiff_xen_4.11.1+26-g87f51bf366-3_xen_4.11.1+92-g6c33308a8d-1.txt.gz Description: application/gzip
Bug#929129: [Pkg-xen-devel] Bug#929129: closed by Hans van Kranenburg (Bug#929129: fixed in xen 4.11.1+92-g6c33308a8d-1)
On 6/19/19 4:43 PM, Wiebe Cazemier wrote: > This is an update to the unstable release. What is one running Debian > stable (9), with Xen Hypervisor 4.8, to do? This is not meant as a middle finger to users of stable. All of the bug numbers will be closed twice, also by the 4.8 upload, which also has to mention them. This is confusing, however the automated behaviour after uploading any of them is to close the bug with that report. At least the 4.11 is out now, last thing I heard about 4.8 was that there are issues compiling the current 4.8-stable upstream branch in Stretch, and that's quite an important prerequisite for continuing. :| Ian needs to work on that. I will see if I can manipulate them a bit. All the other ones mentioned in the changelog should also have the info that it's found in current version in stable attached to them, so that the version graph shows both. Hans