Bug#816513: Call to configure_network in initramfs script broken due to set -e
Package: mandos-client Version: 1.6.9-1 Hi, The mandos initramfs script <${INITRAMFS}/scripts/init-premount/mandos> is configured with set -e. (#!/bin/sh -e in the shebang). This causes that it aborts when any command executed returns non-zero and the return value is not checked. The problem is that this script sources /scripts/functions from the initramfs and /scripts/functions was not designed to work with set -e. So when the mandos script calls any function sourced from /scripts/functions problems may happen. For example, I have found that when executing the function configure_networking it will cause the mandos script to abort if the DHCP server don't replies in less than 2 seconds. This function is called from the mandos initramfs script <${INITRAMFS}/scripts/init-premount/mandos> to configure the network when mandos=connect is specified on the kernel command line. Let's take a look to the function configure_networking: configure_networking() { # [... skipped code for clarity ] for ROUNDTTT in 2 3 4 6 9 16 25 36 64 100; do # The NIC is to be configured if this file does not exist. # Ip-Config tries to create this file and when it succeds # creating the file, ipconfig is not run again. for x in /run/net-"${DEVICE}".conf /run/net-*.conf ; do [ -e "$x" ] && break 2 done case ${IP} in none|off) # Do nothing ;; ""|on|any) # Bring up device ipconfig -t ${ROUNDTTT} "${DEVICE}" ;; dhcp|bootp|rarp|both) ipconfig -t ${ROUNDTTT} -c ${IP} -d "${DEVICE}" ;; *) ipconfig -t ${ROUNDTTT} -d $IP # grab device entry from ip option NEW_DEVICE=${IP#*:*:*:*:*:*} if [ "${NEW_DEVICE}" != "${IP}" ]; then NEW_DEVICE=${NEW_DEVICE%%:*} else # wrong parse, possibly only a partial string NEW_DEVICE= fi if [ -n "${NEW_DEVICE}" ]; then DEVICE="${NEW_DEVICE}" fi ;; esac done # [... skipped code for clarity ] } This function will call ipconfig (from klibc-utils) with a different ROUNDTTT value each time. The problem is that ipconfig will return a non-zero value if it fails to get the DHCP value before the timeout. This is fine if configure_networking has not been called with set -e. Otherwise it will break things because it makes abort the whole script on the first failure from ipconfig. This is part of trace from the initramfs obtained by booting the machine with the debug parameter in the kernel cmdline. Begin: Running /scripts/init-premount ... + run_scripts /scripts/init-premount + initdir=/scripts/init-premount + [ ! -d /scripts/init-premount ] + shift + . /scripts/init-premount/ORDER + /scripts/init-premount/plymouth + [ -e /conf/param.conf ] + /scripts/init-premount/mandos calling: settle IP-Config: eth1 hardware address 0c:14:3a:1b:af:81 mtu 1500 DHCP RARP IP-Config: eth0 hardware address 0c:14:2a:1b:af:80 mtu 1500 DHCP RARP IP-Config: no response after 2 secs - giving up + [ -e /conf/param.conf ] + [ n != y ] + log_end_msg + _log_msg done.\n + [ n = y ] + printf done.\n done. + maybe_break mount + [ = mount ] + log_begin_msg Mounting root file system + _log_msg Begin: Mounting root file system ... + [ n = y ] + printf Begin: Mounting root file system ... Begin: Mounting root file system ... + . /scripts/local + . /scripts/nfs + . /scripts/local As you can see, the script /scripts/init-premount/mandos exits as soon as IP-Config fails on the first try to get IP with a 2 second timeout. A possible fix is the following patch: --- a/usr/share/initramfs-tools/scripts/init-premount/mandos2016-03-02 10:41:43.437960673 +0100 +++ b/usr/share/initramfs-tools/scripts/init-premount/mandos2016-03-02 13:00:27.392153826 +0100 @@ -94,7 +94,7 @@ # If we are connecting directly, run "configure_networking" (from # /scripts/functions); it needs IPOPTS and DEVICE if [ "${connect+set}" = set ]; then -configure_networking +configure_networking || true if [ -n "$connect" ]; then cat <<-EOF >>/conf/conf.d/mandos/plugin-runner.conf But there are also other possibilities like disabling set -e on the script. Maybe there are other functions that can cause trouble. I have checked all the scripts on my initramfs and only the mandos and the udev ones are running with set -e. signature.asc Description: OpenPGP digital signature
Bug#785672: Critical ext4 data corruption bug (maybe is dm-crypt related ?)
Are you using dm-crypt? Then this may be related to another bug that appeared on 4.0. See: http://thread.gmane.org/gmane.linux.kernel/1942014 The following issue on RH's tracker is also related: https://bugzilla.redhat.com/show_bug.cgi?id=1223332 I can confirm that last bug (dm-crypt). I have experimented this issue after upgrading to 4.0.2 (lot of libata errors). Luckily I quickly noticed it and downgraded to 3.16, and I didn't suffered of any data corruption/loss (or at least didn't noticed so far). Regards. signature.asc Description: OpenPGP digital signature
Bug#708070: enable x32 support for the amd64 kernels
I was just hit by bug https://bugs.debian.org/736659 after installing gcc-multilib and later rebuilding my initramfs. I don't think this situation of having several x32 packages on the archive (which other packages depend on) while the official debian kernel don't supports x32 at all is sustainable in the long run. Please consider enabling CONFIG_X86_X32 on 3.14 and superior. Thanks signature.asc Description: OpenPGP digital signature
Bug#712062: Please enable X86_INTEL_PSTATE (P state power scaling driver)
Source: linux Version: 3.9.5-1 Severity: wishlist Please consider enabling X86_INTEL_PSTATE on 3.9 or superior This is a new CPU power scaling driver specially optimized for the latest Intel CPUs (Sandy Bridge and Ivy Bridge) https://lwn.net/Articles/536017/ Thanks! signature.asc Description: OpenPGP digital signature
Bug#573483: linux-headers-3.9-1-amd64 : Depends: linux-kbuild-3.9 but it is not installable
And again... $ sudo apt-get install linux-headers-3.9-1-amd64 Reading package lists... Done Building dependency tree Reading state information... Done Some packages could not be installed. This may mean that you have requested an impossible situation or if you are using the unstable distribution that some required packages have not yet been created or been moved out of Incoming. The following information may help to resolve the situation: The following packages have unmet dependencies: linux-headers-3.9-1-amd64 : Depends: linux-kbuild-3.9 but it is not installable E: Unable to correct problems, you have held broken packages. $ rmadison linux-headers-3.9-1-amd64 linux-headers-3.9-1-amd64 | 3.9.4-1 | sid | amd64, i386 $ rmadison linux-kbuild-3.9 $ rmadison -S linux-tools libusbip-dev | 1.1.1+3.2.17-1 | wheezy| amd64, armel, armhf, i386, ia64, mips, mipsel, powerpc, s390, s390x, sparc libusbip-dev | 1.1.1+3.2.17-1 | jessie| amd64, armel, armhf, i386, ia64, mips, mipsel, powerpc, s390, s390x, sparc libusbip-dev | 1.1.1+3.8.11-1 | sid | amd64, armel, armhf, i386, ia64, mips, mipsel, powerpc, s390, s390x, sparc linux-kbuild-3.2 | 3.2.1-2~bpo60+1 | squeeze-backports | amd64, armel, i386, ia64, mips, mipsel, powerpc, s390, sparc linux-kbuild-3.2 | 3.2.17-1| wheezy| amd64, armel, armhf, i386, ia64, mips, mipsel, powerpc, s390, s390x, sparc linux-kbuild-3.2 | 3.2.17-1| jessie| amd64, armel, armhf, i386, ia64, mips, mipsel, powerpc, s390, s390x, sparc linux-kbuild-3.8 | 3.8.11-1| sid | amd64, armel, armhf, i386, ia64, mips, mipsel, powerpc, s390, s390x, sparc linux-tools | 3.2.1-2~bpo60+1 | squeeze-backports | source linux-tools | 3.2.17-1| wheezy| source linux-tools | 3.2.17-1| jessie| source linux-tools | 3.8.11-1| sid | source linux-tools-3.2 | 3.2.1-2~bpo60+1 | squeeze-backports | amd64, armel, i386, powerpc, s390, sparc linux-tools-3.2 | 3.2.17-1| wheezy| amd64, armel, armhf, i386, powerpc, s390, s390x, sparc linux-tools-3.2 | 3.2.17-1| jessie| amd64, armel, armhf, i386, powerpc, s390, s390x, sparc linux-tools-3.8 | 3.8.11-1| sid | amd64, armel, armhf, i386, powerpc, s390, s390x, sparc usbip| 1.1.1+3.2.17-1 | wheezy| amd64, armel, armhf, i386, ia64, mips, mipsel, powerpc, s390, s390x, sparc usbip| 1.1.1+3.2.17-1 | jessie| amd64, armel, armhf, i386, ia64, mips, mipsel, powerpc, s390, s390x, sparc usbip| 1.1.1+3.8.11-1 | sid | amd64, armel, armhf, i386, ia64, mips, mipsel, powerpc, s390, s390x, sparc signature.asc Description: OpenPGP digital signature
Bug#625922: RE: Bug#625922: Failures with ST2000DL003-9VT166
On 15/06/11 22:15, Paul Faure wrote: I upgraded my raid disks to 4 ST32000644NS (Seagate Constellation 2TB) and I haven't had an issue since. I have also moved the cheaper ST2000DL003-9VT166 disks to a Windows XP box (in a non raid environment) and haven't seen a problem in days now. There are plenty of references online now popping up saying that green disks are not designed or supported in a raid environment. Weather or not that's because of a physical issue, or a software issue, im not sure. Paul Just a guess: One of the features of green drivers is that they park the heads every few seconds [1] without disk activity. Can this be the root of the problem? Perhaps the HDDs are slow responding when they have the heads parked (which tends to happen too much often than with normal drives) and this causes this issue. You can disable this by forcing a one-hour timeout for parking the heads with: hdparm -S 242 /dev/sdX This needs to be done at boot time each time. Perhaps an init.d script would help. You can check the SMART attribute 193 of your drives, which tells you how many times the drive has parked the head in its life. You will see that this number is far greater on such green drivers than on normal drives. smartctl -a /dev/sdX | grep 193 Regards! [1] http://forums.anandtech.com/showthread.php?t=2085685 signature.asc Description: OpenPGP digital signature
Bug#625922: Failures with ST2000DL003-9VT166
I found the following blog post that contains some useful tips about this issue: http://paul.sullivan.za.org/kernel-disables-sata-drive-under-heavy-load-action-0x6-frozen/ signature.asc Description: OpenPGP digital signature
Bug#672891: WARNING: at drivers/net/wireless/brcm80211/brcmsmac/main.c:8241 brcms_c_wait_for_tx_completion+0x75/0x7f [brcmsmac]()
Hello, I am hitting this bug also. It usually shows after a few minutes of uptime and it don't shows again. When it happens the network connectivity seems to stop transferring data and it resumes in a while without disrupting (not need to reconnect) Here is the relevant information: # cat /etc/debian_version 6.0.4 # uname -srvmo Linux 3.2.0-0.bpo.3-686-pae #1 SMP Thu Aug 23 08:21:41 UTC 2012 i686 GNU/Linux # apt-cache policy linux-image-$(uname -r) linux-image-3.2.0-0.bpo.3-686-pae: Installed: 3.2.23-1~bpo60+2 Candidate: 3.2.23-1~bpo60+2 Version table: *** 3.2.23-1~bpo60+2 0 100 http://backports.debian.org/debian-backports/ squeeze-backports/main i386 Packages 100 /var/lib/dpkg/status # apt-cache policy firmware-brcm80211 firmware-brcm80211: Instalado: 0.35~bpo60+1 Candidato: 0.35~bpo60+1 Táboa de versións: *** 0.35~bpo60+1 0 100 http://backports.debian.org/debian-backports/ squeeze-backports/non-free i386 Packages 100 /var/lib/dpkg/status 0.28+squeeze1 0 500 http://ftp.es.debian.org/debian/ squeeze/non-free i386 Packages # lspci -vvv | grep BCM4313 -A44 44:00.0 Network controller: Broadcom Corporation BCM4313 802.11b/g/n Wireless LAN Controller (rev 01) Subsystem: Hewlett-Packard Company Device 145c Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast TAbort- TAbort- MAbort- SERR- PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 19 Region 0: Memory at d050 (64-bit, non-prefetchable) [size=16K] Capabilities: [40] Power Management version 3 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=2 PME- Capabilities: [58] Vendor Specific Information: Len=78 ? Capabilities: [48] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: Data: Capabilities: [d0] Express (v1) Endpoint, MSI 00 DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s 4us, L1 unlimited ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- RlxdOrd- ExtTag- PhantFunc- AuxPwr- NoSnoop- MaxPayload 128 bytes, MaxReadReq 128 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Latency L0 4us, L1 64us ClockPM+ Surprise- LLActRep+ BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+ ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x1, TrErr- Train- SlotClk+ DLActive+ BWMgmt- ABWMgmt- Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Capabilities: [13c v1] Virtual Channel Caps: LPEVC=0 RefClk=100ns PATEntryBits=1 Arb:Fixed- WRR32- WRR64- WRR128- Ctrl: ArbSelect=Fixed Status: InProgress- VC0:Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans- Arb:Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256- Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=ff Status: NegoPending- InProgress- Capabilities: [160 v1] Device Serial Number 00-00-12-ff-ff-c1-ac-81 Capabilities: [16c v1] Power Budgeting ? Kernel driver in use: brcmsmac # dmesg [] [ 6196.254263] ieee80211 phy0: brcms_ops_bss_info_changed: qos enabled: false (implement) [ 6196.254272] ieee80211 phy0: brcmsmac: brcms_ops_bss_info_changed: disassociated [ 6196.254278] ieee80211 phy0: brcms_ops_bss_info_changed: arp filtering: enabled false, count 1 (implement) [ 6196.481463] [ cut here ] [ 6196.481490] WARNING: at /build/buildd-linux_3.2.23-1~bpo60+2-i386-OorA0s/linux-3.2.23/drivers/net/wireless/brcm80211/brcmsmac/main.c:8241 brcms_c_wait_for_tx_completion+0x75/0x7f [brcmsmac]() [ 6196.481495] Hardware name: HP ProBook 6550b [ 6196.481497] Modules linked in: ppdev lp rfcomm
Bug#656899: grub-probe: sending ioctl 1261 to a partition! (Re: confirm ioctl issue on kernel 3.2 from squeeze backports)
On 20/06/12 23:40, Jonathan Nieder wrote: Carlos Alberto Lopez Perez wrote: Is the fix for this issue going to be backported to 3.2 via sta...@vger.kernel.org ? Is it fixed in mainline? Yes. Commit 6d9359280753d2955f86d6411047516a9431eb51 on linux-next [1] https://lkml.org/lkml/2012/6/15/150 Regards! [1] http://git.kernel.org/?p=linux/kernel/git/next/linux-next.git;a=commitdiff;h=6d9359280753d2955f86d6411047516a9431eb51 -- ~~~ Carlos Alberto Lopez Perez http://neutrino.es Igalia - Free Software Engineeringhttp://www.igalia.com ~~~ signature.asc Description: OpenPGP digital signature
Bug#656899: confirm ioctl issue on kernel 3.2 from squeeze backports
Hello, I have just installed kernel 3.2 from debian-backports linux-image-3.2.0-0.bpo.2-686-pae (=3.2.18-1~bpo60+1) on a squeeze system $ uname -r 3.2.0-0.bpo.2-686-pae And I am started receiving this kind of messages $ dmesg|tail [ 274.910635] grub-probe: sending ioctl 1261 to a partition! [ 274.910638] grub-probe: sending ioctl 1261 to a partition! [ 274.910967] grub-probe: sending ioctl 1261 to a partition! [ 274.910968] grub-probe: sending ioctl 1261 to a partition! [ 275.294382] grub-probe: sending ioctl 1261 to a partition! [ 275.294385] grub-probe: sending ioctl 1261 to a partition! [ 275.460448] grub-probe: sending ioctl 1261 to a partition! [ 275.460451] grub-probe: sending ioctl 1261 to a partition! [ 275.460881] grub-probe: sending ioctl 1261 to a partition! [ 275.460882] grub-probe: sending ioctl 1261 to a partition! Is the fix for this issue going to be backported to 3.2 via sta...@vger.kernel.org ? -- ~~~ Carlos Alberto Lopez Perez http://neutrino.es Igalia - Free Software Engineeringhttp://www.igalia.com ~~~ signature.asc Description: OpenPGP digital signature
Bug#666556: Please enable Intel Sandy-Bridge Integrated MC CONFIG_EDAC_SBRIDGE
On 01/04/12 08:20, Ben Hutchings wrote: On Sat, 2012-03-31 at 19:30 +0200, Carlos Alberto Lopez Perez wrote: Source: linux-2.6 Version: 3.2.12-1 Severity: normal Hello, Please enable the support for the Intel Sandy-Bridge memory controller for EDAC so we can take advantage of ECC ram modules on this platform. As you may know, EDAC is not required for ECC. It does improve error reporting and helps you to identify faulty modules. Yes, I discover this after sending the bug report. Sorry for the confusion. Just set CONFIG_EDAC_SBRIDGE=m in the .config The driver is marked as experimental and there are several important bug fixes post-3.2. If we enable it, we also need to apply those bug fixes and possibly others. Ben. The diff between 3.4-rc1 and 3.2 for this driver is small, so probably its worth applying this bug fixes and enabling it. I believe that sandy bridge based architectures are going to be very used during the debian wheezy reign. Regards! -- ~~~ Carlos Alberto Lopez Perez http://neutrino.es Igalia - Free Software Engineeringhttp://www.igalia.com ~~~ signature.asc Description: OpenPGP digital signature
Bug#666556: Please enable Intel Sandy-Bridge Integrated MC CONFIG_EDAC_SBRIDGE
Source: linux-2.6 Version: 3.2.12-1 Severity: normal Hello, Please enable the support for the Intel Sandy-Bridge memory controller for EDAC so we can take advantage of ECC ram modules on this platform. Just set CONFIG_EDAC_SBRIDGE=m in the .config Thanks! -- ~~~ Carlos Alberto Lopez Perez http://neutrino.es Igalia - Free Software Engineeringhttp://www.igalia.com ~~~ signature.asc Description: OpenPGP digital signature
Bug#659169: [2.6.32] BUG: soft lockup - CPU#7 stuck for 17163091979s! [init:9709]
f9e0 88045d2bdfd8 Feb 8 18:38:04 server-i7_920 kernel: [ 4080.447699] 00015780 00015780 88055c60c7e0 88055c60cad8 Feb 8 18:38:04 server-i7_920 kernel: [ 4080.447703] Call Trace: Feb 8 18:38:04 server-i7_920 kernel: [ 4080.447710] [81309fec] ? schedule_timeout+0x2e/0xdd Feb 8 18:38:04 server-i7_920 kernel: [ 4080.447715] [81070704] ? vx_update_load+0x18/0x13c Feb 8 18:38:04 server-i7_920 kernel: [ 4080.447718] [81309ea5] ? wait_for_common+0xde/0x15b Feb 8 18:38:04 server-i7_920 kernel: [ 4080.447723] [8104a4f0] ? default_wake_function+0x0/0x9 Feb 8 18:38:04 server-i7_920 kernel: [ 4080.447727] [8103a4af] ? activate_task+0x22/0x28 Feb 8 18:38:04 server-i7_920 kernel: [ 4080.447730] [8104de19] ? do_fork+0x2d5/0x34e Feb 8 18:38:04 server-i7_920 kernel: [ 4080.447735] [8101657d] ? read_tsc+0xa/0x20 Feb 8 18:38:04 server-i7_920 kernel: [ 4080.447738] [810175f5] ? sys_vfork+0x20/0x22 Feb 8 18:38:04 server-i7_920 kernel: [ 4080.447741] [810377d5] ? ia32_ptregs_common+0x25/0x4c Notice the weird thing between 17:28:57 and 18:20:04. The [timestamps] seems to go backwards (!!) How this can be even possible?... Also many of the modules listed on Modules linked in (btrfs,ufs,qnx4..) were *NOT* expected to be in use. This is some useful information about this server (got after the reboot): # lsmod Module Size Used by xt_limit1782 1 xt_tcpudp 2319 9 xt_state1303 2 iptable_mangle 2817 0 iptable_nat 4299 1 nf_nat 13388 1 iptable_nat nf_conntrack_ipv4 9833 5 iptable_nat,nf_nat nf_conntrack 46551 4 xt_state,iptable_nat,nf_nat,nf_conntrack_ipv4 nf_defrag_ipv4 1139 1 nf_conntrack_ipv4 iptable_filter 2258 1 ip_tables 13915 3 iptable_mangle,iptable_nat,iptable_filter x_tables 12845 5 xt_limit,xt_tcpudp,xt_state,iptable_nat,ip_tables ext4 291462 5 jbd2 67015 1 ext4 crc16 1319 1 ext4 dummy 1584 0 loop 11975 0 snd_pcm60503 0 i2c_i8017830 0 snd_timer 15582 1 snd_pcm i2c_core 15819 1 i2c_i801 snd46526 2 snd_pcm,snd_timer soundcore 4598 1 snd snd_page_alloc 6249 1 snd_pcm pcspkr 1699 0 ioatdma34892 24 dca 3761 1 ioatdma psmouse49937 0 evdev 7352 2 serio_raw 3752 0 processor 29951 8 button 4650 0 ext3 109110 25 jbd37101 1 ext3 mbcache 5050 2 ext4,ext3 dm_mod 54320 91 sd_mod 29921 3 crc_t10dif 1276 1 sd_mod uhci_hcd 18537 0 ahci 32534 2 libata133776 1 ahci ehci_hcd 32097 0 scsi_mod 126533 2 sd_mod,libata usbcore 122514 3 uhci_hcd,ehci_hcd nls_base6377 1 usbcore thermal11674 0 e1000e110079 0 thermal_sys11942 2 processor,thermal # parted /dev/sda print Model: ATA ST32000641AS (scsi) Disk /dev/sda: 2000GB Sector size (logical/physical): 512B/512B Partition Table: gpt Number Start End SizeFile system Name Flags 1 17.4kB 512MB 512MB ext3 boot 2 512MB 2000GB 2000GB lvm # pvs PV VG Fmt Attr PSize PFree /dev/sda2 vg lvm2 a- 1.82t 1.46t # uname -r 2.6.32-5-vserver-amd64 Thanks! -- ~~~ Carlos Alberto Lopez Perez http://neutrino.es Igalia - Free Software Engineeringhttp://www.igalia.com ~~~ signature.asc Description: OpenPGP digital signature
Bug#659169: Re: Bug#659169: [2.6.32] BUG: soft lockup - CPU#7 stuck for 17163091979s! [init:9709]
On -10/01/37 20:59, Ben Hutchings wrote: Version: 2.6.32-40 On Wed, Feb 08, 2012 at 10:13:50PM +0100, Carlos Alberto Lopez Perez wrote: Source: linux-2.6 Version: 2.6.32-5 Severity: normal Hello, Today one of my servers stopped responding to some webs, ssh'ing into it was impossible. The ping worked and the ssh connection was starting but the shell didn't showed up after waiting a long time. Finally a hard-reset was needed in order to bring it back After the reboot I found this on kern.log # tail /var/log/kern.log Feb 1 09:41:00 server-i7_920 kernel: [17383037.287331] EXT4-fs (dm-29): mounted filesystem with ordered data mode Feb 5 09:38:15 server-i7_920 kernel: [17727613.769052] NOHZ: local_softirq_pending 100 Feb 7 05:56:07 server-i7_920 kernel: [17886689.887577] e1000e: eth0 NIC Link is Down Feb 7 05:59:05 server-i7_920 kernel: [17886867.166464] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Feb 7 05:59:41 server-i7_920 kernel: [17886903.722727] e1000e: eth0 NIC Link is Down Feb 7 06:00:00 server-i7_920 kernel: [17886922.309159] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None Feb 8 17:28:57 server-i7_920 kernel: [18446744016.876326] BUG: soft lockup - CPU#7 stuck for 17163091979s! [init:9709] [...] This appears to be the bug fixed by 'sched, x86: Avoid unnecessary overflow in sched_clock', included in longterm update 2.6.32.50 and Debian package version 2.6.32-40. That bug would be triggered once the scheduler clock reached 18014398 seconds, which is a little after the last reasonable time seen in this log. Ben. Wow! Really amazing, thanks for the reply. I will be upgrading the kernel ASAP. Regards! -- ~~~ Carlos Alberto Lopez Perez http://neutrino.es Igalia - Free Software Engineeringhttp://www.igalia.com ~~~ signature.asc Description: OpenPGP digital signature
Re: Uploading linux-2.6 (3.2.1-1)
On 16/01/12 01:55, Axel Beckert wrote: Hi Ben, Ben Hutchings wrote: I intend to upload linux-2.6 version 3.2.1-1 to unstable early this week. This is the latest upstream version (3.2) together with the first upstream stable update. I wonder which kernel is targeted for Wheezy. According to http://www.kroah.com/log/linux/stable-status-01-2012.html neither 3.1 nor 3.2 will be longterm supported kernels while 3.0 will get longterm support by upstream. IIRC there was once at least the question if 3.0 should be used for Wheezy. But I guess that's no more a question as 3.0 is no more in unstable nor testing. (Sorry if that question had been answered before, but recently that question came up on IRC when someone posted the above mentioned link.) Regards, Axel 3.2 would be maintained (at least) by Ubuntu up to 2017 since they will be using it for their 12.04 LTS release There is a thread for the discussion about the Wheezy kernel here: http://lists.debian.org/debian-kernel/2012/01/msg00254.html Regards! -- ~~~ Carlos Alberto Lopez Perez http://neutrino.es Igalia - Free Software Engineeringhttp://www.igalia.com ~~~ signature.asc Description: OpenPGP digital signature
Bug#655353: XFS lockups on 2.6.32 [task xfssyncd blocked for more than 120 seconds]
] [8104a705] ? default_wake_function+0x0/0x9 [ 5520.764234] [a0270058] ? xfs_reclaim_inode+0x95/0xe0 [xfs] [ 5520.764251] [a0270975] ? xfs_inode_ag_walk+0x92/0xef [xfs] [ 5520.764268] [a026ffc3] ? xfs_reclaim_inode+0x0/0xe0 [xfs] [ 5520.764285] [a0270a43] ? xfs_inode_ag_iterator+0x71/0xb2 [xfs] [ 5520.764301] [a026ffc3] ? xfs_reclaim_inode+0x0/0xe0 [xfs] [ 5520.764318] [a0270bd0] ? xfs_sync_worker+0x26/0x5f [xfs] [ 5520.764335] [a0270338] ? xfssyncd+0x150/0x1bb [xfs] [ 5520.764351] [a02701e8] ? xfssyncd+0x0/0x1bb [xfs] [ 5520.764356] [81065bf5] ? kthread+0x79/0x81 [ 5520.764362] [81011baa] ? child_rip+0xa/0x20 [ 5520.764366] [81065b7c] ? kthread+0x0/0x81 [ 5520.764369] [81011ba0] ? child_rip+0x0/0x20 Server 4 running Debian/Squeeze with linux-image-2.6.32-5-vserver-amd64/2.6.32-39 - [279117.747342] INFO: task xfssyncd:1985 blocked for more than 120 seconds. [279117.747401] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. [279117.747463] xfssyncd D 0 1985 2 0x [279117.747466] 88023f06c0b0 0046 00800012 8118a673 [279117.747468] 88023f0a6c00 f9e0 88023cd93fd8 00015780 [279117.747470] 00015780 88023c1547e0 88023c154ad8 000298bc2318 [279117.747473] Call Trace: [279117.747479] [8118a673] ? generic_make_request+0x299/0x2f9 [279117.747483] [810114ce] ? common_interrupt+0xe/0x13 [279117.747501] [a01b45f4] ? xfs_ioend_wait+0x84/0x9c [xfs] [279117.747503] [8106599e] ? autoremove_wake_function+0x0/0x2e [279117.747513] [a0199460] ? xfs_ilock_nowait+0x32/0x92 [xfs] [279117.747520] [a01bce36] ? xfs_sync_inode_data+0x91/0xa8 [xfs] [279117.747528] [a01bcf61] ? xfs_inode_ag_walk+0x92/0xef [xfs] [279117.747535] [a01bcda5] ? xfs_sync_inode_data+0x0/0xa8 [xfs] [279117.747542] [a01bd02f] ? xfs_inode_ag_iterator+0x71/0xb2 [xfs] [279117.747549] [a01bcda5] ? xfs_sync_inode_data+0x0/0xa8 [xfs] [279117.747557] [a01bd2fe] ? xfs_sync_data+0x20/0x42 [xfs] [279117.747564] [a01bd344] ? xfs_flush_inodes_work+0x24/0x31 [xfs] [279117.747571] [a01bc924] ? xfssyncd+0x150/0x1bb [xfs] [279117.747578] [a01bc7d4] ? xfssyncd+0x0/0x1bb [xfs] [279117.747580] [810656d1] ? kthread+0x79/0x81 [279117.747582] [81011baa] ? child_rip+0xa/0x20 [279117.747584] [81065658] ? kthread+0x0/0x81 [279117.747585] [81011ba0] ? child_rip+0x0/0x20 A bit of googling [1] suggests that perhaps the commit 17b3847 [2] could have fixed this Also I can say that I have switched some weeks ago one of this servers to a vainilla/vserver 3.1 Kernel and since then I didn't saw this lockup anymore. - [1] http://comments.gmane.org/gmane.comp.file-systems.xfs.general/41907 [2] https://git.kernel.org/linus/17b3847 -- ~~~ Carlos Alberto Lopez Perez http://neutrino.es Igalia - Free Software Engineeringhttp://www.igalia.com ~~~ signature.asc Description: OpenPGP digital signature
Bug#605090: [RFC] Add a grsec featureset to Debian kernels
Hello, What is the status of this? It has been a looong time ago since last update. I am also interested in having a Debian kernel with the grsec+pax featureset and I am sure that many sysadmins would appreciate this possibility. There is a huge user base of grsec from hosting companies. I agree that this RBAC thing may be not interesting for everybody giving the fact that it duplicates some functionality (we already have SELinux and TOMOYO). So if you really feel so strong about removing this feature from the debian-grsec-kernel it can be easily done just by setting CONFIG_GRKERNSEC_NO_RBAC=y in the .config (there is no need to ask upstream to split the patch). Anyway I think RBAC is a nice feature and it don't hurts: Its far easier to use than SElinux [1] and we already have in Debian the user-space tools to work with it: CC'ing Laszlo Boszormenyi (maintainer of linux-patch-grsecurity2, paxctl and gradm2) I would like to see this moving forward, so I volunteer myself to help with the maintenance of this featureset. Regards! [1] http://www.cs.virginia.edu/~jcg8f/SELinux%20grsecurity%20paper.pdf -- ~~~ Carlos Alberto Lopez Perez http://neutrino.es Igalia - Free Software Engineeringhttp://www.igalia.com ~~~ signature.asc Description: OpenPGP digital signature
Bug#613321: linux-image-amd64: Please enable 'memtest' option for all linux kernels
On 05/12/11 06:53, Ben Hutchings wrote: Since this feature requires almost no extra memory (the code is all discardable after boot) I'm prepared to enable it. However, I will modify it to taint the kernel if any memory fault is detected, on the basis that there are likely to be other undetected faults. Indeed good idea. I think you should submit this patch upstream. Regards! -- ~~~ Carlos Alberto Lopez Perez http://neutrino.es Igalia - Free Software Engineeringhttp://www.igalia.com ~~~ -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4ee18871.8020...@igalia.com
Bug#627573: linux-image-2.6.39-1-686-pae: Please include aufs in linux-image-2.6.39
On 06/16/2011 09:57 PM, Jonathan Nieder wrote: Hi Carlos, Carlos Alberto Lopez Perez wrote: The Ubuntu oneric kernel has a working aufs module git://kernel.ubuntu.com/ubuntu/ubuntu-oneiric.git (tag Ubuntu-2.6.39-3.10) Yes, it would be nice if the two distros could collaborate on this kind of thing. Fortunately in this case, Debian's linux-2.6 2.6.39-2 already has aufs: * aufs: Update for 2.6.39 (Closes: #627837) Is it not working for you? My fault, I didn't checked that it was already included in Debian's kernel Sorry. Does Andy Whitcroft's patch have additional improvements that Debian should adopt? Andy Whitcroft's patch is the same that Debian (and upstream) has [1] except for the following: * The Ubuntu guys are commenting the line WARN_ONCE(cnt AUFS_PLINK_WARN, unexpectedly many pseudo links, %d\n, cnt); (suppress benign plink warning messages) [1] http://sourceforge.net/mailarchive/message.php?msg_id=27661607 -- To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4dfb4d50.7060...@igalia.com