Bug#810121: linux: KVM guests randomly get I/O errors on VirtIO based devices

2016-01-06 Thread Jordi Mallach
Source: linux
Version: 3.16.7-ckt11-1+deb8u5
Severity: important

Hi kernel maintainers,

We've been seeing a strange bug in KVM guests hosted by a Debian jessie box 
(running 3.16.7-ckt11-1+deb8u5),

Basically, we are getting random VirtIO errors inside our guests, resulting in 
stuff like this

[4735406.568235] blk_update_request: I/O error, dev vda, sector 142339584
[4735406.572008] EXT4-fs warning (device dm-0): ext4_end_bio:317: I/O error -5 
writing to inode 1184437 (offset 0 size 208896 starting block 17729472)
[4735406.572008] Buffer I/O error on device dm-0, logical block 17729472
[ ... ]
[4735406.572008] Buffer I/O error on device dm-0, logical block 17729481
[4735406.643486] blk_update_request: I/O error, dev vda, sector 142356480
[ ... ]
[4735406.748456] blk_update_request: I/O error, dev vda, sector 38587480
[4735411.020309] Buffer I/O error on dev dm-0, logical block 12640808, lost 
sync page write
[4735411.055184] Aborting journal on device dm-0-8.
[4735411.056148] Buffer I/O error on dev dm-0, logical block 12615680, lost 
sync page write
[4735411.057626] JBD2: Error -5 detected when updating journal superblock for 
dm-0-8.
[4735411.057936] Buffer I/O error on dev dm-0, logical block 0, lost sync page 
write
[4735411.057946] EXT4-fs error (device dm-0): ext4_journal_check_start:56: 
Detected aborted journal
[4735411.057948] EXT4-fs (dm-0): Remounting filesystem read-only
[4735411.057949] EXT4-fs (dm-0): previous I/O error to superblock detected

(From an Ubuntu 15.04 guest, EXT4 on LVM2)

Or,

Jan 06 03:39:11 titanium kernel: end_request: I/O error, dev vda, sector 
1592467904
Jan 06 03:39:11 titanium kernel: EXT4-fs warning (device vda3): 
ext4_end_bio:317: I/O error -5 writing to inode 31169653 (offset 0 size 0 
starting block 199058492)
Jan 06 03:39:11 titanium kernel: Buffer I/O error on device vda3, logical block 
198899256
[...]
Jan 06 03:39:12 titanium kernel: Aborting journal on device vda3-8.
Jan 06 03:39:12 titanium kernel: Buffer I/O error on device vda3, logical block 
99647488

(From a Debian jessie guest, EXT4 directly on a VirtIO-based block device)

When this happens, it affects multiple guests on the hosts at the same time.
Normally they are severe enough that they end up with a r/o file system, but
we've seen a few hosts survive with a non-fatal I/O error. The host's dmesg has
nothing interesting to see.

We've seen this happen with quite heterogeneous guests:

- Debian 6, 7 and 8 (Debian kernels 2.6.32, 3.2 and 3.16)
- Ubuntu 14.09 and 15.04 (Ubuntu kernels)
- 32 bit and 64 bit installs.

In short, we haven't seen a clear characteristic in any guest, other than the
affected hosts being the ones with some sustained I/O load (build machines,
cgit servers, PostgreSQL RDBMs...). Most of the times, hosts that just sit
there doing nothing with their disks are not affected.

The host is a stock Debian jessie install that manages libvirt-based QEMU
guests. All the guests have their block devices using virtio drivers, some of
them on spinning media based on LSI RAID (was a 3ware card before, got replaced
as we were very suspicious about it, but are getting the same results), and
some of them based on PCIe SSD storage. We have some other 3 hosts, similar
setup except they run Debian wheezy (and honestly we're not too keen on
upgrading them yet, just in case), none of them has ever shown this kind of
problem

We've been seeing this since last summer, and haven't found a pattern that
tells us where these I/O error bugs are coming from. Google isn't revealing
other people with a similar problem, and we're finding that quite surprising as
our setup is quite basic.

Thanks,
Jordi



Bug#661860: Fixed in stable trees 3.2 and 3.8

2013-05-18 Thread Jordi Mallach
Some months ago I was bitten by this bug, found out the upstream driver
seemed to work better (with problems), etc.

I then got distracted, until today, as I need this wireless card to
work.

With much joy, I can report the current kernel version in unstable,
based on 3.8.12, works as expected. I've seen ben also merged this patch
for 3.2, however it was released with 3.2.42, and wheezy still has
3.2.41. I hope this will end up hitting wheezy via a point release in
the future.

Thanks Ben!
Jordi
-- 
Jordi Mallach Pérez  --  Debian developer http://www.debian.org/
jo...@sindominio.net jo...@debian.org http://www.sindominio.net/
GnuPG public key information available at http://oskuro.net/


signature.asc
Description: This is a digitally signed message part


Re: alsa-source

2011-05-20 Thread Jordi Mallach
On Wed, May 18, 2011 at 08:11:59PM +0100, Ben Hutchings wrote:
  from my point of view we should orphan alsa-source and cancel it
  from alsa-driver.
 I'd be very happy with this.

Yeah, let's totally do this.

Elimar, was Ben's patch upstreamed? I guess it's still a good idea to
upstream it, even if we're not going to use it anymore.

Jordi
-- 
Jordi Mallach Pérez  --  Debian developer http://www.debian.org/
jo...@sindominio.net jo...@debian.org http://www.sindominio.net/
GnuPG public key information available at http://oskuro.net/


-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20110520182039.gb8...@aigua.oskuro.net



Bug#602520: linux-2.6: [l10n:ca] Catalan update

2010-11-05 Thread Jordi Mallach
Package: linux-2.6
Severity: wishlist
Tags: l10n

Hi kernel team!

A review of the recently submitted file unvelied a few minor problems with
the new Catalan strings.

Can you update it again with the attached file?

Many thanks!

Jordi

-- System Information:
Debian Release: squeeze/sid
  APT prefers testing
  APT policy: (500, 'testing'), (1, 'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.32-5-amd64 (SMP w/2 CPU cores)
Locale: lang=ca_es.ut...@valencia, lc_ctype=ca_es.ut...@valencia (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash


ca.po.gz
Description: GNU Zip compressed data


Bug#601146: linux-2.6: [l10n:ca] New Catalan translation of Debconf templates

2010-10-23 Thread Jordi Mallach
Package: linux-2.6
Version: 2.6.32-26
Severity: wishlist
Tags: l10n

Attached.


-- System Information:
Debian Release: squeeze/sid
  APT prefers testing
  APT policy: (500, 'testing'), (1, 'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.32-5-amd64 (SMP w/2 CPU cores)
Locale: lang=ca_es.ut...@valencia, lc_ctype=ca_es.ut...@valencia (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash


ca.po.gz
Description: GNU Zip compressed data


Bug#542275: rtc-generic not working on Apple PowerBook G4

2010-01-12 Thread Jordi Mallach
On Tue, Dec 29, 2009 at 04:38:51PM +0100, Moritz Muehlenhoff wrote:
  -   hctosys: unable to read the hardware clock\n);
  +   hctosys: unable to read the hardware clock (%d)\n, 
  err);
 Jordi, did you test the patch? Do current unstable kernels work for you?

Hi Moritz,

I'm sorry I haven't been able; and I'm afraid I won't be able to in some time
or at all:

http://oskuro.net/blog/life/broken-powerbook-g4-2009-10-30-22-07

Sorry about this. I say close the bug and if I come back as a PPC user, I'll
test 2.6.32-trunk to see what happens. I don't think that will ever happen
though. :(

-- 
Jordi Mallach Pérez  --  Debian developer http://www.debian.org/
jo...@sindominio.net jo...@debian.org http://www.sindominio.net/
GnuPG public key information available at http://oskuro.net/



--
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#542275: rtc-generic not working on Apple PowerBook G4

2009-08-18 Thread Jordi Mallach
Package: linux-image-2.6.30-1-powerpc
Version: 2.6.30-6
Severity: important

When I boot my G4-based laptop, the hw clock can't be accessed and the system
time ends up being the epoch for the platform Jan 1st, 1904.

Switching back to 2.6.26 fixes the problem. Although CONFIG_RTC_DRV_GENERIC
was compiled in in July, it seems the switch from rtc-ppc to rtc-generic
doesn't work, at least on my system. Another Debian PPC user reports the
same breakage in his G3.

2.6.26 from lenny:
[7.800992] platform ppc-rtc.0: rtc core: registered ppc_md as rtc0
[7.864990] platform ppc-rtc.0: setting system clock to 2009-08-18 18:37:53 
UTC (1250620673)

Latest 2.6.30 from unstable:
[6.473202] rtc-generic rtc-generic: rtc core: registered rtc-generic as rtc0
[6.533547] rtc-generic rtc-generic: hctosys: unable to read the hardware 
clock

Simon Raven, from #debianppc, has a working kernel which after config
inspection seems to differ only in the CONFIG_RTC_INTF_DEV_UIE_EMUL=y setting,
which is not enabled in Debian.

-- Package-specific info:

-- System Information:
Debian Release: squeeze/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (1, 'experimental')
Architecture: powerpc (ppc)

Kernel: Linux 2.6.26-1-powerpc
Locale: LANG=ca_ES.UTF-8, LC_CTYPE=ca_ES.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages linux-image-2.6.30-1-powerpc depends on:
ii  debconf [debconf-2.0] 1.5.27 Debian configuration management sy
ii  initramfs-tools [linux-initra 0.93.4 tools for generating an initramfs
ii  libc6 2.9-24 GNU C Library: Shared libraries
ii  module-init-tools 3.9-2  tools for managing Linux kernel mo

linux-image-2.6.30-1-powerpc recommends no packages.

Versions of packages linux-image-2.6.30-1-powerpc suggests:
pn  linux-doc-2.6.30  none (no description available)
pn  mkvmlinuz none (no description available)

-- debconf information:
  linux-image-2.6.30-1-powerpc/postinst/old-dir-initrd-link-2.6.30-1-powerpc: 
true
  linux-image-2.6.30-1-powerpc/prerm/removing-running-kernel-2.6.30-1-powerpc: 
true
  linux-image-2.6.30-1-powerpc/postinst/create-kimage-link-2.6.30-1-powerpc: 
true
  linux-image-2.6.30-1-powerpc/postinst/kimage-is-a-directory:
  linux-image-2.6.30-1-powerpc/preinst/overwriting-modules-2.6.30-1-powerpc: 
true
  linux-image-2.6.30-1-powerpc/preinst/initrd-2.6.30-1-powerpc:
  linux-image-2.6.30-1-powerpc/preinst/lilo-initrd-2.6.30-1-powerpc: true
  linux-image-2.6.30-1-powerpc/postinst/old-system-map-link-2.6.30-1-powerpc: 
true
  shared/kernel-image/really-run-bootloader: true
  linux-image-2.6.30-1-powerpc/preinst/elilo-initrd-2.6.30-1-powerpc: true
  linux-image-2.6.30-1-powerpc/preinst/abort-install-2.6.30-1-powerpc:
  
linux-image-2.6.30-1-powerpc/prerm/would-invalidate-boot-loader-2.6.30-1-powerpc:
 true
  linux-image-2.6.30-1-powerpc/postinst/old-initrd-link-2.6.30-1-powerpc: true
  linux-image-2.6.30-1-powerpc/preinst/bootloader-initrd-2.6.30-1-powerpc: true
  linux-image-2.6.30-1-powerpc/postinst/depmod-error-2.6.30-1-powerpc: false
  linux-image-2.6.30-1-powerpc/postinst/bootloader-error-2.6.30-1-powerpc:
  linux-image-2.6.30-1-powerpc/preinst/abort-overwrite-2.6.30-1-powerpc:
  linux-image-2.6.30-1-powerpc/preinst/failed-to-move-modules-2.6.30-1-powerpc:
  linux-image-2.6.30-1-powerpc/preinst/lilo-has-ramdisk:
  linux-image-2.6.30-1-powerpc/postinst/depmod-error-initrd-2.6.30-1-powerpc: 
false
  linux-image-2.6.30-1-powerpc/postinst/bootloader-test-error-2.6.30-1-powerpc:



-- 
To UNSUBSCRIBE, email to debian-kernel-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org



Bug#518231: linux-image-2.6.28-1-powerpc: Fails to configure due to a debconf error

2009-03-04 Thread Jordi Mallach
Package: linux-image-2.6.28-1-powerpc
Version: 2.6.28-1
Severity: serious
Tags: patch

Hi,

I tried installing linux-image-2.6.28-1-powerpc from experimental but it
failed to configure:

run-parts: /etc/kernel/postinst.d/mkvmlinuz exited with return code 20

I added set -x to this script, which revealed:

+ _db_cmd GET mkvmlinuz/bootloaders
+ IFS=  printf %s\n GET mkvmlinuz/bootloaders
+ IFS= read -r _db_internal_line
+ RET=20 Unsupported command update-initramfs: (full line was 
update-initramfs: Generating /boot/initrd.img-2.6.28-1-powerpc) received from 
confmodule.
+ return 20

My system has the not so usual feature of using GRUB2 instead of yaboot for
booting, and I suspect I'm probably the only one doing this on this on
powerpc.

yaboot was installed, and the mkvmlinux/bootloaders debconf variable was
set to yaboot, it turned out.

This bug seems a reincarnation of this Ubuntu bug:
https://bugs.launchpad.net/ubuntu/+source/kernel-package/+bug/54346

The following patch to postinst fixed it for me, although I'm not sure its
the right fix.

--- /tmp/linux-image-2.6.28-1-powerpc.postinst  2009-03-04 23:04:09.0 
+0100
+++ /var/lib/dpkg/info/linux-image-2.6.28-1-powerpc.postinst2009-03-04 
22:48:38.0 +0100
@@ -983,7 +983,7 @@
 my $initrd_path = $realimageloc . initrd.img-$version;
 my $ret = system($ramdisk_cmd  .
  ($mkimage ? -m '$mkimage'  : ) .
- -c -t -k $version);
+ -c -t -k $version 2);
 if ($ret) {
   warn $ramdisk_cmd failed to create initrd image.\n;
 } else {

-- Package-specific info:
** Version:
Linux version 2.6.28-1-powerpc (Debian 2.6.28-1) (m...@debian.org) (gcc version 
4.3.3 (Debian 4.3.3-4) ) #1 Mon Feb 23 23:26:08 UTC 2009

** Command line:
BOOT_IMAGE=/boot/vmlinux-2.6.28-1-powerpc root=/dev/hda3 ro quiet

** Not tainted

** Kernel log:
[7.880625] usb usb4: Product: EHCI Host Controller
[7.880628] usb usb4: Manufacturer: Linux 2.6.28-1-powerpc ehci_hcd
[7.880631] usb usb4: SerialNumber: 0001:10:1b.2
[7.887940] ide-gd driver 1.18
[7.887983] hda: max request size: 512KiB
[7.906049] firewire_ohci 0002:24:0e.0: enabling device ( - 0002)
[7.937116] hda: 156301488 sectors (80026 MB), CHS=16383/255/63
[7.937191] hda: cache flushes supported
[7.937300]  hda: [mac] hda1 hda2 hda3 hda4 hda5 hda6
[7.976185] firewire_ohci: Added fw-ohci device 0002:24:0e.0, OHCI version 
1.10
[7.976839] sungem.c:v0.98 8/24/03 David S. Miller (da...@redhat.com)
[8.044551] PHY ID: 1410cc1, addr: 0
[8.045269] eth0: Sun GEM (PCI) 10/100/1000BaseT Ethernet 00:0d:93:3b:80:c4
[8.045273] eth0: Found Marvell 88E PHY
[8.464592] udev: renamed network interface eth0 to lan
[8.476343] firewire_core: created device fw0: GUID 000d93fffe3b80c4, S800
[9.179749] kjournald starting.  Commit interval 5 seconds
[9.179772] EXT3-fs: mounted filesystem with ordered data mode.
[   11.350275] udevd version 125 started
[   11.897085] Linux agpgart interface v0.103
[   11.930156] agpgart-uninorth :00:0b.0: Apple UniNorth 2 chipset
[   11.930280] agpgart-uninorth :00:0b.0: configuring for size idx: 8
[   11.930369] agpgart-uninorth :00:0b.0: AGP aperture is 32M @ 0x0
[   11.990716] yenta_cardbus 0001:10:13.0: CardBus bridge found [:]
[   11.990735] yenta_cardbus 0001:10:13.0: Enabling burst memory read 
transactions
[   11.990741] yenta_cardbus 0001:10:13.0: Using CSCINT to route CSC interrupts 
to PCI
[   11.990744] yenta_cardbus 0001:10:13.0: Routing CardBus interrupts to PCI
[   11.990750] yenta_cardbus 0001:10:13.0: TI: mfunc 0x1002, devctl 0x60
[   12.092216] yenta_cardbus 0001:10:13.0: ISA IRQ mask 0x, PCI irq 53
[   12.09] yenta_cardbus 0001:10:13.0: Socket status: 3007
[   12.092233] yenta_cardbus 0001:10:13.0: pcmcia: parent PCI bridge I/O 
window: 0x0 - 0x7f
[   12.092239] yenta_cardbus 0001:10:13.0: pcmcia: parent PCI bridge Memory 
window: 0xf300 - 0xf3ff
[   12.092244] yenta_cardbus 0001:10:13.0: pcmcia: parent PCI bridge Memory 
window: 0x8000 - 0xafff
[   12.352270] cfg80211: Using static regulatory domain info
[   12.352277] cfg80211: Regulatory domain: US
[   12.352280]  (start_freq - end_freq @ bandwidth), (max_antenna_gain, 
max_eirp)
[   12.352284]  (2402000 KHz - 2472000 KHz @ 4 KHz), (600 mBi, 2700 mBm)
[   12.352288]  (517 KHz - 519 KHz @ 4 KHz), (600 mBi, 2300 mBm)
[   12.352292]  (519 KHz - 521 KHz @ 4 KHz), (600 mBi, 2300 mBm)
[   12.352296]  (521 KHz - 523 KHz @ 4 KHz), (600 mBi, 2300 mBm)
[   12.352300]  (523 KHz - 533 KHz @ 4 KHz), (600 mBi, 2300 mBm)
[   12.352304]  (5735000 KHz - 5835000 KHz @ 4 KHz), (600 mBi, 3000 mBm)
[   12.352309] cfg80211: Calling CRDA for country: US
[   13.697952] b43-phy0: Broadcom 4306 WLAN found
[   13.767791] phy0: Selected rate control algorithm 'pid'
[   13.933498] snd-aoa-fabric-layout: found 

Bug#518231: linux-image-2.6.28-1-powerpc: Fails to configure due to a debconf error

2009-03-04 Thread Jordi Mallach
On Wed, Mar 04, 2009 at 11:34:08PM +0100, maximilian attems wrote:
  run-parts: /etc/kernel/postinst.d/mkvmlinuz exited with return code 20
 what's this  stupid script
 please post it.


#!/bin/sh

set -e

. /usr/share/debconf/confmodule

db_get mkvmlinuz/bootloaders
bootloader=$RET

if [ $bootloader = mkvmlinuz ]; then
/usr/sbin/mkvmlinuz $1 $2
fi

-- 
Jordi Mallach Pérez  --  Debian developer http://www.debian.org/
jo...@sindominio.net jo...@debian.org http://www.sindominio.net/
GnuPG public key information available at http://oskuro.net/


signature.asc
Description: Digital signature