Bug#1051577: iproute2: obsolete conffiles

2023-09-13 Thread Ian Campbell
On Tue, 2023-09-12 at 23:13 +0200, Luca Boccassi wrote:
> On Mon, 11 Sept 2023 at 15:57, Daniel Gröber 
> wrote:
> > 
> > Hi Luca,
> > 
> > On Mon, Sep 11, 2023 at 01:06:06PM +0100, Luca Boccassi wrote:
> > > > I want to question whether removing these conffiles is a good idea at
> > > > all. I'm probably one of the few people that actually muck around in 
> > > > there
> > > > but it seems like this is going to break things for any users that do.
> > > 
> > > As far as I understand dpkg's conffile machinery should recognize if
> > > you changed anything, and leave it in place. Upstream moved the
> > > default ones to /usr, so we just follow what they do.
> > 
> > Right. Think of an admin having to adjust these config files though:
> > previously they could just `editor /etc/iproute2/rt_tables` and get on with
> > things. Now anyone needing to do that will have to do a doubletake, figure
> > out why /etc/iproute2 is missing, realize that it's at /usr/lib/iproute2
> > now, copy that over and finally edit.
> > 
> > Is that friction really warrented to cater to a specialized niche use-case?
> > 
> > Please consider overriding upstream's decision here.
> 
> Yes, it is warranted, both because it's exactly the correct behaviour
> for a package, and also because we are certainly not spending time and
> resources to go against upstream choices, especially when they are the
> right choices.

What is the plan for handling updates? AIUI we've lost the dpkg
conffile handling but it doesn't look like it's been replaced by
anything (e.g. like using ucf to prompt when an update happened
perhaps?).

Ian.



Re: increase kernel.pid_max?

2020-03-01 Thread Ian Campbell
On Sun, 2020-03-01 at 16:53 +, Ben Hutchings wrote:
> On Sun, 2020-03-01 at 09:24 +0800, Ian Campbell wrote:
> > I can't actually find when/where this changed in recent history -- I've
> > only noticed these large pids in recent months, but it's entirely
> > possible I'm simply not that observant.
> 
> This was a change in systemd 243:

Thanks, that's one of the few places I didn't think to look!

Ian.



Re: increase kernel.pid_max?

2020-02-29 Thread Ian Campbell
On Sat, 2020-02-29 at 20:14 +0100, Harald Dunkel wrote:
> Hi folks,
> 
> looking at the number of cores and highly parallel applications
> I wonder if it would be reasonable to increase the default
> kernel.pid_max (currently 0x8000) to lets say 0x3f?
> 
> Do you expect negative side effects?

It's already 0x40 on my machine and everything is fine. I haven't
(knowingly) tweaked anything so I guess this is the default on recent
machines, this laptop is running 5.4.0-3-amd64 (testing userspace,
although not updated for a little bit).

I can't actually find when/where this changed in recent history -- I've
only noticed these large pids in recent months, but it's entirely
possible I'm simply not that observant.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=72680a191b934377430032f93af15ef50aafb3a8
 was in 2010 and a 2.6.x
kernel.

Ian.



Bug#933294: Kernel quirks on QNAP TS-109 (Marvell orion)

2019-08-07 Thread Ian Campbell
On Sun, 2019-08-04 at 15:51 +0200, Martin Michlmayr wrote:
> Copying the relevant Debian bug (#933294).
> 
> * Matthieu CERDA  [2019-08-04 15:42]:
> > In dmesg during boot, which I suspect means that something is wrong in
> > GPIO / PIC communication.
> > 
> > I tried to replicate the  way Linux shuts the NAS down (send 'A' over
> > ttyS1,19200n8) and nothing happened, neigher shutdown or line echo.
> > 
> > Are there known ways for the PIC access to be broken, is this specific
> > to my NAS (but QTS seems to work fine with it) or maybe a regression
> > specific to TS-X09 ?
> 
> I believe your analysis is correct.  This sounds very similar to what
> I saw on a TS-x09 in the past.

Sounds like this is more a kernel issue than a qcontrol one? Shall we
reassign?

> 
> I fetched my TS-109 from the attic and will try to see if I see the
> same issue.
> 



Bug#914517: linux-image-4.19.0-4-amd64: System freezes with nomodeset set and boots in a loop without

2019-04-12 Thread Ian Campbell
On Wed, 2019-04-10 at 23:02 +0200, Robert Pommrich wrote:
> Am 10.04.19 um 22:32 schrieb Ben Hutchings:
> > On Wed, 2019-04-10 at 17:26 +0200, Robert Pommrich wrote:
> > > Hi,
> > > 
> > > I really don't understand why the severity of this bug was lowered.
> > > Plus, this was the only action taken on this bug beside my actions.
> > [...]
> > 
> > The phrase "renders package unusable" means that the package is
> > unusable in general, or on a large proportion of Debian systems.
> > 
> > I don't believe that that is the case for this bug.
> 
> Okay, what would be the right phrase for a system, that doesn't start?

https://kernel-team.pages.debian.net/kernel-handbook/ch-bugs.html#s9.1.2
has some info on how the kernel team interprets the severity field.

Ian.



Re: Debian kernel bugs

2019-03-14 Thread Ian Campbell
On Thu, 2019-03-14 at 04:35 +, Russell Coker wrote:
> Is there an archive of all the kernels that have been uploaded to Unstable 
> that I could do a binary search on and find out which version had the change 
> that broke things for me?

snapshots.d.o should have everything, including the ones which went to
experimental etc.

I don't have any specific advice for debugging suspend/resume though,
sorry (I can say it works ok on my 4th gen carbon X1, but I think
that's not the latest by at least a gen if not two).

Ian.



Bug#917533: linux-image-4.9.0-8-marvell: nfs-kernel-server leaks ports and triggers rkhunter/unhide-tcp

2019-01-05 Thread Ian Campbell
On Fri, 2019-01-04 at 18:06 +, Ian Campbell wrote:
> I'll also ping upstream about a possible stable backport shortly.

Sent to https://marc.info/?l=linux-netdev=154662768504311=2 (but I
forgot to Cc the bug, sorry, will update here as I hear back).

Ian.



Bug#917533: linux-image-4.9.0-8-marvell: nfs-kernel-server leaks ports and triggers rkhunter/unhide-tcp

2019-01-04 Thread Ian Campbell
On Fri, 2018-12-28 at 09:58 +, Ian Campbell wrote:
> I'm next going to reboot into my locally built kernel with the 
> (likely/hopeful)
> fix applied. I'll follow up in a few days (maybe a week to be sure) if I don't
> see this issue recurring. If it is looking positive at that point I'll also
> ping davem and Trond to requests upstream backports.

It's been a week and I've not had recurrence of the issue, previously I
was seeing it every 2-3 days.

I'm attaching the patch I was using which I described earlier as:

> pkg-kernel git's stretch branch at d9cfad89feb2 ('Revert
> "tracing: Use strlcpy() instead of strcpy() in 
> __trace_find_cmdline()"') plus backports of:
> 
> 8d1b8c62e080 SUNRPC: Refactor TCP socket timeout code into a helper function
> 3ffbc1d65583 net/sunrpc/xprt_sock: fix regression in connection error 
> reporting.
> 9b30889c548a SUNRPC: Ensure we always close the socket after a connection 
> shuts down

I'll also ping upstream about a possible stable backport shortly.

Ian.
From 3f47e65ec1a5b9c456cda19655759d43ec476988 Mon Sep 17 00:00:00 2001
From: Ian Campbell 
Date: Tue, 25 Dec 2018 09:28:51 +
Subject: [PATCH] Backport patches to stop NFS kernel server leak

This tweaks rkhunters hidden port checks
---
 debian/changelog  |  3 +
 ...-always-close-the-socket-after-a-con.patch | 82 +
 ...TCP-socket-timeout-code-into-a-helpe.patch | 88 +++
 ...sock-fix-regression-in-connection-er.patch | 50 +++
 debian/patches/series |  3 +
 5 files changed, 226 insertions(+)
 create mode 100644 debian/patches/bugfix/all/SUNRPC-Ensure-we-always-close-the-socket-after-a-con.patch
 create mode 100644 debian/patches/bugfix/all/SUNRPC-Refactor-TCP-socket-timeout-code-into-a-helpe.patch
 create mode 100644 debian/patches/bugfix/all/net-sunrpc-xprt_sock-fix-regression-in-connection-er.patch

diff --git a/debian/changelog b/debian/changelog
index da70135267ba..1c6b7e88734a 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -734,6 +734,9 @@ linux (4.9.144-1) UNRELEASED; urgency=medium
   * Refresh inet-frags-avoid-abi-change-in-4.9.134.patch for context changes
 in 4.9.142
 
+  [ Ian Campbell ]
+  * Backport NFS fixes to not trigger rkhunter hidden port scan.
+
  -- Ben Hutchings   Sat, 08 Dec 2018 20:53:57 +
 
 linux (4.9.135-1) stretch; urgency=medium
diff --git a/debian/patches/bugfix/all/SUNRPC-Ensure-we-always-close-the-socket-after-a-con.patch b/debian/patches/bugfix/all/SUNRPC-Ensure-we-always-close-the-socket-after-a-con.patch
new file mode 100644
index ..5e99fe42a090
--- /dev/null
+++ b/debian/patches/bugfix/all/SUNRPC-Ensure-we-always-close-the-socket-after-a-con.patch
@@ -0,0 +1,82 @@
+From b0494c706325fdd1ec6b4fdef1d1f0cc12f4f4ad Mon Sep 17 00:00:00 2001
+From: Trond Myklebust 
+Date: Mon, 5 Feb 2018 10:20:06 -0500
+Subject: [PATCH 3/3] SUNRPC: Ensure we always close the socket after a
+ connection shuts down
+
+Ensure that we release the TCP socket once it is in the TCP_CLOSE or
+TCP_TIME_WAIT state (and only then) so that we don't confuse rkhunter
+and its ilk.
+
+Signed-off-by: Trond Myklebust 
+(cherry picked from commit 9b30889c548a4d45bfe6226e58de32504c1d682f)
+---
+ net/sunrpc/xprtsock.c | 23 ++-
+ 1 file changed, 10 insertions(+), 13 deletions(-)
+
+diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
+index d5422d479a22..5be417ed1311 100644
+--- a/net/sunrpc/xprtsock.c
 b/net/sunrpc/xprtsock.c
+@@ -798,13 +798,6 @@ static void xs_sock_reset_connection_flags(struct rpc_xprt *xprt)
+ 	smp_mb__after_atomic();
+ }
+ 
+-static void xs_sock_mark_closed(struct rpc_xprt *xprt)
+-{
+-	xs_sock_reset_connection_flags(xprt);
+-	/* Mark transport as closed and wake up all pending tasks */
+-	xprt_disconnect_done(xprt);
+-}
+-
+ /**
+  * xs_error_report - callback to handle TCP socket state errors
+  * @sk: socket
+@@ -824,9 +817,6 @@ static void xs_error_report(struct sock *sk)
+ 	err = -sk->sk_err;
+ 	if (err == 0)
+ 		goto out;
+-	/* Is this a reset event? */
+-	if (sk->sk_state == TCP_CLOSE)
+-		xs_sock_mark_closed(xprt);
+ 	dprintk("RPC:   xs_error_report client %p, error=%d...\n",
+ 			xprt, -err);
+ 	trace_rpc_socket_error(xprt, sk->sk_socket, err);
+@@ -1619,9 +1609,11 @@ static void xs_tcp_state_change(struct sock *sk)
+ 		if (test_and_clear_bit(XPRT_SOCK_CONNECTING,
+ 	>sock_state))
+ 			xprt_clear_connecting(xprt);
++		clear_bit(XPRT_CLOSING, >state);
+ 		if (sk->sk_err)
+ 			xprt_wake_pending_tasks(xprt, -sk->sk_err);
+-		xs_sock_mark_closed(xprt);
++		/* Trigger the socket release */
++		xs_tcp_force_close(xprt);
+ 	}
+  out:
+ 	read_unlock_bh(>sk_callback_lock);
+@@ -2227,14 +2219,19 @@ static void xs_tcp_shutdown(struct rpc_xprt *xprt)
+ {
+ 	struct sock_xprt *transport = container_of(xprt, struct sock_xprt, xprt);
+ 	struct socket *sock = 

Re: hardening-check can detect whether kernel is protected or not

2019-01-02 Thread Ian Campbell
On Wed, 2019-01-02 at 03:08 +0100, Mikhail Morfikov wrote:
> Also how to get "not stripped" instead of "stripped" kernel?

It is available as the file `vmlinux` at the root of the source tree
after building, if you still have access to that.

There is also the `linux-image-$(uname -r)-dbg` packages which contains
./usr/lib/debug/boot/vmlinux-$(uname -r)` which I think (but am not
entirely sure) is that same binary.

That said, Yves-Alexis is correct that despite being an ELF binary the
kernel is in some ways a bit of a special case, so one shouldn't
necessarily expect tools intended for normal userspace ELF files to
DTRT with it.

Ian.



Bug#917533: linux-image-4.9.0-8-marvell: nfs-kernel-server leaks ports and triggers rkhunter/unhide-tcp

2018-12-28 Thread Ian Campbell
Package: src:linux
Version: 4.9.130-2
Severity: normal
Tags: upstream

Dear Maintainer,

Every few days rkhunter starts reporting in its daily report:

Warning: Hidden ports found:
 Port number: TCP:697

Which corresponds to running unhide-tcp:

# unhide-tcp --lsof
Unhide-tcp 20130526
Copyright © 2013 Yago Jesus & Patrick Gouin
License GPLv3+ : GNU GPL version 3 or later
http://www.unhide-forensics.info
Used options: use_lsof 
[*]Starting TCP checking

Found Hidden port that not appears in ss: 697
lsof reports :
[*]Starting UDP checking
root@armitage:~# unhide-tcp --netstat
Unhide-tcp 20130526
Copyright © 2013 Yago Jesus & Patrick Gouin
License GPLv3+ : GNU GPL version 3 or later
http://www.unhide-forensics.info
Used options: use_netscape 
[*]Starting TCP checking

Found Hidden port that not appears in netstat: 697

Running `service nfs-kernel-server restart` clears it up for a day or two. I
think this corresponds to the report at https://lwn.net/Articles/648417/.

This report was gathered while running 4.9.130-2 but I had already installed
(but not rebooted into) a new locally rebooted version (4.9.144-1~hellion.0)
which corresponds to pkg-kernel git's stretch branch at d9cfad89feb2 ('Revert
"tracing: Use strlcpy() instead of strcpy() in __trace_find_cmdline()"') plus
backports of:

8d1b8c62e080 SUNRPC: Refactor TCP socket timeout code into a helper function
3ffbc1d65583 net/sunrpc/xprt_sock: fix regression in connection error reporting.
9b30889c548a SUNRPC: Ensure we always close the socket after a connection shuts 
down

Where the first two are needed for a clean backport of the third which is:

commit 9b30889c548a4d45bfe6226e58de32504c1d682f
Author: Trond Myklebust 
Date:   Mon Feb 5 10:20:06 2018 -0500

SUNRPC: Ensure we always close the socket after a connection shuts down

Ensure that we release the TCP socket once it is in the TCP_CLOSE or
TCP_TIME_WAIT state (and only then) so that we don't confuse rkhunter
and its ilk.

Signed-off-by: Trond Myklebust 

I have a second system, also armel, running the same kernel and also serving
NFS where this is not happening. It's logs lack the:

[83135.994133] nfsd: last server has exited, flushing export cache
[83137.951143] NFSD: starting 90-second grace period (net c0590248)

which is seen on this system and which I think might correspond to the issue
recurring. The other system is perhaps bit busier with NFS traffic overall.

One final piece of information is that I was previously running (for about a
month if my logs are to be believed) linux-image-4.9.0-0.bpo.8-marvell:armel
4.9.110-3+deb9u5~deb8u1 on Jessie userspace and this was not happening. It only
started when I upgraded to Stretch's userspace and kernel (4.9.130-2). I don't
immediately see anything in `git log v4.9.110..v4.9.130 -- net/sunrpc/` which
would explain the change though. The upgrade to stretch took rkhunter from
1.4.2-0.4+deb8u1 to 1.4.2-6+deb9u1, which did include a bump to the default
configuration file, although I also can't see a smoking gun there based on what
etckeeper says changed (but if I were a betting many I would guess it was a
change to the detection process which exposed this rather than a kernel
regression).

I'm next going to reboot into my locally built kernel with the (likely/hopeful)
fix applied. I'll follow up in a few days (maybe a week to be sure) if I don't
see this issue recurring. If it is looking positive at that point I'll also
ping davem and Trond to requests upstream backports.

Thanks,
Ian.

-- Package-specific info:
** Version:
Linux version 4.9.0-8-marvell (debian-kernel@lists.debian.org) (gcc version 
6.3.0 20170516 (Debian 6.3.0-18+deb9u1) ) #1 Debian 4.9.130-2 (2018-10-27)

** Command line:
console=ttyS0,115200 root=/dev/ram initrd=0xa0,0x90 ramdisk=32768

** Not tainted

** Kernel log:
[7.882180] raid6: using intx1 recovery algorithm
[7.903700] async_tx: api initialized (async)
[7.911087] xor: measuring software checksum speed
[7.955195]arm4regs  :   725.000 MB/sec
[7.999190]8regs :   435.000 MB/sec
[8.043196]32regs:   633.000 MB/sec
[8.047417] xor: using function: arm4regs (725.000 MB/sec)
[8.097711] md: raid6 personality registered for level 6
[8.103102] md: raid5 personality registered for level 5
[8.108456] md: raid4 personality registered for level 4
[8.154600] md: raid10 personality registered for level 10
[8.423667] random: crng init done
[8.427094] random: 7 urandom warning(s) missed due to ratelimiting
[9.166444] EXT4-fs (dm-0): mounting ext3 file system using the ext4 
subsystem
[9.200619] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: 
(null)
[   12.160956] input: gpio_keys as /devices/platform/gpio_keys/input/input0
[   12.306034] m25p80 spi0.0: m25p128 

Bug#915178: i915: unable to handle kernel NULL pointer dereference at 0000000000000008

2018-12-01 Thread Ian Campbell
I think this might be the same as 914495.

On Sat, 2018-12-01 at 17:33 -0800, Bill Brelsford wrote:
> I get the same behavior (except at 0004) with
> linux-image-4.18.0-3-686-pae.  4.18.0-2 was ok.

On my system I observed that `modprobe i915` had hung (it was killable
and I could try again, but it remained hung), in the dmesg I found a
splat at +0x52 in the same function:

[4.971540] [drm:intel_ctx_workarounds_init [i915]] Number of context 
specific w/a: 0
[4.972175] [drm:i915_gem_contexts_init [i915]] fake context support 
initialized
[4.973980] [drm] RC6 disabled, disabling runtime PM support
[4.974029] [drm:init_ring_common [i915]] rcs0 initialization failed 
[head=f69d8000], fudging
[4.974059] BUG: unable to handle kernel NULL pointer dereference at 0004
[4.974065] *pdpt = 2c14b001 *pde =  
[4.974071] Oops:  [#1] SMP
[4.974075] CPU: 1 PID: 408 Comm: systemd-udevd Not tainted 4.18.0-3-686-pae 
#1 Debian 4.18.20-2
[4.974078] Hardware name: Gigabyte Technology Co., Ltd. 
EG45M-DS2H/EG45M-DS2H, BIOS F2 07/18/2008
[4.974129] EIP: gen4_render_ring_flush+0x52/0x100 [i915]
[4.974131] Code: 84 ab 00 00 00 ba 16 00 00 00 89 f0 e8 47 fe ff ff 3d 00 
f0 ff ff 77 6b 89 18 c7 40 04 02 40 00 7a 8b 56 44 8b 92 50 01 00 00 <8b> 52 04 
c7 40 0c 00 00 00 00 c7 40 10 00 00 00 00 83 ca 04 89 50 
[4.974167] EAX: f7a8 EBX: 0222 ECX:  EDX: 
[4.974170] ESI: ec754a00 EDI: ec29e200 EBP: f692fc88 ESP: f692fc80
[4.974173] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00010287
[4.974176] CR0: 80050033 CR2: 0004 CR3: 2c1836c0 CR4: 06f0
[4.974180] Call Trace:
[4.974219]  i915_request_alloc+0x21d/0x350 [i915]
[4.974258]  i915_gem_init+0x264/0x440 [i915]
[4.974290]  i915_driver_load+0xa66/0xdd0 [i915]
[4.974298]  ? acpi_dev_found+0x6c/0x80
[4.974331]  ? i915_pci_remove+0x20/0x20 [i915]
[4.974364]  i915_pci_probe+0x3a/0x70 [i915]
[4.974369]  pci_device_probe+0xc7/0x160
[4.974374]  driver_probe_device+0x2be/0x460
[4.974378]  __driver_attach+0xe1/0x110
[4.974382]  ? driver_probe_device+0x460/0x460
[4.974386]  bus_for_each_dev+0x5a/0x90
[4.974390]  driver_attach+0x19/0x20
[4.974393]  ? driver_probe_device+0x460/0x460
[4.974397]  bus_add_driver+0x12f/0x230
[4.974400]  ? pci_bus_num_vf+0x20/0x20
[4.974404]  driver_register+0x56/0xf0
[4.974407]  ? 0xf82f8000
[4.974411]  __pci_register_driver+0x3d/0x40
[4.974446]  i915_init+0x4b/0x50 [i915]
[4.974452]  do_one_initcall+0x42/0x1a9
[4.974457]  ? kfree+0x145/0x160
[4.974460]  ? kfree+0x145/0x160
[4.974465]  ? _cond_resched+0x17/0x40
[4.974469]  ? kmem_cache_alloc_trace+0x3b/0x1e0
[4.974474]  ? do_init_module+0x21/0x1dc
[4.974477]  ? do_init_module+0x21/0x1dc
[4.974481]  do_init_module+0x50/0x1dc
[4.974486]  load_module.constprop.58+0x2054/0x2690
[4.974491]  sys_finit_module+0x8a/0xe0
[4.974496]  do_fast_syscall_32+0x7f/0x1b0
[4.974500]  entry_SYSENTER_32+0x4e/0x7c
[4.974503] EIP: 0xb7f7bd39
[4.974505] Code: 08 8b 80 5c cd ff ff 85 d2 74 02 89 02 5d c3 8b 04 24 c3 
8b 0c 24 c3 8b 1c 24 c3 8b 3c 24 c3 90 90 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 
c3 90 90 90 90 8d 76 00 58 b8 77 00 00 00 cd 80 90 8d 76 
[4.974542] EAX: ffda EBX: 0011 ECX: 0100e4a0 EDX: 
[4.974545] ESI: 00ffc8b0 EDI: 0100d750 EBP: 000a ESP: bfd97c3c
[4.974549] DS: 007b ES: 007b FS:  GS: 0033 SS: 007b EFLAGS: 0296
[4.974553] Modules linked in: snd_hda_codec_realtek(+) intel_powerclamp(-) 
snd_hda_codec_generic coretemp kvm_intel dvb_usb_dib0700(+) snd_hda_intel 
dib7000m kvm dib0090 ppdev i915(+) dib0070 dib3000mc snd_hda_codec irqbypass 
evdev dibx000_common ir_rc6_decoder dvb_usb snd_hda_core dvb_core rc_rc6_mce 
serio_raw pcspkr video cp210x snd_hwdep snd_pcm usbserial mceusb(+) sg iTCO_wdt 
iTCO_vendor_support rc_core drm_kms_helper snd_timer parport_pc parport snd drm 
soundcore button pcc_cpufreq acpi_cpufreq i2c_algo_bit ext4 crc16 mbcache jbd2 
crc32c_generic fscrypto ecb crypto_simd cryptd aes_i586 sr_mod cdrom sd_mod 
ata_generic pata_it8213 ata_piix firewire_ohci firewire_core crc_itu_t i2c_i801 
libata scsi_mod lpc_ich r8169 mii ehci_pci uhci_hcd ehci_hcd usbcore usb_common
[4.974615] CR2: 0004
[4.974618] ---[ end trace 5620155027e66da5 ]---
[4.974667] EIP: gen4_render_ring_flush+0x52/0x100 [i915]
[4.974670] Code: 84 ab 00 00 00 ba 16 00 00 00 89 f0 e8 47 fe ff ff 3d 00 
f0 ff ff 77 6b 89 18 c7 40 04 02 40 00 7a 8b 56 44 8b 92 50 01 00 00 <8b> 52 04 
c7 40 0c 00 00 00 00 c7 40 10 00 00 00 00 83 ca 04 89 50 
[4.974706] EAX: f7a8 EBX: 0222 ECX:  EDX: 
[4.974615] CR2: 0004
[4.974618] ---[ end trace 5620155027e66da5 ]---
[4.974667] EIP: gen4_render_ring_flush+0x52/0x100 [i915]
[4.974670] Code: 84 ab 00 00 00 ba 16 00 00 00 89 f0 e8 

Bug#900581: linux: Enable Buster kernel features for newer ARM64 servers.

2018-09-02 Thread Ian Campbell
On Fri, 2018-08-31 at 14:52 -0700, Geoff Levand wrote:
> In summary, the latest released m400 firmware
> did not support APEI, and so no special work-around or kernel quirk
> support is needed.

That seems reasonable enough to me, no reason to support random back-
channel (un)released firmware. Ben?

Ian.



Bug#905574: linux-image-4.17.0-0.bpo.1-amd64: cryptsetup missing in intitramfs for kernel 4.17

2018-08-06 Thread Ian Campbell
On Mon, 2018-08-06 at 23:47 +0800, Ben Hutchings wrote:
> On Mon, 2018-08-06 at 14:40 +0100, Ian Campbell wrote:
> > On Mon, 2018-08-06 at 21:15 +0800, Ben Hutchings wrote:
> > > > Inspecting the initramfs shows that the cryptsetup related
> > > > parts are
> > > > missing for 4.17, but still in the 4.16 kernel.
> > > > 
> > > > I was able to mitigate the issue by use the cryptsetup packages
> > > > from
> > > > buster.
> > > 
> > > This is strange.  Kernel packages do not determine what goes into
> > > the
> > > initramfs.
> > 
> > Possibly the cryptsetup package changed (and become broken). Then
> > the
> > 4.17 initramfs was (re)built (due to the install/upgrade of that
> > kernel) while the 4.16 initramfs wasn't rebuilt.
> 
> In stable?  (There's no backport of cryptsetup.)

Ah, sorry, I didn't read close enough and somehow thought this was in
unstable.

For stable/bpo s/the cryptsetup package/some package/.

> > I expected that there were be triggers in place which should have
> > caused the 4.16 initramfs (in fact, all initramfses) to be updated
> > if a
> > relevant package (e.g. cryptsetup) was changed, but perhaps that
> > was
> > more in hope than expectation and it's only an initramfs-tools
> > update
> > which would trigger that?
> 
> Installing or upgrading cryptsetup will trigger an update of the newest
> installed kernel version's initramfs.

Thanks, that would potentially explain why 4.16 continues to work (it
has an older working initramfs).

Ian.



Bug#905574: linux-image-4.17.0-0.bpo.1-amd64: cryptsetup missing in intitramfs for kernel 4.17

2018-08-06 Thread Ian Campbell
On Mon, 2018-08-06 at 21:15 +0800, Ben Hutchings wrote:
> > Inspecting the initramfs shows that the cryptsetup related parts are
> > missing for 4.17, but still in the 4.16 kernel.
> > 
> > I was able to mitigate the issue by use the cryptsetup packages from
> > buster.
> 
> This is strange.  Kernel packages do not determine what goes into the
> initramfs.

Possibly the cryptsetup package changed (and become broken). Then the
4.17 initramfs was (re)built (due to the install/upgrade of that
kernel) while the 4.16 initramfs wasn't rebuilt.

I expected that there were be triggers in place which should have
caused the 4.16 initramfs (in fact, all initramfses) to be updated if a
relevant package (e.g. cryptsetup) was changed, but perhaps that was
more in hope than expectation and it's only an initramfs-tools update
which would trigger that?

Ian.



Bug#900581: linux: Enable Buster kernel features for newer ARM64 servers.

2018-06-14 Thread Ian Campbell
On Wed, 2018-06-13 at 12:25 -0700, Geoff Levand wrote:
> On 06/09/2018 05:15 AM, Ian Campbell wrote:
> 
> > I think this is probably something for the arch (or perhaps
> > platform)
> > code to deal with. See for example all the various platform quirks
> > in
> > arch/x86/kernel/acpi/boot.c, which fixup various wrongness and/or
> > disable features.
> 
> I followed your advice and created a fix in the arm64 acpi init
> code of arch/arm64/kernel/acpi.c.  Here's the submission:
> 
>   https://marc.info/?l=linux-acpi=152891415600796=2
>   https://www.spinics.net/lists/linux-acpi/msg82887.html

Thanks!



Re: "external abort on linefetch (0x814)" on Kirkwood 6282 SoC

2018-06-09 Thread Ian Campbell
On Sat, 2018-06-09 at 16:23 +0200, Andrew Lunn wrote:
> > Debian uses a Marvell specific kernel, so we don't need to worry
> about
> > the impact on other platforms.
> 
> That i was not sure about. Are there any plans to merge all ARM v5
> kernels together?

Not AFAIK, marvell is the only armv5 flavour left in Debian and armel
is well past the point where more are likely to be added.

> > > Or do we need to figure out why highmem breaks on Kirkwood?
> > 
> > I guess it would be nice from an upstream PoV to know what was going on
> > -- in particular in case there were to be other more subtle side
> > effects or corruption possible.
> 
> I might be able to hack together a 3.5/0.5G split, so forcing some of
> the 512MB of RAM i have in my Kirkwood into highmem. Hopefully i can
> then reproduce the issue.

A 3.5/0.5 split is a good idea, hadn't occurred to me. None of my QNAP
boxes have more than 512M either.

Ian.



Bug#900581: linux: Enable Buster kernel features for newer ARM64 servers.

2018-06-09 Thread Ian Campbell
On Fri, 2018-06-08 at 12:33 -0700, Geoff Levand wrote:
> On 06/05/2018 12:28 AM, Ian Campbell wrote:
> > On Tue, 2018-06-05 at 02:14 +0100, Ben Hutchings wrote:
> > 
> >> I don't think it's OK to cause a regression like this.  Since this
> is
> >> problem affects a specific known platform, the driver ought to
> >> recognise it and disable itself automatically.
> > 
> > Indeed, while the Fedora bug upthread claims such a patch wouldn't
> be
> > upstreamable, AFAIK it is not uncommon to have such quirks for
> broken
> > firmware based upon DMI identifiers or similar.
> 
> Just to mention it, Mark Salter submitted one of the work-around
> patches
> for the m400 firmware.  The reply from the ACPI maintainer wasn't
> very
> encouraging. See:
> 
>   https://lkml.org/lkml/2018/4/19/1020 (ACPI / scan: Fix regression
> related to X-Gene UARTs)

He said:
> I'm not convinced that making changes to the core ACPI device
> enumeration code in order to cover up for firmware bugs is the right
> approach.

That response seems fair, changing the core ACPI code at that point
indeed doesn't seem correct, especially with a one-off special case
(most such things are table and callback driven).

I think this is probably something for the arch (or perhaps platform)
code to deal with. See for example all the various platform quirks in
arch/x86/kernel/acpi/boot.c, which fixup various wrongness and/or
disable features.

Although I would also note that there seems to be ~200 existing DMI
matches under drivers/acpi, just not in the core device enumeration
code, I don't read Rafael's response as ruling out a fix somewhere in
the ACPI code, just not in the enumeration paths as presented there.

> CONFIG_ACPI_APEI allows for automated error reporting, so it is something
> that is very desirable[...]

I don't think anyone is disputing that, but there are tradeoff to be
made here.

> Is this an acceptable solution?

It should be sent upstream. It at least seems to be a more targetted
fix than the one above.

Has anyone tried to detect this "slave device attached to itself"
situation in a more generic way? Perhaps that would also be worth
discussing with upstream too.

It's an expected consequence of ARM & co's push towards the ACPI model
which effectively requires that the (upstream) kernel must deal with
buggy firmware in the field, just like on x86.

I don't think it is right that the distros should have to carry and
support fixes for this sort of thing, it should be done upstream or by
vendors fixing firmware (and I don't hold out much hope for the latter
if x86 is any indication, especially for a platform now as old as the
m400).

Ian.



Re: "external abort on linefetch (0x814)" on Kirkwood 6282 SoC

2018-06-09 Thread Ian Campbell
(adding debian-kernel, context: external aborts on qnap/marvell systems
with 1G of RAM, avoided with VMSPLIT_3G_OPT=y).

On Sat, 2018-06-02 at 21:31 +0200, Andrew Lunn wrote:
> On Sat, Jun 02, 2018 at 09:48:47PM +0300, Timo Jyrinki wrote:
> > 2018-06-02 18:55 GMT+03:00 Ian Campbell :
> > > You need to append a dtb and then encode in u-boot's uImage format.
> > > e.g.
> > >
> > >cat arch/arm/boot/zImage arch/arm/boot/dts/kirkwood-ts419-6281.dtb > x
> > >sudo mkimage -A arm -T kernel -O linux -C none -a 0x8000 -e 0x8000 -d 
> > > x uImage
> > 
> > Thank you! Now it's all coming back to me, I'm not sure if I've played
> > with these since Neo FreeRunner times.
> > 
> > So the good news is that with this kernel
> > kernel-kirkwood-ts219-6282-split3gopt from
> > https://people.debian.org/~timo/qnap/ (initrd from
> > http://ftp.debian.org/debian/dists/stretch/main/installer-armel/current/images/kirkwood/network-console/qnap/ts-21x/)
> > I'm getting full 1GB RAM without the errors!
> 
> Cool. Thanks for testing.
> 
> Now, the question is, is this an O.K. workaround?

Hard to say for sure. IIRC the downside of the VMSPLIT_3G_OPT
workaround is a slightly smaller virtual address space (from 3G down to
 2.75G) for the userspace part of a process, which would mean that
applications which really needed the full space would suffer.

There are some use case which need this, linking large packages comes
immediately to mind, but I don't think Debian runs any armel buildd's
on armel (they are running as chroots on armhf systems).

With only 1G of physical RAM anything using the full 3G would be
already so far into swapping hell that it seems like it would be pretty
unusable. So maybe we can assert that it is unlikely that there is any
real world usage that would be impacted by this change.

Only other things which come to mind are applications which require a
full 3G of address space but which don't populate it all with RAM
somehow (v. sparse layouts for dynamical languages perhaps?) or which
are simply buggy with the smaller size (I don't know if there are
precedents on other archs or other arm flavours for this). These seem
unlikely to me, but frankly I'm basing that on no data at all.

Debian uses a Marvell specific kernel, so we don't need to worry about
the impact on other platforms.

> Or do we need to figure out why highmem breaks on Kirkwood?

I guess it would be nice from an upstream PoV to know what was going on
-- in particular in case there were to be other more subtle side
effects or corruption possible.

Ian.



Bug#900581: linux: Enable Buster kernel features for newer ARM64 servers.

2018-06-05 Thread Ian Campbell
On Tue, 2018-06-05 at 02:14 +0100, Ben Hutchings wrote:

> I don't think it's OK to cause a regression like this.  Since this is
> problem affects a specific known platform, the driver ought to
> recognise it and disable itself automatically.

Indeed, while the Fedora bug upthread claims such a patch wouldn't be
upstreamable, AFAIK it is not uncommon to have such quirks for broken
firmware based upon DMI identifiers or similar.

Ian.



Re: Fixing Linux getrandom() in stable

2018-05-10 Thread Ian Campbell
On Thu, 2018-05-10 at 10:41 -0700, Russ Allbery wrote:

> It means that the configured timeout for which it's reasonable to wait for
> randomness is centralized in one service that can set that based on
> understanding of what's necessary in practice, and timeouts to catch other
> startup problems can remain in place for other services.  Right now, to
> have krb5-kdc wait for randomness requires extending the startup timeout
> of the service as a whole, thus potentially not diagnosing various other
> problems that might be preventing the KDC from starting unrelated to
> randomness.

Would it also mean that the user would see messages like "Waiting for
rng to be ready" instead of "Waiting for $someservice to be ready" in
the boot logs? I think it would and, if so, that seems useful in its
own right as well.

Ian.



Re: Debian kernel packaging no longer accepts typical downstream version numbering.

2018-05-07 Thread Ian Campbell
On Mon, 2018-05-07 at 13:36 +0100, peter green wrote:
> Common practice for downstreams (whether complete derivatives or end
> users) is to version modified packages with a version number like
> 
> 4.16.5-1+something1
> 
> Where "something" is the name of a project, the name of the person
> performing the modification etc.
> 
> Unfortunately with 4.16.5-1 of the kernel package such a version
> number is no longer accepted with the error message "Invalid debian
> linux version". It seems the cause of this was the following change.
> 
>   (?P
> -[^-]+
> +[^-+]+
>   )
> 
> Reverting this change allowed me to get a succesful control files
> generation.

This was also reported in #898087



Re: D-Link DNS-323 support dropped in Debian stretch

2018-03-28 Thread Ian Campbell
On Tue, 2018-03-27 at 21:25 +0300, Aaro Koskinen wrote:
> Hi,
> 
> On Mon, Mar 26, 2018 at 07:36:26PM +0900, Roger Shimizu wrote:
> > There's one possibility that can bring back qnap, or even D-Link
> > DNS device:
> > - create a new flavour for armel, such as armel-none-mini
> > - the new flavour will disable many features that other common
> > kernels
> > have, such as wireless, crypto, etc.
> 
> Disable all other features, except what's needed for disk access and kexec
> (perhaps still leave serial console :)). Then with simple scripting boot
> the full featured kernel from external storage using kexec. Such minimal
> kernel should be fairly stable from maintenance point of view.

This, and similar things (like chainloading a more capable u-boot),
have been suggested repeatedly over the last few years, what is needed
is for someone to actually try/do it.

Ian.



Re: Bug#870185: armel/marvell kernel size

2017-08-19 Thread Ian Campbell
On Sat, 2017-08-19 at 12:57 +0900, Roger Shimizu wrote:
> I know for bug #870185, Robert fixed his device by modify uboot
> params, but I guess it's still possible to keep uboot params and only
> change the boot addresses of kernel/initrd in flash-kernel db file.

In https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=870185#35 I
concluded it wasn't, perhaps I'm wrong though.

Can someone confirm whether the issue is that the u-boot load addresses
 for kernel and initrd are conflicting or if it is the kernel's
target/decompression address and u-boot's initrd address which are
conflicting? The symptoms seem to suggest the kernel is being
decompressed over the initrd.

Since we don't want to change the u-boot we can't easily influence the
initrd address (since it doesn't use a u-boot header on this platform).
So it would seem the only possibility (other than shrinking the kernel)
would be to change the address at which it decompresses itself so it
doesn't conflict with the initrd.

Not sure how hard that would be -- it might be as simple as tweaking a
constant in the source or a Kconfig option, but I suspect it might
involve changing early boot assembly...

Ian.



Bug#870185: FATAL: kernel 4.11.0-0.bpo.1-marvell does not boot on QNAP TS-219P II

2017-08-13 Thread Ian Campbell
On Sun, 2017-08-13 at 11:56 +0200, Robert Schlabbach wrote:
> Von: "Ian Campbell" <i...@debian.org>
> > There is one other option, which is to ask people to adjust their
> u-
> > boot boot scripts as Robert has done, however the QNAP systems are
> > often run headless and without access to the serial console (it's a
> > special cable which only a minority of users will have access to)
> so
> > that really is a last resort.
> 
> Note that it is possible to modify the u-boot environment without the
> serial console, using the "fw_setenv" command in a running debian
> system.

Given a correct /etc/fw_env.config, yes. I think the TS-xxx ones are
pretty well known although I don't think we ship any anywhere and we
certainly don't install one automatically.

> So one possibility would indeed be to modify the flash-kernel scripts
> to use fw_printenv, "parse" the environment to detect affected systems
> and, if needed, use fw_setenv to make the necessary changes.
> 
> That's not really a "pretty" solution, though, and any bugs in that
> function could easily brick the device. Then again, there have been
> "bricking" changes in the past, so it wouldn't be an entirely new
> risk ;-)

Do you know if this particular brickage is undone by the recovery
procedure (http://www.cyrius.com/debian/kirkwood/qnap/ts-219/recovery/)
. I think one of the mtd devices which is backed up is the boot loader
config (mtd4) so I think the answer must be yes, but I've not tried it.

Ian.

> At least making the change _without_ flashing a new kernel should not
> be harmful as the moved initramfs location appears to be backwards
> compatible (though I've tried only 4.9 in practice).
> 
> Best regards,
> -Robert
> 



Bug#870185: FATAL: kernel 4.11.0-0.bpo.1-marvell does not boot on QNAP TS-219P II

2017-08-13 Thread Ian Campbell
On Sat, 2017-08-12 at 21:49 +0100, Ben Hutchings wrote:
> On Mon, 2017-07-31 at 18:10 +0200, Robert Schlabbach wrote:
> > Ok, I figured it out. I noticed that the 4.11 kernel has a more
> > "generous" memory layout than the 4.9 one:
> > 
> > kernel 4.9:
> > 
> > [0.00] Memory: 504492K/524288K available (3777K kernel code, 371K 
> > rwdata, 1128K rodata, 296K init, 247K bss, 19796K reserved, 0K 
> > cma-reserved, 0K highmem)
> > 
> > kernel 4.11:
> > 
> > [0.00] Memory: 502648K/524288K available (4096K kernel code, 398K 
> > rwdata, 1132K rodata, 1024K init, 248K bss, 21640K reserved, 0K 
> > cma-reserved, 0K highmem)
> > 
> > So I suspected that the 4.11 kernel might be overwriting/corrupting
> > the initrd.img provided in memory before it gets to unpack it, and
> > changed the memory location from 0xa0 to 0xc0:
> [...]
> > Voila! It's finally booting!
> > 
> > So, was the 4.11 kernel compiled/linked with a wrong alignment
> > padding setting? Or should the bootloader environment be changed to
> > permanently use the higher address for passing initrd.img to the
> > kernel?
> 
> Should this be assigned to flash-kernel?

Sadly probably not.

There are three relevant load addresses, the one in the uboot header
added to the kernel (added by flash-kernel) and the two baked into the
uboot boot script, one for the kernel and one for the initrd. In some
systems there is a forth one in the uboot header on the initrd binary
but the QNAP systems don't appear to use that one, the initrd in flash
is the raw one.

Robert is modifying the boot script load address for the initrd which
flash-kernel has no control over. flash-kernel only controls the
address in the kernel u-boot header and IIRC that has to match a build
time constant in the kernel, so while we could perhaps coordinate a
change here I don't think there would be an appropriate kernel load
address which would help very much here since AIUI the conflicting
addresses which cause overlaps are the boot script ones.

The only thing I can think of would be simply reducing the size of the
armel kernel binary. I believe Roger was already looking into that? 

I don't think looking into reducing the size of the initrd will help
since it is loaded second in RAM. I suppose it is worth double checking that 
/etc/initramfs-tools/initramfs.conf is using MODULES=dep (instead of most). I 
think d-i arranges that automatically on these platforms so it is highly 
probably Robert is already using it, Robert can you confirm?

Relatedly it does seem here like perhaps the kernels limit on kernel
size on these platforms needs to be shrunk to take into account the
boot time RAM layout considerations and not just the flash partition
sizes. Roger, what do you think?

There is one other option, which is to ask people to adjust their u-
boot boot scripts as Robert has done, however the QNAP systems are
often run headless and without access to the serial console (it's a
special cable which only a minority of users will have access to) so
that really is a last resort.

There's also the chainloaded u-boot solution, but realistically noone
appears to be working on that (me included).

Ian.



Bug#870430: linux-image-4.9.0-3-marvell: Couldn't find DTB in /usr/lib/linux-image-4.9.0-3-marvell or /etc/flash-kernel/dtbs

2017-08-02 Thread Ian Campbell
Control: reassign -1 src:flash-kernel 3.79

On Tue, 2017-08-01 at 23:39 +0200, noone never wrote:
> Package: src:linux
> Version: 4.9.30-2+deb9u2
> Severity: important
> Dear Maintainer,
> 
> When I dist-upgrade my Sheevaplug from jessie to stretch, I get this error:
> Couldn't find DTB  in /usr/lib/linux-image-4.9.0-3-marvell or 
> /etc/flash-kernel/dtbs
> The file /usr/lib/linux-image-4.9.0-3-marvell does exist in the filesystem.
> [...]
> /usr/share/flash-kernel/functions: line 155: warning: command substitution: 
> ignored null byte in input
> I: The initramfs will attempt to resume from /dev/sda3
> I: (UUID=6fcd0aea-6301-4c47-a3fd-9f5eb2c1f8b5)
> I: Set the RESUME variable to override this.
> W: mdadm: /etc/mdadm/mdadm.conf defines no arrays.
> /usr/share/flash-kernel/functions: line 155: warning: command substitution: 
> ignored null byte in input
> Using DTB: kirkwood-sheevaplug.dtb
> Couldn't find DTB  in /usr/lib/linux-image-4.9.0-3-marvell or 
> /etc/flash-kernel/dtbs

These errors are from the flash-kernel hook rather than the kernel
package. You seem to be running stable so I have tagged it as being in
the stable version (3.79) and reassigned. Please let me know if this is
an incorrect assumption.

It says it is looking for "kirkwood-sheevaplug.dtb" and from [0] the
file /usr/lib/linux-image-4.9.0-3-marvell/kirkwood-sheevaplug.dtb is
present in the package -- I suppose it is also present on your
filesystem?

The message:
Couldn't find DTB  in /usr/lib/linux-image-4.9.0-3-marvell or 
/etc/flash-kernel/dtbs
is interesting since the double space in "DTB  in" is supposed to
contain $dtb_name, i.e. the path to look for, but it doesn't not.
Please could you attach the full output of running `sh -x
/usr/sbin/flash-kernel`, maybe that will include a clue as to where
things have gone astray.

There is also the warning from line 155 of `functions` is
`machine="$(get_dt_model)"` where:
PROC_DTMODEL="${FK_PROC_DRMODEL:-/proc/device-tree/model}"
[...]
get_dt_model() {
cat "$PROC_DTMODEL"
}

Assuming you haven't had cause to override FK_PROC_DRMODEL (note typo
in there), does /proc/device-tree/model exist on your system, and if so
what does it contain? The output of `cat -vet /proc/device-tree/model`
might be informative since it will escape any special characters and
make them visible. I suspect this is a harmless trailing nul and it
seems like it is correctly deciding to use kirkwood-sheevaplug.dtb so
this "ignored null byte" message may just be a benign warning.

Ian.

[0] 
https://packages.debian.org/stretch/armel/linux-image-4.9.0-3-marvell/filelist



Re: armel/marvell kernel size

2017-07-30 Thread Ian Campbell


On 30 July 2017 19:05:18 BST, Roger Shimizu  wrote:
>On Sat, Jul 22, 2017 at 8:40 AM, Ben Hutchings 
>wrote:
>> On Wed, 2017-05-03 at 23:12 +0100, Ben Hutchings wrote:
>>> linux 4.11-1~exp1 FTBFS on armel.  I spent a little while
>modularising
>>> some things that were unnecessarily built-in, but the image size
>will
>>> still be very close to the current limit of 2 MiB.  If it grows
>beyond
>>> this we'll lose support for many QNAP models.
>>
>> It's now (with 4.12.2-1~exp1) over 2 MiB; please look at this.
>
>While I'm still working on this, but I find the latest kernel in
>archive, 4.11.11-1+b1, fails to boot on my kirkwood based Linkstation.
>I tried netconsole, but don't get any log. (netconsole outputs fine on
>working kernels, such as 4.9 and 4.10 series.)
>And I trace to latest working kernel is 4.10.7-1~exp1.
>
>4.11-1_exp[12] FTBFS on armel, and 4.11.3-1_exp1 doesn't boot.
>
>I checked d-kernel and d-arm list, but didn't find similar issue.
>Do you have any clue on this? Thanks!

I found that with 4.11 the initrd was too big for my ts41x. Could it be that?

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Bug#852324: Disable CONFIG_DEBUG_WX in order to avoid this issue.

2017-07-26 Thread Ian Campbell
On Wed, 2017-07-26 at 19:23 +0200, Helio Loureiro wrote:
> Hi,
> 
> I already sent.

Please provide a full serial log when booting with the bad kernel.

Also please let us know what version of Xen you are running and whether
this was running as a guest or as dom0.

> And in the first post in this bug it says:
> 
> "
> When I boot my system with Xen, I get the following section in dmesg:
[   13.588386] WARNING: CPU: 18 PID: 1 at 
/build/linux-zDY19G/linux-4.8.15/arch/x86/mm/dump_pagetables.c:225 
note_page+0x5e8/0x790
> [...]
> [   13.588392] CPU: 18 PID: 1 Comm: swapper/0 Not tainted 4.8.0-2-amd64 #1 
> Debian 4.8.15-2
> [...]
> But when I boot my system 'normally', ie without Xen, the error does
> not
> show up."
> He clearly states it isn't booting.  It is crashing.

The original bug report was running the same kernel as the splat at the
point where reportbug was run, so it is booting at least far enough to
do that. See the "Kernel: Linux 4.8.0-2-amd64" in the original report,
which is also in the warning message.

There is no suggestion anywhere that it isn't booting in the original
report, just that when it does boot this message appears in the logs.
All he says is that this warning (he says "error", but that doesn't
imply a boot failure either) doesn't appear with the normal kernel.

See also the logs at https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=
852324#25 which show the system continuing to successfully boot after
the splat.

> How could that not be related?

Because you are experiencing some other bug later on, or perhaps the
check which resulting in a WARNING for the original poster has a bug in
it which is causing a hang for you, please provide us with the
information we need in order to diagnose which it is rather than
continuing to assert that it is the same issue, otherwise no progress
is going to happen here.

Ian.



Bug#852324: Disable CONFIG_DEBUG_WX in order to avoid this issue.

2017-07-26 Thread Ian Campbell
On Wed, 2017-07-26 at 18:51 +0200, Helio Loureiro wrote:
> Hi,
> 
> It can't be.  It is the same bug as describe in this one.
> 
> If you read the first post, it can't boot and shows the same content
> as in the bug I detected now on my system.

Nowhere there does it say anything about failing to boot.

Please boot a working kernel and use reportbug to file a new bug,
describing your configuration (e.g. version of Xen etc) and please then
attach your serial logs of the issue occurring.

> So it isn't a new bug.

Yes, it is.

Ian.



Bug#852324: Disable CONFIG_DEBUG_WX in order to avoid this issue.

2017-07-26 Thread Ian Campbell
This option does not (per its intent) present booting, it is just a
check & warning.

There may be a bug with the check which is causing a failure to boot,
but you are the first to report that aspect and that isn't what #852324
was about until now.

Please use `reportbug kernel` to report a fresh bug describing your
specific circumstances and your failure mode.

Ian.

On Wed, 2017-07-26 at 18:32 +0200, Helio Loureiro wrote:
> Hi,
> 
> VM doesn't boot with this parameter enabled, as confirmed by Linus
> mail.  So my upgraded to Stretch leaded to a complete system outage
> because of this parameter.
> 
> I held on kernel 3.19 from Jessie meanwhile.
> 
> Best Regards,
> Helio Loureiro
> http://helio.loureiro.eng.br
> https://se.linkedin.com/in/helioloureiro
> http://twitter.com/helioloureiro
> 
> 
> 2017-07-26 17:56 GMT+02:00 Ian Campbell <i...@debian.org>:
> > On Wed, 2017-07-26 at 17:13 +0200, Helio Loureiro wrote:
> > > Hi,
> > >
> > > As much it sounds correct to protect systems in this way, you
> > broke
> > > compatibility.  I'm back to kernel 3.19 until this is fixed.
> > >
> > > So in order to have such parameter enabled, you should at the
> > least
> > > provide a bootparam option to toggle enabled or not.
> > >
> > > From my point of view as user, you should never break backward
> > > compatibility, as bad is sounds in terms of security.  And you
> > should
> > > never enforce it to users.
> > 
> > Other than a single warning printed to dmesg during boot, what is
> > actually broken for you?
> > 
> > Ian.
> > 
> 
> 



Bug#852324: Disable CONFIG_DEBUG_WX in order to avoid this issue.

2017-07-26 Thread Ian Campbell
On Wed, 2017-07-26 at 17:13 +0200, Helio Loureiro wrote:
> Hi,
> 
> As much it sounds correct to protect systems in this way, you broke
> compatibility.  I'm back to kernel 3.19 until this is fixed.
> 
> So in order to have such parameter enabled, you should at the least
> provide a bootparam option to toggle enabled or not.
> 
> From my point of view as user, you should never break backward
> compatibility, as bad is sounds in terms of security.  And you should
> never enforce it to users.

Other than a single warning printed to dmesg during boot, what is
actually broken for you?

Ian.



Re: Removing inactive members from Alioth 'kernel' project

2017-05-17 Thread Ian Campbell
On Wed, 2017-05-17 at 15:51 +0100, Ben Hutchings wrote:
> Finally, Ian Campbell has two accounts as members and I would like to
> remove the guest account (ijc-guest).

Ack.

In fact my ijc-guest hat has just removed himself from the project. It
took a small leap of faith that clicking the trash can next to the
kernel project would leave the project rather than deleting it or that
it would at least prompt me first ;-)

Ian.



Bug#860808: ocfs2 blocks jbd2

2017-05-16 Thread Ian Campbell
On Sat, 2017-05-13 at 08:41 -0500, Russell Mosemann wrote:
> package: src:linux
> Version: 4.9.18-1~bpo8+1
> Severity: important
>  
> Is anyone from ocfs2 addressing these bugs?

They won't see this message because you haven't sent it to them.

Please take this directly to the ocfs2-devel mailing list (details at
[0]). Please feel free to keep cc-ing the Debian bug at 86...@bugs.debi
an.org (although I think we have more than enough stack traces at this
point, thanks).

The same goes for #841144 ,I think, assuming you think they are
different bugs. If they are the same they should be merged, let me know
and I can do that, but I think one is fs/ocfs2/alloc.c:1514 and the
other fs/jbd2/transaction.c:297 so it is reasonable to suppose they
might be different until told otherwise by the OCFS2 people.

Thanks,
Ian.

[0] https://oss.oracle.com/mailman/listinfo/ocfs2-devel



Bug#854854: Cherry pick and backport patches for rtc-s35390a (Was: Bug#854854: qcontrol: reboot/poweroff)

2017-04-19 Thread Ian Campbell
Control: retitle -1 Cherry pick and backport patches for rtc-s35390a to Jessie
Control: found -1 3.16.39-1+deb8u2
Control: fixed -1 4.6.4-1

stable@ please see the list of requested backports at the end.

debian-*: I tried (and succeeded!) to reassign to src:linux and tried
(and failed!) to merge with 794266, actually that failure was probably
for the best. I've tried to reflect the fact that 794266 was fixed in
4.6.4-1 but that the issue is still found in Jessie and the stable
3.16.y tree (since I don't see the fix in the changelog of either) with
the above runes.

(the original bug report was all in the $subject, so reproducing here
since the retitle should overwrite it:

After new installation of Debian the system is properly powering 
down either shutdownor by power button. Also the restart is working
by power button. But after some days running the system the machine
is not powering down completely anymore. The status-LED is blinking
red/green, power LED is blue. The system can only be restarted by
disconnecting from power. The power button is not working in this
moment. This effect happens if I stop by shutdown command as well as
by power button. nowby
)

On Mon, 2017-04-03 at 21:09 +0200, Alexandre Belloni wrote:
> Hi,
> 
> On 16/03/2017 at 09:45:07 +0100, Uwe Kleine-König wrote:
> > Back then I prepared a backport to jessie (which still sits in a
> > topic
> > branch in the debian-kernel repo[2]) but it seems I forgot to merge
> > it into
> > the jessie kernel.
> > 
> > I wonder if the better fix would be to fix this in linux-stable
> > instead.
> > What do you think?
> > 
> 
> I agree that this should be backported to any stable kernel that is used
> by a distribution. I can ack the patch if you send one to stable@

According to 
https://anonscm.debian.org/cgit/kernel/linux.git/commit/?id=1d69dac66c315a290fb61c5f400e056e8d01fe50
linked from
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=854854#15

The set of requested backports is:
f87e904ddd8f rtc: s35390a: fix reading out alarm
8e6583f1b5d1 rtc: s35390a: implement reset routine as suggested by the reference
3bd32722c827 rtc: s35390a: improve irq handling

all of which are in mainline already. They should/could be backported
to the active LTS branches between v3.7..v4.5 AFAICT.

Uwe, can you confirm for stable@ that these are the patches and
releases where these are needed?

I also see
https://anonscm.debian.org/cgit/kernel/linux.git/tree/debian/patches/bugfix/all/rtc-s35390a-initialize-all-fields.patch?id=1d69dac66c315a290fb61c5f400e056e8d01fe50
in your branch but not sure if that is needed upstream or if it was
intended only for the distro packages

Uwe, Maybe you should merge your Debian kernel package branch in the
meantime and we can drop the patches when they hit stable?

Ian.



Bug#841144: kernel BUG at /build/linux-Wgpe2M/linux-4.8.11/fs/ocfs2/alloc.c:1514!

2017-03-14 Thread Ian Campbell
OCFS2 folks, any thoughts on this crash?

On Tue, 2017-01-17 at 02:12 +, Ben Hutchings wrote:
> On Mon, 2017-01-16 at 13:12 -0600, Russell Mosemann wrote:
> [...]
> > Jan 15 17:31:03 vhost032 kernel: [ cut here ]
> > Jan 15 17:31:03 vhost032 kernel: kernel BUG at 
> > /build/linux-Wgpe2M/linux-4.8.11/fs/ocfs2/alloc.c:1514!
> 
> This is:
> 
> static int ocfs2_grow_tree(handle_t *handle, struct ocfs2_extent_tree *et,
>    int *final_depth, struct buffer_head **last_eb_bh,
>    struct ocfs2_alloc_context *meta_ac)
> {
> ...
> BUG_ON(meta_ac == NULL);
> 
> > [...]
> > Jan 15 17:31:03 vhost032 kernel: Call Trace:
> > Jan 15 17:31:03 vhost032 kernel:  [] ? 
> > ocfs2_set_buffer_uptodate+0x35/0x4a0 [ocfs2]
> > Jan 15 17:31:03 vhost032 kernel:  [] ? 
> > __find_get_block+0xa7/0x110
> > Jan 15 17:31:03 vhost032 kernel:  [] ? 
> > ocfs2_split_and_insert+0x307/0x490 [ocfs2]
> > Jan 15 17:31:03 vhost032 kernel:  [] ? 
> > ocfs2_split_extent+0x3ee/0x560 [ocfs2]
> > Jan 15 17:31:03 vhost032 kernel:  [] ? 
> > ocfs2_change_extent_flag+0x273/0x450 [ocfs2]
> > Jan 15 17:31:03 vhost032 kernel:  [] ? 
> > ocfs2_mark_extent_written+0x110/0x1d0 [ocfs2]
> > Jan 15 17:31:03 vhost032 kernel:  [] ? 
> > ocfs2_dio_end_io_write+0x44d/0x600 [ocfs2]
> 
> meta_ac is passed down from ocfs2_dio_end_io_write(), which allocates
> it using ocfs2_lock_allocators()... but the latter only allocates it
> conditionally.  It seems like the condition is wrong somehow.

This still seems to be happening for this user with 4.9.13, looking at
"git log -p v4.9.13..origin/master -- fs/ocfs2" I wonder if
https://git.kernel.org/torvalds/c/3e10b793fc40dfdbe51762e0d084bd6f2c8acaaa
might be relevant?

The commit message mentions meta_ac not getting allocated and an extent
split vs refcount split differentiation and we have ocfs2_split_extent
in the trace. Slim reasoning I know, maybe someone who knows the code
could make a better determination.

As Ben said before the whole bug report can be found at https://bugs.de
bian.org/841144

Ian.



Re: Bug#852596: iio-sensor-proxy: spams syslog with an error message every second

2017-01-29 Thread Ian Campbell
On Sun, 2017-01-29 at 14:30 +0530, Ritesh Raj Sarraf wrote:
> On Fri, 2017-01-27 at 22:04 +0100, iqwue Wabv wrote:
> > upstream of i-s-p suggests that this bug should be reassigned to kernel 
> > package.
> > ps.
> > said 27.01.2017 on https://github.com/hadess/iio-sensor-proxy/issues/134
> 
> Yes. I am aware of that. The Debian Kernel Policy, usually, is to not include
> changes that aren't in the kernel already. And this fix seems to have gone for
> 4.10 or 4.11, whereas the Debian Stretch kernel will be based on 4.9 LTS.
> 
> I'm adding debian-kernel to this reply, in case the policy has changed.

It has never been as strict as you suggest, backports from mainline are
ok too (and always have been AFAIK) if justified. For especially urgent
or important fixes then "in a maintainers tree and so well on the way
to mainline" can be the minimum necessary bar.

> If this fix gets included in the Linux 4.9 LTS tree upstream, then Debian 
> would
> automatically pull it for Stretch inclusion.

This is true too, but there's no particular need to wait for that
process to unfold.

> But, at this moment, this isn't even in Linus's tree.

Can you point to the commit in the relevant upstream kernel maintainer's tree?

Ian.



Bug#852324: x86/mm: Found insecure W+X mapping

2017-01-27 Thread Ian Campbell
On Mon, 2017-01-23 at 18:32 +, Ben Hutchings wrote:
> Any ideas, Ian?

I'm afraid I'm not so up to speed on Xen things since I changed jobs
last year, looks like people from xen-devel are already involved
though.

Ian.



Re: [PATCH] x86/kbuild: enable modversions for symbols exported from asm

2016-12-12 Thread Ian Campbell
On Sat, 2016-12-10 at 13:41 +0100, Greg Kroah-Hartman wrote:
> Now I don't work on a distro anymore, but I would think that something
> like this would be really useful, pointing out exactly what changed is
> very important for distro maintainers to determine what they want to do

The .symvers produced by the current scheme aren't completely useless
from this PoV, although they aren't ideal since you need both before an
d after trees and if the changes are large or far reaching the diff can
get a bit unwieldy, so better tooling which points directly to the
actual relevant change would be no bad thing.

Ian.



Re: [PATCH] x86/kbuild: enable modversions for symbols exported from asm

2016-12-09 Thread Ian Campbell
On Fri, 2016-12-09 at 13:33 +1000, Nicholas Piggin wrote:
> 
> Well I simply tested the outcome. If you have:
> 
> struct blah {
>   int x;
> };
> int foo(struct blah *blah)
> {
>   return blah->x;
> }
> EXPORT(foo);
> 
> $ nm vmlinux | grep __crc_foo
> a0cf13a0 A __crc_foo
> 
> Now change to
> 
> struct blah {
>   int y;
>   int x;
> };
> 
> $ nm vmlinux | grep __crc_foo
> a0cf13a0 A __crc_foo
> 
> It just doesn't catch these things.

I found the same when I just added your snippet to init/main.c.

_But_ when I moved the struct into include/types.h (which happened to
be included by init/main.c) then, with just x in the struct:

$ make -s init/main.{o,symtypes} && grep -E foo\|blah init/main.symtypes && 
nm init/main.o  | grep __crc_foo
s#blah struct blah { int x ; } 
foo int foo ( s#blah * ) 
0cd0312e A __crc_foo

but adding y:

$ make -s init/main.{o,symtypes} && grep -E foo\|blah init/main.symtypes && 
nm init/main.o  | grep __crc_foo
s#blah struct blah { int x ; int y ; } 
foo int foo ( s#blah * ) 
eda220c6 A __crc_foo

So it does catch things in that case.

With struct blah inline in main.c it was:

$ make -s init/main.{o,symtypes} && grep -E foo\|blah init/main.symtypes && 
nm init/main.o  | grep __crc_foo
s#blah struct blah { UNKNOWN } 
foo int foo ( s#blah * ) 
a0cf13a0 A __crc_foo

So I suppose it only cares about structs which are in headers, which I
guess makes sense. I think it is working in at least one of the
important cases.

Ian.



Bug#847154: linux-image-amd64: Disabling vsyscall interface may break docker run

2016-12-06 Thread Ian Campbell
On Tue, 2016-12-06 at 14:17 +, Ben Hutchings wrote:
> 
> But perhas we should more explicit in this message, e.g.:
> 
> "This breaks (e)glibc 2.13 and earlier, which may still be installed in
> a chroot or container environment based on Debian 7, RHEL/CentOS 6 or
> earlier versions."

That's a good idea!

Ian.



Bug#845611: linux-image-4.8.0-1-marvell: hard drive not detected on LinkStation Pro (LS-GL)

2016-11-25 Thread Ian Campbell
On Fri, 2016-11-25 at 08:56 -0800, Ryan Tandy wrote:
> On Fri, Nov 25, 2016 at 04:24:31PM +0000, Ian Campbell wrote:
> > 
> > Is it possible that there are multiple variants of this one with
> > differing numbers of disks?
> > 
> > It's a little tricky to google for but all the LS-GL's I can see
> > _look_
> > like they are single disk (see [*] below). Or maybe the naming is
> > just
> > confusing?
> 
> 
> [...snip...]

Thanks for all that.

> Mine has zero mods though, totally stock hardware-wise.

Oh, so the issue is not that you are missing a second disk, but rather
that the one disk you do have seems to not be detected? You theory is
that it is actually only the second port which is connected to anything
in this hw?

Sorry, took me a while to grok what you were saying in your original
mail!

Ian.



Bug#845611: linux-image-4.8.0-1-marvell: hard drive not detected on LinkStation Pro (LS-GL)

2016-11-25 Thread Ian Campbell
On Fri, 2016-11-25 at 08:10 -0800, Ryan Tandy wrote:
> On Fri, Nov 25, 2016 at 07:42:37AM +0000, Ian Campbell wrote:
> > 
> > Assuming orion5x-linkstation.dtsi is correctly relating to your
> > platform you would appear to want one of:
> >    $ git grep orion5x-linkstation.dtsi arch/arm/boot/dts/*.dts
> >    arch/arm/boot/dts/orion5x-kuroboxpro.dts:#include "orion5x-
> > linkstation.dtsi"
> >    arch/arm/boot/dts/orion5x-linkstation-lsgl.dts:#include
> > "orion5x-linkstation.dtsi"
> >    arch/arm/boot/dts/orion5x-linkstation-lswtgl.dts:#include
> > "orion5x-linkstation.dtsi"
> > 
> > (Or something else not yet present in the kernel tree.)
> 
> Mine is an LS-GL, so orion5x-linkstation-lsgl.dts should be the one. 

Is it possible that there are multiple variants of this one with
differing numbers of disks?

It's a little tricky to google for but all the LS-GL's I can see _look_
like they are single disk (see [*] below). Or maybe the naming is just
confusing?

Ian.

[0] http://www.gkspk.com/view/techie/upgrade-hdd-buffalo-linkstation-ls
-gl-nas/
[1] http://buffalo.nas-central.org/wiki/Category:LSPro reached from th
"LS-GL v1" or "v2" link at  http://buffalo.nas-central.org/wiki/Main_Pa
ge
[2] http://buffalo.nas-central.org/wiki/Information/LSPROOverview which
is the result of searching for "LS-GL" on that site
[3] http://buffalo.nas-central.org/wiki/Disassemble_the_LS_Pro_v1/LS_Li
ve_v1



Bug#845611: linux-image-4.8.0-1-marvell: hard drive not detected on LinkStation Pro (LS-GL)

2016-11-24 Thread Ian Campbell
On Thu, 2016-11-24 at 23:24 -0800, Ryan Tandy wrote:
> whereas the new orion5x-linkstation.dtsi contains this code:
> 
>  {
> status = "okay";
> nr-ports = <1>;
> };

A .dtsi is an include file, not the final thing for any given device,
for that you need a .dts (which is compiled into a .dtb).

Assuming orion5x-linkstation.dtsi is correctly relating to your
platform you would appear to want one of:
$ git grep orion5x-linkstation.dtsi arch/arm/boot/dts/*.dts
arch/arm/boot/dts/orion5x-kuroboxpro.dts:#include "orion5x-linkstation.dtsi"
arch/arm/boot/dts/orion5x-linkstation-lsgl.dts:#include 
"orion5x-linkstation.dtsi"
arch/arm/boot/dts/orion5x-linkstation-lswtgl.dts:#include 
"orion5x-linkstation.dtsi"

(Or something else not yet present in the kernel tree.)

Of the three above orion5x-kuroboxpro.dts and orion5x-linkstation-
lswtgl both set ports to 2 using:

 {
nr-ports = <2>;
};

which overrides the defaults from the dtsi. You mentioned kurobox_pro-
setup.c so perhaps orion5x-kuroboxpro.dts is the one you want?

Ian.



Bug#834505: arm64 boot failure with large physical memory range

2016-08-22 Thread Ian Campbell
On Mon, 2016-08-22 at 11:03 +0100, Leif Lindholm wrote:
> 
> > I thought there was a control bit on ARMv8 too which made it cause a
> > fault if the code loaded through, stored via, branched to etc an
> > address with bits set between the maximum physical address bit and the
> > bits architecturally reserved for tagging at the top end of the word,
> > but perhaps my memory has simply fabricated that out of thing air?
> 
> I don't remember if there's a dedicated bit for that, but certainly
> > judicious use of TTBCR/TTBRn should be able to achieve the same.

If it's possible then that seems like a good thing to do to me, to
avoid surprises and to be consistent with other arches.

OTOH as I now understand from Ard's response the problem here is not
lack of masking, but rather too much masking.

Ian.



Bug#834505: arm64 boot failure with large physical memory range

2016-08-21 Thread Ian Campbell
On Sun, 2016-08-21 at 11:42 +0100, Leif Lindholm wrote:
> 
> You're not wrong, but unfortunately the ability to write semi-portable
> code left the planet over a decade ago. For clarification - the
> problem is not with regards to code written specifically for arm64 and
> not verified with different MMU-configurations, but with code written
> for x86 and never tested on anything else.

Wouldn't the equivalent scenario on x86 result in taking faults (#GP
IIRC) due to them being non-canonical addresses? (Which are basically
Intel's way of forbidding the use of bits above the physical address
size when registers are used in pointer-like ways, i.e. for
loads/stores or jump targets etc).

I thought there was a control bit on ARMv8 too which made it cause a
fault if the code loaded through, stored via, branched to etc an
address with bits set between the maximum physical address bit and the
bits architecturally reserved for tagging at the top end of the word,
but perhaps my memory has simply fabricated that out of thing air?

Ian.



Bug#833561: arm64: dtbs no longer installed in vendor subdir

2016-08-06 Thread Ian Campbell
On Sat, 2016-08-06 at 11:14 +0100, Ian Campbell wrote:
> On Fri, 2016-08-05 at 18:37 -0700, Martin Michlmayr wrote:
> > 
> > I ran dtbs_install manually to verify:
> > 
> > ARCH=arm64 make  -n -C debian/build/build_arm64_none_arm64
> > dtbs_install INSTALL_DTBS_PATH=/home/tbm/debian/linux-4.7~rc7/x
> > echo '  INSTALL arch/arm64/boot/dts/amd/amd-overdrive.dtb'; mkdir
> > -p
> > /home/tbm/debian/linux-4.7~rc7/x; cp arch/arm64/boot/dts/amd/amd-
> > overdrive.dtb /home/tbm/d ebian/linux-4.7~rc7/x
> 
> With pristine upstream 4.7-rc7 I see the subdirs, must be something
> in
> our local patches?
> 
> $ make ARCH=arm64 defconfig dtbs dtbs_install
> INSTALL_DTBS_PATH=$HOME/tmp/dtbs -j12 CROSS_COMPILE=aarch64-linux-
> gnu- -s &&  find ~/tmp/dtbs | head -n 10
> /home/ijc/tmp/dtbs
> /home/ijc/tmp/dtbs/altera
> /home/ijc/tmp/dtbs/altera/socfpga_stratix10_socdk.dtb
> /home/ijc/tmp/dtbs/mediatek
> /home/ijc/tmp/dtbs/mediatek/mt6795-evb.dtb
> /home/ijc/tmp/dtbs/mediatek/mt8173-evb.dtb
> /home/ijc/tmp/dtbs/broadcom
> /home/ijc/tmp/dtbs/broadcom/ns2-svk.dtb
> /home/ijc/tmp/dtbs/broadcom/vulcan-eval.dtb
> /home/ijc/tmp/dtbs/amd

Things seem ok with the debian/4.7_rc7-1_exp1 tag in our git repo:

git reset --hard debian/4.7_rc7-1_exp1
git clean -fdqx debian
./debian/rules orig
./debian/rules debian/control
make -f debian/rules.gen setup_arm64
cd debian/build/build_arm64_none_arm64/
rm -rf ~/tmp/dtbs && make ARCH=arm64 dtbs dtbs_install 
INSTALL_DTBS_PATH=$HOME/tmp/dtbs CROSS_COMPILE=aarch64-linux-gnu- -s &&  find 
~/tmp/dtbs | head -n 10

Results in :
/home/ijc/tmp/dtbs
/home/ijc/tmp/dtbs/amd
/home/ijc/tmp/dtbs/amd/amd-overdrive.dtb
/home/ijc/tmp/dtbs/amd/amd-overdrive-rev-b1.dtb
/home/ijc/tmp/dtbs/amd/amd-overdrive-rev-b0.dtb
/home/ijc/tmp/dtbs/amd/husky.dtb
/home/ijc/tmp/dtbs/nvidia
/home/ijc/tmp/dtbs/nvidia/tegra210-smaug.dtb
/home/ijc/tmp/dtbs/nvidia/tegra132-norrin.dtb
/home/ijc/tmp/dtbs/nvidia/tegra210-p2371-2180.dtb

Or similar to your original:

$ make -n -C debian/build/build_arm64_none_arm64/ dtbs_install 
INSTALL_DTBS_PATH=$HOME/tmp/dtbs V=1 | head -n 15
make: Entering directory 
'/local/scratch/ijc/development/debian/pkg-kernel/master.git/debian/build/build_arm64_none_arm64'
make -C /local/scratch/ijc/development/debian/pkg-kernel/master.git 
O=/local/scratch/ijc/development/debian/pkg-kernel/master.git/debian/build/build_arm64_none_arm64/.
 dtbs_install
make -C 
/local/scratch/ijc/development/debian/pkg-kernel/master.git/debian/build/build_arm64_none_arm64
 KBUILD_SRC=/local/scratch/ijc/development/debian/pkg-kernel/master.git \
-f /local/scratch/ijc/development/debian/pkg-kernel/master.git/Makefile 
dtbs_install
/local/scratch/ijc/development/debian/pkg-kernel/master.git/Makefile:681: 
Cannot use CONFIG_CC_STACKPROTECTOR_STRONG: -fstack-protector-strong not 
supported by compiler
make -f 
/local/scratch/ijc/development/debian/pkg-kernel/master.git/scripts/Makefile.dtbinst
 obj=arch/arm64/boot/dts
mkdir -p /home/ijc/tmp/dtbs
make -f 
/local/scratch/ijc/development/debian/pkg-kernel/master.git/scripts/Makefile.dtbinst
 obj=arch/arm64/boot/dts/al
make[4]: Nothing to be done for '__dtbs_install'.
make -f 
/local/scratch/ijc/development/debian/pkg-kernel/master.git/scripts/Makefile.dtbinst
 obj=arch/arm64/boot/dts/altera
make[4]: Nothing to be done for '__dtbs_install'.
make -f 
/local/scratch/ijc/development/debian/pkg-kernel/master.git/scripts/Makefile.dtbinst
 obj=arch/arm64/boot/dts/amd
echo '  mkdir -p /home/ijc/tmp/dtbs/amd; cp 
arch/arm64/boot/dts/amd/amd-overdrive.dtb /home/ijc/tmp/dtbs/amd'; mkdir -p 
/home/ijc/tmp/dtbs/amd; cp arch/arm64/boot/dts/amd/amd-overdrive.dtb 
/home/ijc/tmp/dtbs/amd
echo '  mkdir -p /home/ijc/tmp/dtbs/amd; cp 
arch/arm64/boot/dts/amd/amd-overdrive-rev-b0.dtb /home/ijc/tmp/dtbs/amd'; mkdir 
-p /home/ijc/tmp/dtbs/amd; cp arch/arm64/boot/dts/amd/amd-overdrive-rev-b0.dtb 
/home/ijc/tmp/dtbs/amd
echo '  mkdir -p /home/ijc/tmp/dtbs/amd; cp 
arch/arm64/boot/dts/amd/amd-overdrive-rev-b1.dtb /home/ijc/tmp/dtbs/amd'; mkdir 
-p /home/ijc/tmp/dtbs/amd; cp arch/arm64/boot/dts/amd/amd-overdrive-rev-b1.dtb 
/home/ijc/tmp/dtbs/amd
echo '  mkdir -p /home/ijc/tmp/dtbs/amd; cp arch/arm64/boot/dts/amd/husky.dtb 
/home/ijc/tmp/dtbs/amd'; mkdir -p /home/ijc/tmp/dtbs/amd; cp 
arch/arm64/boot/dts/amd/husky.dtb /home/ijc/tmp/dtbs/amd

But indeed in https://packages.debian.org/experimental/arm64/linux-imag
e-4.7.0-rc7-arm64-unsigned/filelist the subdirectory is missing.

I'm at a complete loss as to why you and the buildd have such different
results.



Bug#833561: arm64: dtbs no longer installed in vendor subdir

2016-08-06 Thread Ian Campbell
On Fri, 2016-08-05 at 18:37 -0700, Martin Michlmayr wrote:
> I ran dtbs_install manually to verify:
> 
> ARCH=arm64 make  -n -C debian/build/build_arm64_none_arm64
> dtbs_install INSTALL_DTBS_PATH=/home/tbm/debian/linux-4.7~rc7/x
> echo '  INSTALL arch/arm64/boot/dts/amd/amd-overdrive.dtb'; mkdir -p
> /home/tbm/debian/linux-4.7~rc7/x; cp arch/arm64/boot/dts/amd/amd-
> overdrive.dtb /home/tbm/d ebian/linux-4.7~rc7/x

With pristine upstream 4.7-rc7 I see the subdirs, must be something in
our local patches?

$ make ARCH=arm64 defconfig dtbs dtbs_install INSTALL_DTBS_PATH=$HOME/tmp/dtbs 
-j12 CROSS_COMPILE=aarch64-linux-gnu- -s &&  find ~/tmp/dtbs | head -n 10
/home/ijc/tmp/dtbs
/home/ijc/tmp/dtbs/altera
/home/ijc/tmp/dtbs/altera/socfpga_stratix10_socdk.dtb
/home/ijc/tmp/dtbs/mediatek
/home/ijc/tmp/dtbs/mediatek/mt6795-evb.dtb
/home/ijc/tmp/dtbs/mediatek/mt8173-evb.dtb
/home/ijc/tmp/dtbs/broadcom
/home/ijc/tmp/dtbs/broadcom/ns2-svk.dtb
/home/ijc/tmp/dtbs/broadcom/vulcan-eval.dtb
/home/ijc/tmp/dtbs/amd

Ian.



Bug#830268: linux: please make the build reproducible

2016-07-07 Thread Ian Campbell
On Thu, 2016-07-07 at 20:52 +0200, Reiner Herrmann wrote:
> While working on the "reproducible builds" effort [1], we have noticed
> that linux could not be built reproducibly.
> Since we started varying the shell used for /bin/sh (bash vs. dash),
> linux no longer builds reproducibly.

OOI what is the motivation for varying the build environment in this
way?

Obviously a package built with $SHELL should build reproducibly with
the same $SHELL no matter which $SHELL is chosen so long as it is
consistent, but from the diff that doesn't seem to be what the goal is
here, but rather to build with $SHELL_A and then rebuild with a
different $SHELL_B.

I thought part of the reproducible builds effort included ensuring a
reproducible build environment too (through .buildinfo etc). Is
changing the shell different to changing the compiler or some library
build dep?

Ian.



Re: dtbs_install recursing on subdirs-y and dtbs-subdir leading to race?

2016-03-16 Thread Ian Campbell
On Wed, 2016-03-16 at 09:04 +, Russell King - ARM Linux wrote:
> On Wed, Mar 16, 2016 at 08:54:34AM +0000, Ian Campbell wrote:
> > Where it appears that multiple instance of __dtbs_install_prep have
> > been running in parallel at least the apm and arm subdirectories of
> > arch/arm64/boot/dts, with both of them then racing in the 
> > $(Q)if [ -d $(INSTALL_DTBS_PATH) ]; then mv
> $(INSTALL_DTBS_PATH) $(INSTALL_DTBS_PATH).old; fi
> > rule since apparently $(INSTALL_DTBS_PATH) existed during the "-d"
> > check but had gone by the time of the move.
> 
> I've already sent a patch several times to remove this line, I believe
> it's finally queued for this merge window.

Yes, as I said further down in my mail:

I understand that the mv bit of the rule in question is likely to be
removed quite soon[1] but I think the underlying race / extra recursion
still exits and might have other implications.

(where [1] was a link to your patch).

I still think it is unexpected (or at least unintended) that this rune
is run in all subdirectories.

Ian.



Bug#813881: linux-image-4.3.0-1-armmp install wrong dtb on Wandboard Quad Rev B1

2016-03-16 Thread Ian Campbell
On Mon, 2016-03-14 at 10:48 -0700, Vagrant Cascadian wrote:
> On 2016-03-14, Ian Campbell wrote:
> > On Sun, 2016-02-07 at 19:50 -0800, Vagrant Cascadian wrote:
> >> On 2016-02-06, Heinrich Schuchardt wrote:
> >> > Booting with u-boot-imx requires imx6q-wandboard-revb1.dtb.
> >> > linux-image-4.3.0-1-armmp installs imx6q-wandboard.dtb
> >> > leaving me with a system that will not boot.
> >> >
> >> > With imx6q-wandboard-revb1.dtb the system boots.
> ...
> >> When the revc was added, backwards compatibility was broken by renaming
> >> the revb .dtb file instead of keeping it and introducing the revc in a
> >> new .dtb... kind of hard to fix correctly now...
> 
> >> Adding support for flash-kernel to copy multiple, or even optionally all
> >> .dtb files could at least work around the issue.
> 
> > flash-kernel's DTB entry can reference a script to run which prints the
> > DTB filename to use, so if you can distinguish the variants by poking
> > at /sys etc (e.g the current sole user is kirkwood-qnap which looks at
> > properties of the PCI host bridge etc) then that might be an option?
> 
> I still think it would be better to copy multiple .dtb files, to make
> sure all variants are available. This also makes it possible to use the
> same SD card image on multiple wandboards.

That would be fine too.

> > Were any of these boards supported in Jessie?
> 
> In Jessie, they both work using the same .dtb provided by linux 3.16.x,
> although installing 4.x from jessie-backports on a wandboard rev B might
> cause issues.

OK, so we do need to worry about the upgrade path then.

> > If so then making upgrade work smoothly would be nice, but if not then
> > this might just be a case of Testing/Unstable users having
> > occasionally to manually fix things, but once this is done and the
> > correct DTB is in use flash-kernel should form then on DTRT and
> > Stretch will just work for fresh installs.
> 
> Upgrading u-boot is the tricky part, as we don't currently automatically
> upgrade u-boot(and it's a bit tricky to do so). Depending on which
> u-boot version is installed, u-boot will set fdtfile to a value that may
> not be correct depending on which combination of linux + flash-kernel +
> board variant is being booted.

That would imply that the "wrong" u-boot was running on the board,
wouldn't it? Does u-boot actually need updating or is "setenv fdtname
...; saveenv" sufficient?

> I think this can partially be worked around by updating the wandboard
> bootscript to have fallbacks to /boot/dtb-$ver (like the u-boot-generic
> bootscript). Then the user can set the appropriate .dtb in
> /etc/flash-kernel/db.

With support for installing multiple dtb files we would perhaps want to
still have a notion of a "primary" DTB, i.e. the one linked to
/boot/dtb-$ver and if that is the case then using the script callout to
try and pick the most appropriate fallback would make sense to me.

Ian.



dtbs_install recursing on subdirs-y and dtbs-subdir leading to race?

2016-03-16 Thread Ian Campbell
Hello,

As part of an automated build of the Debian Linux kernel packages I
think we have observed a race (or at least some unexpected extra
recursion) in the handling of dtb-subdirs vs subdirs-y in
arch/arm64/boot/dts when using make -j.

Looking at the log at [0] and removing the unrelated stuff happening
due to other parallelism we see:

make[8]: Nothing to be done for '__dtbs_install'.
mv: cannot stat 
'/«PKGBUILDDIR»/debian/linux-image-4.5.0-rc7-arm64/usr/lib/linux-image-4.5.0-rc7-arm64':
 No such file or directory
mv: cannot stat 
'/«PKGBUILDDIR»/debian/linux-image-4.5.0-rc7-arm64/usr/lib/linux-image-4.5.0-rc7-arm64':
 No such file or directory
/«PKGBUILDDIR»/scripts/Makefile.dtbinst:26: recipe for target 
'__dtbs_install_prep' failed
make[8]: *** [__dtbs_install_prep] Error 1
/«PKGBUILDDIR»/scripts/Makefile.dtbinst:26: recipe for target 
'__dtbs_install_prep' failed
make[8]: *** [__dtbs_install_prep] Error 1
  INSTALL net/bridge/netfilter/ebtable_nat.ko
/«PKGBUILDDIR»/scripts/Makefile.dtbinst:46: recipe for target 'apm' failed
make[7]: *** [apm] Error 2
make[7]: *** Waiting for unfinished jobs
/«PKGBUILDDIR»/scripts/Makefile.dtbinst:46: recipe for target 'arm' failed
make[7]: *** [arm] Error 2
make[8]: Nothing to be done for '__dtbs_install'.
arch/arm64/Makefile:103: recipe for target 'dtbs_install' failed
make[6]: *** [dtbs_install] Error 2
Makefile:146: recipe for target 'sub-make' failed
make[5]: *** [sub-make] Error 2
Makefile:24: recipe for target '__sub-make' failed
make[4]: *** [__sub-make] Error 2
make[4]: Leaving directory '/«PKGBUILDDIR»/debian/build/build_arm64_none_arm64'
debian/rules.real:394: recipe for target 'install-image_arm64_none_arm64_dt' 
failed
make[3]: *** [install-image_arm64_none_arm64_dt] Error 2
make[3]: Leaving directory '/«PKGBUILDDIR»'
debian/rules.real:362: recipe for target 'install-image_arm64_none_arm64' failed
make[2]: *** [install-image_arm64_none_arm64] Error 2
make[2]: *** Waiting for unfinished jobs

Where it appears that multiple instance of __dtbs_install_prep have
been running in parallel at least the apm and arm subdirectories of
arch/arm64/boot/dts, with both of them then racing in the 
$(Q)if [ -d $(INSTALL_DTBS_PATH) ]; then mv $(INSTALL_DTBS_PATH) 
$(INSTALL_DTBS_PATH).old; fi
rule since apparently $(INSTALL_DTBS_PATH) existed during the "-d"
check but had gone by the time of the move. The build is in an
automated buildd pristine environment with INSTALL_DTBS_PATH pointing
to a brand new directory, so for $(INSTALL_DTBS_PATH) to exist at all
there must have been a third instance of __dtbs_install_prep earlier
which created it.

I understand that the mv bit of the rule in question is likely to be
removed quite soon[1] but I think the underlying race / extra recursion
still exits and might have other implications.

Ben and I have hypothesised that this is because
arch/arm64/boot/dts/Makefile has:

subdir-y:= $(dts-dirs)

which means that dtbs_install recurses twice, once due to the dts-dirs
handling in scripts/Makefile.dtbinst rules (via $(dtbsinst-dirs)) and
once again for the (generic) subdir-y handling. BTW many of the subdir
Makefiles have the same construct, I'm not sure why since they have no
sub-sub-dirs, although it seems more harmless in that context.

By my reading the __dtbs_install_prep rule is supposed to run once as
part of arch/*/boot/dts/Makefile and not as part of any each of the
subdirectories.

I've experimented with removing the subdir-y assignment, but that seems
to break the dtbs and clean rules.

I'm not sure how else this can/should be fixed with Kbuild. Any ideas?

Thanks,
Ian.

[0] 
https://buildd.debian.org/status/fetch.php?pkg=linux=arm64=4.5~rc7-1~exp1=1457444794
[1] 
https://git.kernel.org/cgit/linux/kernel/git/mmarek/kbuild.git/commit/?id=5399eb9b39081d6d2fc1a13d4ea85f1ceb2c8b44



Bug#813881: linux-image-4.3.0-1-armmp install wrong dtb on Wandboard Quad Rev B1

2016-03-14 Thread Ian Campbell
On Sun, 2016-02-07 at 19:50 -0800, Vagrant Cascadian wrote:
> On 2016-02-06, Heinrich Schuchardt wrote:
> > Booting with u-boot-imx requires imx6q-wandboard-revb1.dtb.
> > linux-image-4.3.0-1-armmp installs imx6q-wandboard.dtb
> > leaving me with a system that will not boot.
> >
> > With imx6q-wandboard-revb1.dtb the system boots.
> 
> As you've noted, flash-kernel has no way of distinguishing which variant
> to support, though u-boot does. I've sometimes wondered weather u-boot
> should pass a boot argument for which .dtb to use...
> 
> To make matters worse, in older versions of the linux kernel, such as
> the 3.16.x in jessie, imx6q-wandboard.dtb may actually be for wandboard
> revb variants, not for the revc variants (although I have one of each,
> both running the same .dtb in jessie without obvious problem, though I
> don't make use of the wifi or bluetooth on either).
> 
> When the revc was added, backwards compatibility was broken by renaming
> the revb .dtb file instead of keeping it and introducing the revc in a
> new .dtb... kind of hard to fix correctly now...
> 
> Adding support for flash-kernel to copy multiple, or even optionally all
> .dtb files could at least work around the issue.


flash-kernel's DTB entry can reference a script to run which prints the
DTB filename to use, so if you can distinguish the variants by poking
at /sys etc (e.g the current sole user is kirkwood-qnap which looks at
properties of the PCI host bridge etc) then that might be an option?

Were any of these boards supported in Jessie? If so then making upgrade
work smoothly would be nice, but if not then this might just be a case
of Testing/Unstable users having occasionally to manually fix things,
but once this is done and the correct DTB is in use flash-kernel should
form then on DTRT and Stretch will just work for fresh installs.

Ian.



Bug#812540: Add ARCH_HISI for Lemaker HiKey support

2016-03-13 Thread Ian Campbell
On Sat, 2016-03-12 at 09:00 -0800, Vagrant Cascadian wrote:
> On 2016-03-12, Ian Campbell wrote:
> > On Fri, 2016-03-11 at 20:03 -0800, Vagrant Cascadian wrote:
> >> On 2016-03-10, Martin Michlmayr wrote:
> >> > * Ian Campbell <i...@debian.org> [2016-01-25 09:57]:
> >> Most drivers aren't even available in 4.4.x, and some aren't even
> in
> >> 4.5.x yet. From a breif glance, the dts files for
> >> arch/arm64/boot/dts/hisilicon/hi6220*, don't look to contain
> support
> >> for
> >> much more than the CPU, memory and serial...
> >> 
> >> 
> >> > This might be a good starting point:
> >> > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg109
> 864
> >> 4.html
> >> 
> >> Reading that patch, I'm guessing the following are available and
> >> possibly needed or desired, though possibly may require device-
> tree
> >> patches as well:
> >> 
> >> available in 4.4.x:
> >> 
> >> CONFIG_ARCH_HISI=y
> >> CONFIG_POWER_RESET_HISI=y
> >> CONFIG_HISI_THERMAL=m
> >> CONFIG_MMC_DW=m
> >> CONFIG_MMC_DW_K3=m
> >> CONFIG_I2C_DESIGNWARE_PLATFORM=m
> >> 
> >> available in 4.5.x:
> >> 
> >> CONFIG_COMMON_RESET_HI6220=m
> >> CONFIG_PHY_HI6220_USB=m
> >> 
> >> linux-next tag next-20160311:
> >> 
> >> CONFIG_HI6220_MBOX=m
> >> CONFIG_REGULATOR_HI655X=m
> >> 
> >> 
> >> Will build a few test kernels and report back...
> >
> > Thanks, for things other than the ones listed above the ones I'd be
> > most curious about would be serial and networking (wifi only on
> this
> > platform IIRC). Both of those could possibly be supported already
> via
> > existing generic drivers, I've no idea...
> 
> Not sure about WIFI, but serial seems already enabled:
> 
>   CONFIG_SERIAL_AMBA_PL011=y
>   CONFIG_SERIAL_AMBA_PL011_CONSOLE=y

That makes sense, thanks for the info. Since MMC + serial in 4.4 is a
basically useful thing (and I'm hoping the WiFi is some preexisting
driver too) this seemed like a useful set of stuff to enable. So I have
done so in git.

Ian.



Bug#818059: linux-image-4.4.0-1-armmp-lpae: Missing usb-power-supply in DTB for OlinuXino-A20-LIME2

2016-03-13 Thread Ian Campbell
Control: tag -1 +upstream

On Sun, 2016-03-13 at 09:17 +0100, Michael Haas wrote:
> 
> I presume this would need to be fixed upstream in 
> arch/arm/boot/dts/un7i-a20-olinuxino-lime2.dts
> but I don't know where to direct that specific bug report.

Please report it (or, better, send a patch) to the linux-sunxi mailing
list[0]. I'd also recommend copying this subset of the output of get-
maintainer.pl:

$ ./scripts/get_maintainer.pl -f arch/arm/boot/dts/sun7i-a20-olinuxino-lime2.dts
[...]
Maxime Ripard  (maintainer:ARM/Allwinner 
sunXi SoC support)
Chen-Yu Tsai  (maintainer:ARM/Allwinner sunXi SoC support)
[...]
linux-arm-ker...@lists.infradead.org (moderated list:ARM PORT)
[...]

Once it is on its way to mainline (i.e. in Maxime's tree would be
sufficient) please let us know and we can add it to the Debian kernel.

Ian.

[0] http://linux-sunxi.org/Mailing_list



Bug#812540: Add ARCH_HISI for Lemaker HiKey support

2016-03-12 Thread Ian Campbell
On Fri, 2016-03-11 at 20:03 -0800, Vagrant Cascadian wrote:
> On 2016-03-10, Martin Michlmayr wrote:
> > * Ian Campbell <i...@debian.org> [2016-01-25 09:57]:
> >> I suppose it will need more than just ARCH_HISI. Are you able to
> identify
> >> the full set of options (e.g. drivers and such) which are needed
> to make
> >> useful Lemaker support? If so I'd appreciate it (if not I can
> probably try
> >> and dig those out myself, but it might take me a while to around
> to it).
> 
> Most drivers aren't even available in 4.4.x, and some aren't even in
> 4.5.x yet. From a breif glance, the dts files for
> arch/arm64/boot/dts/hisilicon/hi6220*, don't look to contain support
> for
> much more than the CPU, memory and serial...
> 
> 
> > This might be a good starting point:
> > https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg109864
> 4.html
> 
> Reading that patch, I'm guessing the following are available and
> possibly needed or desired, though possibly may require device-tree
> patches as well:
> 
> available in 4.4.x:
> 
> CONFIG_ARCH_HISI=y
> CONFIG_POWER_RESET_HISI=y
> CONFIG_HISI_THERMAL=m
> CONFIG_MMC_DW=m
> CONFIG_MMC_DW_K3=m
> CONFIG_I2C_DESIGNWARE_PLATFORM=m
> 
> available in 4.5.x:
> 
> CONFIG_COMMON_RESET_HI6220=m
> CONFIG_PHY_HI6220_USB=m
> 
> linux-next tag next-20160311:
> 
> CONFIG_HI6220_MBOX=m
> CONFIG_REGULATOR_HI655X=m
> 
> 
> Will build a few test kernels and report back...

Thanks, for things other than the ones listed above the ones I'd be
most curious about would be serial and networking (wifi only on this
platform IIRC). Both of those could possibly be supported already via
existing generic drivers, I've no idea...

Ian.



Bug#817016: linux-image-4.3.0-1-amd64: ThinkPad X1 Carbon: Boot stalls at "intel_pstate: HWP enabled"

2016-03-07 Thread Ian Campbell
Control: forwarded -1 https://bugzilla.kernel.org/show_bug.cgi?id=110941

On Mon, 2016-03-07 at 07:46 +, Andy Smith wrote:
> Package: src:linux
> Version: 4.3.5-1
> Severity: normal
> 
> Dear Maintainer,
> 
> Booting this kernel (or the debian-installer latest daily) results in a blank
> screen. When removing the quiet option the boot is seen to stall after 
> printing
> "intel_pstate: HWP enabled".
> 
> Here's a screenshot: http://i.imgur.com/cr5i72L.png
> 
> Searching around found me another report from a Yoga 260 owner with the same
> issue. They've got a Skylake i5 whereas I've got a Skylake i7. A suggested
> workaround was to use kernel parameter:
> 
>   intel_pstate=no_hwp
> 
> This allowed boot of the debian-installer and this installed kernel but I
> understand it disables many desirable power efficiency features.

FWIW I also tipped over this on an X1 carbon.

Looks like it is being investigated at https://bugzilla.kernel.org/show
_bug.cgi?id=110941

Ian.



Re: Booting uncompressed kernel images

2016-02-05 Thread Ian Campbell
On Thu, 2016-02-04 at 15:40 -0200, Tiago Ilieve wrote:
> Sorry for the delay in my response. In the past couple days I was
> confirming with Oracle if my findings (using virt-what, as you
> suggested) where right and, indeed, they are supporting Xen HVM right
> now.

Great!

> 
> So, there's no need for an uncompressed/gzipped kernel anymore and the
> default one boots just fine. Although I'm still curious regarding the
> possibility of booting an uncompressed kernel on
> native/full-virtualization, I guess this does not makes sense.

WRT virtualisation, not in general no.

If possible you really want to do the decompression in guest context, to
avoid issues with potentially malicious compressed binaries.

For native there generally isn't much point, but for x86 at least there is
also a bunch of necessary setup and gathering of information (e.g. from the
BIOS in 16 bit mode) which is done in the same preamble as where the
decompressor runs (more or less) and which you would need to replicate
before booting the uncompressed image's entry point -- it's really not
worth the effort in general.

> I'm really thankful for you support and inclination to help us on the
> matter.

No problem!

Ian.



Re: Booting uncompressed kernel images

2016-02-02 Thread Ian Campbell
On Mon, 2016-02-01 at 23:31 -0200, Tiago Ilieve wrote:
> > PS I have also found binwalk [2] useful when examining contents of
> > compressed kernel
> > apt-get install binwalk
> 
> Thanks for the tip - although I got a little bit surprised with so
> many dependencies in what should be a simple command-line utility.
> 
> Here's what I got from "binwalk /boot/vmlinuz-3.16.0-4-amd64":
> 
> DECIMAL   HEXADECIMAL DESCRIPTION
> ---
> -
> 0 0x0 Microsoft portable executable
> 18356 0x47B4  xz compressed data
> 3108600   0x2F6EF8xz compressed data
> 
> Not sure about what bytes "0-18355" means. Maybe a false-positive?
> 
> If I run it with "-e", it get two files ("47B4.tar" and "2F6EF8.tar")
> that can't be decompressed with "tar"/"unxz".

What does file(1) say about them? For me I see:

_vmlinuz-3.16.0-4-amd64.extracted/2F6EF8.xz: XZ compressed data
_vmlinuz-3.16.0-4-amd64.extracted/47B4:  ELF 64-bit LSB executable, x86-64, 
version 1 (SYSV), statically linked, 
BuildID[sha1]=2883400c6927fe339cdd2c321d3d154c472ef418, stripped
_vmlinuz-3.16.0-4-amd64.extracted/47B4.xz:   XZ compressed data

Do any of them match what you get out of extract-linux?

_vmlinuz-3.16.0-4-amd64.extracted/47B4 looks to me to be an ELF file which
I would expect to be bootable as a Xen PV guest, it has the required ELF
notes etc.

Did you see my other replies on debian-kernel yesterday? There are some
questions there which it would be useful to know the answers to.

Ian.



Re: Booting uncompressed kernel images

2016-02-02 Thread Ian Campbell
On Tue, 2016-02-02 at 15:18 -0200, Tiago Ilieve wrote:
> Hi Ian,
> 
> On 2 February 2016 at 09:24, Ian Campbell <i...@debian.org> wrote:
> > Did you see my other replies on debian-kernel yesterday? There are some
> > questions there which it would be useful to know the answers to.
> 
> It turned out that I though that I was subscribed to
> "debian-kernel@l.d.o", but I wasn't. This is solved now.
> 
> > What does file(1) say about them? For me I see:
> > 
> > _vmlinuz-3.16.0-4-amd64.extracted/2F6EF8.xz: XZ compressed data
> > _vmlinuz-3.16.0-4-amd64.extracted/47B4:  ELF 64-bit LSB executable,
> > x86-64, version 1 (SYSV), statically linked,
> > BuildID[sha1]=2883400c6927fe339cdd2c321d3d154c472ef418, stripped
> > _vmlinuz-3.16.0-4-amd64.extracted/47B4.xz:   XZ compressed data
> 
> Are you able to extract three files from it?

Yes, although I think/suspect that my 47B4 is created by binwalk
decompressing 47B4.xz as a convenience.

>  Here's what I got:
> 
> $ file _vmlinuz-3.16.0-4-amd64.extracted/*
> _vmlinuz-3.16.0-4-amd64.extracted/2F6EF8.tar: XZ compressed data
> _vmlinuz-3.16.0-4-amd64.extracted/47B4.tar:   XZ compressed data
> 
> Only two that I can't extract again.
> 
> > Do any of them match what you get out of extract-linux?
> 
> $ ./extract-vmlinux /boot/vmlinuz-3.16.0-4-amd64 > vmlinux
> $ file vmlinux
> vmlinux: ELF 64-bit LSB executable, x86-64, version 1 (SYSV),
> statically linked,
> BuildID[sha1]=2883400c6927fe339cdd2c321d3d154c472ef418, stripped

This looks like a file which I would expect to be bootable as a Xen PV
guest. Using "readelf -n" should show lots of:

Displaying notes found at file offset 0x00716774 with length 0x01d8:
  Owner Data size   Description
  Xen  0x0006   Unknown note type: (0x0006)

Which would imply this.

> The SHA1 is the same.
> 
> > _vmlinuz-3.16.0-4-amd64.extracted/47B4 looks to me to be an ELF file
> > which
> > I would expect to be bootable as a Xen PV guest, it has the required
> > ELF
> > notes etc.
> 
> From your previous e-mails, looks like Grub2 can't boot it, opposite
> to a Xen PV guest, right?

It depends ;-).

Grub2 can support many different platforms, including "native PC" and "Xen
PV". http://wiki.xen.org/wiki/PvGrub2#Background:_Introduction_to_Xen_PV_Bo
otloaders has some background which I hope will be useful here (the second
part on actually using grub2 is not really relevant since you are at the
mercy of Oracle Cloud).

"Xen HVM" == "native PC" from a booting point of view, so in a Xen HVM
guest you would use the native grub, in a Xen PV guest it's possible that
you are using the "Xen PV" version of Grub2.

A native version of Grub will not be able to boot the raw ELF file
extracting from the vmlinuz, but it should be able to boot the vmlinuz
itself just fine.

Virtual box == "native PC" too, I think it unlikely it would be able to
boot a raw ELF file.

> On 1 February 2016 at 08:06, Ian Campbell <i...@debian.org> wrote:
> 
> > Is this booting via pvgrub[1], if not then do you know how? What does
> > the
> > grub.cfg stanza look like?
> 
> This is hard to answer properly. The image has "grub2" (the last
> 2.02~beta2-22+deb8u1 available on Jessie) installed and I can boot it
> on a local Xen installation as well on Oracle Cloud (using the custom
> kernel compressed with gzip).

Do you need the custom kernel on your local install?

Having grub2 installed in a PV guest does not necessarily mean it is being
used, since the first stage bootloader is provided by the host. The package
supporting Xen PV is "grub-xen"

If you are booting PV grub2 then perhaps you just need to add "insmod xzio"
to your grub.cfg?

>  What I tried is the same process
> (extract-linux, generate initrd, update-grub) both on VirtualBox and
> Xen on Oracle Cloud. In VirtualBox, I got the error pasted on my first
> message.

I think you should ignore VirtualBox for the purposes of diagnosing what is
going on here, behaviour on VB tells us next to nothing about behaviour on
Xen.

>  In Oracle Cloud I have no access to the boot logs to confirm
> if it happens to be exactly the same.

You really need to figure out for sure if you are booting in an HVM or PV
guest, it makes a very large difference to what kernel features you
want/need. From there it should become pretty clear what bootloader is in
use.

If you can boot the guest in some way then virt-what ought to tell you for 
pretty sure what you are running in.

Ian.



Re: Booting uncompressed kernel images

2016-02-01 Thread Ian Campbell
On Mon, 2016-02-01 at 06:03 -0200, Tiago Ilieve wrote:
> Hi,
> 
> I have a scenario[1] where the default Linux kernel compressed with XZ
> (from Jessie and up) cannot be booted. The first thing that I've tried
> was to uncompress it using "extract-linux"[2] and it didn't worked by
> the time, so I decided to rebuild the entire "linux-image-*" package
> changing "CONFIG_KERNEL_XZ=y" to "CONFIG_KERNEL_GZIP=y". This, of
> course, implies that it would be needed to recompile the kernel every
> time a new version of the package is released, which is an overkill
> for a such simple requirement.
> 
> Yesterday, Ben Hutchings itself suggested[3] me to write a hook that
> decompresses the kernel at package installation time, something which
> I find a great idea. The problem is that, again, I couldn't boot a
> machine (tried on VirtualBox and Xen) after uncompressing its
> "/boot/vmlinuz-*" using "extract-linux".

What does file(1) say about the resulting file? It should be a plain ELF
file of some sort.

I wouldn't expect such a kernel to be bootable by very much other than Xen
PV. It won't work as native for sure, nor as Xen HVM.

>  I can generate an initrd file
> from this uncompressed image, "update-grub" detects it fine, but if I
> reboot it the following error appears:
> 
> Loading Linux 3.16.0-4-amd64 ...
> error: invalid magic number.
> Loading initial ramdisk ...
> error: you need to load the kernel first.

Is this booting via pvgrub[1], if not then do you know how? What does the
grub.cfg stanza look like?

> - Is "extract-linux" stripping some essential information (the script
> looks for an offset to start the decompression process) from the
> kernel image that is needed to boot it later? If so, is there a way to
> recover and insert it on the uncompressed image?

It should just be the raw ELF file payload form the bzImage. Xen requires
some particular ELF notes to be present, but I very much doubt extract-
linux would be stripping those (since that would involve diving into the
contents of the payload)

Ian.

[1] http://wiki.xen.org/wiki/PvGrub



> 
> Regards,
> Tiago.
> 
> [1]: https://lists.debian.org/debian-cloud/2016/01/msg00052.html
> [2]: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/plai
> n/scripts/extract-vmlinux
> [3]: https://lists.debian.org/debian-cloud/2016/01/msg00060.html
> [4]: http://www.he1ix.org/2015/01/creating-a-xen-domu-on-debian-squeeze-6
> -0-6/
> [5]: http://noone.org/blog/English/Computer/Debian/Running%20a%20Sid%20Do
> mU%20on%20a%20Squeeze%20Dom0.html
> 



Re: Booting uncompressed kernel images

2016-02-01 Thread Ian Campbell
On Mon, 2016-02-01 at 10:06 +, Ian Campbell wrote:
> On Mon, 2016-02-01 at 06:03 -0200, Tiago Ilieve wrote:
> > Hi,
> > 
> > I have a scenario[1] where the default Linux kernel compressed with XZ
> > (from Jessie and up) cannot be booted. The first thing that I've tried
> > was to uncompress it using "extract-linux"[2] and it didn't worked by
> > the time, so I decided to rebuild the entire "linux-image-*" package
> > changing "CONFIG_KERNEL_XZ=y" to "CONFIG_KERNEL_GZIP=y". This, of
> > course, implies that it would be needed to recompile the kernel every
> > time a new version of the package is released, which is an overkill
> > for a such simple requirement.
> > 
> > Yesterday, Ben Hutchings itself suggested[3] me to write a hook that
> > decompresses the kernel at package installation time, something which
> > I find a great idea. The problem is that, again, I couldn't boot a
> > machine (tried on VirtualBox and Xen) after uncompressing its
> > "/boot/vmlinuz-*" using "extract-linux".
> 
> What does file(1) say about the resulting file? It should be a plain ELF
> file of some sort.
> 
> I wouldn't expect such a kernel to be bootable by very much other than
> Xen
> PV. It won't work as native for sure, nor as Xen HVM.
> 
> >  I can generate an initrd file
> > from this uncompressed image, "update-grub" detects it fine, but if I
> > reboot it the following error appears:
> > 
> > Loading Linux 3.16.0-4-amd64 ...
> > error: invalid magic number.
> > Loading initial ramdisk ...
> > error: you need to load the kernel first.
> 
> Is this booting via pvgrub[1], if not then do you know how? What does the
> grub.cfg stanza look like?

Hrm, that "invalid magic number" message is from upstream grub2, native
code paths, AFAICT.

Were you trying to boot this kernel natively? Or are the guests on Oracle
HVM ones (in which case the compression shouldn't matter, since the kernel
native boot path will take care of it).

> 
> > - Is "extract-linux" stripping some essential information (the script
> > looks for an offset to start the decompression process) from the
> > kernel image that is needed to boot it later? If so, is there a way to
> > recover and insert it on the uncompressed image?
> 
> It should just be the raw ELF file payload form the bzImage. Xen requires
> some particular ELF notes to be present, but I very much doubt extract-
> linux would be stripping those (since that would involve diving into the
> contents of the payload)
> 
> Ian.
> 
> [1] http://wiki.xen.org/wiki/PvGrub
> 
> 
> 
> > 
> > Regards,
> > Tiago.
> > 
> > [1]: https://lists.debian.org/debian-cloud/2016/01/msg00052.html
> > [2]: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/pl
> > ai
> > n/scripts/extract-vmlinux
> > [3]: https://lists.debian.org/debian-cloud/2016/01/msg00060.html
> > [4]: http://www.he1ix.org/2015/01/creating-a-xen-domu-on-debian-squeeze
> > -6
> > -0-6/
> > [5]: http://noone.org/blog/English/Computer/Debian/Running%20a%20Sid%20
> > Do
> > mU%20on%20a%20Squeeze%20Dom0.html
> > 
> 
> 



Bug#784688: Thousands of "xen:balloon: Cannot add additional memory (-17) messages" despite dom0 ballooning disabled

2016-02-01 Thread Ian Campbell
On Fri, 2016-01-29 at 15:59 +, Andy Smith wrote:
> Hi Ian,
> 
> On Fri, Jan 29, 2016 at 02:57:23PM +0000, Ian Campbell wrote:
> > I spent a bit of time investigating this, but sadly I'm not able to
> > reproduce the basis failure.
> 
> FWIW it was me who reported this with the packages in Debian stable
> (linux-image-3.16.0-4-amd64 3.16.7-ckt20-1+deb8u3,
> xen-hypervisor-4.4-amd64 4.4.1-9+deb8u3) when using
> "dom0_mem=1024M,max:1024M" on the hypervisor command line.
> 
> I must admit I found this bug when searching for the error message,
> and have only been seeing it printed a couple of times at guest
> shutdown, not thousands of times.
> 
> So if having it printed a couple of times isn't considered a bug,
> I'm sorry if I've led you astray here.

No worries, thanks for letting me know.

>  Might be worth finding a way
> to remove it anyway though; anyone having a problem is going to keep
> searching for it thinking it is relevant to their case.

Indeed. That might be tricky to arrange 100% reliably due to the way these
things work out in practice wrt ballooning and maxes and dynamic changes
due to the use by backends, an upstream thing in any case.

> At the moment I am avoiding seeing the message at all by running with
> only "dom0_mem=1024M" on the command line. What's the disadvantage
> of not having the "max:1024M" there?

I'm not 100% sure. It looks like it causes no max to be set (LONG_MAX is
the default), which I suppose would allow dom0 (from Xen's PoV, the kernel
might have its own limitations) to balloon to more than 1024M if it tried
to (which would explain it working around this issue without).

Ian.



Bug#784688: Thousands of "xen:balloon: Cannot add additional memory (-17) messages" despite dom0 ballooning disabled

2016-01-29 Thread Ian Campbell
On Wed, 2016-01-27 at 10:57 +, Ian Campbell wrote:
> 
> I'm still unable to spot what might have changed between 3.16.7-ckt20-
> 1+deb8u2 and 4.3.3-5 though to explain it going away, which I'd still
> quite liketo get to the bottom of in order to fix in Jessie.

I spent a bit of time investigating this, but sadly I'm not able to
reproduce the basis failure.

I've tried the combinations below and all are OK. Some of them produce 1 or
2 of the "-17" messages (I should have noted which but didn't, I think it
was most) but in no case did I see thousands of them.

Ian.

for i in $(seq 1 15) ; do xl reboot debian.guest.osstest ; sleep 10s; done

dom0_mem=2048M,max:2056M
L: 3.16.7-ckt20-1+deb8u3
X: 4.6.0-1+nmu1

=> OK

dom0_mem=2048M,max:2048M
L: 3.16.7-ckt20-1+deb8u3
X: 4.6.0-1+nmu1

=> OK

dom0_mem=2048M,max:2048M
L: 3.16.7-ckt9-3~deb8u1
X: 4.6.0-1+nmu1

=> OK

-

for i in $(seq 1 15) ; do xl create /etc/xen/debian.guest.osstest.cfg ; sleep 
10s ;  xl shutdown -w debian.guest.osstest; sleep 5s; done

dom0_mem=2048M,max:2048M
L: 3.16.7-ckt9-3~deb8u1
X: 4.6.0-1+nmu1

=> OK

dom0_mem=2048M,max:2048M
L: 3.16.7-ckt9-3~deb8u1
X: 4.4.1-9+deb8u3

=> OK



Bug#784688: Thousands of "xen:balloon: Cannot add additional memory (-17) messages" despite dom0 ballooning disabled

2016-01-27 Thread Ian Campbell
On Tue, 2016-01-26 at 19:46 +0200, KSB wrote:
> > This is actually useful, because it shows that the issue occurs even
> > with
> > Xen 4.6, which I think rules out a Xen side issue (otherwise we'd have
> > had
> > lots more reports from 4.4 through to 4.6) and points to a kernel side
> > issue somewhere.
> > 
> > > But I checked logs more thoroughly and found it even on more recent
> > > kernels:
> > > 1) Lot of messages on 3.14-2-amd64 with xen-4.6, 13 domU's.
> > 
> > Just to be clear, "Lots" here means "hundreds or thousands"? I think it
> > is
> > expected to see one or two around the time a VM is started or stopped,
> > so
> > with 13 domUs a couple of dozen messages wouldn't seem out of line to
> > me.
> > 
> pkg 3.14.15-2
> ~1600 from last dmesg cleanup which was 23h ago, but all of them 
> distributed in last 15h
> 
> 
> > > 2) 4.3.0-1-amd64 xen-4.6, only two messages shortly after boot, only
> > > 1
> > > domU running:
> > > [   12.473778] xen:balloon: Cannot add additional memory (-17)
> > > [   21.673298] xen:balloon: Cannot add additional memory (-17)
> > > uptime 17 days.
> > > 
> > > Previous on same machine was 4.2.0-1-amd64 with more (-17)'s
> > 
> > Was it running xen-4.6 when it was running 4.2.0 or was that also
> > older?
> 
> 4.3.3-5 xen-4.6.0 and previous 4.2.6-1 xen-4.4.1

Thanks. And just to clarify, with Linux 4.2.6-1 xen-4.4.1 you were or were
not seeing this issue?

To summarise what I can tell from this bug log the following combinations
are/are not prone to this issue:

xen-??? xen-4.1 xen-4.4.1 4.4.1-9+deb8u3 xen-4.6.0
3.14.15-2Y[1]

3.16.7-ckt7-1   N[1]
3.16.7-ckt9-3~deb8u1Y[2]
3.16.7-ckt20-1+deb8u2 Y[3]

4.2.6-1 ?[1]
4.3.3-5  NN[1]
4.3.3-7  N[1]

[1] KSV
[2] ML (original report, Xen version unknown)
[3] AS (with dom0_mem=1024M,max:1024M, but not dom0_mem=1024M)

The N for xen-4.1 + linux-3.16.7-ckt7-1 (KSV's #4) seems anomalous. Perhaps 
that version is susceptible but not exhibiting it during the span of the logs.

The ? for xen-4.4.1 + linux-4.2.6-1 is the "just to clarify" above.

In any case it does appear to correlate with the Linux version and not the Xen 
version, and it does appear to be fixed in 4.3.3-5, or possibly even 4.2.6-1.

I'm still unable to spot what might have changed between 3.16.7-ckt20-1+deb8u2 
and 4.3.3-5 though to explain it going away, which I'd still quite liketo get 
to the bottom of in order to fix in Jessie.

Thanks,


Ian.



Bug#784688: Thousands of "xen:balloon: Cannot add additional memory (-17) messages" despite dom0 ballooning disabled

2016-01-26 Thread Ian Campbell
On Mon, 2016-01-25 at 20:36 +0200, KSB wrote:
> > Do you have a package version which you know to be good? How confident
> > are
> > you that it is ok (sometimes the problem is intermittent)?
> > 
> > Lastly, is there any chance you upgraded the Xen packages at the same
> > time?
> > I'm starting to wonder if maybe this is not a kernel issue.
> > 
> Sorry, but there is chance, sadly.

This is actually useful, because it shows that the issue occurs even with
Xen 4.6, which I think rules out a Xen side issue (otherwise we'd have had
lots more reports from 4.4 through to 4.6) and points to a kernel side
issue.

> But I checked logs more thoroughly and found it even on more recent
> kernels:
> 1) Lot of messages on 3.14-2-amd64 with xen-4.6, 13 domU's.

Just to be clear, "Lots" here means "hundreds or thousands"? I think it is
expected to see one or two around the time a VM is started or stopped, so
with 13 domUs a couple of dozen messages wouldn't seem out of line to me.

> 2) 4.3.0-1-amd64 xen-4.6, only two messages shortly after boot, only 1 
> domU running:
> [   12.473778] xen:balloon: Cannot add additional memory (-17)
> [   21.673298] xen:balloon: Cannot add additional memory (-17)
> uptime 17 days.
> 
> Previous on same machine was 4.2.0-1-amd64 with more (-17)'s

Was it running xen-4.6 when it was running 4.2.0 or was that also older?

Also 4.2.0-1-amd64 here (and all the other numbers you gave) is the ABI,
not the package version. The package versions is either in dpkg or you can
find it in /proc/version:

Linux version 4.1.0-2-amd64 (debian-kernel@lists.debian.org) (gcc version 4.9.3 
(Debian 4.9.3-3) ) #1 SMP Debian 4.1.6-1 (2015-08-23)
  ^^^ABI^^^ 
  ^^^VERSION

> 3) 4.3.0-1-amd64, one month, several reboots, average 4 domU's, and no 
> messages

Any idea which Xen?

> 4) 3.16.0-4-amd64, xen-4.1, 22 domU's, uptime 188 days, in last month I 
> see only
> Jan 7 14:12:08
> Jan 7 14:12:08
> Jan 7 14:12:08
> Jan 7 14:12:08
> Jan 7 14:27:47
> Jan 7 14:27:47
> Jan 7 14:27:47
> Jan 7 14:27:48
> and this is roughly the time last machine was created(started).
> 
> 
> 



Re: Testing versatile kernel on Raspberry Pi?

2016-01-26 Thread Ian Campbell
On Mon, 2016-01-25 at 17:59 +0100, Sebastian Reichel wrote:
> Hi,
> 
> On Mon, Jan 25, 2016 at 05:39:40PM +0100, Diederik de Haas wrote:
> > On Monday 25 January 2016 14:58:27 Ian Campbell wrote:
> > > > As I got the impression that support for the RPi was now present in
> > > > upstream and (therefor) the Debian kernels,
> > > 
> > > The "therefor" won't happen automatically, someone will need to file
> > > a
> > > wishlist bug asking for the relevant options to be enabled in the
> > > Debian
> > > kernel configuration.
> > > 
> > > For the RPi's with the newer CPU cores it makes clear sense to do
> > > that in
> > > the armhf/armmp kernel flavour (since it is the "multiplatform"
> > > flavour,
> > > and the only one we want to support).
> > 
> > I'll give it a try, but first have to learn about the armmp stuff.
> > 
> > Here is the default kernel configuration for the RPi 2: 
> > https://github.com/raspberrypi/linux/blob/rpi-4.4.y/arch/arm/configs/bc
> > m2709_defconfig 
> > (which does not seem part of 
> > https://sources.debian.net/src/linux/4.4-1~exp1/arch/arm/configs/)
> 
> The kernel from the Raspberry Pi foundation and the mainline kernel
> have different config options, so that won't be of much use. Check
> mainline's arch/arm/configs/bcm2835_defconfig instead.

Also note that the Debian kernels don't use anything from arch/*/configs.

The configuration is built from snippets (global, per arch, per flavour,
etc) under debian/configs. The armhf/armmp specific bits are
in debian/config/armhf/config.armmp and it is this I would expect to need
changes to support a new armhf platform.

When making changes it is useful to use kconfigeditor from http://anonscm.d
ebian.org/cgit/kernel/kernel-team.git/ to re-normalise the changes (just
run kernel-team.git/utils/kconfigeditor2/process.py ).

It's also likely that things will need adding under
debian/installer/armhf/modules/ in order for the platform to work in the
installer environment.

There's more info on ther Debian kernel packaging in the Debian Kernel
Handbook at https://www.debian.org/doc/manuals/debian-kernel-handbook/index
.html

Ian.



Re: Testing versatile kernel on Raspberry Pi?

2016-01-26 Thread Ian Campbell
On Mon, 2016-01-25 at 17:59 +0100, Sebastian Reichel wrote:
> Hi,
> 
> On Mon, Jan 25, 2016 at 05:39:40PM +0100, Diederik de Haas wrote:
> > On Monday 25 January 2016 14:58:27 Ian Campbell wrote:
> > > > As I got the impression that support for the RPi was now
> present in
> > > > upstream and (therefor) the Debian kernels,
> > > 
> > > The "therefor" won't happen automatically, someone will need to
> file a
> > > wishlist bug asking for the relevant options to be enabled in the
> Debian
> > > kernel configuration.
> > > 
> > > For the RPi's with the newer CPU cores it makes clear sense to do
> that in
> > > the armhf/armmp kernel flavour (since it is the "multiplatform"
> flavour,
> > > and the only one we want to support).
> > 
> > I'll give it a try, but first have to learn about the armmp stuff.
> > 
> > Here is the default kernel configuration for the RPi 2: 
> > https://github.com/raspberrypi/linux/blob/rpi-4.4.y/arch/arm/config
> s/bcm2709_defconfig 
> > (which does not seem part of 
> > https://sources.debian.net/src/linux/4.4-1~exp1/arch/arm/configs/)
> 
> The kernel from the Raspberry Pi foundation and the mainline kernel
> have different config options, so that won't be of much use. Check
> mainline's arch/arm/configs/bcm2835_defconfig instead.
> 
> > > For the RPi's with the older cores it wouldn't seem to make much
> sense to
> > > enable it in the armel/versatile flavour (because I can't see why
> it fits
> > > there, despite folks apparently adding it there), but equally I'm
> not sure
> > > we want to be adding new flavours to armel (which is essentially
> on the
> > > downward slope of the support lifecycle at this stage). Perhaps
> others
> > > around here feel differently though.
> > 
> > Bummer for me as it won't solve the issue I hoped it would solve,
> but I'll try 
> > other avenues for that. But still, thanks for clarifying :-)
> 
> Another option would be adding RPi1 support to the armhf armmp
> kernel. I guess the benefits of ARMv7 vs ARMv6 is neglectable for
> the kernel (no floating point operations and Thumb2 should stay
> disabled because of errata 430973 on some older Cortex-A8s [i.e.
> on N900])

The Debian armhf kernels do not have support for ARMv6 enabled. AIUI
moving to a v6+v7 capable kernel, other than muddying the waters WRT
what "armhf" means, would also mean falling back to ARMv6 features only
missing out on ARMv7 additions like improvements to SMP barriers and
atomic operations. I don't think we'd want to do that.

IMHO RPi1 is best supported by Raspbian, or if people really want it in
Debian then by armel, but not by an armhf+armel hybrid which involves
supporting v6 in some way on armhf.

Ian.

>  and one could use armel userland + armmp kernel from armhf.
> 
> -- Sebastian



Bug#784688: Thousands of "xen:balloon: Cannot add additional memory (-17) messages" despite dom0 ballooning disabled

2016-01-26 Thread Ian Campbell
On Mon, 2016-01-25 at 20:36 +0200, KSB wrote:
> > Do you have a package version which you know to be good? How confident
> > are
> > you that it is ok (sometimes the problem is intermittent)?
> > 
> > Lastly, is there any chance you upgraded the Xen packages at the same
> > time?
> > I'm starting to wonder if maybe this is not a kernel issue.
> > 
> Sorry, but there is chance, sadly.

This is actually useful, because it shows that the issue occurs even with
Xen 4.6, which I think rules out a Xen side issue (otherwise we'd have had
lots more reports from 4.4 through to 4.6) and points to a kernel side
issue somewhere.

> But I checked logs more thoroughly and found it even on more recent
> kernels:
> 1) Lot of messages on 3.14-2-amd64 with xen-4.6, 13 domU's.

Just to be clear, "Lots" here means "hundreds or thousands"? I think it is
expected to see one or two around the time a VM is started or stopped, so
with 13 domUs a couple of dozen messages wouldn't seem out of line to me.

> 2) 4.3.0-1-amd64 xen-4.6, only two messages shortly after boot, only 1 
> domU running:
> [   12.473778] xen:balloon: Cannot add additional memory (-17)
> [   21.673298] xen:balloon: Cannot add additional memory (-17)
> uptime 17 days.
> 
> Previous on same machine was 4.2.0-1-amd64 with more (-17)'s

Was it running xen-4.6 when it was running 4.2.0 or was that also older?

Also 4.2.0-1-amd64 is the ABI, not the package version. The package
versions is either in dpkg or you can find it in /proc/version:

Linux version 4.1.0-2-amd64 (debian-kernel@lists.debian.org) (gcc version 4.9.3 
(Debian 4.9.3-3) ) #1 SMP Debian 4.1.6-1 (2015-08-23)
  ^^^ABI^^^ 
  ^^^VERSION

> 3) 4.3.0-1-amd64, one month, several reboots, average 4 domU's, and no 
> messages

Any idea which Xen?

> 4) 3.16.0-4-amd64, xen-4.1, 22 domU's, uptime 188 days, in last month I 
> see only
> Jan 7 14:12:08
> Jan 7 14:12:08
> Jan 7 14:12:08
> Jan 7 14:12:08
> Jan 7 14:27:47
> Jan 7 14:27:47
> Jan 7 14:27:47
> Jan 7 14:27:48
> and this is roughly the time last machine was created(started).
> 
> 
> 



Re: Testing versatile kernel on Raspberry Pi?

2016-01-25 Thread Ian Campbell
On Mon, 2016-01-25 at 15:46 +0100, Diederik de Haas wrote:
> > I wasn't aware that any of the RPi support (for any model) had gone
> > upstream.
> 
> It has taken a while, but it seems that major parts are now upstream-ed. 
> See the changelog mentioned earlier.

Great!

[...]
> As I got the impression that support for the RPi was now present in
> upstream and (therefor) the Debian kernels,

The "therefor" won't happen automatically, someone will need to file a
wishlist bug asking for the relevant options to be enabled in the Debian
kernel configuration.

For the RPi's with the newer CPU cores it makes clear sense to do that in
the armhf/armmp kernel flavour (since it is the "multiplatform" flavour,
and the only one we want to support).

For the RPi's with the older cores it wouldn't seem to make much sense to
enable it in the armel/versatile flavour (because I can't see why it fits
there, despite folks apparently adding it there), but equally I'm not sure
we want to be adding new flavours to armel (which is essentially on the
downward slope of the support lifecycle at this stage). Perhaps others
around here feel differently though.

Probably those two cases ought to be separate wishlist bugs for armhf vs
armel.

>  I wanted to offer my help in testing it, if needed.

Once someone does the work to enable them then testing would be valuable
(of course). Perhaps you can take on the former?

Ian.



Bug#784688: Thousands of "xen:balloon: Cannot add additional memory (-17) messages" despite dom0 ballooning disabled

2016-01-25 Thread Ian Campbell
On Fri, 2016-01-22 at 21:38 +0200, KSB wrote:
> Seen this behavior on earlier kernels (i.e. 3.14-2-amd64 pkg 3.14.15-2.) 
> and seems to be gone at least in 4.3

That's useful info thanks, I've been unable to pinpoint a culprit for this
for ages now.

Do you have a package version which you know to be good? How confident are
you that it is ok (sometimes the problem is intermittent)?

Lastly, is there any chance you upgraded the Xen packages at the same time?
I'm starting to wonder if maybe this is not a kernel issue.

Ian.



Re: Testing versatile kernel on Raspberry Pi?

2016-01-25 Thread Ian Campbell
On Sat, 2016-01-16 at 03:22 +0100, Diederik de Haas wrote:
> Hi!
> 
> I have a Raspberry Pi 1B, 1B+ and 2B and I'd love to test Debian's 4.4 kernel 
> for it. At least it is/was my understanding that the versatile kernel is 
> meant 
> for the Raspberry Pi. 

The versatile kernel is meant for ARM "veratile" development boards, and it
also happens to be a reasonable platform emulated by QEMU.

It's not "versatile" in the sense of "adaptable".

I wasn't aware that any of the RPi support (for any model) had gone
upstream.

In any case for the RPi variants which use the older (non-armhf) processor
you are most likely better off with the Raspbian derived distro than Debian
armel.

For the variant with the newer CPU core which is armhf compatible we should
consider enabling support in the armhf kernels, for which support being in
the mainline kernel is a prerequisite.

Ian.



Re: Testing versatile kernel on Raspberry Pi?

2016-01-25 Thread Ian Campbell
On Mon, 2016-01-25 at 14:56 +, Ben Hutchings wrote:
> On Mon, 2016-01-25 at 15:46 +0100, Diederik de Haas wrote:
> > Thanks for your response :-)
> > 
> > On Monday 25 January 2016 13:23:20 Ian Campbell wrote:
> > > > I have a Raspberry Pi 1B, 1B+ and 2B and I'd love to test Debian's
> > > > 4.4
> > > > kernel  for it. At least it is/was my understanding that the
> > > > versatile
> > > > kernel is meant for the Raspberry Pi.
> > > 
> > > The versatile kernel is meant for ARM "veratile" development boards,
> > > and it
> > > also happens to be a reasonable platform emulated by QEMU.
> > > 
> > > It's not "versatile" in the sense of "adaptable".
> > 
> > I know. A slightly adapted version of the versatile kernel is used to
> > do Pi 
> > simulation in QEMU and I've (also) created a repo for it: 
> > https://github.com/diederikdehaas/raspbian-kernel (default branch is 
> > kernel-3.18.x-qemu).
> > 
> > If you look at the changelog on linux (4.4~rc8-1~exp1) [1] you see the 
> > Raspberry Pi 2 explicitly mentioned and also references to BCM2836 (=Pi
> > 2), 
> > BCM2835 (=Pi 1) and vc4 which stands for VideoCore4 which is the
> > graphics chip 
> > for both the Pi 1 and 2.
> > Further reports on /. and phoronix [2] suggested that (full?) support
> > for the 
> > Pi 1 and 2 was added to the upstream kernel and the changelog hinted at
> > that 
> > as well. (A more recent report on /. [3] indicates that kernel 4.5 is
> > more 
> > likely and it could be that it is primarily for the Pi 2.)
> [...]
> 
> The armmp kernel flavour should now support the BCM2836 and the Pi 2,

I missed this going in, thanks!

> but *not* the BCM2835.  Also, Debian's armhf port is built for ARMv7
> whereas the BCM2835 implements ARMv6.  Most of the peripherals are the
> same between these two chips, so the driver names include 'bcm2835'.
> 
> I haven't yet had confirmation that it actually does work.

Ian.



Bug#812540: Add ARCH_HISI for Lemaker HiKey support

2016-01-25 Thread Ian Campbell
On Sun, 2016-01-24 at 21:21 +0100, Kilian Krause wrote:
> Package: linux
> Version: 4.3.3-7
> Severity: wishlist
> Tags: d-i
> 
> Dear kernel maintainers,
> 
> while trying to get a d-i booted on a Lemaker HiKey, tbm pointed out
> that ARCH_HISI is not (yet) activated on linux.
> 
> Please enable it so we can add HiKey support to Debian.

I suppose it will need more than just ARCH_HISI. Are you able to identify
the full set of options (e.g. drivers and such) which are needed to make
useful Lemaker support? If so I'd appreciate it (if not I can probably try
and dig those out myself, but it might take me a while to around to it).

Lemaker HiKey is the 96boards thing, right? Good to finally see that
hitting upstream, was it properly supported in 4.3.x or should this change
be made in experimental (currently 4.4.x)?

Thanks,
Ian.



Re: Xen XL blocked on linux-image-4.3.0.1-amd64 4.3.3-5

2016-01-25 Thread Ian Campbell
On Mon, 2016-01-25 at 10:53 +0100, Torben Schou Jensen wrote:
> Something with Xen is unstable since upgrade to kernel 4.3.0.1.

This is fixed in 4.3.3-7 I believe as bug #810472.

Ian.



Bug#812386: Please enable ARCH_QCOM on arm64

2016-01-24 Thread Ian Campbell
On Sat, 2016-01-23 at 12:00 -0800, Martin Michlmayr wrote:

> Good point.  Please enable these for generic QCOM support:

Done.

> I think this should be good enough for now.  I'll open a new bug once
> I have hardware if needed.

Ack.



Bug#812386: Please enable ARCH_QCOM on arm64

2016-01-23 Thread Ian Campbell
On Fri, 2016-01-22 at 20:57 -0800, Martin Michlmayr wrote:
> Package: linux
> Version: 4.4-1~exp1
> Severity: wishlist
> 
> Please enable ARCH_QCOM on arm64.  I believe the following options
> should be enabled:

Done in git. 

I suppose we will also want some modules added to the udebs? Based on
the diff to the .config I added
nic-modules:
+stmmac
+stmmac-platform
+dwmac-generic
+dwmac-ipq806x
sata-modules:
+phy-qcom-apq8064-sata
+phy-qcom-ipq806x-sata

I suspect there will be others (esp. phy drivers) which are needed in
practice. Will you give this a go and report back?

Also looking at the diff of the resulting config, I found a few new
options available which might be of interest on this platform:
+# CONFIG_KS8842 is not set
+# CONFIG_MFD_QCOM_RPM is not set
+CONFIG_REGULATOR=y => makes a lot of CONFIG_REGULATOR_* avail, at
least one of which might be relevant?

There was also a bunch of video and DRM stuff.

Ian.



Re: [PATCH 0/8] debian/linux: extended ARM64 support

2016-01-23 Thread Ian Campbell
On Tue, 2016-01-05 at 22:14 -0200, Ricardo Salveti wrote:
> This patch set extend the ARM64 support by including support for the AMD
> Overdrive (development platform using the AMD Seattle SoC), making it
> supported by d-i and including a basic set of config options that are now
> supported by upstream (common configs that are also enabled for x86).
> 
> Tested with AMD Overdrive B0.
> 
> Patches against the master branch.

All applied (will be 4.4-1~exp2 when uploaded). Thanks for the patches
and sorry for the delay applying.



Bug#810820: linux-image-4.3.0-1-amd64: XEN fails after 7 domU's are started with linux-image-4.3.0-1-amd64 (version 4.3.3-5)

2016-01-20 Thread Ian Campbell
Control: tag -1 +moreinfo

On Tue, 2016-01-12 at 17:00 +0200, Kaspars Bogdanovs wrote:
> 
> When booting XEN system (both 4.6 and 4.3) with kernel (linux-image-
> 4.3.0-1-amd64, pkg version 4.3.3-5) when started 6 to 7 small domU's,
> error messages are thrown:

I think there is a reasonable chance that this is the same issue as
810472[0], which was fixed in kernel package version 4.3.3-6.

Please could you give that one a try?

If it doesn't help it would also be interesting to know if the issue
persists with the kernel from experimental (currently 4.4~rc8-1~exp1).

Ian.

[0] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=810472



Bug#809476: Linux 4.4-rc6 fails to boot on QNAP TS-109

2016-01-01 Thread Ian Campbell
On Thu, 2015-12-31 at 08:20 -0800, Martin Michlmayr wrote:
> * Martin Michlmayr  [2015-12-30 16:09]:
> > I guess the kernel is uncompressed and overwrites part of the
> > ramdisk
> > located at 0x80.  I don't really get this part because
> > arch/arm/boot/Image is only 6.1 MB (but vmlinux is around 9 MB,
> > even
> > on the kernel that works).
> 
> Actually, now that I think about it, this makes perfect sense since
> we're loading a 2 MB image to 0x8000 which is then presumably
> decompressed right after the image leading to another 6 MB.  So around
> 8 MB.  If it's too large, it would overwrite the ramdisk at 0x80.

FWIW, makes sense to me.

Ian.



Bug#809083: linux: armhf/armmp: Please enable support for Rockchip platforms

2016-01-01 Thread Ian Campbell
On Sat, 2015-12-26 at 18:26 -0800, Vagrant Cascadian wrote:
> +#drivers/crypto/Kconfig:
> +CONFIG_CRYPTO_DEV_ROCKCHIP=m

This one didn't seem to exist anywhere in the mainline tree.

The rest looks good to me, I intend to apply once I get a chance to
build test.

Ian.



Bug#807624: armhf: Please enable support for DRA7XX systems (Beagle-x15)

2016-01-01 Thread Ian Campbell
On Tue, 2015-12-15 at 16:04 -0800, Vagrant Cascadian wrote:

> Would you consider enabling it even though there are some notable
> features not yet working?

Seems like there is a sufficiently useful set of stuff which is
working, so I think we should.

The final change to the generated .config with your one liner is below,
I think we should probably enable CONFIG_DRA752_THERMAL
and CONFIG_VIDEO_TI_VPE? The former seems obvious so I'll include it,
not so sure about the other option, will leave it for now.

Ian.

@@ -465,7 +465,7 @@
 CONFIG_SOC_OMAP5=y
 CONFIG_SOC_AM33XX=y
 # CONFIG_SOC_AM43XX is not set
-# CONFIG_SOC_DRA7XX is not set
+CONFIG_SOC_DRA7XX=y
 CONFIG_ARCH_OMAP2PLUS=y
 CONFIG_OMAP_INTERCONNECT_BARRIER=y
 
@@ -3695,6 +3695,7 @@
 # CONFIG_OMAP3_THERMAL is not set
 CONFIG_OMAP4_THERMAL=y
 CONFIG_OMAP5_THERMAL=y
+# CONFIG_DRA752_THERMAL is not set
 
 #
 # Samsung thermal drivers
@@ -4233,6 +4234,7 @@
 # CONFIG_VIDEO_SAMSUNG_S5P_MFC is not set
 # CONFIG_VIDEO_SAMSUNG_EXYNOS_GSC is not set
 # CONFIG_VIDEO_SH_VEU is not set
+# CONFIG_VIDEO_TI_VPE is not set
 CONFIG_V4L_TEST_DRIVERS=y
 CONFIG_VIDEO_VIVID=m
 CONFIG_VIDEO_VIVID_MAX_DEVS=64
@@ -6405,6 +6407,7 @@
 CONFIG_ARM_GIC=y
 CONFIG_ARM_GIC_V3=y
 CONFIG_OMAP_IRQCHIP=y
+CONFIG_IRQ_CROSSBAR=y
 # CONFIG_IPACK_BUS is not set
 CONFIG_ARCH_HAS_RESET_CONTROLLER=y
 CONFIG_RESET_CONTROLLER=y



Bug#797881: QNAP TS-219P II: qcontrol no longer works after upgrading to linux-image-4.1.0-0.bpo.1-kirkwood

2015-12-10 Thread Ian Campbell
On Thu, 2015-12-10 at 07:36 +, Ian Campbell wrote:
> On Wed, 2015-12-09 at 16:04 -0800, Martin Michlmayr wrote:
> > * Ian Campbell <i...@debian.org> [2015-10-04 14:04]:
> > > I suspect this is due to the device path for the input node
> > changing
> > > from /dev/input/by-path/platform-gpio_keys-event to /dev/input/by
> > > -path/platform-gpio-keys-event. With the version of qcontrol in
> > Jessie
> > > it won't even start if it can't find the device, even though it
> can
> > do
> > > many of its core things without it (the node is for button input
> > only).
> > > 
> > > This is fixed by qcontrol 0.5.4-4 in testing (both looking for
> old
> > and
> > > new names, as well as not treating failure to find either as a
> > > catastrophe), but for Jessie you can just edit the path in
> > > /etc/qcontrol.conf.
> > > 
> > > If that works for you then it might be worth uploading an updated
> > > qcontrol to backports.
> > 
> > Ian, were you going to upload qcontrol to backports or did you want
> > someone else to do it?
> 
> I wasn't, but now I am. I've left a build going and will (hopefully)
> upload after breakfast.

Done (pending a successful dinstall run).

Cheers,
Ian.



Bug#797880: QNAP TS-219P II with linux-image-4.1.0-0.bpo.1-kirkwood "loses" one hard disk from the RAID while flashing initramfs, causing read-only remount and dpkg to fail

2015-12-10 Thread Ian Campbell
On Wed, 2015-12-09 at 16:02 -0800, Martin Michlmayr wrote:
> * Robert Schlabbach <robert.schlabb...@gmx.net> [2015-09-03 12:19]:
> > Package: linux-image-4.1.0-0.bpo.1-kirkwood
> > Version: 4.1.3-1~bpo8+1
> > 
> > Bad things happen when flash-kernel (3.45) flashes the initramfs
> with this Linux kernel on my QNAP TS-219P II:
> 
> Ian Campbell added a workaround to flash-kernel 3.52 for this kernel
> issue.
> 
> Can you try if 3.52 works for you?  If so, I guess it makes sense to
> upload 3.52 to backports.

I've just uploaded 3.52~bpo8-1 (pending a successful dinstall run).

Cheers,
Ian.



Bug#797881: QNAP TS-219P II: qcontrol no longer works after upgrading to linux-image-4.1.0-0.bpo.1-kirkwood

2015-12-09 Thread Ian Campbell
On Wed, 2015-12-09 at 16:04 -0800, Martin Michlmayr wrote:
> * Ian Campbell <i...@debian.org> [2015-10-04 14:04]:
> > I suspect this is due to the device path for the input node
> changing
> > from /dev/input/by-path/platform-gpio_keys-event to /dev/input/by
> > -path/platform-gpio-keys-event. With the version of qcontrol in
> Jessie
> > it won't even start if it can't find the device, even though it can
> do
> > many of its core things without it (the node is for button input
> only).
> > 
> > This is fixed by qcontrol 0.5.4-4 in testing (both looking for old
> and
> > new names, as well as not treating failure to find either as a
> > catastrophe), but for Jessie you can just edit the path in
> > /etc/qcontrol.conf.
> > 
> > If that works for you then it might be worth uploading an updated
> > qcontrol to backports.
> 
> Ian, were you going to upload qcontrol to backports or did you want
> someone else to do it?

I wasn't, but now I am. I've left a build going and will (hopefully)
upload after breakfast.

Ian.



Bug#805971: linux-image-3.16.0-4-amd64: [PATCH] Xen domU "unable to handle kernel NULL pointer dereference"

2015-11-24 Thread Ian Campbell
On Tue, 2015-11-24 at 13:18 +0100, Sebastian Pipping wrote:
> 
> Viktor Dukhovni published a patch on 2015-09-09 at
> http://lists.xenproject.org/archives/html/xen-users/2015-09/txtbaRgWqxpT4
> .txt ,
> already.  His patch also fixes the "only created %d queues" message:
> unpatched it is using the wanted number of queues (rather than the number
> of
> queues created), by mistake.
> 
> I'm hoping for an updated kernel package including Viktor's patch, soon.

This needs to be fixed in mainline (or at least in net(-next).git and well
on its way to mainline) before we can consider it for inclusion in the
Debian kernel.

I don't see any patches from Viktor in any of those trees, nor anything
which looks like a similar fix from someone else.

Ian.



Bug#805885: https://bugs.debian.org/src:linux gives "Internal Server Error"

2015-11-23 Thread Ian Campbell
Package: bugs.debian.org
X-Debbugs-Cc: debian-kernel@lists.debian.org

Hi,

[ I sent this to owner@, as requested by the error message, in
  <1446823766.23065.68.ca...@debian.org>
  at Fri, 06 Nov 2015 15:29:26 +, but having not heard back I'm now
  filing as a bug against bugs.d.o, I hope that is correct. ]

Trying to look at the list of src:linux bugs results in:

Internal Server Error

The server encountered an internal error or misconfiguration and was 
unable to complete 
your request.

Several times today, include right now at "Fri  6 Nov 15:28:57 GMT 2015".

There's quite a lot of bugs in src:linux, might be a CGI timeout?

Thanks,
Ian.



Bug#762634: initramfs-tools: [armhf] mounting rootfs on USB disk fails / some USB host controller drivers missing in initramfs

2015-11-12 Thread Ian Campbell
On Wed, 2015-11-11 at 18:46 -0600, Vagrant Cascadian wrote:
> On 2014-09-30, Ben Hutchings wrote:
> > On Tue, 2014-09-30 at 08:19 +0100, Ian Campbell wrote:
> > > On Fri, 2014-09-26 at 00:08 +0100, Ben Hutchings wrote:
> > > > However, at the moment initramfs-tools won't include PHY drivers
> > > > even in
> > > > that configuration.
> > > 
> > > I spent some time last week hunting for a sysfs link between a device
> > > and the phys which it is using, without success. Do you have any
> > > ideas?
> > 
> > I suspect they're not visible there yet.
> > 
> > I think you could include all PHY drivers (drivers/phy and
> > drivers/usb/phy) when MODULES=most, and only the currently loaded
> > drivers if MODULES=dep.
> > 
> > USB non-generic PHY drivers don't appear in the device model at all
> > (ugh!) so in the MODULES=dep case you may have to bodge it by checking
> > for modules with names beginning with "phy-" (check both /sys/module
> > and /lib/modules/$(uname -r)/modules.builtin).
> 
> Would definitely like to see this, with a recent install on a Wandboard
> Dual with a USB2 sata disk for the rootfs. It installed fine with
> jessie's debian-installer, but failed on initial boot.
> 
> I worked around it by adding to /etc/initramfs-tools/modules:
> 
>   ci_hdrc_imx
>   phy_mxs_usb
> 
> I haven't yet verified if only adding "phy_mxs_usb" instead of both will
> work.
> 
> Had a similar problem with an Odroid-XU4 install (which isn't yet
> supported by debian-installer), and worked around it similarly, although
> haven't narrowed down exactly which modules are needed, though I
> suspect one or both of:
> 
>   phy_exynos_usb2
>   phy_exynos5_usbdrd
> 
> 
> Something that pulls in all the phy-* modules would likely fix the issue
> in a generic way, rather than playing whack-a-mole with various phy
> types.

I think Ben essentially did this in #762042[0], which ends up adding any
phy-* which are currently loaded to the initramfs. That change appears to
be in v0.119 while Jessie has v0.120 so perhaps something else is going on.

ci-* isn't covered by this logic, so that might be it in the first case.
What is ci-*?

Ian.

[0] 6ba2e937271770576fb88f4e16cb5adb3094517b in intiramfs-tools.git



Bug#803159: linux: Enable DT support for armel/orion5x arch

2015-11-09 Thread Ian Campbell
On Sat, 2015-11-07 at 11:45 +0900, Roger Shimizu wrote:
> Patch appended, to avoid any misunderstanding.
>  - 0001 is both OK for sid and jessie
>  - 0002 is only necessary for sid, or other 4.x kernel series (e.g.
> jessie-backport)

Thanks, I'd already set a build going with your first patch and my variant
of your second patch for the UART thing. The build went fine and I have now
pushed the result to our git tree so it will be in the next upload. I don't
have any orion5x to test on.

My version of the UART change is below, with your variant (without the
"ICEDCC is not set") I would expect CONFIG_DEBUG_ICEDCC to be on by default
and according to the Kconfig help text for things to not work without a
debugger attached.

Did you try your version and find it worked?

Ian.

@@ -16,6 +16,14 @@ CONFIG_ATAGS_PROC=y
 CONFIG_VFP=y
 
 ##
+## file: arch/arm/Kconfig.debug
+##
+## choice: Kernel low-level debugging port
+# CONFIG_DEBUG_ICEDCC is not set
+CONFIG_DEBUG_LL_UART_8250=y
+## end choice
+
+##
 ## file: arch/arm/mach-imx/Kconfig
 ##
 # CONFIG_ARCH_MXC is not set



Bug#804079: linux-image-3.16.0-4-amd64: Kernel panic on Xen virtualisation in Debian

2015-11-06 Thread Ian Campbell
Control: found -1 3.16.7-ckt11-1+deb8u5
Control: notfound -1 3.16.7-ckt11-1

Thanks for your report.

On Wed, 2015-11-04 at 18:53 +0100, Jan Prunk wrote:
> Package: src:linux
> Version: 3.16.7-ckt11-1

>From your text and the screenshot I think this should really be
+deb8u5. I've updated the bug metadata with the first lines.

> Severity: important
> 
> Dear Maintainer,


> The following kernel panic error appears at random in Xen
> virtualisation.

As in it has appeared randomly from time to time (i.e. more than once)
or you've had a single random instance?

> Please look at the error in screenshot attachment.
> It's a Debian 8, Kernel 3.16.7-ckt11-1+deb8u5, Xen 4.4.4-pre

The screenshot shows a fault at 0x812b6dad == memcpy+0xd,
called from ndisc_send_redirect+0x3bf.

Unfortunately disassembling memcpy from what I think is the correct dbg
package[0] results in:

Dump of assembler code for function memcpy:
   0x812b6da0 <+0>: mov%rdi,%rax
   0x812b6da3 <+3>: cmp$0x20,%rdx
   0x812b6da7 <+7>: jb 0x812b6e27

   0x812b6da9 <+9>: cmp%dil,%sil
   0x812b6dac <+12>:jl 0x812b6de3

   0x812b6dae <+14>:sub$0x20,%rdx
   0x812b6db2 <+18>:sub$0x20,%rdx

i.e. the faulting %rip (0x812b6dad) is not on an instruction
boundary (it would be in the middle of that jl instruction, which
cannot happen).

The call in ndisc_send_redirect disassembles sensibly and matches up
ok.

If I decode the faulting address as if it were on an instruction
boundary then I get:

(gdb) x/i 0x812b6dad
   0x812b6dad :  xor$0x20ea8348,%eax

which isn't accessing RAM and therefore surely cannot fault.

The version you have given is corroborated by the screenshot and I am
pretty I have got the correct dbg package to match.

I suppose you haven't rebuilt the kernel or anything like that?

I don't like to put things down to "cosmic rays", but if this was a one
off then I'm struggling to think of anything else to explain what
appears to be a single bit error in %rip.

At this point I would normally ask if you had run memtest86 etc on the
machine (i.e. if the RAM is known to be solid), but this seems to be a
register and not memory related.

> It's a production machine so not much detailed further testing can be
> provided in time.
> The information below (bugreport) is executed from a different
> machine, so the info provided below is not matching the original
> machine where the error appears !

FYI it is possible to run reportbug on a machine but get it to write
the report to a file for transfer and sending from another machine.

Ian.

[0] 
http://security.debian.org/debian-security/pool/updates/main/l/linux/linux-image-3.16.0-4-amd64-dbg_3.16.7-ckt11-1+deb8u5_amd64.deb
 => /usr/lib/debug/vmlinux-3.16.0-4-amd64



Bug#803159: linux: Enable DT support for armel/orion5x arch

2015-11-06 Thread Ian Campbell
On Fri, 2015-11-06 at 23:53 +0900, Roger Shimizu wrote:
> So I think it's now safe to turn on DEBUG_LL_UART_8250 and merge my
> patch.

Great, thanks for tracking that down. 

It's a bit odd that DEBUG_LL_UART_NONE fails (after all it should be
doing nothing) but given that this flavour supports a single SOC which
all have the same UART (AFAICT, it's on in upstream orion5x defconfig
at least) I think there's no reason not to do this.

When I tried this I also got CONFIG_DEBUG_ICEDCC coming on. From the
Kconfig help I don't think we want this at all ("Note that the system
will appear to hang during boot if there is nothing connected to read
from the DCC.") so I have disabled it. Please let me know if you think
we want this for some reason.

I'll kick off a local build with this in now and push once it's done.
(I'm away this w/e, so might be next week)

Ian.



Bug#799122: [Pkg-xen-devel] Bug#799122: xen-hypervisor-4.4-amd64: Networking of domUs stops working after a few minutes

2015-11-06 Thread Ian Campbell
On Thu, 2015-11-05 at 19:16 +0100, Arne Klein wrote:
> Thank you. The tested versions on dom0 and domU in which the problem 
> occurs are:
> 4.1.3-1~bpo8+1
> 3.16.7-ckt11-1+deb8u5

I've asked around upstream and it was suggested that "xen-netback:
require fewer guest Rx slots when not using GSO" which is in later
upstream might be relevant.

Please could you try 4.3~rc7-1~exp1 from experimental in dom0 and see
if that helps?

It was also suggested that the contents of /sys/kernel/debug/xen-
netback after the stall has occurred might contain useful information.

I forgot to ask earlier but are you able to trigger this on demand
and/or is there any commonality in the workloads running in the guests
which trigger this (i.e. particular network services which they run) or
their configurations (e..g manually disabling of offloads perhaps).

Ian.



Bug#799122: [Pkg-xen-devel] Bug#799122: xen-hypervisor-4.4-amd64: Networking of domUs stops working after a few minutes

2015-11-06 Thread Ian Campbell
On Fri, 2015-11-06 at 11:23 +, Ian Campbell wrote:

> Please could you try 4.3~rc7-1~exp1 from experimental in dom0 and see
> if that helps?

4.3-1~exp1 was just uploaded, so you may as well try that instead.

Thanks,
Ian.



Bug#803159: linux: Enable DT support for armel/orion5x arch

2015-11-05 Thread Ian Campbell
Control: tag -1 +moreinfo

On Fri, 2015-11-06 at 01:27 +0900, Roger Shimizu wrote:
> Dear Ian,
> 
> Thanks for your comments!
> 
> > Given this and the discussions
> > https://lists.debian.org/debian-boot/2015/10/msg00221.html
> > I wonder how useful it turns out to be to apply this patch.
> 
> 1. This patch at least works well on jessie kernel (3.16).

I'm afraid the only candidate place for this to be applied would be
sid, not in a stable release.

> 2. I'm still trying to figure out why it doesn't work on sid kernel.

OK, please let us know.

> 3. The reason to support DT for my orion5x device (LS-WTGL) is
> because
> there's no other way to support it. There were some patches to support
> the device in legacy way, but it's not merged into mainline kernel, so
> need to compile every time upstream get updated.

There's no doubt that if this device is to be supported it should be
via DT.

But if the kernel image is too big for this device then it may not be a
device which we can support at all.

> 4. For LS-WSGL, sorry I don't have it. Debian previously support it,
> but mainline kernel 4.3 has convert this device to DT, which means
> legacy code has been removed and you have to use DT to boot. I notice
> it's the same series product with my device, so I included it in my
> patch.

OK.

I may be confused regarding which of the two systems mentioned have
issues with the size of the kernel supported by the firmware.

If you can figure out why the LS-WTGL doesn't work and that can be
fixed (i.e. it turns out to be something other than the size) then I
see no problem with adding support for the LS-WTGL platform.

Ian.



Bug#803159: linux: Enable DT support for armel/orion5x arch

2015-11-05 Thread Ian Campbell
On Tue, 2015-10-27 at 23:20 +0900, Roger Shimizu wrote:
> Package: linux
> Severity: normal
> 
> Dear Maintainer,
> 
> There're some updates for armel/orion5x in Linux 4.3 kernel:
>   - Buffalo Linkstation LS-WTGL: DT support newly added
>   - Buffalo Linkstation LS-WSGL: converted to DT
> 
> But currently, armel/orion5x kernel doesn't support DT well.
> Here's the patch to enable DT support for armel/orion5x.
> 
> However, I met some issues during testing on my LS-WTGL box,
> and found it's because the kernel image size exceeded the 
> limit of u-boot.

Given this and the discussions
https://lists.debian.org/debian-boot/2015/10/msg00221.html
I wonder how useful it turns out to be to apply this patch.

I suppose even if the given platform doesn't work due to uboot
limitations the general direction of enabling DT seems sound so long as
it doesn't push us over the current size limit, but if there are no
supported platforms which use this functionality perhaps we should wait
until then.

So I'm in too minds, and other pkg-kernel or ARM folks have any
thoughts?

>  After I revert commit b3b60bbdd13 [0], it 
> boot smoothly.
> 
> I hope this patch can be applied soon. Thank you!
> 
> Cheers,
> Roger
> 
> Reference [0]:
> https://anonscm.debian.org/cgit/kernel/linux.git/commit/debian/config
> /armel/config.orion5x?id=b3b60bbdd13a9702dbd8e00bd9b35d49b625df31



Bug#788315: same problem in stretch

2015-10-11 Thread Ian Campbell
On Sun, 2015-10-11 at 09:58 +0200, Heiko Ernst wrote:
> can someone add this bug to stretch ?

Please write another mail to 788...@bugs.debian.org with this
directive, before any other text (including quotes "On Sun... wrote"
etc):

Control: found -1 

e.g. if you tested 4.1.6-1 then write:

Control: found -1 4.1.6-1

This will tell the BTS what is going on and it will then know that
Stretch is affected.

Be sure to use the kernel package version and not the ABI version (the
latter of which will look like 4.1.6-2-amd64 or something like that).

"-1" is a shorthand for whichever nnn...@bugs.debian.org the mail is
addressed to, you can also spell it out in full.

You can also use the bts(1) from the devscripts package to do the same
thing. See https://www.debian.org/Bugs/ for more information on
manipulating bugs in the Debian BTS.
Ian.



Bug#797881: QNAP TS-219P II: qcontrol no longer works after upgrading to linux-image-4.1.0-0.bpo.1-kirkwood

2015-10-05 Thread Ian Campbell
On Sun, 2015-10-04 at 22:21 +0100, Ben Hutchings wrote:
> On Sun, 2015-10-04 at 14:04 +0100, Ian Campbell wrote:
> > On Thu, 2015-09-03 at 12:20 +0200, Robert Schlabbach wrote:
> > > Package: linux-image-4.1.0-0.bpo.1-kirkwood
> > > Version: 4.1.3-1~bpo8+1
> > >  
> > > After installing this Linux kernel on my QNAP TS-219P II, qcontrol no
> > > longer works:
> > >  
> > > 1. The status LED remains in red/green blink mode (as set by the boot
> > > loader). It should be set to solid green when the kernel is loaded.
> > > 2. The buzzer does not buzz. It should buzz when the kernel is loaded
> > > and when the kernel is shutting down.
> > >  
> > > Removing and reinstalling the qcontrol package did not help.
> > 
> > I suspect this is due to the device path for the input node changing
> > from /dev/input/by-path/platform-gpio_keys-event to /dev/input/by
> > -path/platform-gpio-keys-event. With the version of qcontrol in Jessie
> > it won't even start if it can't find the device, even though it can do
> > many of its core things without it (the node is for button input only).
> 
> The change seems to have been in the other direction.

Right.

> > This is fixed by qcontrol 0.5.4-4 in testing (both looking for old and
> > new names, as well as not treating failure to find either as a
> > catastrophe), but for Jessie you can just edit the path in
> > /etc/qcontrol.conf.
> > 
> > If that works for you then it might be worth uploading an updated
> > qcontrol to backports.
> 
> I think the name change in the kernel should be reverted (not just in
> Debian, but upstream) since it broke existing userland.

I agree, I did mention this upstream at the time this was first
discovered[0] and the consensus seemed to be that this would be hard to fix
(or at least no one knew how).

> Presumably that would be:

I don't think DTB node names generally have any actual meaning and the
gpio_keys is the module name (which the kernel has normalised with "tr -
_", like it generally does), compared with the older board file based stuff
which was, I suppose, non-modular or otherwise hard coded somewhere.

It happened enough releases ago now that I think it is unlikely to change
back :-/. I probably should have chased harder at the time.

Ian.

[0] http://lists.infradead.org/pipermail/linux-arm-kernel/2014-January/2237
91.html and some relevant replies:
http://lists.infradead.org/pipermail/linux-arm-kernel/2014-January/224933.html
http://lists.infradead.org/pipermail/linux-arm-kernel/2014-January/225917.html

> 
> --- a/arch/arm/boot/dts/kirkwood-ts219-6281.dts
> +++ b/arch/arm/boot/dts/kirkwood-ts219-6281.dts
> @@ -32,7 +32,7 @@
>   };
>   };
>  
> - gpio_keys {
> + gpio-keys {
>   compatible = "gpio-keys";
>   #address-cells = <1>;
>   #size-cells = <0>;
> --- a/arch/arm/boot/dts/kirkwood-ts219-6282.dts
> +++ b/arch/arm/boot/dts/kirkwood-ts219-6282.dts
> @@ -42,7 +42,7 @@
>   };
>   };
>  
> - gpio_keys {
> + gpio-keys {
>   compatible = "gpio-keys";
>   #address-cells = <1>;
>   #size-cells = <0>;
> --- END ---
> 
> Ben.
> 



Bug#800085: linux-image-4.1.0-2-armmp: The ethernet fails to come up on my Allwinner A10 device.

2015-10-04 Thread Ian Campbell
On Sun, 2015-09-27 at 02:55 +0200, Gianluca Renzi wrote:
> Sorry but I don't agree with you. Supposing the allwinner A10 device
> need some voltage regulator drivers to be enabled in the kernel to
> have some devices in a working state, this not hurts some other
> devices that have some other configuration setting different in
> hardware layout. They are defined in the device tree for this
> particular machine, vendor or somewhere else. So the drivers for the
> regulator can be safely inserted into the A10 Debian kernel and you
> will don't break down anything.

Ben said, and I agree, that nothing in the set of modules suggested in
the URL enables any kind of real regulator. So while everything you say
is true, it is of no relevance here I'm afraid.

If you know of a specific option for a _physical_ regulator device (or
other physical bit of h/w) which should be enabled and/or loaded to fix
this issue then please let us know what it is.

Thanks,
Ian.



Bug#800085: linux-image-4.1.0-2-armmp: The ethernet fails to come up on my Allwinner A10 device.

2015-10-04 Thread Ian Campbell
On Sat, 2015-09-26 at 18:04 +0100, Ben Hutchings wrote:
> Control: tag -1 - newcomer
> Control: tag -1 moreinfo
> 
> On Sat, 2015-09-26 at 12:10 -0400, Matthew Schneider wrote:
> > Package: src:linux
> > Version: 4.1.6-1
> > Severity: important
> > Tags: newcomer
> > 
> > Dear Maintainer,
> > 
> >* What led up to the situation?
> > Attempted to install the mainline Debian kernel for the first time
> on
> > this system.
> >* What was the outcome of this action?
> > The ethernet did not come up. 
> > 
> > The problem seems to be outlined and solved here: 
> > 
> http://comments.gmane.org/gmane.comp.hardware.netbook.arm.sunxi/17147
> >  
> > It would seem that an additional module needs to be included for it
> > to work correctly.
> [...]
> 
> None of the extra modules that were enabled as a 'solution' have
> anything to do with this hardware (or any real hardware).  So I don't
> think we have a real solution.

I agree, the URL suggests:
> CONFIG_REGULATOR_DEBUG=y
> CONFIG_REGULATOR_VIRTUAL_CONSUMER=y
> CONFIG_REGULATOR_USERSPACE_CONSUMER=y
> CONFIG_PWM_SUN4I=y

Of which the first is a debug option, the second two are "virtual" (in
that they don't relate to any real hardware) and the final one, while
related to h/w it is not h/w which should have anything to do with
Ethernet.

My best guess is that one of these has some sort of knock-on dependency
which causes something else to be enabled. I'm going to investigate
that possibility now... In the meantime:

> Please send the boot log from Linux 4.1 (or 4.2, which just landed in
> unstable).

Yes, please.

It would also be useful/interesting to know which Allwinner A10 device
this was with.

Ian.



Bug#799853: linux-image-2.6.32-5-xen-amd64: Xen kernel BUG: unable to handle kernel paging request

2015-10-04 Thread Ian Campbell
On Wed, 2015-09-23 at 11:51 +0200, Zdeněk Bělehrádek wrote:
> Package: linux-2.6
> Version: 2.6.32-48squeeze10

I'm afraid that with Squeeze being old-old-stable at this point your
best bet is going to be to upgrade at least the kernel if not the whole
distro to something > Squeeze.

I'd suggest starting with 2.6.32-48squeeze14 from the lts effort, if
that doesn't help (which I think is most likely going to be the case)
then 3.2.68-1+deb7u3~bpo60+1 from the o-o-bpo might be a good bet.
That's assuming you cannot upgrade the entire system to Wheezy or even
Jessie, which would be best of course.

Ian.


> Severity: important
> 
> 
> We have several virtual servers running under Xen, and two of them
> crash every few hours to days. Crash times are quite random, we have
> seen two crashes just about 2 minutes apart, and also few days went
> without crashing.
> 
> The crashing servers are used as mailservers, and run several
> instances of Exim, each listenning on different loopback and public
> IP address. Our customer uses it to send bulk e-mails, so there are
> long intervals of inactivity. We do have more of these serevrs, only
> two of them are crashing.
> 
> I checked the core dump with the crash utility, and it always hits
> kernel BUG: unable to handle kernel paging request, always in the
> same function and with the same backtrace. The crash is always
> triggered by collectd process. We tried to update kernel to latest
> version, and it had no effect. 
> 
> The hypervisor is xen-hypervisor-4.4-amd64 from Debian Jessie, the
> Dom0 is also Jessie. There is enough RAM in physical HW to support
> all the guests and some more.
> 
> I censored hostnames and IP addresses to protect the innocent.
> 
> -- Dmesg from crashed guest:
> 
> [12694.749508] BUG: unable to handle kernel paging request at
> 880002c49500
> [12694.750086] IP: [] inet_diag_dump+0x39f/0x78f
> [inet_diag]
> [12694.750690] PGD 1002067 PUD 1006067 PMD 3a9f067 PTE 0
> [12694.751300] Oops:  [#1] SMP 
> [12694.751932] last sysfs file: /sys/devices/vbd
> -2049/block/xvda1/uevent
> [12694.752007] CPU 0 
> [12694.752007] Modules linked in: tcp_diag inet_diag loop snd_pcm
> snd_timer snd soundcore snd_page_alloc evdev pcspkr joydev ext3 jbd
> mbcache dm_mod raid10 raid456 async_raid6_recov async_pq raid6_pq
> async_xor xor async_memcpy async_tx raid1 raid0 multipath linear
> md_mod xen_blkfront xen_netfront
> [12694.752007] Pid: 892, comm: collectd Not tainted 2.6.32-5-xen
> -amd64 #1 
> [12694.752007] RIP: e030:[]  []
> inet_diag_dump+0x39f/0x78f [inet_diag]
> [12694.752007] RSP: e02b:8800fcc31a88  EFLAGS: 00010246
> [12694.752007] RAX: 880002c49500 RBX: 8800fc558c70 RCX:
> 
> [12694.752007] RDX: 8800fef1cdc0 RSI: 8800fc558c60 RDI:
> 8800fd848148
> [12694.752007] RBP: 880002c3ef00 R08: 8800fc558000 R09:
> 
> [12694.752007] R10: 7f08762daeb0 R11: 8127be52 R12:
> 8800fc558c60
> [12694.752007] R13: 8800fdd8ea20 R14: 8800fd848000 R15:
> 8800fc558c60
> [12694.752007] FS:  7f08762db700() GS:880003add000()
> knlGS:
> [12694.752007] CS:  e033 DS:  ES:  CR0: 8005003b
> [12694.752007] CR2: 880002c49500 CR3: fd023000 CR4:
> 2660
> [12694.752007] DR0:  DR1:  DR2:
> 
> [12694.752007] DR3:  DR6: 0ff0 DR7:
> 0400
> [12694.752007] Process collectd (pid: 892, threadinfo
> 8800fcc3, task 8800fc902350)
> [12694.752007] Stack:
> [12694.752007]  81255a22 8800fc558000 0004
> 880002b14f00
> [12694.752007] <0> 001c 816e1f80 001c00d0
> 
> [12694.752007] <0> 880002a5a810 816e1d80 00d0
> 0074
> [12694.752007] Call Trace:
> [12694.752007]  [] ? sock_rmalloc+0x29/0x86
> [12694.752007]  [] ? netlink_dump+0x54/0x16c
> [12694.752007]  [] ? netlink_recvmsg+0x1a6/0x2c0
> [12694.752007]  [] ?
> hrtimer_try_to_cancel+0x3a/0x43
> [12694.752007]  [] ? sock_recvmsg+0xa6/0xbe
> [12694.752007]  [] ?
> autoremove_wake_function+0x0/0x2e
> [12694.752007]  [] ?
> autoremove_wake_function+0x0/0x2e
> [12694.752007]  [] ? verify_iovec+0x52/0xa2
> [12694.752007]  [] ? verify_iovec+0x52/0xa2
> [12694.752007]  [] ? sys_recvmsg+0x1b7/0x2cc
> [12694.752007]  [] ? sk_prot_alloc+0x79/0x12f
> [12694.752007]  [] ? sock_attach_fd+0x91/0xbf
> [12694.752007]  [] ? fd_install+0x2e/0x5a
> [12694.752007]  [] ? sock_map_fd+0x57/0x64
> [12694.752007]  [] ? system_call_fastpath+0x16/0x1b
> [12694.752007] Code: 34 4c 89 f7 c7 43 3c 00 00 00 00 e8 88 b8 0d e1
> c7 43 44 00 00 00 00 89 43 40 41 80 7c 24 10 0a 75 35 0f b7 45 38 48
> 8d 44 05 00 <48> 8b 10 49 89 54 24 18 48 8b 40 08 49 89 44 24 20 0f
> b7 45 38 
> [12694.752007] RIP  [] inet_diag_dump+0x39f/0x78f
> [inet_diag]
> [12694.752007]  RSP 
> [12694.752007] CR2: 880002c49500
> 

Bug#800085: linux-image-4.1.0-2-armmp: The ethernet fails to come up on my Allwinner A10 device.

2015-10-04 Thread Ian Campbell
On Sun, 2015-10-04 at 11:17 +0100, Ian Campbell wrote:

> I agree, the URL suggests:
> > CONFIG_REGULATOR_DEBUG=y
> > CONFIG_REGULATOR_VIRTUAL_CONSUMER=y
> > CONFIG_REGULATOR_USERSPACE_CONSUMER=y
> > CONFIG_PWM_SUN4I=y
> 
[...]
> My best guess is that one of these has some sort of knock-on dependency
> which causes something else to be enabled. I'm going to investigate
> that possibility now...

Nope, it turns out PWM_SUN4I is already on in our config and the other
three are independent and turn on nothing else, which was basically as
expected since they are all virtual.

SUN4I_PWM is a module, so it might be worth modprobing that (module
will be pwm-sun4i it seems) to ensure it is loaded and see if that
helps. I don't think it will though.

>  In the meantime:
> 
> > Please send the boot log from Linux 4.1 (or 4.2, which just landed in
> > unstable).
> 
> Yes, please.
> 
> It would also be useful/interesting to know which Allwinner A10 device
> this was with.

... and the .config for your working self-built kernel would be useful
for comparison too. As well as a dmesg from both the Debian kernel and
that one.

Given the self-built+working kernel was 3.16 based and the Debian one
was 4.1 it may just be that 4.1 has a bug with the Ethernet, so trying
4.2 would indeed be wise too.

Ian.



  1   2   3   4   5   6   7   8   9   >