Raising NR_CPUS for arm64

2017-04-14 Thread Ben Hutchings
Following a request to raise the maximum supported number of CPUs in
the kernel (NR_CPUS) for s390x, I reviewed the current settings for
other architectures.  Changing this value changes the kernel module
ABI, which is disruptive and is rarely done in stable updates.  Raising
NR_CPUS has a small cost wherever the kernel manipulates sets of CPU
IDs (cpumask_t).  So we want it to be high enough for all supported
systems but not much higher.

For arm64, this value can range between 2 and 4096, and is currently
set to the default of 64.  I think this is high enough for all
*currently* supported systems.  However, there are at least 3 chip
families due to be released in the next year or so that appear to
support configurations with >64 cores:

- Macom X-Gene 3 "Skylark"
  status: sampling?
  max cores/package: 32
  max packages: 8? (not sure if cache-coherent for more than 2 packages)
  max cores: 256?
- Cavium ThunderX2
  status: sampling?
  max cores/package: 54
  max packages: 2
  max cores: 108
- Qualcomm Centriq 2400
  status: sampling
  max cores/package: 48
  max packages: 2? (2-way motherboard exists, but don't know if more possible)
  max cores: 96?

Should I raise NR_CPUS to, say, 256, in anticipation that support for
these chips may be backported?  (Some support for the Centriq 2400 is
already in mainline Linux.)

Ben.

-- 
Ben Hutchings
Larkinson's Law: All laws are basically false.



signature.asc
Description: This is a digitally signed message part


Bug#738063: nfs-kernel-server: option to disable NFSv4 in /etc/default/nfs-kernel-server not working properly

2017-04-14 Thread Ben Hutchings
On Fri, 2017-04-14 at 20:19 -0700, Nye Liu wrote:
> Is anyone maintaining this package?

Daniel Pocock has been doing a fair amount of work on it.

> Is it possible to put this patch into an NMU?

This bug report has severity 'normal', which does not justify an NMU. 
Also, as we are approaching the release of Debian 9 'stretch', even
updates by a maintainer should only fix important bugs.

Ben.

-- 
Ben Hutchings
Larkinson's Law: All laws are basically false.



signature.asc
Description: This is a digitally signed message part


Bug#738063: nfs-kernel-server: option to disable NFSv4 in /etc/default/nfs-kernel-server not working properly

2017-04-14 Thread Nye Liu
Is anyone maintaining this package?

Is it possible to put this patch into an NMU?



Bug#860335: linux: [armhf] ahci_mvebu module is missing from sata-modules udeb

2017-04-14 Thread Ben Hutchings
On Fri, 2017-04-14 at 13:06 -0400, Robert Edmonds wrote:
> Package: linux
> Version: 4.9.18-1
> Severity: normal
> Tags: d-i patch
> 
> Hi,
> 
> The sata-modules udeb on armhf is missing the "ahci_mvebu" module. This
> prevents, e.g., installing Debian to an mSATA SSD installed in a Marvell
> Armada 385 SoC based system like the Turris Omnia.
> 
> I was able to complete an install using the d-i stretch rc3 release
> after manually copying ahci_mvebu.ko to the running installer
> environment and modprobe'ing it, so I think the attached patch will fix
> this problem. (See #860286 for the installation report.)

Thanks for the patch, but I prefer to specify modules in a more general
way rather than building up long lists.  What I think we should do here
is to add all the ahci* modules to sata-modules (instead of just ahci).

Ben.

-- 
Ben Hutchings
I'm not a reverse psychological virus.  Please don't copy me into your
sig.



signature.asc
Description: This is a digitally signed message part


Processed: severity of 860335 is important

2017-04-14 Thread Debian Bug Tracking System
Processing commands for cont...@bugs.debian.org:

> severity 860335 important
Bug #860335 [linux] linux: [armhf] ahci_mvebu module is missing from 
sata-modules udeb
Severity set to 'important' from 'normal'
> thanks
Stopping processing here.

Please contact me if you need assistance.
-- 
860335: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=860335
Debian Bug Tracking System
Contact ow...@bugs.debian.org with problems



Bug#860335: linux: [armhf] ahci_mvebu module is missing from sata-modules udeb

2017-04-14 Thread Robert Edmonds
Package: linux
Version: 4.9.18-1
Severity: normal
Tags: d-i patch

Hi,

The sata-modules udeb on armhf is missing the "ahci_mvebu" module. This
prevents, e.g., installing Debian to an mSATA SSD installed in a Marvell
Armada 385 SoC based system like the Turris Omnia.

I was able to complete an install using the d-i stretch rc3 release
after manually copying ahci_mvebu.ko to the running installer
environment and modprobe'ing it, so I think the attached patch will fix
this problem. (See #860286 for the installation report.)

Thanks!

-- 
Robert Edmonds
edmo...@debian.org
>From efa1c34a255f9ead2dd3591bb076b5b9d9c05110 Mon Sep 17 00:00:00 2001
From: Robert Edmonds 
Date: Fri, 14 Apr 2017 12:55:13 -0400
Subject: [PATCH] [armhf] sata-modules: Add module required for Turris Omnia
 mSATA

---
 debian/installer/armhf/modules/armhf-armmp/sata-modules | 1 +
 1 file changed, 1 insertion(+)

diff --git a/debian/installer/armhf/modules/armhf-armmp/sata-modules b/debian/installer/armhf/modules/armhf-armmp/sata-modules
index 70d5e3674..a1b457370 100644
--- a/debian/installer/armhf/modules/armhf-armmp/sata-modules
+++ b/debian/installer/armhf/modules/armhf-armmp/sata-modules
@@ -3,6 +3,7 @@ ahci_platform
 ahci_imx
 ahci_sunxi
 ahci_tegra
+ahci_mvebu
 sata_highbank
 
 # SATA PHYs
-- 
2.11.0



Bug#860236: xen pv domU crash with 3.16 kernel and xen 4.8

2017-04-14 Thread Vincent Legout
On Fri, Apr 14, 2017 at 09:15:58AM +0200, Vincent Legout wrote :
> On Thu, Apr 13, 2017 at 11:41:37PM +0100, Ben Hutchings wrote :
> > Control: tag -1 moreinfo
> > 
> > On Thu, 2017-04-13 at 11:18 +0200, Vincent Legout wrote:
> > > Package: src:linux
> > > Version: 3.16.39-1+deb8u2
> > > Severity: normal
> > > 
> > > Hi,
> > > 
> > > A xen jessie domU crashes around 5 minutes after the boot with the
> > > attached backtrace (at every boot). dom0 is also a Debian jessie running
> > > Xen 4.8.
> > > 
> > > It only happens when the guest is in pv mode, it works fine with pvhvm.
> > > 
> > > It also crashes with older 3.16 kernels and 4.0.2-1, but not with
> > > 4.2.1-1 (last 2 kernels from snapshot.debian.org).
> > > 
> > > # uname -a
> > > 3.16.0-4-amd64 #1 SMP Debian 3.16.39-1+deb8u2 (2017-03-07) x86_64 
> > > GNU/Linux
> > 
> > From the crash log:
> > 
> > > [  300.632389] CPU: 0 PID: 0 Comm: swapper/0 Tainted: GW 
> > > 3.16.0-4-amd64 #1 Debian 3.16.39-1+deb8u2
> > 
> > This indicates there was an earlier WARNING message; what was that?
> 
> Thanks for the answer.
> 
> I got this WARNING after I increased verbosity in the command line:

The WARNING and BUG disappear if "maxvcpus" is disabled in the guest
configuration (which prevents adding or removing vcpus).

Could cpu hotplug be buggy in 3.16? And Xen triggers this bug after 5
minutes even without doing any 'xl vcpu-set'?

With "maxvcpus" set larger "vcpus", xl vcpu-set seems to work most of
the time (between 1 and 16 vcpus), but after several tries, I got the
attached trace.

Vincent
[   62.000210] BUG: unable to handle kernel NULL pointer dereference at 
0008
[   62.000229] IP: [] migrate_timer_list+0x3b/0xc0
[   62.000246] PGD 0 
[   62.000251] Oops: 0002 [#1] SMP 
[   62.000261] Modules linked in: x86_pkg_temp_thermal thermal_sys intel_rapl 
coretemp crc32_pclmul evdev aesni_intel aes_x86_64 lrw gf128mul glue_helper 
ablk_helper pcspkr cryptd autofs4 ext4 crc16 mbcache jbd2 xen_netfront 
xen_blkfront crct10dif_pclmul crct10dif_common crc32c_intel
[   62.000306] CPU: 9 PID: 89 Comm: xenwatch Not tainted 3.16.0-4-amd64 #1 
Debian 3.16.39-1+deb8u2
[   62.000318] task: 88003d597370 ti: 88003d598000 task.ti: 
88003d598000
[   62.000326] RIP: e030:[]  [] 
migrate_timer_list+0x3b/0xc0
[   62.000338] RSP: e02b:88003d59bd70  EFLAGS: 00010087
[   62.000344] RAX: dead0200 RBX:  RCX: 223a
[   62.000351] RDX:  RSI: 88003f96ca00 RDI: 88003daac000
[   62.000357] RBP: 88003f96ca00 R08: 4000 R09: fff8
[   62.000364] R10:  R11:  R12: 88003e3a5430
[   62.000375] R13: 88003daac000 R14: 818e2fa0 R15: 88003e3a5030
[   62.000387] FS:  () GS:88003f92() 
knlGS:
[   62.000395] CS:  e033 DS:  ES:  CR0: 80050033
[   62.000401] CR2: 0008 CR3: 01813000 CR4: 00042660
[   62.000408] Stack:
[   62.000411]   88003daac000 88003e3a5c30 
88003e3a5830
[   62.000422]  88003e3a5430 81075188 88003e3a4000 
fff2
[   62.000432]  8184c1a0  0007 
0001
[   62.000443] Call Trace:
[   62.000455]  [] ? timer_cpu_notify+0xf8/0x2e0
[   62.000465]  [] ? notifier_call_chain+0x4e/0x70
[   62.000478]  [] ? cpu_notify+0x1f/0x40
[   62.000486]  [] ? cpu_notify_nofail+0xa/0x20
[   62.000499]  [] ? _cpu_down+0x17b/0x290
[   62.000512]  [] ? unregister_xenbus_watch+0x210/0x210
[   62.000520]  [] ? cpu_down+0x2d/0x40
[   62.000530]  [] ? handle_vcpu_hotplug_event+0xa7/0xd0
[   62.000538]  [] ? xenwatch_thread+0x92/0x130
[   62.000550]  [] ? prepare_to_wait_event+0xf0/0xf0
[   62.000565]  [] ? kthread+0xbd/0xe0
[   62.000572]  [] ? kthread_create_on_node+0x180/0x180
[   62.000586]  [] ? ret_from_fork+0x58/0x90
[   62.000594]  [] ? kthread_create_on_node+0x180/0x180
[   62.000600] Code: 49 89 fd 41 54 49 89 f4 55 53 48 8b 2e 48 39 ee 74 4a 66 
0f 1f 44 00 00 0f 1f 44 00 00 48 8b 45 08 48 8b 55 00 48 89 ee 4c 89 ef <48> 89 
42 08 48 89 10 48 b8 00 02 00 00 00 00 ad de 48 89 45 08 
[   62.000680] RIP  [] migrate_timer_list+0x3b/0xc0
[   62.000692]  RSP 
[   62.000696] CR2: 0008
[   62.000703] ---[ end trace b62387850d17f99e ]---
[   84.492006] INFO: rcu_sched detected stalls on CPUs/tasks: { 2 8 9} 
(detected by 4, t=5255 jiffies, g=614, c=613, q=59)
[   84.492039] sending NMI to all CPUs:
[   63.481417] NMI backtrace for cpu 0
[   63.481417] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G  D   
3.16.0-4-amd64 #1 Debian 3.16.39-1+deb8u2
[   63.481417] task: 8181a460 ti: 8180 task.ti: 
8180
[   63.481417] RIP: e030:[]  [] 
_raw_spin_lock+0x28/0x30
[   63.481417] RSP: e02b:88003f803b58  EFLAGS: 0093
[   63.481417] RAX: 0198 RBX: 88003af9d3d8 RCX: 019b
[   63.481417] 

Bug#860236: xen pv domU crash with 3.16 kernel and xen 4.8

2017-04-14 Thread Vincent Legout
On Thu, Apr 13, 2017 at 11:41:37PM +0100, Ben Hutchings wrote :
> Control: tag -1 moreinfo
> 
> On Thu, 2017-04-13 at 11:18 +0200, Vincent Legout wrote:
> > Package: src:linux
> > Version: 3.16.39-1+deb8u2
> > Severity: normal
> > 
> > Hi,
> > 
> > A xen jessie domU crashes around 5 minutes after the boot with the
> > attached backtrace (at every boot). dom0 is also a Debian jessie running
> > Xen 4.8.
> > 
> > It only happens when the guest is in pv mode, it works fine with pvhvm.
> > 
> > It also crashes with older 3.16 kernels and 4.0.2-1, but not with
> > 4.2.1-1 (last 2 kernels from snapshot.debian.org).
> > 
> > # uname -a
> > 3.16.0-4-amd64 #1 SMP Debian 3.16.39-1+deb8u2 (2017-03-07) x86_64 GNU/Linux
> 
> From the crash log:
> 
> > [  300.632389] CPU: 0 PID: 0 Comm: swapper/0 Tainted: GW 
> > 3.16.0-4-amd64 #1 Debian 3.16.39-1+deb8u2
> 
> This indicates there was an earlier WARNING message; what was that?

Thanks for the answer.

I got this WARNING after I increased verbosity in the command line:

[  300.636063] [ cut here ]
[  300.636102] WARNING: CPU: 0 PID: 0 at 
/build/linux-GSgHvp/linux-3.16.39/arch/x86/kernel/cpu/mcheck/mce.c:1307 
mce_timer_fn+0x132/0x140()
[  300.636116] Modules linked in: x86_pkg_temp_thermal thermal_sys intel_rapl 
coretemp crc32_pclmul evdev aesni_intel aes_x86_64 lrw gf128mul glue_helper 
ablk_helper pcspkr cryptd autofs4 ext4 crc16 mbcache jbd2 xen_netfront 
xen_blkfront crct10dif_pclmul crct10dif_common crc32c_intel
[  300.636167] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.16.0-4-amd64 #1 
Debian 3.16.39-1+deb8u2
[  300.636178]   81514c81  
0009
[  300.636188]  81068867 88003f80ca00 88003f9eca00 
0100
[  300.636199]  81038a30 000f 81038b62 
81a66e00
[  300.636211] Call Trace:
[  300.636216][] ? dump_stack+0x5d/0x78
[  300.636242]  [] ? warn_slowpath_common+0x77/0x90
[  300.636250]  [] ? mce_cpu_restart+0x40/0x40
[  300.636257]  [] ? mce_timer_fn+0x132/0x140
[  300.636267]  [] ? call_timer_fn+0x31/0x140
[  300.636274]  [] ? mce_cpu_restart+0x40/0x40
[  300.636284]  [] ? run_timer_softirq+0x1e9/0x2f0
[  300.636292]  [] ? __do_softirq+0xf1/0x2d0
[  300.636299]  [] ? irq_exit+0x95/0xa0
[  300.636309]  [] ? xen_evtchn_do_upcall+0x35/0x50
[  300.636319]  [] ? xen_do_hypervisor_callback+0x1e/0x30
[  300.636324][] ? xen_hypercall_sched_op+0xc/0x20
[  300.636339]  [] ? xen_hypercall_sched_op+0xc/0x20
[  300.636349]  [] ? xen_safe_halt+0xc/0x20
[  300.636360]  [] ? default_idle+0x19/0xd0
[  300.636370]  [] ? cpu_startup_entry+0x374/0x470
[  300.636384]  [] ? start_kernel+0x497/0x4a2
[  300.636392]  [] ? set_init_arg+0x4e/0x4e
[  300.636400]  [] ? xen_start_kernel+0x569/0x573
[  300.636413] ---[ end trace 7131ef713ca84161 ]---

Then, the same BUG as before. It always happens after 300 seconds.

Vincent


signature.asc
Description: PGP signature