[Kernel-packages] [Bug 2059316] Re: backport arm64 THP improvements from 6.9

2024-05-01 Thread Ian May
** Also affects: linux-nvidia (Ubuntu)
   Importance: Undecided
   Status: New

** Also affects: linux (Ubuntu Noble)
   Importance: Undecided
   Status: New

** Also affects: linux-nvidia (Ubuntu Noble)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2059316

Title:
  backport arm64 THP improvements from 6.9

Status in linux package in Ubuntu:
  In Progress
Status in linux-nvidia package in Ubuntu:
  New
Status in linux source package in Noble:
  New
Status in linux-nvidia source package in Noble:
  New

Bug description:
  Initial support for multi-size THP landed upstream in v6.8. In the 6.9
  merge window, 2 other series have landed that show significant
  performance improvements on arm64

  mm/memory: optimize fork() with PTE-mapped THP
    https://lkml.iu.edu/hypermail/linux/kernel/2401.3/02766.html

  Transparent Contiguous PTEs for User Mappings:
   https://lwn.net/Articles/962330/

  On an Ampere AltraMax system w/ 4K page size, kernel builds in a tmpfs
  are reduced from 6m30s to 5m17s, a ~19% improvement.

  It has been reported that this can have a *10x* improvement for
  certain GPU workloads on ARM:

  https://lwn.net/Articles/954094/

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2059316/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2062380] Re: Using a 6.8 kernel 'modprobe nvidia' hangs on Quanta Grace Hopper

2024-04-18 Thread Ian May
** Summary changed:

- Using a 6.8 kernel modprobe nvidia hangs on Grace Hopper
+ Using a 6.8 kernel 'modprobe nvidia' hangs on Quanta Grace Hopper

** Also affects: nvidia-graphics-drivers-535-server (Ubuntu)
   Importance: Undecided
   Status: New

** Changed in: nvidia-graphics-drivers-535-server (Ubuntu)
   Status: New => Confirmed

** Changed in: nvidia-graphics-drivers-550-server (Ubuntu)
   Status: New => Confirmed

** Description changed:

  Using both -generic and -nvidia 6.8 kernels I'm seeing a hang when I
  load the nvidia driver.
+ 
+ $ sudo dmidecode -t 0
+ # dmidecode 3.5
+ Getting SMBIOS data from sysfs.
+ SMBIOS 3.6.0 present.
+ # SMBIOS implementations newer than version 3.5.0 are not
+ # fully supported by this version of dmidecode.
+ 
+ Handle 0x0001, DMI type 0, 26 bytes
+ BIOS Information
+   Vendor: NVIDIA
+   Version: 01.02.01
+   Release Date: 20240207
+   ROM Size: 64 MB
+   Characteristics:
+   PCI is supported
+   PNP is supported
+   BIOS is upgradeable
+   BIOS shadowing is allowed
+   Boot from CD is supported
+   Selectable boot is supported
+   Serial services are supported (int 14h)
+   ACPI is supported
+   Targeted content distribution is supported
+   UEFI is supported
+   Firmware Revision: 0.0
  
  [  382.938326] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
  [  382.946075] rcu: 53-...0: (4 ticks this GP) 
idle=1c2c/1/0x4000 softirq=4866/4868 fqs=14124
  [  382.955683] rcu:  hardirqs   softirqs   csw/system
  [  382.961378] rcu:  number:0  00
  [  382.967071] rcu: cputime:0  00   ==> 
30026(ms)
  [  382.974189] rcu: (detected by 52, t=60034 jiffies, g=24469, q=1199 
ncpus=72)
  [  392.982095] rcu: rcu_preempt kthread starved for 9994 jiffies! g24469 f0x0 
RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=31
  [  392.992769] rcu: Unless rcu_preempt kthread gets sufficient CPU time, 
OOM is now expected behavior
  
- 
  After seeing this, I Enabled kdump and set kernel.panic_on_rcu_stall = 1
  
  KDUMP INFO
  WARNING: cpu 54: cannot find NT_PRSTATUS note
-   KERNEL: /usr/lib/debug/boot/vmlinux-6.8.0-1004-nvidia-64k  [TAINTED]
- DUMPFILE: /var/crash/202404172139/dump.202404172139  [PARTIAL DUMP]
- CPUS: 72
- DATE: Wed Apr 17 21:39:13 UTC 2024
-   UPTIME: 00:06:10
+   KERNEL: /usr/lib/debug/boot/vmlinux-6.8.0-1004-nvidia-64k  [TAINTED]
+ DUMPFILE: /var/crash/202404172139/dump.202404172139  [PARTIAL DUMP]
+ CPUS: 72
+ DATE: Wed Apr 17 21:39:13 UTC 2024
+   UPTIME: 00:06:10
  LOAD AVERAGE: 0.68, 0.63, 0.28
-TASKS: 854
- NODENAME: hinyari
-  RELEASE: 6.8.0-1005-nvidia-64k
-  VERSION: #5-Ubuntu SMP PREEMPT_DYNAMIC Wed Apr 17 11:26:46 UTC 2024
-  MACHINE: aarch64  (unknown Mhz)
-   MEMORY: 479.7 GB
-PANIC: "Kernel panic - not syncing: RCU Stall"
-  PID: 0
-  COMMAND: "swapper/21"
- TASK: 82026880  (1 of 72)  [THREAD_INFO: 82026880]
-  CPU: 21
-STATE: TASK_RUNNING (PANIC)
+    TASKS: 854
+ NODENAME: hinyari
+  RELEASE: 6.8.0-1005-nvidia-64k
+  VERSION: #5-Ubuntu SMP PREEMPT_DYNAMIC Wed Apr 17 11:26:46 UTC 2024
+  MACHINE: aarch64  (unknown Mhz)
+   MEMORY: 479.7 GB
+    PANIC: "Kernel panic - not syncing: RCU Stall"
+  PID: 0
+  COMMAND: "swapper/21"
+ TASK: 82026880  (1 of 72)  [THREAD_INFO: 82026880]
+  CPU: 21
+    STATE: TASK_RUNNING (PANIC)
  
  [  300.313144] nvidia: loading out-of-tree module taints kernel.
  [  300.313153] nvidia: module verification failed: signature and/or required 
key missing - tainting kernel
  [  300.316694] nvidia-nvlink: Nvlink Core is being initialized, major device 
number 506
- [  300.316699] 
+ [  300.316699]
  [  360.323454] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
  [  360.331206] rcu: 54-...0: (24 ticks this GP) 
idle=742c/1/0x4000 softirq=4931/4933 fqs=13148
  [  360.340903] rcu:  hardirqs   softirqs   csw/system
  [  360.346597] rcu:  number:0  00
  [  360.352291] rcu: cputime:0  00   ==> 
30031(ms)
  [  360.359408] rcu: (detected by 21, t=60038 jiffies, g=25009, q=1123 
ncpus=72)
  [  360.366704] Sending NMI from CPU 21 to CPUs 54:
  [  370.367310] rcu: rcu_preempt kthread starved for 9993 jiffies! g25009 f0x0 
RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=31
  [  370.377983] rcu: Unless rcu_preempt kthread gets sufficient CPU time, 
OOM is now expected behavior.
  [  370.387322] rcu: RCU grace-period kthread stack dump:
  [  370.392482] task:rcu_preempt state:I stack:0 pid:17tgid:17
ppid:2  flags:0x0008
  [  370.392488] Call trace:
  [ 

[Kernel-packages] [Bug 2062380] [NEW] Using a 6.8 kernel modprobe nvidia hangs on Grace Hopper

2024-04-18 Thread Ian May
Public bug reported:

Using both -generic and -nvidia 6.8 kernels I'm seeing a hang when I
load the nvidia driver.

[  382.938326] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[  382.946075] rcu: 53-...0: (4 ticks this GP) 
idle=1c2c/1/0x4000 softirq=4866/4868 fqs=14124
[  382.955683] rcu:  hardirqs   softirqs   csw/system
[  382.961378] rcu:  number:0  00
[  382.967071] rcu: cputime:0  00   ==> 
30026(ms)
[  382.974189] rcu: (detected by 52, t=60034 jiffies, g=24469, q=1199 
ncpus=72)
[  392.982095] rcu: rcu_preempt kthread starved for 9994 jiffies! g24469 f0x0 
RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=31
[  392.992769] rcu: Unless rcu_preempt kthread gets sufficient CPU time, 
OOM is now expected behavior


After seeing this, I Enabled kdump and set kernel.panic_on_rcu_stall = 1

KDUMP INFO
WARNING: cpu 54: cannot find NT_PRSTATUS note
  KERNEL: /usr/lib/debug/boot/vmlinux-6.8.0-1004-nvidia-64k  [TAINTED]
DUMPFILE: /var/crash/202404172139/dump.202404172139  [PARTIAL DUMP]
CPUS: 72
DATE: Wed Apr 17 21:39:13 UTC 2024
  UPTIME: 00:06:10
LOAD AVERAGE: 0.68, 0.63, 0.28
   TASKS: 854
NODENAME: hinyari
 RELEASE: 6.8.0-1005-nvidia-64k
 VERSION: #5-Ubuntu SMP PREEMPT_DYNAMIC Wed Apr 17 11:26:46 UTC 2024
 MACHINE: aarch64  (unknown Mhz)
  MEMORY: 479.7 GB
   PANIC: "Kernel panic - not syncing: RCU Stall"
 PID: 0
 COMMAND: "swapper/21"
TASK: 82026880  (1 of 72)  [THREAD_INFO: 82026880]
 CPU: 21
   STATE: TASK_RUNNING (PANIC)

[  300.313144] nvidia: loading out-of-tree module taints kernel.
[  300.313153] nvidia: module verification failed: signature and/or required 
key missing - tainting kernel
[  300.316694] nvidia-nvlink: Nvlink Core is being initialized, major device 
number 506
[  300.316699] 
[  360.323454] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
[  360.331206] rcu: 54-...0: (24 ticks this GP) 
idle=742c/1/0x4000 softirq=4931/4933 fqs=13148
[  360.340903] rcu:  hardirqs   softirqs   csw/system
[  360.346597] rcu:  number:0  00
[  360.352291] rcu: cputime:0  00   ==> 
30031(ms)
[  360.359408] rcu: (detected by 21, t=60038 jiffies, g=25009, q=1123 
ncpus=72)
[  360.366704] Sending NMI from CPU 21 to CPUs 54:
[  370.367310] rcu: rcu_preempt kthread starved for 9993 jiffies! g25009 f0x0 
RCU_GP_DOING_FQS(6) ->state=0x0 ->cpu=31
[  370.377983] rcu: Unless rcu_preempt kthread gets sufficient CPU time, 
OOM is now expected behavior.
[  370.387322] rcu: RCU grace-period kthread stack dump:
[  370.392482] task:rcu_preempt state:I stack:0 pid:17tgid:17
ppid:2  flags:0x0008
[  370.392488] Call trace:
[  370.392489]  __switch_to+0xd0/0x118
[  370.392499]  __schedule+0x2a8/0x7b0
[  370.392501]  schedule+0x40/0x168
[  370.392502]  schedule_timeout+0xac/0x1e0
[  370.392505]  rcu_gp_fqs_loop+0x128/0x508
[  370.392512]  rcu_gp_kthread+0x150/0x188
[  370.392514]  kthread+0xf8/0x110
[  370.392519]  ret_from_fork+0x10/0x20
[  370.392524] rcu: Stack dump where RCU GP kthread last ran:
[  370.398128] Sending NMI from CPU 21 to CPUs 31:
[  370.398131] NMI backtrace for cpu 31
[  370.398136] CPU: 31 PID: 0 Comm: swapper/31 Kdump: loaded Tainted: G 
  OE  6.8.0-1005-nvidia-64k #5-Ubuntu
[  370.398139] Hardware name:  /P3880, BIOS 01.02.01 20240207
[  370.398140] pstate: 6349 (nZCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
[  370.398142] pc : cpuidle_enter_state+0xd8/0x790
[  370.398150] lr : cpuidle_enter_state+0xcc/0x790
[  370.398153] sp : 800081eefd70
[  370.398154] x29: 800081eefd70 x28:  x27: 
[  370.398157] x26:  x25: 00563d67e4e0 x24: 
[  370.398160] x23: a0a1445699f8 x22:  x21: 00563d72ece0
[  370.398162] x20: a0a144569a10 x19: 8fa4a800 x18: 800081f00030
[  370.398165] x17:  x16:  x15: ac8c73b08db0
[  370.398168] x14:  x13:  x12: 
[  370.398170] x11:  x10: 2da0fbe3d5e8c649 x9 : a0a1424fd244
[  370.398173] x8 : 820559b8 x7 :  x6 : 
[  370.398175] x5 :  x4 :  x3 : 
[  370.398178] x2 :  x1 :  x0 : 
[  370.398181] Call trace:
[  370.398183]  cpuidle_enter_state+0xd8/0x790
[  370.398185]  cpuidle_enter+0x44/0x78
[  370.398195]  cpuidle_idle_call+0x15c/0x210
[  370.398202]  do_idle+0xb0/0x130
[  370.398204]  cpu_startup_entry+0x40/0x50
[  370.398206]  secondary_start_kernel+0xec/0x130
[  370.398211]  __secondary_switched+0xc0/0xc8
[  370.399132] Kernel panic - not syncing: RCU Stall
[  370.403938] CPU: 21 PID: 0 Comm: 

[Kernel-packages] [Bug 2055712] Re: Pull-request to address bug in mm/page_alloc.c

2024-04-02 Thread Ian May
** Tags added: verification-done-jammy

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-nvidia-6.5 in Ubuntu.
https://bugs.launchpad.net/bugs/2055712

Title:
  Pull-request to address bug in mm/page_alloc.c

Status in linux-nvidia-6.5 package in Ubuntu:
  Fix Released

Bug description:
  
  The current calculation of min_free_kbytes only uses ZONE_DMA and
  ZONE_NORMAL pages,but the ZONE_MOVABLE zone->_watermark[WMARK_MIN] will
  also divide part of min_free_kbytes.This will cause the min watermark of
  ZONE_NORMAL to be too small in the presence of ZONE_MOVEABLE.

  __GFP_HIGH and PF_MEMALLOC allocations usually don't need movable zone
  pages, so just like ZONE_HIGHMEM, cap pages_min to a small value in
  __setup_per_zone_wmarks().

  On my testing machine with 16GB of memory (transparent hugepage is turned
  off by default, and movablecore=12G is configured) The following is a
  comparative test data of watermark_min

  no patchadd patch
  ZONE_DMA1   8
  ZONE_DMA32  151 709
  ZONE_NORMAL 233 1113
  ZONE_MOVABLE1434128
  min_free_kbytes 72887326

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.5/+bug/2055712/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2055712] Re: Pull-request to address bug in mm/page_alloc.c

2024-04-02 Thread Ian May
** Changed in: linux-nvidia-6.5 (Ubuntu)
   Status: New => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-nvidia-6.5 in Ubuntu.
https://bugs.launchpad.net/bugs/2055712

Title:
  Pull-request to address bug in mm/page_alloc.c

Status in linux-nvidia-6.5 package in Ubuntu:
  Fix Released

Bug description:
  
  The current calculation of min_free_kbytes only uses ZONE_DMA and
  ZONE_NORMAL pages,but the ZONE_MOVABLE zone->_watermark[WMARK_MIN] will
  also divide part of min_free_kbytes.This will cause the min watermark of
  ZONE_NORMAL to be too small in the presence of ZONE_MOVEABLE.

  __GFP_HIGH and PF_MEMALLOC allocations usually don't need movable zone
  pages, so just like ZONE_HIGHMEM, cap pages_min to a small value in
  __setup_per_zone_wmarks().

  On my testing machine with 16GB of memory (transparent hugepage is turned
  off by default, and movablecore=12G is configured) The following is a
  comparative test data of watermark_min

  no patchadd patch
  ZONE_DMA1   8
  ZONE_DMA32  151 709
  ZONE_NORMAL 233 1113
  ZONE_MOVABLE1434128
  min_free_kbytes 72887326

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.5/+bug/2055712/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2059150] Re: jammy/linux-nvidia-6.5: 6.5.0-1014.14 - Boot failure on Quanta Grace/Hopper

2024-03-26 Thread Ian May
Upgrading bios firmware resolves failure

$ sudo dmidecode -t 0
# dmidecode 3.3
Getting SMBIOS data from sysfs.
SMBIOS 3.6.0 present.
# SMBIOS implementations newer than version 3.5.0 are not
# fully supported by this version of dmidecode.

Handle 0x0001, DMI type 0, 26 bytes
BIOS Information
Vendor: NVIDIA
Version: 01.02.01
Release Date: 20240207
ROM Size: 64 MB
Characteristics:
PCI is supported
PNP is supported
BIOS is upgradeable
BIOS shadowing is allowed
Boot from CD is supported
Selectable boot is supported
Serial services are supported (int 14h)
ACPI is supported
Targeted content distribution is supported
UEFI is supported
Firmware Revision: 0.0


** Changed in: linux-nvidia-6.5 (Ubuntu)
   Status: New => Invalid

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-nvidia-6.5 in Ubuntu.
https://bugs.launchpad.net/bugs/2059150

Title:
  jammy/linux-nvidia-6.5: 6.5.0-1014.14 - Boot failure on Quanta
  Grace/Hopper

Status in linux-nvidia-6.5 package in Ubuntu:
  Invalid

Bug description:
  Output from BMC SOL console:

  Unhandled Exception from EL2
  x0 = 0x11f210305619
  x1 = 0x
  x2 = 0x
  x3 = 0x
  x4 = 0x5f972493
  x5 = 0x
  x6 = 0x
  x7 = 0x
  x8 = 0x
  x9 = 0xa0e0a03e7d6c
  x10= 0x
  x11= 0x
  x12= 0x
  x13= 0x
  x14= 0x
  x15= 0x
  x16= 0x
  x17= 0x
  x18= 0x
  x19= 0xf0f18080
  x20= 0x80009e86f6a0
  x21= 0x80009e86f720
  x22= 0x07a5a0e0a03e7d6c
  x23= 0x
  x24= 0xa0e0a3348aa0
  x25= 0xa0e0a2990008
  x26= 0xa0e0a2990008
  x27= 0xa0e04b4f5748
  x28= 0x80009e86f710
  x29= 0x80008000fe00
  x30= 0xa0e0a03e7d6c
  scr_el3= 0x0407073d
  sctlr_el3  = 0x30cd183f
  cptr_el3   = 0x00100100
  tcr_el3= 0x80853510
  daif   = 0x02c0
  mair_el3   = 0x004404ff
  spsr_el3   = 0x034000c9
  elr_el3= 0xa0e04b4f58b4
  ttbr0_el3  = 0x0078734a5001
  esr_el3= 0x622c5c1f
  far_el3= 0x9446dd42099e8148
  spsr_el1   = 0x
  elr_el1= 0x
  spsr_abt   = 0x
  spsr_und   = 0x
  spsr_irq   = 0x
  spsr_fiq   = 0x
  sctlr_el1  = 0x30d00980
  actlr_el1  = 0x
  cpacr_el1  = 0x0030
  csselr_el1 = 0x0002
  sp_el1 = 0x
  esr_el1= 0x
  ttbr0_el1  = 0x
  ttbr1_el1  = 0x
  mair_el1   = 0x
  amair_el1  = 0x
  tcr_el1= 0x
  tpidr_el1  = 0x
  tpidr_el0  = 0x8000
  tpidrro_el0= 0x
  par_el1= 0x0800
  mpidr_el1  = 0x8102
  afsr0_el1  = 0x
  afsr1_el1  = 0x
  contextidr_el1 = 0x
  vbar_el1   = 0x
  cntp_ctl_el0   = 0x
  cntp_cval_el0  = 0x0012ec91c420
  cntv_ctl_el0   = 0x
  cntv_cval_el0  = 0x
  cntkctl_el1= 0x
  sp_el0 = 0x0078732cf4f0
  isr_el1= 0x0040
  cpuectlr_el1   = 0x4000340340003000
  gicd_ispendr regs (Offsets 0x200 - 0x278)
   Offset:value
  0200:   0xUnhandled Exception in EL3.
  x30= 0x0078732c4384
  x0 = 0x
  x1 = 0x0078732cb7d8
  x2 = 0x0018
  x3 = 0x0078732b1720
  x4 = 0x
  x5 = 0x003c
  x6 = 0x0078732c9109
  x7 = 0x22000204
  x8 = 0x4000340340003000
  x9 = 0x
  x10= 0x
  x11= 0x0012ec91c420
  x12= 0x
  x13= 0x
  x14= 0x
  

[Kernel-packages] [Bug 2059150] [NEW] jammy/linux-nvidia-6.5: 6.5.0-1014.14 - Boot failure on Quanta Grace/Hopper

2024-03-26 Thread Ian May
Public bug reported:

Output from BMC SOL console:

Unhandled Exception from EL2
x0 = 0x11f210305619
x1 = 0x
x2 = 0x
x3 = 0x
x4 = 0x5f972493
x5 = 0x
x6 = 0x
x7 = 0x
x8 = 0x
x9 = 0xa0e0a03e7d6c
x10= 0x
x11= 0x
x12= 0x
x13= 0x
x14= 0x
x15= 0x
x16= 0x
x17= 0x
x18= 0x
x19= 0xf0f18080
x20= 0x80009e86f6a0
x21= 0x80009e86f720
x22= 0x07a5a0e0a03e7d6c
x23= 0x
x24= 0xa0e0a3348aa0
x25= 0xa0e0a2990008
x26= 0xa0e0a2990008
x27= 0xa0e04b4f5748
x28= 0x80009e86f710
x29= 0x80008000fe00
x30= 0xa0e0a03e7d6c
scr_el3= 0x0407073d
sctlr_el3  = 0x30cd183f
cptr_el3   = 0x00100100
tcr_el3= 0x80853510
daif   = 0x02c0
mair_el3   = 0x004404ff
spsr_el3   = 0x034000c9
elr_el3= 0xa0e04b4f58b4
ttbr0_el3  = 0x0078734a5001
esr_el3= 0x622c5c1f
far_el3= 0x9446dd42099e8148
spsr_el1   = 0x
elr_el1= 0x
spsr_abt   = 0x
spsr_und   = 0x
spsr_irq   = 0x
spsr_fiq   = 0x
sctlr_el1  = 0x30d00980
actlr_el1  = 0x
cpacr_el1  = 0x0030
csselr_el1 = 0x0002
sp_el1 = 0x
esr_el1= 0x
ttbr0_el1  = 0x
ttbr1_el1  = 0x
mair_el1   = 0x
amair_el1  = 0x
tcr_el1= 0x
tpidr_el1  = 0x
tpidr_el0  = 0x8000
tpidrro_el0= 0x
par_el1= 0x0800
mpidr_el1  = 0x8102
afsr0_el1  = 0x
afsr1_el1  = 0x
contextidr_el1 = 0x
vbar_el1   = 0x
cntp_ctl_el0   = 0x
cntp_cval_el0  = 0x0012ec91c420
cntv_ctl_el0   = 0x
cntv_cval_el0  = 0x
cntkctl_el1= 0x
sp_el0 = 0x0078732cf4f0
isr_el1= 0x0040
cpuectlr_el1   = 0x4000340340003000
gicd_ispendr regs (Offsets 0x200 - 0x278)
 Offset:value
0200:   0xUnhandled Exception in EL3.
x30= 0x0078732c4384
x0 = 0x
x1 = 0x0078732cb7d8
x2 = 0x0018
x3 = 0x0078732b1720
x4 = 0x
x5 = 0x003c
x6 = 0x0078732c9109
x7 = 0x22000204
x8 = 0x4000340340003000
x9 = 0x
x10= 0x
x11= 0x0012ec91c420
x12= 0x
x13= 0x
x14= 0x
x15= 0x0078732cf4f0
x16= 0x2200
x17= 0x0018
x18= 0x0407073d
x19= 0x007873386440
x20= 0x80009e86f6a0
x21= 0x80009e86f720
x22= 0x07a5a0e0a03e7d6c
x23= 0x
x24= 0xa0e0a3348aa0
x25= 0xa0e0a2990008
x26= 0xa0e0a2990008
x27= 0xa0e04b4f5748
x28= 0x80009e86f710
x29= 0x80008000fe00
scr_el3= 0x0407073d
sctlr_el3  = 0x30cd183f
cptr_el3   = 0x00100100
tcr_el3= 0x80853510
daif   = 0x03c0
mair_el3   = 0x004404ff
spsr_el3   = 0x834002cd
elr_el3= 0x0078732b0af4
ttbr0_el3  = 0x0078734a5001
esr_el3= 0xbe11
far_el3= 0x9446dd42099e8148
spsr_el1   = 0x
elr_el1= 0x
spsr_abt   = 0x
spsr_und   = 0x
spsr_irq   = 0x
spsr_fiq   = 0x
sctlr_el1  = 0x30d00980
actlr_el1  = 0x
cpacr_el1  = 0x0030
csselr_el1 = 0x0002
sp_el1 = 0x
esr_el1= 0x
ttbr0_el1  = 0x
ttbr1_el1  = 0x
mair_el1   = 

[Kernel-packages] [Bug 2049537] Re: Pull request for: peer-memory, ACPI thermal issues and coresight etm4x issues

2024-01-17 Thread Ian May
** Changed in: linux-nvidia-6.5 (Ubuntu)
   Status: New => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-nvidia-6.5 in Ubuntu.
https://bugs.launchpad.net/bugs/2049537

Title:
  Pull request for: peer-memory, ACPI thermal issues and coresight
  etm4x issues

Status in linux-nvidia-6.5 package in Ubuntu:
  Fix Committed

Bug description:
  * Add support of "Thermal fast Sampling Period (_TFP)" for passive cooling.
  * Finer grained CPU throttling
  * The peer_memory_client scheme allows a driver to register with the ib_umem
  system that it has the ability to understand user virtual address ranges that 
are not compatible with get_user_pages(). For instance VMAs created with 
io_remap_pfn_range(), or other driver special VMA.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.5/+bug/2049537/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2048815] Re: Pull request to address TPM SPI devices

2024-01-11 Thread Ian May
** Changed in: linux-nvidia-6.5 (Ubuntu)
   Status: New => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-nvidia-6.5 in Ubuntu.
https://bugs.launchpad.net/bugs/2048815

Title:
  Pull request to address TPM SPI devices

Status in linux-nvidia-6.5 package in Ubuntu:
  Fix Committed

Bug description:
  TPM devices may insert wait state on last clock cycle of ADDR phase.
  For SPI controllers that support full-duplex transfers, this can be
  detected using software by reading the MISO line. For SPI controllers
  that only support half-duplex transfers, such as the Tegra QSPI, it is
  not possible to detect the wait signal from software.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.5/+bug/2048815/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2048966] Re: Fix soft lockup triggered by arm_smmu_mm_invalidate_range

2024-01-11 Thread Ian May
** Changed in: linux-nvidia-6.5 (Ubuntu)
   Status: New => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-nvidia-6.5 in Ubuntu.
https://bugs.launchpad.net/bugs/2048966

Title:
  Fix soft lockup triggered by arm_smmu_mm_invalidate_range

Status in linux-nvidia-6.5 package in Ubuntu:
  Fix Committed

Bug description:
  [Problem]

  When running an SVA case, the following soft lockup is triggered:
  
  watchdog: BUG: soft lockup - CPU#244 stuck for 26s!
  pstate: 8349 (Nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
  pc : arm_smmu_cmdq_issue_cmdlist+0x178/0xa50
  lr : arm_smmu_cmdq_issue_cmdlist+0x150/0xa50
  sp : 8000d83ef290
  x29: 8000d83ef290 x28: 3b9aca00 x27: 
  x26: 8000d83ef3c0 x25: da86c0812194a0e8 x24: 
  x23: 0040 x22: 8000d83ef340 x21: c63980c0
  x20: 0001 x19: c6398080 x18: 
  x17:  x16:  x15: 3000b4a3bbb0
  x14: 3000b4a30888 x13: 3000b4a3cf60 x12: 
  x11:  x10:  x9 : c08120e4d6bc
  x8 :  x7 :  x6 : 00048cfa
  x5 :  x4 : 0001 x3 : 000a
  x2 : 8000 x1 :  x0 : 0001
  Call trace:
   arm_smmu_cmdq_issue_cmdlist+0x178/0xa50
   __arm_smmu_tlb_inv_range+0x118/0x254
   arm_smmu_tlb_inv_range_asid+0x6c/0x130
   arm_smmu_mm_invalidate_range+0xa0/0xa4
   __mmu_notifier_invalidate_range_end+0x88/0x120
   unmap_vmas+0x194/0x1e0
   unmap_region+0xb4/0x144
   do_mas_align_munmap+0x290/0x490
   do_mas_munmap+0xbc/0x124
   __vm_munmap+0xa8/0x19c
   __arm64_sys_munmap+0x28/0x50
   invoke_syscall+0x78/0x11c
   el0_svc_common.constprop.0+0x58/0x1c0
   do_el0_svc+0x34/0x60
   el0_svc+0x2c/0xd4
   el0t_64_sync_handler+0x114/0x140
   el0t_64_sync+0x1a4/0x1a8
  

  
  [Fix]

  backport the following upstream stable patch
  d5afb4b47e13161b3f33904d45110f9e6463bad6

  Link:
  https://lore.kernel.org/r/20230920052257.8615-1-nicol...@nvidia.com

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.5/+bug/2048966/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2048966] Re: Fix soft lockup triggered by arm_smmu_mm_invalidate_range

2024-01-11 Thread Ian May
** Description changed:

+ [Problem]
+ 
  When running an SVA case, the following soft lockup is triggered:
- 
- watchdog: BUG: soft lockup - CPU#244 stuck for 26s!
- pstate: 8349 (Nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
- pc : arm_smmu_cmdq_issue_cmdlist+0x178/0xa50
- lr : arm_smmu_cmdq_issue_cmdlist+0x150/0xa50
- sp : 8000d83ef290
- x29: 8000d83ef290 x28: 3b9aca00 x27: 
- x26: 8000d83ef3c0 x25: da86c0812194a0e8 x24: 
- x23: 0040 x22: 8000d83ef340 x21: c63980c0
- x20: 0001 x19: c6398080 x18: 
- x17:  x16:  x15: 3000b4a3bbb0
- x14: 3000b4a30888 x13: 3000b4a3cf60 x12: 
- x11:  x10:  x9 : c08120e4d6bc
- x8 :  x7 :  x6 : 00048cfa
- x5 :  x4 : 0001 x3 : 000a
- x2 : 8000 x1 :  x0 : 0001
- Call trace:
-  arm_smmu_cmdq_issue_cmdlist+0x178/0xa50
-  __arm_smmu_tlb_inv_range+0x118/0x254
-  arm_smmu_tlb_inv_range_asid+0x6c/0x130
-  arm_smmu_mm_invalidate_range+0xa0/0xa4
-  __mmu_notifier_invalidate_range_end+0x88/0x120
-  unmap_vmas+0x194/0x1e0
-  unmap_region+0xb4/0x144
-  do_mas_align_munmap+0x290/0x490
-  do_mas_munmap+0xbc/0x124
-  __vm_munmap+0xa8/0x19c
-  __arm64_sys_munmap+0x28/0x50
-  invoke_syscall+0x78/0x11c
-  el0_svc_common.constprop.0+0x58/0x1c0
-  do_el0_svc+0x34/0x60
-  el0_svc+0x2c/0xd4
-  el0t_64_sync_handler+0x114/0x140
-  el0t_64_sync+0x1a4/0x1a8
- 
+ 
+ watchdog: BUG: soft lockup - CPU#244 stuck for 26s!
+ pstate: 8349 (Nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
+ pc : arm_smmu_cmdq_issue_cmdlist+0x178/0xa50
+ lr : arm_smmu_cmdq_issue_cmdlist+0x150/0xa50
+ sp : 8000d83ef290
+ x29: 8000d83ef290 x28: 3b9aca00 x27: 
+ x26: 8000d83ef3c0 x25: da86c0812194a0e8 x24: 
+ x23: 0040 x22: 8000d83ef340 x21: c63980c0
+ x20: 0001 x19: c6398080 x18: 
+ x17:  x16:  x15: 3000b4a3bbb0
+ x14: 3000b4a30888 x13: 3000b4a3cf60 x12: 
+ x11:  x10:  x9 : c08120e4d6bc
+ x8 :  x7 :  x6 : 00048cfa
+ x5 :  x4 : 0001 x3 : 000a
+ x2 : 8000 x1 :  x0 : 0001
+ Call trace:
+  arm_smmu_cmdq_issue_cmdlist+0x178/0xa50
+  __arm_smmu_tlb_inv_range+0x118/0x254
+  arm_smmu_tlb_inv_range_asid+0x6c/0x130
+  arm_smmu_mm_invalidate_range+0xa0/0xa4
+  __mmu_notifier_invalidate_range_end+0x88/0x120
+  unmap_vmas+0x194/0x1e0
+  unmap_region+0xb4/0x144
+  do_mas_align_munmap+0x290/0x490
+  do_mas_munmap+0xbc/0x124
+  __vm_munmap+0xa8/0x19c
+  __arm64_sys_munmap+0x28/0x50
+  invoke_syscall+0x78/0x11c
+  el0_svc_common.constprop.0+0x58/0x1c0
+  do_el0_svc+0x34/0x60
+  el0_svc+0x2c/0xd4
+  el0t_64_sync_handler+0x114/0x140
+  el0t_64_sync+0x1a4/0x1a8
+ 
+ 
+ 
+ [Fix]
+ 
+ backport the following upstream stable patch
+ d5afb4b47e13161b3f33904d45110f9e6463bad6
+ 
+ Link:
+ https://lore.kernel.org/r/20230920052257.8615-1-nicol...@nvidia.com

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-nvidia-6.5 in Ubuntu.
https://bugs.launchpad.net/bugs/2048966

Title:
  Fix soft lockup triggered by arm_smmu_mm_invalidate_range

Status in linux-nvidia-6.5 package in Ubuntu:
  New

Bug description:
  [Problem]

  When running an SVA case, the following soft lockup is triggered:
  
  watchdog: BUG: soft lockup - CPU#244 stuck for 26s!
  pstate: 8349 (Nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
  pc : arm_smmu_cmdq_issue_cmdlist+0x178/0xa50
  lr : arm_smmu_cmdq_issue_cmdlist+0x150/0xa50
  sp : 8000d83ef290
  x29: 8000d83ef290 x28: 3b9aca00 x27: 
  x26: 8000d83ef3c0 x25: da86c0812194a0e8 x24: 
  x23: 0040 x22: 8000d83ef340 x21: c63980c0
  x20: 0001 x19: c6398080 x18: 
  x17:  x16:  x15: 3000b4a3bbb0
  x14: 3000b4a30888 x13: 

[Kernel-packages] [Bug 2048966] [NEW] Fix soft lockup triggered by arm_smmu_mm_invalidate_range

2024-01-10 Thread Ian May
Public bug reported:

When running an SVA case, the following soft lockup is triggered:

watchdog: BUG: soft lockup - CPU#244 stuck for 26s!
pstate: 8349 (Nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
pc : arm_smmu_cmdq_issue_cmdlist+0x178/0xa50
lr : arm_smmu_cmdq_issue_cmdlist+0x150/0xa50
sp : 8000d83ef290
x29: 8000d83ef290 x28: 3b9aca00 x27: 
x26: 8000d83ef3c0 x25: da86c0812194a0e8 x24: 
x23: 0040 x22: 8000d83ef340 x21: c63980c0
x20: 0001 x19: c6398080 x18: 
x17:  x16:  x15: 3000b4a3bbb0
x14: 3000b4a30888 x13: 3000b4a3cf60 x12: 
x11:  x10:  x9 : c08120e4d6bc
x8 :  x7 :  x6 : 00048cfa
x5 :  x4 : 0001 x3 : 000a
x2 : 8000 x1 :  x0 : 0001
Call trace:
 arm_smmu_cmdq_issue_cmdlist+0x178/0xa50
 __arm_smmu_tlb_inv_range+0x118/0x254
 arm_smmu_tlb_inv_range_asid+0x6c/0x130
 arm_smmu_mm_invalidate_range+0xa0/0xa4
 __mmu_notifier_invalidate_range_end+0x88/0x120
 unmap_vmas+0x194/0x1e0
 unmap_region+0xb4/0x144
 do_mas_align_munmap+0x290/0x490
 do_mas_munmap+0xbc/0x124
 __vm_munmap+0xa8/0x19c
 __arm64_sys_munmap+0x28/0x50
 invoke_syscall+0x78/0x11c
 el0_svc_common.constprop.0+0x58/0x1c0
 do_el0_svc+0x34/0x60
 el0_svc+0x2c/0xd4
 el0t_64_sync_handler+0x114/0x140
 el0t_64_sync+0x1a4/0x1a8


** Affects: linux-nvidia-6.5 (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-nvidia-6.5 in Ubuntu.
https://bugs.launchpad.net/bugs/2048966

Title:
  Fix soft lockup triggered by arm_smmu_mm_invalidate_range

Status in linux-nvidia-6.5 package in Ubuntu:
  New

Bug description:
  When running an SVA case, the following soft lockup is triggered:
  
  watchdog: BUG: soft lockup - CPU#244 stuck for 26s!
  pstate: 8349 (Nzcv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--)
  pc : arm_smmu_cmdq_issue_cmdlist+0x178/0xa50
  lr : arm_smmu_cmdq_issue_cmdlist+0x150/0xa50
  sp : 8000d83ef290
  x29: 8000d83ef290 x28: 3b9aca00 x27: 
  x26: 8000d83ef3c0 x25: da86c0812194a0e8 x24: 
  x23: 0040 x22: 8000d83ef340 x21: c63980c0
  x20: 0001 x19: c6398080 x18: 
  x17:  x16:  x15: 3000b4a3bbb0
  x14: 3000b4a30888 x13: 3000b4a3cf60 x12: 
  x11:  x10:  x9 : c08120e4d6bc
  x8 :  x7 :  x6 : 00048cfa
  x5 :  x4 : 0001 x3 : 000a
  x2 : 8000 x1 :  x0 : 0001
  Call trace:
   arm_smmu_cmdq_issue_cmdlist+0x178/0xa50
   __arm_smmu_tlb_inv_range+0x118/0x254
   arm_smmu_tlb_inv_range_asid+0x6c/0x130
   arm_smmu_mm_invalidate_range+0xa0/0xa4
   __mmu_notifier_invalidate_range_end+0x88/0x120
   unmap_vmas+0x194/0x1e0
   unmap_region+0xb4/0x144
   do_mas_align_munmap+0x290/0x490
   do_mas_munmap+0xbc/0x124
   __vm_munmap+0xa8/0x19c
   __arm64_sys_munmap+0x28/0x50
   invoke_syscall+0x78/0x11c
   el0_svc_common.constprop.0+0x58/0x1c0
   do_el0_svc+0x34/0x60
   el0_svc+0x2c/0xd4
   el0t_64_sync_handler+0x114/0x140
   el0t_64_sync+0x1a4/0x1a8
  

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.5/+bug/2048966/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2042697] Re: Pull request to address thermal core issues

2023-12-04 Thread Ian May
** Changed in: linux-nvidia-6.2 (Ubuntu)
   Status: New => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-nvidia-6.2 in Ubuntu.
https://bugs.launchpad.net/bugs/2042697

Title:
  Pull request to address thermal core issues

Status in linux-nvidia-6.2 package in Ubuntu:
  Fix Released

Bug description:
  The Grace development team has not been testing the 6.2 Ubuntu kernel
  but instead a newer kernel. When they run their thermal tests on a 6.2
  kernel they are running into failures. Investigations have turned up
  several missing kernel patches. These patches are clean cherry-picks
  and have been tested and confirmed to fix the thermal issues we are
  seeing.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.2/+bug/2042697/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2040526] Re: Backport RDMA DMABUF

2023-11-14 Thread Ian May
** Changed in: linux (Ubuntu)
   Status: Incomplete => Won't Fix

** No longer affects: linux (Ubuntu Jammy)

** No longer affects: linux-nvidia (Ubuntu Jammy)

** Changed in: linux-nvidia (Ubuntu)
   Status: New => Fix Committed

** Changed in: linux-nvidia (Ubuntu)
 Assignee: (unassigned) => Ian May (ian-may)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2040526

Title:
  Backport RDMA DMABUF

Status in linux package in Ubuntu:
  Won't Fix
Status in linux-nvidia package in Ubuntu:
  Fix Committed

Bug description:
  SRU Justification:

  [Impact]

  * From Nvidia:

  "We are working on a high performance networking solution with real
  customers. That solution is being developed using the Ubuntu 22.04 LTS
  distro release and the distro kernel (lowlatency flavour). This
  “dma_buf” patchset consists of upstreamed patches that allow buffers
  to be shared between drivers thus enhancing performance while reducing
  copying of data.

  Our team is currently engaged in the development of a high-performance
  networking solution tailored to meet the demands of real-world
  customers. This cutting-edge solution is being crafted on the
  foundation of Ubuntu 22.04 LTS, utilizing the distribution's kernel,
  specifically the lowlatency flavor.

  At the heart of our innovation lies the transformative "dma_buf"
  patchset, comprising a series of patches that have been integrated
  into the upstream kernel in 5.16 and 5.17. These patches introduce a
  groundbreaking capability: enabling the seamless sharing of buffers
  among various drivers. This not only bolsters the solution's
  performance but also minimizes the need for data copying, effectively
  enhancing efficiency across the board.

  The new functionality is isolated such that existing user will not
  execute these new code paths."

  Upstream Reference

  https://lore.kernel.org/all/20211012120903.96933-1-galpr...@amazon.com/
  https://lore.kernel.org/all/0-v1-bd147097458e+ede-umem_dmabuf_...@nvidia.com/
  The patch "[PATCH 1/4] net/mlx5: Add IFC bits for mkey ATS" is already in 
Jammy and was included in:
  "2a1e6097e9b9 UBUNTU: SAUCE: RDMA/core: Updated ib_peer_memory"

  [Test Plan]

  * Testing instructions are outlined in the SF case and has been tested
  on in house hardware and externally by Nvidia.

  [Where problems could occur?]

  * This introduces new code paths so regression potential should be
  low.

  [Other Info]

  * SF#00370664

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2040526/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2040526] Re: Backport RDMA DMABUF

2023-11-14 Thread Ian May
** Package changed: linux-nvidia (Ubuntu) => linux (Ubuntu)

** Also affects: linux-nvidia (Ubuntu)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2040526

Title:
  Backport RDMA DMABUF

Status in linux package in Ubuntu:
  Won't Fix
Status in linux-nvidia package in Ubuntu:
  Fix Committed

Bug description:
  SRU Justification:

  [Impact]

  * From Nvidia:

  "We are working on a high performance networking solution with real
  customers. That solution is being developed using the Ubuntu 22.04 LTS
  distro release and the distro kernel (lowlatency flavour). This
  “dma_buf” patchset consists of upstreamed patches that allow buffers
  to be shared between drivers thus enhancing performance while reducing
  copying of data.

  Our team is currently engaged in the development of a high-performance
  networking solution tailored to meet the demands of real-world
  customers. This cutting-edge solution is being crafted on the
  foundation of Ubuntu 22.04 LTS, utilizing the distribution's kernel,
  specifically the lowlatency flavor.

  At the heart of our innovation lies the transformative "dma_buf"
  patchset, comprising a series of patches that have been integrated
  into the upstream kernel in 5.16 and 5.17. These patches introduce a
  groundbreaking capability: enabling the seamless sharing of buffers
  among various drivers. This not only bolsters the solution's
  performance but also minimizes the need for data copying, effectively
  enhancing efficiency across the board.

  The new functionality is isolated such that existing user will not
  execute these new code paths."

  Upstream Reference

  https://lore.kernel.org/all/20211012120903.96933-1-galpr...@amazon.com/
  https://lore.kernel.org/all/0-v1-bd147097458e+ede-umem_dmabuf_...@nvidia.com/
  The patch "[PATCH 1/4] net/mlx5: Add IFC bits for mkey ATS" is already in 
Jammy and was included in:
  "2a1e6097e9b9 UBUNTU: SAUCE: RDMA/core: Updated ib_peer_memory"

  [Test Plan]

  * Testing instructions are outlined in the SF case and has been tested
  on in house hardware and externally by Nvidia.

  [Where problems could occur?]

  * This introduces new code paths so regression potential should be
  low.

  [Other Info]

  * SF#00370664

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2040526/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2040526] Re: Backport RDMA DMABUF

2023-11-14 Thread Ian May
** Package changed: linux (Ubuntu) => linux-nvidia (Ubuntu)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-nvidia in Ubuntu.
https://bugs.launchpad.net/bugs/2040526

Title:
  Backport RDMA DMABUF

Status in linux-nvidia package in Ubuntu:
  Incomplete
Status in linux-nvidia source package in Jammy:
  Incomplete

Bug description:
  SRU Justification:

  [Impact]

  * From Nvidia:

  "We are working on a high performance networking solution with real
  customers. That solution is being developed using the Ubuntu 22.04 LTS
  distro release and the distro kernel (lowlatency flavour). This
  “dma_buf” patchset consists of upstreamed patches that allow buffers
  to be shared between drivers thus enhancing performance while reducing
  copying of data.

  Our team is currently engaged in the development of a high-performance
  networking solution tailored to meet the demands of real-world
  customers. This cutting-edge solution is being crafted on the
  foundation of Ubuntu 22.04 LTS, utilizing the distribution's kernel,
  specifically the lowlatency flavor.

  At the heart of our innovation lies the transformative "dma_buf"
  patchset, comprising a series of patches that have been integrated
  into the upstream kernel in 5.16 and 5.17. These patches introduce a
  groundbreaking capability: enabling the seamless sharing of buffers
  among various drivers. This not only bolsters the solution's
  performance but also minimizes the need for data copying, effectively
  enhancing efficiency across the board.

  The new functionality is isolated such that existing user will not
  execute these new code paths."

  Upstream Reference

  https://lore.kernel.org/all/20211012120903.96933-1-galpr...@amazon.com/
  https://lore.kernel.org/all/0-v1-bd147097458e+ede-umem_dmabuf_...@nvidia.com/
  The patch "[PATCH 1/4] net/mlx5: Add IFC bits for mkey ATS" is already in 
Jammy and was included in:
  "2a1e6097e9b9 UBUNTU: SAUCE: RDMA/core: Updated ib_peer_memory"

  [Test Plan]

  * Testing instructions are outlined in the SF case and has been tested
  on in house hardware and externally by Nvidia.

  [Where problems could occur?]

  * This introduces new code paths so regression potential should be
  low.

  [Other Info]

  * SF#00370664

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia/+bug/2040526/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2043059] Re: Installation errors out when installing in a chroot

2023-11-10 Thread Ian May
I don't appear to have access to the image file used in the reproducer.
http://bright-dev.nvidia.com/base-distributions/x86_64/dgx-os/dgx-os-6.1-trd4/DGXOS-6.1.0-DGX-H100.tar.gz

So instead I'm using the following image for reproducing.
https://cloud-images.ubuntu.com/jammy/20231027/jammy-server-cloudimg-amd64.tar.gz

The error indicates to me that it can't find the root device.  If I
don't bind mount /dev into my image, I'm able to recreate the error with
both linux-generic and linux-nvidia.  With the host /dev mounted into
the chroot both kernels are able to call mkinitramfs successfully.

Can you confirm that 'cm-chroot-sw-img' is mounting /dev?
mount | grep /cm/images/dgx-h100-image/dev

If we are lucky and it happens to not be mounted could you try the following:
sudo mount --bind /dev /cm/images/dgx-h100-image/dev
sudo chroot /cm/images/dgx-h100-image
/etc/kernel/postinst.d/kdump-tools 5.15.0-1040-nvidia

If /dev is correctly mounted and problem persists, I'll probably need a
way to get that image tar to further investigate.

Thanks,
Ian

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-nvidia in Ubuntu.
https://bugs.launchpad.net/bugs/2043059

Title:
  Installation errors out when installing in a chroot

Status in linux-nvidia package in Ubuntu:
  New

Bug description:
  Processing triggers for linux-image-5.15.0-1040-nvidia (5.15.0-1040.40) ...
  /etc/kernel/postinst.d/dkms:
   * dkms: running auto installation service for kernel 5.15.0-1040-nvidia
 ...done.
  /etc/kernel/postinst.d/initramfs-tools:
  update-initramfs: Generating /boot/initrd.img-5.15.0-1040-nvidia
  cryptsetup: WARNING: Couldn't determine root device
  W: Couldn't identify type of root file system for fsck hook
  cp: cannot stat '/etc/iscsi/initiatorname.iscsi': No such file or directory
  /etc/kernel/postinst.d/kdump-tools:
  kdump-tools: Generating /var/lib/kdump/initrd.img-5.15.0-1040-nvidia
  mkinitramfs: failed to determine device for /
  mkinitramfs: workaround is MODULES=most, check:
  grep -r MODULES /var/lib/kdump/initramfs-tools

  Error please report bug on initramfs-tools
  Include the output of 'mount' and 'cat /proc/mounts'
  update-initramfs: failed for /var/lib/kdump/initrd.img-5.15.0-1040-nvidia 
with 1.
  run-parts: /etc/kernel/postinst.d/kdump-tools exited with return code 1
  dpkg: error processing package linux-image-5.15.0-1040-nvidia (--configure):
   installed linux-image-5.15.0-1040-nvidia package post-installation script 
subprocess returned error exit status 1
  Errors were encountered while processing:
   linux-image-5.15.0-1040-nvidia
  E: Sub-process /usr/bin/dpkg returned an error code (1)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia/+bug/2043059/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2040526] Re: Backport RDMA DMABUF

2023-10-25 Thread Ian May
** Changed in: linux (Ubuntu)
   Status: Incomplete => New

** Description changed:

  SRU Justification:
  
  [Impact]
  
- From Nvidia:
+ *From Nvidia:
  
  "We are working on a high performance networking solution with real
  customers. That solution is being developed using the Ubuntu 22.04 LTS
  distro release and the distro kernel (lowlatency flavour). This
  “dma_buf” patchset consists of upstreamed patches that allow buffers to
  be shared between drivers thus enhancing performance while reducing
  copying of data.
  
  Our team is currently engaged in the development of a high-performance
  networking solution tailored to meet the demands of real-world
  customers. This cutting-edge solution is being crafted on the foundation
  of Ubuntu 22.04 LTS, utilizing the distribution's kernel, specifically
  the lowlatency flavor.
  
  At the heart of our innovation lies the transformative "dma_buf"
  patchset, comprising a series of patches that have been integrated into
  the upstream kernel in 5.16 and 5.17. These patches introduce a
  groundbreaking capability: enabling the seamless sharing of buffers
  among various drivers. This not only bolsters the solution's performance
  but also minimizes the need for data copying, effectively enhancing
  efficiency across the board.
  
  The new functionality is isolated such that existing user will not
  execute these new code paths."
  
  Upstream Reference
  
  https://lore.kernel.org/all/20211012120903.96933-1-galpr...@amazon.com/
  https://lore.kernel.org/all/0-v1-bd147097458e+ede-umem_dmabuf_...@nvidia.com/
  The patch "[PATCH 1/4] net/mlx5: Add IFC bits for mkey ATS" is already in 
Jammy and was included in:
  "2a1e6097e9b9 UBUNTU: SAUCE: RDMA/core: Updated ib_peer_memory"
  
  [Test Plan]
  
- Testing instructions are outlined in the SF case and has been tested on
+ *Testing instructions are outlined in the SF case and has been tested on
  in house hardware and externally by Nvidia.
  
  [Where problems could occur?]
  
- This introduces new code paths so regression potential should be low.
+ *This introduces new code paths so regression potential should be low.
  
  [Other Info]
- SF#00370664
+ 
+ *SF#00370664

** Description changed:

  SRU Justification:
  
  [Impact]
  
- *From Nvidia:
+ * From Nvidia:
  
  "We are working on a high performance networking solution with real
  customers. That solution is being developed using the Ubuntu 22.04 LTS
  distro release and the distro kernel (lowlatency flavour). This
  “dma_buf” patchset consists of upstreamed patches that allow buffers to
  be shared between drivers thus enhancing performance while reducing
  copying of data.
  
  Our team is currently engaged in the development of a high-performance
  networking solution tailored to meet the demands of real-world
  customers. This cutting-edge solution is being crafted on the foundation
  of Ubuntu 22.04 LTS, utilizing the distribution's kernel, specifically
  the lowlatency flavor.
  
  At the heart of our innovation lies the transformative "dma_buf"
  patchset, comprising a series of patches that have been integrated into
  the upstream kernel in 5.16 and 5.17. These patches introduce a
  groundbreaking capability: enabling the seamless sharing of buffers
  among various drivers. This not only bolsters the solution's performance
  but also minimizes the need for data copying, effectively enhancing
  efficiency across the board.
  
  The new functionality is isolated such that existing user will not
  execute these new code paths."
  
  Upstream Reference
  
  https://lore.kernel.org/all/20211012120903.96933-1-galpr...@amazon.com/
  https://lore.kernel.org/all/0-v1-bd147097458e+ede-umem_dmabuf_...@nvidia.com/
  The patch "[PATCH 1/4] net/mlx5: Add IFC bits for mkey ATS" is already in 
Jammy and was included in:
  "2a1e6097e9b9 UBUNTU: SAUCE: RDMA/core: Updated ib_peer_memory"
  
  [Test Plan]
  
- *Testing instructions are outlined in the SF case and has been tested on
- in house hardware and externally by Nvidia.
+ * Testing instructions are outlined in the SF case and has been tested
+ on in house hardware and externally by Nvidia.
  
  [Where problems could occur?]
  
- *This introduces new code paths so regression potential should be low.
+ * This introduces new code paths so regression potential should be low.
  
  [Other Info]
  
- *SF#00370664
+ * SF#00370664

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2040526

Title:
  Backport RDMA DMABUF

Status in linux package in Ubuntu:
  New
Status in linux source package in Jammy:
  New

Bug description:
  SRU Justification:

  [Impact]

  * From Nvidia:

  "We are working on a high performance networking solution with real
  customers. That solution is being developed using the Ubuntu 22.04 LTS
  distro release and the distro kernel (lowlatency flavour). This
  “dma_buf” patchset consists 

[Kernel-packages] [Bug 2040526] Re: Backport RDMA DMABUF

2023-10-25 Thread Ian May
** Description changed:

  SRU Justification:
  
  [Impact]
  
  From Nvidia:
  
  "We are working on a high performance networking solution with real
  customers. That solution is being developed using the Ubuntu 22.04 LTS
  distro release and the distro kernel (lowlatency flavour). This
  “dma_buf” patchset consists of upstreamed patches that allow buffers to
  be shared between drivers thus enhancing performance while reducing
  copying of data.
  
  Our team is currently engaged in the development of a high-performance
  networking solution tailored to meet the demands of real-world
  customers. This cutting-edge solution is being crafted on the foundation
  of Ubuntu 22.04 LTS, utilizing the distribution's kernel, specifically
  the lowlatency flavor.
  
  At the heart of our innovation lies the transformative "dma_buf"
  patchset, comprising a series of patches that have been integrated into
  the upstream kernel in 5.16 and 5.17. These patches introduce a
  groundbreaking capability: enabling the seamless sharing of buffers
  among various drivers. This not only bolsters the solution's performance
  but also minimizes the need for data copying, effectively enhancing
  efficiency across the board.
  
  The new functionality is isolated such that existing user will not
  execute these new code paths."
  
  Upstream Reference
  
  https://lore.kernel.org/all/20211012120903.96933-1-galpr...@amazon.com/
  https://lore.kernel.org/all/0-v1-bd147097458e+ede-umem_dmabuf_...@nvidia.com/
- The patch "[PATCH 1/4] net/mlx5: Add IFC bits for mkey ATS" is already in 
Jammy and was included in: 
+ The patch "[PATCH 1/4] net/mlx5: Add IFC bits for mkey ATS" is already in 
Jammy and was included in:
  "2a1e6097e9b9 UBUNTU: SAUCE: RDMA/core: Updated ib_peer_memory"
  
  [Test Plan]
  
  Testing instructions are outlined in the SF case and has been tested on
- local hardware and also by Nvidia.
+ in house hardware and externally by Nvidia.
  
  [Where problems could occur?]
  
  This introduces new code paths so regression potential should be low.
  
  [Other Info]
  SF#00370664

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2040526

Title:
  Backport RDMA DMABUF

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Jammy:
  New

Bug description:
  SRU Justification:

  [Impact]

  From Nvidia:

  "We are working on a high performance networking solution with real
  customers. That solution is being developed using the Ubuntu 22.04 LTS
  distro release and the distro kernel (lowlatency flavour). This
  “dma_buf” patchset consists of upstreamed patches that allow buffers
  to be shared between drivers thus enhancing performance while reducing
  copying of data.

  Our team is currently engaged in the development of a high-performance
  networking solution tailored to meet the demands of real-world
  customers. This cutting-edge solution is being crafted on the
  foundation of Ubuntu 22.04 LTS, utilizing the distribution's kernel,
  specifically the lowlatency flavor.

  At the heart of our innovation lies the transformative "dma_buf"
  patchset, comprising a series of patches that have been integrated
  into the upstream kernel in 5.16 and 5.17. These patches introduce a
  groundbreaking capability: enabling the seamless sharing of buffers
  among various drivers. This not only bolsters the solution's
  performance but also minimizes the need for data copying, effectively
  enhancing efficiency across the board.

  The new functionality is isolated such that existing user will not
  execute these new code paths."

  Upstream Reference

  https://lore.kernel.org/all/20211012120903.96933-1-galpr...@amazon.com/
  https://lore.kernel.org/all/0-v1-bd147097458e+ede-umem_dmabuf_...@nvidia.com/
  The patch "[PATCH 1/4] net/mlx5: Add IFC bits for mkey ATS" is already in 
Jammy and was included in:
  "2a1e6097e9b9 UBUNTU: SAUCE: RDMA/core: Updated ib_peer_memory"

  [Test Plan]

  Testing instructions are outlined in the SF case and has been tested
  on in house hardware and externally by Nvidia.

  [Where problems could occur?]

  This introduces new code paths so regression potential should be low.

  [Other Info]
  SF#00370664

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2040526/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2040526] Re: Backport RDMA DMABUF

2023-10-25 Thread Ian May
** Description changed:

  SRU Justification:
  
  [Impact]
  
  From Nvidia:
  
  "We are working on a high performance networking solution with real
  customers. That solution is being developed using the Ubuntu 22.04 LTS
  distro release and the distro kernel (lowlatency flavour). This
  “dma_buf” patchset consists of upstreamed patches that allow buffers to
  be shared between drivers thus enhancing performance while reducing
  copying of data.
  
  Our team is currently engaged in the development of a high-performance
  networking solution tailored to meet the demands of real-world
  customers. This cutting-edge solution is being crafted on the foundation
  of Ubuntu 22.04 LTS, utilizing the distribution's kernel, specifically
  the lowlatency flavor.
  
  At the heart of our innovation lies the transformative "dma_buf"
  patchset, comprising a series of patches that have been integrated into
  the upstream kernel in 5.16 and 5.17. These patches introduce a
  groundbreaking capability: enabling the seamless sharing of buffers
  among various drivers. This not only bolsters the solution's performance
  but also minimizes the need for data copying, effectively enhancing
  efficiency across the board.
  
  The new functionality is isolated such that existing user will not
  execute these new code paths."
  
  Upstream Reference
+ 
  https://lore.kernel.org/all/20211012120903.96933-1-galpr...@amazon.com/
  https://lore.kernel.org/all/0-v1-bd147097458e+ede-umem_dmabuf_...@nvidia.com/
+ The patch "[PATCH 1/4] net/mlx5: Add IFC bits for mkey ATS" is already in 
Jammy and was included in: 
+ "2a1e6097e9b9 UBUNTU: SAUCE: RDMA/core: Updated ib_peer_memory"
  
  [Test Plan]
  
  Testing instructions are outlined in the SF case and has been tested on
  local hardware and also by Nvidia.
  
  [Where problems could occur?]
  
  This introduces new code paths so regression potential should be low.
  
  [Other Info]
  SF#00370664

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2040526

Title:
  Backport RDMA DMABUF

Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Jammy:
  New

Bug description:
  SRU Justification:

  [Impact]

  From Nvidia:

  "We are working on a high performance networking solution with real
  customers. That solution is being developed using the Ubuntu 22.04 LTS
  distro release and the distro kernel (lowlatency flavour). This
  “dma_buf” patchset consists of upstreamed patches that allow buffers
  to be shared between drivers thus enhancing performance while reducing
  copying of data.

  Our team is currently engaged in the development of a high-performance
  networking solution tailored to meet the demands of real-world
  customers. This cutting-edge solution is being crafted on the
  foundation of Ubuntu 22.04 LTS, utilizing the distribution's kernel,
  specifically the lowlatency flavor.

  At the heart of our innovation lies the transformative "dma_buf"
  patchset, comprising a series of patches that have been integrated
  into the upstream kernel in 5.16 and 5.17. These patches introduce a
  groundbreaking capability: enabling the seamless sharing of buffers
  among various drivers. This not only bolsters the solution's
  performance but also minimizes the need for data copying, effectively
  enhancing efficiency across the board.

  The new functionality is isolated such that existing user will not
  execute these new code paths."

  Upstream Reference

  https://lore.kernel.org/all/20211012120903.96933-1-galpr...@amazon.com/
  https://lore.kernel.org/all/0-v1-bd147097458e+ede-umem_dmabuf_...@nvidia.com/
  The patch "[PATCH 1/4] net/mlx5: Add IFC bits for mkey ATS" is already in 
Jammy and was included in: 
  "2a1e6097e9b9 UBUNTU: SAUCE: RDMA/core: Updated ib_peer_memory"

  [Test Plan]

  Testing instructions are outlined in the SF case and has been tested
  on local hardware and also by Nvidia.

  [Where problems could occur?]

  This introduces new code paths so regression potential should be low.

  [Other Info]
  SF#00370664

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2040526/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2040526] Re: Backport DMABUF functionality

2023-10-25 Thread Ian May
** Description changed:

  SRU Justification:
  
  [Impact]
  
- Backport RDMA DMABUF functionality
+ Backport RDMA DMABUF
  
- Nvidia is working on a high performance networking solution with real
+ From Nvidia:
+ 
+ "We are working on a high performance networking solution with real
  customers. That solution is being developed using the Ubuntu 22.04 LTS
  distro release and the distro kernel (lowlatency flavour). This
  “dma_buf” patchset consists of upstreamed patches that allow buffers to
  be shared between drivers thus enhancing performance while reducing
  copying of data.
  
  Our team is currently engaged in the development of a high-performance
  networking solution tailored to meet the demands of real-world
  customers. This cutting-edge solution is being crafted on the foundation
  of Ubuntu 22.04 LTS, utilizing the distribution's kernel, specifically
  the lowlatency flavor.
  
  At the heart of our innovation lies the transformative "dma_buf"
  patchset, comprising a series of patches that have been integrated into
  the upstream kernel in 5.16 and 5.17. These patches introduce a
  groundbreaking capability: enabling the seamless sharing of buffers
  among various drivers. This not only bolsters the solution's performance
  but also minimizes the need for data copying, effectively enhancing
  efficiency across the board.
  
  The new functionality is isolated such that existing user will not
- execute these new code paths.
+ execute these new code paths."
  
- * First 3 patches adds a new api to the RDMA subsystem that allows drivers to 
get a pinned dmabuf memory
- region without requiring an implementation of the move_notify callback.
- 
+ Upstream Reference
  https://lore.kernel.org/all/20211012120903.96933-1-galpr...@amazon.com/
- 
- * The remaining patches add support for DMABUF when creating a devx umem. 
devx umems
- are quite similar to MR's execpt they cannot be revoked, so this uses the 
- dmabuf pinned memory flow. Several mlx5dv flows require umem and cannot
- work with MR. 
- 
- https://lore.kernel.org/all/0-v1-bd147097458e+ede-
- umem_dmabuf_...@nvidia.com/
+ https://lore.kernel.org/all/0-v1-bd147097458e+ede-umem_dmabuf_...@nvidia.com/
  
  [Test Plan]
  
- SW Configuration:
- • Download CUDA 12.2 run file 
(https://developer.nvidia.com/cuda-downloads?target_os=Linux_arch=x86_64=Ubuntu_version=20.04_type=runfile_local)
- • Install using kernel-open i.e. #sh ./cuda_12.2.2_535.104.05_linux.run 
-m=kernel-open
- • Clone perftest from https://github.com/linux-rdma/perftest.
- • cd perftest
- • export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64:$LD_LIBRARY_PATH
- • export LIBRARY_PATH=/usr/local/cuda-12.2/lib64:$LIBRARY_PATH
- • run: ./autogen.sh ; ./configure CUDA_H_PATH=/usr/local/cuda/include/cuda.h; 
make
- 
- # Start Server
- $ ./ib_write_bw -d mlx5_2 -F --use_cuda=0 --use_cuda_dmabuf
- 
- #Start Client
- $ ./ib_write_bw -d mlx5_3 -F --use_cuda=1 --use_cuda_dmabuf localhost
+ Testing instructions are outlined in the SF case and has been tested on
+ local hardware and also by Nvidia.
  
  [Where problems could occur?]
+ 
+ This introduces new code paths so regression potential should be low.
+ 
+ [Other Info]
+ SF#00370664

** Description changed:

  SRU Justification:
  
  [Impact]
- 
- Backport RDMA DMABUF
  
  From Nvidia:
  
  "We are working on a high performance networking solution with real
  customers. That solution is being developed using the Ubuntu 22.04 LTS
  distro release and the distro kernel (lowlatency flavour). This
  “dma_buf” patchset consists of upstreamed patches that allow buffers to
  be shared between drivers thus enhancing performance while reducing
  copying of data.
  
  Our team is currently engaged in the development of a high-performance
  networking solution tailored to meet the demands of real-world
  customers. This cutting-edge solution is being crafted on the foundation
  of Ubuntu 22.04 LTS, utilizing the distribution's kernel, specifically
  the lowlatency flavor.
  
  At the heart of our innovation lies the transformative "dma_buf"
  patchset, comprising a series of patches that have been integrated into
  the upstream kernel in 5.16 and 5.17. These patches introduce a
  groundbreaking capability: enabling the seamless sharing of buffers
  among various drivers. This not only bolsters the solution's performance
  but also minimizes the need for data copying, effectively enhancing
  efficiency across the board.
  
  The new functionality is isolated such that existing user will not
  execute these new code paths."
  
  Upstream Reference
  https://lore.kernel.org/all/20211012120903.96933-1-galpr...@amazon.com/
  https://lore.kernel.org/all/0-v1-bd147097458e+ede-umem_dmabuf_...@nvidia.com/
  
  [Test Plan]
  
  Testing instructions are outlined in the SF case and has been tested on
  local hardware and also by Nvidia.
  
  [Where problems could occur?]
  
  This introduces new code paths so regression potential should be low.
  
  [Other 

[Kernel-packages] [Bug 2040526] [NEW] Backport DMABUF functionality

2023-10-25 Thread Ian May
Public bug reported:

SRU Justification:

[Impact]

Backport RDMA DMABUF functionality

Nvidia is working on a high performance networking solution with real
customers. That solution is being developed using the Ubuntu 22.04 LTS
distro release and the distro kernel (lowlatency flavour). This
“dma_buf” patchset consists of upstreamed patches that allow buffers to
be shared between drivers thus enhancing performance while reducing
copying of data.

Our team is currently engaged in the development of a high-performance
networking solution tailored to meet the demands of real-world
customers. This cutting-edge solution is being crafted on the foundation
of Ubuntu 22.04 LTS, utilizing the distribution's kernel, specifically
the lowlatency flavor.

At the heart of our innovation lies the transformative "dma_buf"
patchset, comprising a series of patches that have been integrated into
the upstream kernel in 5.16 and 5.17. These patches introduce a
groundbreaking capability: enabling the seamless sharing of buffers
among various drivers. This not only bolsters the solution's performance
but also minimizes the need for data copying, effectively enhancing
efficiency across the board.

The new functionality is isolated such that existing user will not
execute these new code paths.

* First 3 patches adds a new api to the RDMA subsystem that allows drivers to 
get a pinned dmabuf memory
region without requiring an implementation of the move_notify callback.

https://lore.kernel.org/all/20211012120903.96933-1-galpr...@amazon.com/

* The remaining patches add support for DMABUF when creating a devx umem. devx 
umems
are quite similar to MR's execpt they cannot be revoked, so this uses the 
dmabuf pinned memory flow. Several mlx5dv flows require umem and cannot
work with MR. 

https://lore.kernel.org/all/0-v1-bd147097458e+ede-
umem_dmabuf_...@nvidia.com/

[Test Plan]

SW Configuration:
• Download CUDA 12.2 run file 
(https://developer.nvidia.com/cuda-downloads?target_os=Linux_arch=x86_64=Ubuntu_version=20.04_type=runfile_local)
• Install using kernel-open i.e. #sh ./cuda_12.2.2_535.104.05_linux.run 
-m=kernel-open
• Clone perftest from https://github.com/linux-rdma/perftest.
• cd perftest
• export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64:$LD_LIBRARY_PATH
• export LIBRARY_PATH=/usr/local/cuda-12.2/lib64:$LIBRARY_PATH
• run: ./autogen.sh ; ./configure CUDA_H_PATH=/usr/local/cuda/include/cuda.h; 
make

# Start Server
$ ./ib_write_bw -d mlx5_2 -F --use_cuda=0 --use_cuda_dmabuf

#Start Client
$ ./ib_write_bw -d mlx5_3 -F --use_cuda=1 --use_cuda_dmabuf localhost

[Where problems could occur?]

** Affects: linux (Ubuntu)
 Importance: Undecided
 Status: Incomplete

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2040526

Title:
  Backport DMABUF functionality

Status in linux package in Ubuntu:
  Incomplete

Bug description:
  SRU Justification:

  [Impact]

  Backport RDMA DMABUF functionality

  Nvidia is working on a high performance networking solution with real
  customers. That solution is being developed using the Ubuntu 22.04 LTS
  distro release and the distro kernel (lowlatency flavour). This
  “dma_buf” patchset consists of upstreamed patches that allow buffers
  to be shared between drivers thus enhancing performance while reducing
  copying of data.

  Our team is currently engaged in the development of a high-performance
  networking solution tailored to meet the demands of real-world
  customers. This cutting-edge solution is being crafted on the
  foundation of Ubuntu 22.04 LTS, utilizing the distribution's kernel,
  specifically the lowlatency flavor.

  At the heart of our innovation lies the transformative "dma_buf"
  patchset, comprising a series of patches that have been integrated
  into the upstream kernel in 5.16 and 5.17. These patches introduce a
  groundbreaking capability: enabling the seamless sharing of buffers
  among various drivers. This not only bolsters the solution's
  performance but also minimizes the need for data copying, effectively
  enhancing efficiency across the board.

  The new functionality is isolated such that existing user will not
  execute these new code paths.

  * First 3 patches adds a new api to the RDMA subsystem that allows drivers to 
get a pinned dmabuf memory
  region without requiring an implementation of the move_notify callback.

  https://lore.kernel.org/all/20211012120903.96933-1-galpr...@amazon.com/

  * The remaining patches add support for DMABUF when creating a devx umem. 
devx umems
  are quite similar to MR's execpt they cannot be revoked, so this uses the 
  dmabuf pinned memory flow. Several mlx5dv flows require umem and cannot
  work with MR. 

  https://lore.kernel.org/all/0-v1-bd147097458e+ede-
  umem_dmabuf_...@nvidia.com/

  [Test Plan]

  SW Configuration:
  • Download CUDA 12.2 run file 

[Kernel-packages] [Bug 2038099] Re: Enable building and signing of the nvidia-fs out-of-tree kernel module.

2023-10-10 Thread Ian May
** Also affects: linux-nvidia-6.2 (Ubuntu Jammy)
   Importance: Undecided
   Status: New

** Changed in: linux-nvidia-6.2 (Ubuntu)
   Status: New => Fix Committed

** Changed in: linux-nvidia-6.2 (Ubuntu Jammy)
   Status: New => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-nvidia-6.2 in Ubuntu.
https://bugs.launchpad.net/bugs/2038099

Title:
  Enable building and signing of the nvidia-fs out-of-tree kernel
  module.

Status in linux-nvidia-6.2 package in Ubuntu:
  Fix Committed
Status in linux-nvidia-6.2 source package in Jammy:
  Fix Committed

Bug description:
  [Issue]

  The nvidia-fs kernel module is a must have for Nvidia optimized
  kernels. There is now a version that is compatible with the Grace
  processor. Integrate the changes necessary to build and sign this out-
  of-tree kernel module.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.2/+bug/2038099/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2033685] Re: Pull-request to address ARM CoreSoght PMU issues

2023-10-10 Thread Ian May
** Changed in: linux-nvidia-6.2 (Ubuntu Jammy)
   Status: New => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-nvidia-6.2 in Ubuntu.
https://bugs.launchpad.net/bugs/2033685

Title:
  Pull-request to address ARM CoreSoght PMU issues

Status in linux-nvidia-6.2 package in Ubuntu:
  Fix Committed
Status in linux-nvidia-6.2 source package in Jammy:
  Fix Committed

Bug description:
  [issue]

  This patch set addresses several CoreSight PMU issues. These are all
  upstream patches.

  
  Commit Summary

  2940a5e perf: arm_cspmu: Fix variable dereference warning
  06f6951 perf: arm_cspmu: Set irq affinitiy only if overflow interrupt is used
  292771d perf/arm_cspmu: Fix event attribute type
  6992931 ACPI/APMT: Don't register invalid resource
  48f4b92 perf/arm_cspmu: Clean up ACPI dependency
  7da1852 perf/arm_cspmu: Decouple APMT dependency
  d3d56a4 perf: arm_cspmu: Add missing MODULE_DEVICE_TABLE

  File Changes (4 files)

  M drivers/acpi/arm64/apmt.c (10)
  M drivers/perf/arm_cspmu/Kconfig (3)
  M drivers/perf/arm_cspmu/arm_cspmu.c (95)
  M drivers/perf/arm_cspmu/arm_cspmu.h (5)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.2/+bug/2033685/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2037688] Re: Pull-request to address TPM bypass issue

2023-10-10 Thread Ian May
** Changed in: linux-nvidia-6.2 (Ubuntu Jammy)
   Status: New => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-nvidia-6.2 in Ubuntu.
https://bugs.launchpad.net/bugs/2037688

Title:
  Pull-request to address TPM bypass issue

Status in linux-nvidia-6.2 package in Ubuntu:
  Fix Committed
Status in linux-nvidia-6.2 source package in Jammy:
  Fix Committed

Bug description:
  NVIDIA: [Config]: Ensure the TPM is available before IMA
  initializes

  Set the following configs:

CONFIG_SPI_TEGRA210_QUAD=y
CONFIG_TCG_TIS_SPI=y

  On Grace systems, the IMA driver emits the following log:

ima: No TPM chip found, activating TPM-bypass!

  This occurs because the IMA driver initializes before we are able to 
detect
  the TPM. This will always be the case when the drivers required to
  communicate with the TPM, spi_tegra210_quad and tpm_tis_spi, are built as
  modules.

  Having these drivers as built-ins ensures that the TPM is available before
  the IMA driver initializes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.2/+bug/2037688/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2033685] Re: Pull-request to address ARM CoreSoght PMU issues

2023-10-10 Thread Ian May
** Also affects: linux-nvidia-6.2 (Ubuntu Jammy)
   Importance: Undecided
   Status: New

** Changed in: linux-nvidia-6.2 (Ubuntu)
   Status: New => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-nvidia-6.2 in Ubuntu.
https://bugs.launchpad.net/bugs/2033685

Title:
  Pull-request to address ARM CoreSoght PMU issues

Status in linux-nvidia-6.2 package in Ubuntu:
  Fix Committed
Status in linux-nvidia-6.2 source package in Jammy:
  New

Bug description:
  [issue]

  This patch set addresses several CoreSight PMU issues. These are all
  upstream patches.

  
  Commit Summary

  2940a5e perf: arm_cspmu: Fix variable dereference warning
  06f6951 perf: arm_cspmu: Set irq affinitiy only if overflow interrupt is used
  292771d perf/arm_cspmu: Fix event attribute type
  6992931 ACPI/APMT: Don't register invalid resource
  48f4b92 perf/arm_cspmu: Clean up ACPI dependency
  7da1852 perf/arm_cspmu: Decouple APMT dependency
  d3d56a4 perf: arm_cspmu: Add missing MODULE_DEVICE_TABLE

  File Changes (4 files)

  M drivers/acpi/arm64/apmt.c (10)
  M drivers/perf/arm_cspmu/Kconfig (3)
  M drivers/perf/arm_cspmu/arm_cspmu.c (95)
  M drivers/perf/arm_cspmu/arm_cspmu.h (5)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.2/+bug/2033685/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2037688] Re: Pull-request to address TPM bypass issue

2023-10-10 Thread Ian May
** Also affects: linux-nvidia-6.2 (Ubuntu Jammy)
   Importance: Undecided
   Status: New

** Changed in: linux-nvidia-6.2 (Ubuntu)
   Status: New => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-nvidia-6.2 in Ubuntu.
https://bugs.launchpad.net/bugs/2037688

Title:
  Pull-request to address TPM bypass issue

Status in linux-nvidia-6.2 package in Ubuntu:
  Fix Committed
Status in linux-nvidia-6.2 source package in Jammy:
  New

Bug description:
  NVIDIA: [Config]: Ensure the TPM is available before IMA
  initializes

  Set the following configs:

CONFIG_SPI_TEGRA210_QUAD=y
CONFIG_TCG_TIS_SPI=y

  On Grace systems, the IMA driver emits the following log:

ima: No TPM chip found, activating TPM-bypass!

  This occurs because the IMA driver initializes before we are able to 
detect
  the TPM. This will always be the case when the drivers required to
  communicate with the TPM, spi_tegra210_quad and tpm_tis_spi, are built as
  modules.

  Having these drivers as built-ins ensures that the TPM is available before
  the IMA driver initializes.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.2/+bug/2037688/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2026776] Re: arm64+ast2600: No Output from BMC's VGA port

2023-08-30 Thread Ian May
** Changed in: linux (Ubuntu Jammy)
   Status: New => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-hwe-6.2 in Ubuntu.
https://bugs.launchpad.net/bugs/2026776

Title:
  arm64+ast2600: No Output from BMC's VGA port

Status in linux package in Ubuntu:
  Triaged
Status in linux-hwe-5.19 package in Ubuntu:
  Won't Fix
Status in linux-hwe-6.2 package in Ubuntu:
  Fix Committed
Status in linux source package in Jammy:
  Fix Committed
Status in linux-hwe-5.19 source package in Jammy:
  Won't Fix
Status in linux-hwe-6.2 source package in Jammy:
  Fix Committed
Status in linux source package in Lunar:
  Fix Committed

Bug description:
  SRU Justification:

  [ Impact ]

  On systems that have the following combination of hardware

  1) arm64 CPU 
  2) ASPEED AST2600 BMC: https://www.aspeedtech.com/server_ast2600/

  No output when connecting a display to the BMC's VGA port.

  [ Fix ]

  For AST2500+ MMIO should be enabled by default.

  [ Test Plan ]

  Test on targeted hardware to make sure BMC is displaying output.

  [ Where problems could occur ]

  Not aware of any potential problems, but any should be confined to
  ASPEED AST2500+ hardware.

  [ Other Info ]

  Patch is already in jammy/nvidia-5.19 and jammy/nvidia-6.2 and has
  been tested with affected BMC.


  
  [Issue]

  On systems that have the following combination of hardware...:

  1) arm64 CPU
  2) ASPEED AST2600 BMC: https://www.aspeedtech.com/server_ast2600/

  .. we see no output when connecting a display to the BMC's VGA port.

  Upon further investigation, we see that applying the following patch
  fixes this issue:
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu/drm/ast?h=v6.4-rc6=4327a6137ed43a091d900b1ac833345d60f32228

  [Action]

  Please apply the following two backports to the appropriate Ubuntu HWE
  kernels:

  
https://github.com/NVIDIA-BaseOS-6/linux-nvidia-5.19/commit/055c9ec3739d7df1179db3ba054b00f3dd684560
  
https://github.com/NVIDIA-BaseOS-6/linux-nvidia-6.2/commit/8ab3253c6a59eee3424fe0c60b1fc6dc9f2d73b7

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2026776/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2026776] Re: arm64+ast2600: No Output from BMC's VGA port

2023-08-29 Thread Ian May
** Also affects: linux (Ubuntu Jammy)
   Importance: Undecided
   Status: New

** Also affects: linux-hwe-5.19 (Ubuntu Jammy)
   Importance: Undecided
   Status: New

** Also affects: linux-hwe-6.2 (Ubuntu Jammy)
   Importance: Undecided
   Status: New

** Changed in: linux-hwe-5.19 (Ubuntu Jammy)
   Status: New => Fix Committed

** Changed in: linux-hwe-6.2 (Ubuntu Jammy)
   Status: New => Fix Committed

** Changed in: linux-hwe-5.19 (Ubuntu Jammy)
   Status: Fix Committed => Won't Fix

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-hwe-6.2 in Ubuntu.
https://bugs.launchpad.net/bugs/2026776

Title:
  arm64+ast2600: No Output from BMC's VGA port

Status in linux package in Ubuntu:
  Triaged
Status in linux-hwe-5.19 package in Ubuntu:
  Won't Fix
Status in linux-hwe-6.2 package in Ubuntu:
  Fix Committed
Status in linux source package in Jammy:
  New
Status in linux-hwe-5.19 source package in Jammy:
  Won't Fix
Status in linux-hwe-6.2 source package in Jammy:
  Fix Committed
Status in linux source package in Lunar:
  Fix Committed

Bug description:
  SRU Justification:

  [ Impact ]

  On systems that have the following combination of hardware

  1) arm64 CPU 
  2) ASPEED AST2600 BMC: https://www.aspeedtech.com/server_ast2600/

  No output when connecting a display to the BMC's VGA port.

  [ Fix ]

  For AST2500+ MMIO should be enabled by default.

  [ Test Plan ]

  Test on targeted hardware to make sure BMC is displaying output.

  [ Where problems could occur ]

  Not aware of any potential problems, but any should be confined to
  ASPEED AST2500+ hardware.

  [ Other Info ]

  Patch is already in jammy/nvidia-5.19 and jammy/nvidia-6.2 and has
  been tested with affected BMC.


  
  [Issue]

  On systems that have the following combination of hardware...:

  1) arm64 CPU
  2) ASPEED AST2600 BMC: https://www.aspeedtech.com/server_ast2600/

  .. we see no output when connecting a display to the BMC's VGA port.

  Upon further investigation, we see that applying the following patch
  fixes this issue:
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu/drm/ast?h=v6.4-rc6=4327a6137ed43a091d900b1ac833345d60f32228

  [Action]

  Please apply the following two backports to the appropriate Ubuntu HWE
  kernels:

  
https://github.com/NVIDIA-BaseOS-6/linux-nvidia-5.19/commit/055c9ec3739d7df1179db3ba054b00f3dd684560
  
https://github.com/NVIDIA-BaseOS-6/linux-nvidia-6.2/commit/8ab3253c6a59eee3424fe0c60b1fc6dc9f2d73b7

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2026776/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2026776] Re: arm64+ast2600: No Output from BMC's VGA port

2023-08-29 Thread Ian May
** Changed in: linux-hwe-6.2 (Ubuntu)
   Status: New => Incomplete

** Changed in: linux-hwe-6.2 (Ubuntu)
   Status: Incomplete => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-hwe-6.2 in Ubuntu.
https://bugs.launchpad.net/bugs/2026776

Title:
  arm64+ast2600: No Output from BMC's VGA port

Status in linux package in Ubuntu:
  Triaged
Status in linux-hwe-5.19 package in Ubuntu:
  Won't Fix
Status in linux-hwe-6.2 package in Ubuntu:
  Fix Committed
Status in linux source package in Lunar:
  Fix Committed

Bug description:
  SRU Justification:

  [ Impact ]

  On systems that have the following combination of hardware

  1) arm64 CPU 
  2) ASPEED AST2600 BMC: https://www.aspeedtech.com/server_ast2600/

  No output when connecting a display to the BMC's VGA port.

  [ Fix ]

  For AST2500+ MMIO should be enabled by default.

  [ Test Plan ]

  Test on targeted hardware to make sure BMC is displaying output.

  [ Where problems could occur ]

  Not aware of any potential problems, but any should be confined to
  ASPEED AST2500+ hardware.

  [ Other Info ]

  Patch is already in jammy/nvidia-5.19 and jammy/nvidia-6.2 and has
  been tested with affected BMC.


  
  [Issue]

  On systems that have the following combination of hardware...:

  1) arm64 CPU
  2) ASPEED AST2600 BMC: https://www.aspeedtech.com/server_ast2600/

  .. we see no output when connecting a display to the BMC's VGA port.

  Upon further investigation, we see that applying the following patch
  fixes this issue:
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu/drm/ast?h=v6.4-rc6=4327a6137ed43a091d900b1ac833345d60f32228

  [Action]

  Please apply the following two backports to the appropriate Ubuntu HWE
  kernels:

  
https://github.com/NVIDIA-BaseOS-6/linux-nvidia-5.19/commit/055c9ec3739d7df1179db3ba054b00f3dd684560
  
https://github.com/NVIDIA-BaseOS-6/linux-nvidia-6.2/commit/8ab3253c6a59eee3424fe0c60b1fc6dc9f2d73b7

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2026776/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1982519] Re: GDS: Add NFS patches to optimized kernel

2023-08-28 Thread Ian May
** Changed in: linux-nvidia-5.19 (Ubuntu Jammy)
   Status: New => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-nvidia in Ubuntu.
https://bugs.launchpad.net/bugs/1982519

Title:
  GDS: Add NFS patches to optimized kernel

Status in linux-nvidia package in Ubuntu:
  New
Status in linux-nvidia-5.19 package in Ubuntu:
  New
Status in linux-nvidia-6.2 package in Ubuntu:
  New
Status in linux-nvidia source package in Jammy:
  Fix Released
Status in linux-nvidia-5.19 source package in Jammy:
  Fix Released
Status in linux-nvidia-6.2 source package in Jammy:
  Fix Released

Bug description:
   [Impact]
  Adding these changes will enable GDS functionality NFS drivers.

   [Fix]
  This is a not a fix but a new feature being to NFS driver.

   [Test]
  
  Tested the NFS driver on a hpe system as I did not have a setup with BASEOS6.
   1) Installed 5.15.39 kernel on the system (this is the kernel that 
optimized kernel is on currently).
   2) Downloaded the optimized kernel.
   3) Applied the patches to the optimized kernel
   4) Replaced the NFS modules on the system with the one's built on 
optimized kernel.
   5) Ran gds and compat mode tests on a  NFS mount with the patched NFS 
driver. All tests went fine.
   
  Attaching the results

  Compat mode tests
  ==
  **
  API Tests, : 72 /  72 tests passed
  **
  Testsuite : 211 / 211 tests passed
  done tests:Thu Jul 21 08:27:58 PM UTC 2022

  GDS mode tests
  ==
  **
  NVFS IOCTL negative Tests, : 23 /  23 tests passed
  **
  Testsuite : 249 / 249 tests passed
  End: nvidia-fs:
  GDS Version: 1.4.0.31
  NVFS statistics(ver: 4.0)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia/+bug/1982519/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1982519] Re: GDS: Add NFS patches to optimized kernel

2023-08-28 Thread Ian May
** Changed in: linux-nvidia-6.2 (Ubuntu Jammy)
   Status: New => Fix Released

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-nvidia in Ubuntu.
https://bugs.launchpad.net/bugs/1982519

Title:
  GDS: Add NFS patches to optimized kernel

Status in linux-nvidia package in Ubuntu:
  New
Status in linux-nvidia-5.19 package in Ubuntu:
  New
Status in linux-nvidia-6.2 package in Ubuntu:
  New
Status in linux-nvidia source package in Jammy:
  Fix Released
Status in linux-nvidia-5.19 source package in Jammy:
  New
Status in linux-nvidia-6.2 source package in Jammy:
  Fix Released

Bug description:
   [Impact]
  Adding these changes will enable GDS functionality NFS drivers.

   [Fix]
  This is a not a fix but a new feature being to NFS driver.

   [Test]
  
  Tested the NFS driver on a hpe system as I did not have a setup with BASEOS6.
   1) Installed 5.15.39 kernel on the system (this is the kernel that 
optimized kernel is on currently).
   2) Downloaded the optimized kernel.
   3) Applied the patches to the optimized kernel
   4) Replaced the NFS modules on the system with the one's built on 
optimized kernel.
   5) Ran gds and compat mode tests on a  NFS mount with the patched NFS 
driver. All tests went fine.
   
  Attaching the results

  Compat mode tests
  ==
  **
  API Tests, : 72 /  72 tests passed
  **
  Testsuite : 211 / 211 tests passed
  done tests:Thu Jul 21 08:27:58 PM UTC 2022

  GDS mode tests
  ==
  **
  NVFS IOCTL negative Tests, : 23 /  23 tests passed
  **
  Testsuite : 249 / 249 tests passed
  End: nvidia-fs:
  GDS Version: 1.4.0.31
  NVFS statistics(ver: 4.0)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia/+bug/1982519/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2026776] Re: arm64+ast2600: No Output from BMC's VGA port

2023-07-20 Thread Ian May
** Also affects: linux (Ubuntu Lunar)
   Importance: Undecided
   Status: New

** Also affects: linux-hwe-5.19 (Ubuntu Lunar)
   Importance: Undecided
   Status: New

** Also affects: linux-hwe-6.2 (Ubuntu Lunar)
   Importance: Undecided
   Status: New

** No longer affects: linux-hwe-5.19 (Ubuntu Lunar)

** No longer affects: linux-hwe-6.2 (Ubuntu Lunar)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2026776

Title:
  arm64+ast2600: No Output from BMC's VGA port

Status in linux package in Ubuntu:
  New
Status in linux-hwe-5.19 package in Ubuntu:
  Won't Fix
Status in linux-hwe-6.2 package in Ubuntu:
  New
Status in linux source package in Lunar:
  New

Bug description:
  SRU Justification:

  [ Impact ]

  On systems that have the following combination of hardware

  1) arm64 CPU 
  2) ASPEED AST2600 BMC: https://www.aspeedtech.com/server_ast2600/

  No output when connecting a display to the BMC's VGA port.

  [ Fix ]

  For AST2500+ MMIO should be enabled by default.

  [ Test Plan ]

  Test on targeted hardware to make sure BMC is displaying output.

  [ Where problems could occur ]

  Not aware of any potential problems, but any should be confined to
  ASPEED AST2500+ hardware.

  [ Other Info ]

  Patch is already in jammy/nvidia-5.19 and jammy/nvidia-6.2 and has
  been tested with affected BMC.


  
  [Issue]

  On systems that have the following combination of hardware...:

  1) arm64 CPU
  2) ASPEED AST2600 BMC: https://www.aspeedtech.com/server_ast2600/

  .. we see no output when connecting a display to the BMC's VGA port.

  Upon further investigation, we see that applying the following patch
  fixes this issue:
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu/drm/ast?h=v6.4-rc6=4327a6137ed43a091d900b1ac833345d60f32228

  [Action]

  Please apply the following two backports to the appropriate Ubuntu HWE
  kernels:

  
https://github.com/NVIDIA-BaseOS-6/linux-nvidia-5.19/commit/055c9ec3739d7df1179db3ba054b00f3dd684560
  
https://github.com/NVIDIA-BaseOS-6/linux-nvidia-6.2/commit/8ab3253c6a59eee3424fe0c60b1fc6dc9f2d73b7

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2026776/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2026776] Re: arm64+ast2600: No Output from BMC's VGA port

2023-07-19 Thread Ian May
With Kinetic going EOL, there will be no further SRU updates for linux-
hwe-5.19

** Description changed:

+ SRU Justification:
+ 
+ [ Impact ]
+ 
+ On systems that have the following combination of hardware
+ 
+ 1) arm64 CPU 
+ 2) ASPEED AST2600 BMC: https://www.aspeedtech.com/server_ast2600/
+ 
+ No output when connecting a display to the BMC's VGA port.
+ 
+ [ Fix ]
+ 
+ For AST2500+ MMIO should be enabled by default.
+ 
+ [ Test Plan ]
+ 
+ Test on targeted hardware to make sure BMC is displaying output.
+ 
+ [ Where problems could occur ]
+ 
+ Not aware of any potential problems, but any should be confined to
+ ASPEED AST2500+ hardware.
+ 
+ [ Other Info ]
+ 
+ Patch is already in jammy/nvidia-5.19 and jammy/nvidia-6.2 and has been
+ tested with affected BMC.
+ 
+ 
+ 
  [Issue]
  
  On systems that have the following combination of hardware...:
  
  1) arm64 CPU
  2) ASPEED AST2600 BMC: https://www.aspeedtech.com/server_ast2600/
  
  .. we see no output when connecting a display to the BMC's VGA port.
  
  Upon further investigation, we see that applying the following patch
  fixes this issue:
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu/drm/ast?h=v6.4-rc6=4327a6137ed43a091d900b1ac833345d60f32228
  
  [Action]
  
  Please apply the following two backports to the appropriate Ubuntu HWE
  kernels:
  
  
https://github.com/NVIDIA-BaseOS-6/linux-nvidia-5.19/commit/055c9ec3739d7df1179db3ba054b00f3dd684560
  
https://github.com/NVIDIA-BaseOS-6/linux-nvidia-6.2/commit/8ab3253c6a59eee3424fe0c60b1fc6dc9f2d73b7

** Changed in: linux-hwe-5.19 (Ubuntu)
   Status: New => Won't Fix

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/2026776

Title:
  arm64+ast2600: No Output from BMC's VGA port

Status in linux package in Ubuntu:
  New
Status in linux-hwe-5.19 package in Ubuntu:
  Won't Fix
Status in linux-hwe-6.2 package in Ubuntu:
  New

Bug description:
  SRU Justification:

  [ Impact ]

  On systems that have the following combination of hardware

  1) arm64 CPU 
  2) ASPEED AST2600 BMC: https://www.aspeedtech.com/server_ast2600/

  No output when connecting a display to the BMC's VGA port.

  [ Fix ]

  For AST2500+ MMIO should be enabled by default.

  [ Test Plan ]

  Test on targeted hardware to make sure BMC is displaying output.

  [ Where problems could occur ]

  Not aware of any potential problems, but any should be confined to
  ASPEED AST2500+ hardware.

  [ Other Info ]

  Patch is already in jammy/nvidia-5.19 and jammy/nvidia-6.2 and has
  been tested with affected BMC.


  
  [Issue]

  On systems that have the following combination of hardware...:

  1) arm64 CPU
  2) ASPEED AST2600 BMC: https://www.aspeedtech.com/server_ast2600/

  .. we see no output when connecting a display to the BMC's VGA port.

  Upon further investigation, we see that applying the following patch
  fixes this issue:
  
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/gpu/drm/ast?h=v6.4-rc6=4327a6137ed43a091d900b1ac833345d60f32228

  [Action]

  Please apply the following two backports to the appropriate Ubuntu HWE
  kernels:

  
https://github.com/NVIDIA-BaseOS-6/linux-nvidia-5.19/commit/055c9ec3739d7df1179db3ba054b00f3dd684560
  
https://github.com/NVIDIA-BaseOS-6/linux-nvidia-6.2/commit/8ab3253c6a59eee3424fe0c60b1fc6dc9f2d73b7

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2026776/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 2019240] Re: Pull-request to address a number of enablement issues for Orin platforms

2023-05-11 Thread Ian May
Changing Package to linux-nvidia-tegra

** Package changed: linux-nvidia (Ubuntu) => linux-nvidia-tegra (Ubuntu)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-nvidia-tegra in Ubuntu.
https://bugs.launchpad.net/bugs/2019240

Title:
  Pull-request to address a number of enablement issues for Orin
  platforms

Status in linux-nvidia-tegra package in Ubuntu:
  New

Bug description:
  [impact]
  This patch set addresses a wide variety of bugs and missing features for 
NVIDIA Orin platforms.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-tegra/+bug/2019240/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1999082] [NEW] linux-modules-nvidia-510-server fails to install

2022-12-07 Thread Ian May
Public bug reported:

$ sudo apt install linux-modules-nvidia-510-server-$(uname -r)
Reading package lists... Done
Building dependency tree   
Reading state information... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
 linux-modules-nvidia-510-server-5.4.0-135-generic : Depends: 
linux-signatures-nvidia-5.4.0-135-generic (= 5.4.0-135.152) but 5.4.0-135.152+1 
is to be installed
 Depends: 
nvidia-kernel-common-510-server (<= 510.85.02-1) but it is not going to be 
installed
 Depends: 
nvidia-kernel-common-510-server (>= 510.85.02) but it is not going to be 
installed
E: Unable to correct problems, you have held broken packages.


$ lsb_release -rd
Description:Ubuntu 20.04.5 LTS
Release:20.04

** Affects: nvidia-graphics-drivers-510-server (Ubuntu)
 Importance: Undecided
 Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to nvidia-graphics-drivers-510-server in
Ubuntu.
https://bugs.launchpad.net/bugs/1999082

Title:
  linux-modules-nvidia-510-server fails to install

Status in nvidia-graphics-drivers-510-server package in Ubuntu:
  New

Bug description:
  $ sudo apt install linux-modules-nvidia-510-server-$(uname -r)
  Reading package lists... Done
  Building dependency tree   
  Reading state information... Done
  Some packages could not be installed. This may mean that you have
  requested an impossible situation or if you are using the unstable
  distribution that some required packages have not yet been created
  or been moved out of Incoming.
  The following information may help to resolve the situation:

  The following packages have unmet dependencies:
   linux-modules-nvidia-510-server-5.4.0-135-generic : Depends: 
linux-signatures-nvidia-5.4.0-135-generic (= 5.4.0-135.152) but 5.4.0-135.152+1 
is to be installed
   Depends: 
nvidia-kernel-common-510-server (<= 510.85.02-1) but it is not going to be 
installed
   Depends: 
nvidia-kernel-common-510-server (>= 510.85.02) but it is not going to be 
installed
  E: Unable to correct problems, you have held broken packages.

  
  $ lsb_release -rd
  Description:  Ubuntu 20.04.5 LTS
  Release:  20.04

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nvidia-graphics-drivers-510-server/+bug/1999082/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1975509] Re: Update to the 510.73.08 ERD NVIDIA driver series in Bionic, Focal, Impish, Jammy, and Kinetic

2022-06-14 Thread Ian May
** Changed in: fabric-manager-510 (Ubuntu Bionic)
   Status: New => Fix Committed

** Changed in: fabric-manager-510 (Ubuntu Focal)
   Status: New => Fix Committed

** Changed in: fabric-manager-510 (Ubuntu Impish)
   Status: New => Fix Committed

** Changed in: fabric-manager-510 (Ubuntu Jammy)
   Status: New => Fix Committed

** Changed in: fabric-manager-510 (Ubuntu Kinetic)
   Status: New => Fix Committed

** Changed in: libnvidia-nscq-510 (Ubuntu Bionic)
   Status: New => Fix Committed

** Changed in: libnvidia-nscq-510 (Ubuntu Focal)
   Status: New => Fix Committed

** Changed in: libnvidia-nscq-510 (Ubuntu Impish)
   Status: New => Fix Committed

** Changed in: libnvidia-nscq-510 (Ubuntu Jammy)
   Status: New => Fix Committed

** Changed in: libnvidia-nscq-510 (Ubuntu Kinetic)
   Status: New => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-restricted-modules in Ubuntu.
https://bugs.launchpad.net/bugs/1975509

Title:
  Update to the 510.73.08 ERD NVIDIA driver series in Bionic, Focal,
  Impish, Jammy, and Kinetic

Status in fabric-manager-510 package in Ubuntu:
  Fix Committed
Status in libnvidia-nscq-510 package in Ubuntu:
  Fix Committed
Status in linux-restricted-modules package in Ubuntu:
  Confirmed
Status in nvidia-graphics-drivers-510-server package in Ubuntu:
  Fix Committed
Status in fabric-manager-510 source package in Bionic:
  Fix Committed
Status in libnvidia-nscq-510 source package in Bionic:
  Fix Committed
Status in linux-restricted-modules source package in Bionic:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Bionic:
  Fix Released
Status in fabric-manager-510 source package in Focal:
  Fix Committed
Status in libnvidia-nscq-510 source package in Focal:
  Fix Committed
Status in linux-restricted-modules source package in Focal:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Focal:
  Fix Released
Status in fabric-manager-510 source package in Impish:
  Fix Committed
Status in libnvidia-nscq-510 source package in Impish:
  Fix Committed
Status in linux-restricted-modules source package in Impish:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Impish:
  Fix Released
Status in fabric-manager-510 source package in Jammy:
  Fix Committed
Status in libnvidia-nscq-510 source package in Jammy:
  Fix Committed
Status in linux-restricted-modules source package in Jammy:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Jammy:
  Fix Released
Status in fabric-manager-510 source package in Kinetic:
  Fix Committed
Status in libnvidia-nscq-510 source package in Kinetic:
  Fix Committed
Status in linux-restricted-modules source package in Kinetic:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Kinetic:
  Fix Committed

Bug description:
  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.

  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.

  [Discussion]

  [Changelog]

  === 510 kinetic/jammy/impish/focal/bionic ===

* New upstream release (LP: #1975509):
  - When calculating the address of grid barrier allocated for a CUDA 
stream, there was an off-by-one error. The address calculation is 
corrected in thisrelease.
  - An issue that caused an AC cycle test to fail with "AssertionError: 
NVLink links with inappropriate status found" is resolved.
  - An issue that caused NX 11 to become nonresponsive during a graphics 
operation is resolved.
  - Linking issues were observed when using libnvfm.so. Now and other 
depend tools use dynamic linking with libstdc++ and libgcc.
  - An intermittent error CUDA_ERROR_NVLINK_UNCORRECTABLE caused by some
non-fatal nvlink interrupts is resolved.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/fabric-manager-510/+bug/1975509/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1976425] Re: Release of nvidia-graphics-drivers UDA/ERD 515.48.07 for Bionic, Focal, Impish, Jammy, Kinetic

2022-06-09 Thread Ian May
** Summary changed:

- Release of nvidia-graphics-drivers 515.48.07 for Bionic, Focal, Impish, 
Jammy, Kinetic
+ Release of nvidia-graphics-drivers UDA/ERD 515.48.07 for Bionic, Focal, 
Impish, Jammy, Kinetic

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-restricted-modules in Ubuntu.
https://bugs.launchpad.net/bugs/1976425

Title:
  Release of nvidia-graphics-drivers UDA/ERD 515.48.07 for Bionic,
  Focal, Impish, Jammy, Kinetic

Status in linux-restricted-modules package in Ubuntu:
  New
Status in linux-restricted-modules source package in Bionic:
  New
Status in linux-restricted-modules source package in Focal:
  New
Status in linux-restricted-modules source package in Impish:
  New
Status in linux-restricted-modules source package in Jammy:
  New
Status in linux-restricted-modules source package in Kinetic:
  New

Bug description:
  [Impact]
  These releases provide both bug fixes and new features, and we would like to 
make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. Nvidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.

  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.

  
  [Changelog]


  RELEASE HIGHLIGHTS

  Published the source code to a variant of the NVIDIA Linux kernel modules 
dual-licensed as MIT/GPLv2. The source is available here:
  https://github.com/NVIDIA/open-gpu-kernel-modules
  and will be updated each driver release. Please see the "Open Linux Kernel 
Modules" chapter in the README for details.

  Added support for the VK_EXT_external_memory_dma_buf and
  VK_EXT_image_drm_format_modifier Vulkan extensions. To use this
  functionality, the nvidia-drm kernel module must be loaded with DRM
  KMS mode setting enabled. See the DRM KMS section of the README for
  guidance on enabling mode setting.

  Changed nvidia-suspend.service, nvidia-resume.service, and 
nvidia-hibernate.service to use WantedBy= rather than RequiredBy= dependencies 
for systemd-suspend.service and systemd-hibernate.service. This avoids a 
problem where suspend or hibernate fails if the NVIDIA driver is uninstalled 
without disabling these services first.
  See https://github.com/systemd/systemd/issues/21991
  If these services were manually enabled, it may be necessary to update their 
dependencies by running
  sudo systemctl reenable nvidia-suspend.service nvidia-resume.service 
nvidia-hibernate.service

  Interlaced modes are now disabled when active stereo is enabled.

  NVIDIA X Server Settings will now display the quit confirmation dialog
  automatically if only there are pending changes that need to be
  manually saved. The corresponding configuration option to control the
  appearance of the quit dialog was thus also removed.

  Removed the warning message about mismatches between the compiler used
  to build the Linux kernel and the compiler used to build the NVIDIA
  kernel modules from nvidia-installer. Modern compilers are less likely
  to cause problems when this type of mismatch occurs, and it has become
  common in many distributions to build the Linux kernel with a
  different compiler than the default system compiler.

  Updated nvidia-installer to skip test-loading the kernel modules on systems 
where no supported NVIDIA GPUs are detected.
  Updated nvidia-installer to avoid a race condition which could cause the 
kernel module test load to fail due to udev automatically loading kernel 
modules left over from an existing NVIDIA driver installation. This failure 
resulted in an installation error message "Kernel module load error: File 
exists".

  Updated the RTD3 Video Memory Utilization Threshold
  (NVreg_DynamicPowerManagementVideoMemoryThreshold) maximum value from
  200 MB to 1024 MB.

  Improved performance of GLX and Vulkan applications running in gamescope.
  Added a "kernelopen" feature tag to the supported-gpus.json file, to indicate 
which GPUs are compatible with open-gpu-kernel-modules.
  Improved Vulkan swapchain creation failure reporting. Applications can use 
the VK_EXT_debug_utils extension to receive additional information when an 
error was encountered in vkCreateSwapchainKHR().

  Added a new configuration option for NVIDIA NGX to allow disabling the DSO 
signature check. See the "NGX" chapter of the README for more information.
  Fixed an issue where HDMI audio output was not working in some cases, 
especially 

[Kernel-packages] [Bug 1976425] [NEW] Release of nvidia-graphics-drivers 515.48.07 for Bionic, Focal, Impish, Jammy, Kinetic

2022-05-31 Thread Ian May
Public bug reported:

[Impact]
These releases provide both bug fixes and new features, and we would like to 
make sure all of our users have access to these improvements.

See the changelog entry below for a full list of changes and bugs.

[Test Case]
The following development and SRU process was followed:
https://wiki.ubuntu.com/NVidiaUpdates

Certification test suite must pass on a range of hardware:
https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

The QA team that executed the tests will be in charge of attaching the
artifacts and console output of the appropriate run to the bug. Nvidia
maintainers team members will not mark ‘verification-done’ until this
has happened.

[Regression Potential]
In order to mitigate the regression potential, the results of the
aforementioned system level tests are attached to this bug.


[Changelog]


RELEASE HIGHLIGHTS

Published the source code to a variant of the NVIDIA Linux kernel modules 
dual-licensed as MIT/GPLv2. The source is available here:
https://github.com/NVIDIA/open-gpu-kernel-modules
and will be updated each driver release. Please see the "Open Linux Kernel 
Modules" chapter in the README for details.

Added support for the VK_EXT_external_memory_dma_buf and
VK_EXT_image_drm_format_modifier Vulkan extensions. To use this
functionality, the nvidia-drm kernel module must be loaded with DRM KMS
mode setting enabled. See the DRM KMS section of the README for guidance
on enabling mode setting.

Changed nvidia-suspend.service, nvidia-resume.service, and 
nvidia-hibernate.service to use WantedBy= rather than RequiredBy= dependencies 
for systemd-suspend.service and systemd-hibernate.service. This avoids a 
problem where suspend or hibernate fails if the NVIDIA driver is uninstalled 
without disabling these services first.
See https://github.com/systemd/systemd/issues/21991
If these services were manually enabled, it may be necessary to update their 
dependencies by running
sudo systemctl reenable nvidia-suspend.service nvidia-resume.service 
nvidia-hibernate.service

Interlaced modes are now disabled when active stereo is enabled.

NVIDIA X Server Settings will now display the quit confirmation dialog
automatically if only there are pending changes that need to be manually
saved. The corresponding configuration option to control the appearance
of the quit dialog was thus also removed.

Removed the warning message about mismatches between the compiler used
to build the Linux kernel and the compiler used to build the NVIDIA
kernel modules from nvidia-installer. Modern compilers are less likely
to cause problems when this type of mismatch occurs, and it has become
common in many distributions to build the Linux kernel with a different
compiler than the default system compiler.

Updated nvidia-installer to skip test-loading the kernel modules on systems 
where no supported NVIDIA GPUs are detected.
Updated nvidia-installer to avoid a race condition which could cause the kernel 
module test load to fail due to udev automatically loading kernel modules left 
over from an existing NVIDIA driver installation. This failure resulted in an 
installation error message "Kernel module load error: File exists".

Updated the RTD3 Video Memory Utilization Threshold
(NVreg_DynamicPowerManagementVideoMemoryThreshold) maximum value from
200 MB to 1024 MB.

Improved performance of GLX and Vulkan applications running in gamescope.
Added a "kernelopen" feature tag to the supported-gpus.json file, to indicate 
which GPUs are compatible with open-gpu-kernel-modules.
Improved Vulkan swapchain creation failure reporting. Applications can use the 
VK_EXT_debug_utils extension to receive additional information when an error 
was encountered in vkCreateSwapchainKHR().

Added a new configuration option for NVIDIA NGX to allow disabling the DSO 
signature check. See the "NGX" chapter of the README for more information.
Fixed an issue where HDMI audio output was not working in some cases, 
especially with high display refresh rates (120Hz, 100Hz, etc.) using Fixed 
Rate Link (FRL) transmission mode.

** Affects: linux-restricted-modules (Ubuntu)
 Importance: Undecided
 Status: New

** Affects: linux-restricted-modules (Ubuntu Bionic)
 Importance: Undecided
 Status: New

** Affects: linux-restricted-modules (Ubuntu Focal)
 Importance: Undecided
 Status: New

** Affects: linux-restricted-modules (Ubuntu Impish)
 Importance: Undecided
 Status: New

** Affects: linux-restricted-modules (Ubuntu Jammy)
 Importance: Undecided
 Status: New

** Affects: linux-restricted-modules (Ubuntu Kinetic)
 Importance: Undecided
 Status: New

** Description changed:

+ RELEASE HIGHLIGHTS
+ 
  Published the source code to a variant of the NVIDIA Linux kernel modules 
dual-licensed as MIT/GPLv2. The source is available here:
  https://github.com/NVIDIA/open-gpu-kernel-modules
  and will be updated each driver release. Please 

[Kernel-packages] [Bug 1975509] Re: Update to the 510.73.08 ERD NVIDIA driver series in Bionic, Focal, Impish, Jammy, and Kinetic

2022-05-25 Thread Ian May
** Also affects: fabric-manager-510 (Ubuntu)
   Importance: Undecided
   Status: New

** Also affects: libnvidia-nscq-510 (Ubuntu)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-restricted-modules in Ubuntu.
https://bugs.launchpad.net/bugs/1975509

Title:
  Update to the 510.73.08 ERD NVIDIA driver series in Bionic, Focal,
  Impish, Jammy, and Kinetic

Status in fabric-manager-510 package in Ubuntu:
  New
Status in libnvidia-nscq-510 package in Ubuntu:
  New
Status in linux-restricted-modules package in Ubuntu:
  Confirmed
Status in nvidia-graphics-drivers-510-server package in Ubuntu:
  Fix Committed
Status in fabric-manager-510 source package in Bionic:
  New
Status in libnvidia-nscq-510 source package in Bionic:
  New
Status in linux-restricted-modules source package in Bionic:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Bionic:
  Fix Committed
Status in fabric-manager-510 source package in Focal:
  New
Status in libnvidia-nscq-510 source package in Focal:
  New
Status in linux-restricted-modules source package in Focal:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Focal:
  Fix Committed
Status in fabric-manager-510 source package in Impish:
  New
Status in libnvidia-nscq-510 source package in Impish:
  New
Status in linux-restricted-modules source package in Impish:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Impish:
  Fix Committed
Status in fabric-manager-510 source package in Jammy:
  New
Status in libnvidia-nscq-510 source package in Jammy:
  New
Status in linux-restricted-modules source package in Jammy:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Jammy:
  Fix Committed
Status in fabric-manager-510 source package in Kinetic:
  New
Status in libnvidia-nscq-510 source package in Kinetic:
  New
Status in linux-restricted-modules source package in Kinetic:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Kinetic:
  Fix Committed

Bug description:
  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.

  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.

  [Discussion]

  [Changelog]

  === 510 kinetic/jammy/impish/focal/bionic ===

* New upstream release (LP: #1975509):
  - When calculating the address of grid barrier allocated for a CUDA 
stream, there was an off-by-one error. The address calculation is 
corrected in thisrelease.
  - An issue that caused an AC cycle test to fail with "AssertionError: 
NVLink links with inappropriate status found" is resolved.
  - An issue that caused NX 11 to become nonresponsive during a graphics 
operation is resolved.
  - Linking issues were observed when using libnvfm.so. Now and other 
depend tools use dynamic linking with libstdc++ and libgcc.
  - An intermittent error CUDA_ERROR_NVLINK_UNCORRECTABLE caused by some
non-fatal nvlink interrupts is resolved.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/fabric-manager-510/+bug/1975509/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1975509] Re: Update to the 510.73.08 ERD NVIDIA driver series in Bionic, Focal, Impish, Jammy, and Kinetic

2022-05-23 Thread Ian May
** Changed in: nvidia-graphics-drivers-510-server (Ubuntu Bionic)
   Status: Confirmed => In Progress

** Changed in: nvidia-graphics-drivers-510-server (Ubuntu Focal)
   Status: Confirmed => In Progress

** Changed in: nvidia-graphics-drivers-510-server (Ubuntu Impish)
   Status: Confirmed => In Progress

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-restricted-modules in Ubuntu.
https://bugs.launchpad.net/bugs/1975509

Title:
  Update to the 510.73.08 ERD NVIDIA driver series in Bionic, Focal,
  Impish, Jammy, and Kinetic

Status in linux-restricted-modules package in Ubuntu:
  Confirmed
Status in nvidia-graphics-drivers-510-server package in Ubuntu:
  Confirmed
Status in linux-restricted-modules source package in Bionic:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Bionic:
  In Progress
Status in linux-restricted-modules source package in Focal:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Focal:
  In Progress
Status in linux-restricted-modules source package in Impish:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Impish:
  In Progress
Status in linux-restricted-modules source package in Jammy:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Jammy:
  Confirmed
Status in linux-restricted-modules source package in Kinetic:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Kinetic:
  Confirmed

Bug description:
  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.

  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.

  [Discussion]

  [Changelog]

  === 510 kinetic/jammy/impish/focal/bionic ===

* New upstream release (LP: #1975509):
  - When calculating the address of grid barrier allocated for a CUDA 
stream, there was an off-by-one error. The address calculation is 
corrected in thisrelease.
  - An issue that caused an AC cycle test to fail with "AssertionError: 
NVLink links with inappropriate status found" is resolved.
  - An issue that caused NX 11 to become nonresponsive during a graphics 
operation is resolved.
  - Linking issues were observed when using libnvfm.so. Now and other 
depend tools use dynamic linking with libstdc++ and libgcc.
  - An intermittent error CUDA_ERROR_NVLINK_UNCORRECTABLE caused by some
non-fatal nvlink interrupts is resolved.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-restricted-modules/+bug/1975509/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1975509] Re: Update to the 510.73.08 ERD NVIDIA driver series in Bionic, Focal, Impish, Jammy, and Kinetic

2022-05-23 Thread Ian May
** Description changed:

  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.
  
  See the changelog entry below for a full list of changes and bugs.
  
  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates
  
  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu
  
  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.
  
  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.
  
  [Discussion]
  
  [Changelog]
  
+ === 510 kinetic/jammy/impish/focal/bionic ===
  
- When calculating the address of grid barrier allocated for a CUDA
- stream, there was an off-by-one error. The address calculation is
- corrected in this release.
- 
- An issue that caused an AC cycle test to fail with "AssertionError: NVLink 
links with inappropriate status found" is resolved.
- An issue that caused NX 11 to become nonresponsive during a graphics 
operation is resolved.
- 
- Linking issues were observed when using libnvfm.so. Now and other depend 
tools use dynamic linking with libstdc++ and libgcc.
- An intermittent error CUDA_ERROR_NVLINK_UNCORRECTABLE caused by some 
non-fatal nvlink interrupts is resolved.
+   * New upstream release (LP: #1975509):
+ - When calculating the address of grid barrier allocated for a CUDA 
+   stream, there was an off-by-one error. The address calculation is 
+   corrected in thisrelease.
+ - An issue that caused an AC cycle test to fail with "AssertionError: 
+   NVLink links with inappropriate status found" is resolved.
+ - An issue that caused NX 11 to become nonresponsive during a graphics 
+   operation is resolved.
+ - Linking issues were observed when using libnvfm.so. Now and other 
+   depend tools use dynamic linking with libstdc++ and libgcc.
+ - An intermittent error CUDA_ERROR_NVLINK_UNCORRECTABLE caused by some
+   non-fatal nvlink interrupts is resolved.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-restricted-modules in Ubuntu.
https://bugs.launchpad.net/bugs/1975509

Title:
  Update to the 510.73.08 ERD NVIDIA driver series in Bionic, Focal,
  Impish, Jammy, and Kinetic

Status in linux-restricted-modules package in Ubuntu:
  Confirmed
Status in nvidia-graphics-drivers-510-server package in Ubuntu:
  Confirmed
Status in linux-restricted-modules source package in Bionic:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Bionic:
  Confirmed
Status in linux-restricted-modules source package in Focal:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Focal:
  Confirmed
Status in linux-restricted-modules source package in Impish:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Impish:
  Confirmed
Status in linux-restricted-modules source package in Jammy:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Jammy:
  Confirmed
Status in linux-restricted-modules source package in Kinetic:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Kinetic:
  Confirmed

Bug description:
  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.

  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.

  [Discussion]

  [Changelog]

  === 510 kinetic/jammy/impish/focal/bionic ===

* New upstream release (LP: #1975509):
  - When calculating the address of grid barrier allocated for a CUDA 
stream, there was an off-by-one error. The address calculation is 
corrected in thisrelease.
  - An issue that caused an AC cycle test to fail with "AssertionError: 
NVLink links with inappropriate status found" is resolved.
  - An issue that caused NX 11 to become nonresponsive during a graphics 

[Kernel-packages] [Bug 1975509] Re: Update to the 510.73.08 ERD NVIDIA driver series in Bionic, Focal, Impish, Jammy, and Kinetic

2022-05-23 Thread Ian May
** Changed in: nvidia-graphics-drivers-510-server (Ubuntu Bionic)
 Assignee: (unassigned) => Ian May (ian-may)

** Changed in: nvidia-graphics-drivers-510-server (Ubuntu Focal)
 Assignee: (unassigned) => Ian May (ian-may)

** Changed in: nvidia-graphics-drivers-510-server (Ubuntu Impish)
 Assignee: (unassigned) => Ian May (ian-may)

** Changed in: nvidia-graphics-drivers-510-server (Ubuntu Jammy)
 Assignee: (unassigned) => Ian May (ian-may)

** Changed in: nvidia-graphics-drivers-510-server (Ubuntu Kinetic)
 Assignee: (unassigned) => Ian May (ian-may)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-restricted-modules in Ubuntu.
https://bugs.launchpad.net/bugs/1975509

Title:
  Update to the 510.73.08 ERD NVIDIA driver series in Bionic, Focal,
  Impish, Jammy, and Kinetic

Status in linux-restricted-modules package in Ubuntu:
  Confirmed
Status in nvidia-graphics-drivers-510-server package in Ubuntu:
  Confirmed
Status in linux-restricted-modules source package in Bionic:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Bionic:
  Confirmed
Status in linux-restricted-modules source package in Focal:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Focal:
  Confirmed
Status in linux-restricted-modules source package in Impish:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Impish:
  Confirmed
Status in linux-restricted-modules source package in Jammy:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Jammy:
  Confirmed
Status in linux-restricted-modules source package in Kinetic:
  Confirmed
Status in nvidia-graphics-drivers-510-server source package in Kinetic:
  Confirmed

Bug description:
  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.

  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.

  [Discussion]

  [Changelog]


  When calculating the address of grid barrier allocated for a CUDA
  stream, there was an off-by-one error. The address calculation is
  corrected in this release.

  An issue that caused an AC cycle test to fail with "AssertionError: NVLink 
links with inappropriate status found" is resolved.
  An issue that caused NX 11 to become nonresponsive during a graphics 
operation is resolved.

  Linking issues were observed when using libnvfm.so. Now and other depend 
tools use dynamic linking with libstdc++ and libgcc.
  An intermittent error CUDA_ERROR_NVLINK_UNCORRECTABLE caused by some 
non-fatal nvlink interrupts is resolved.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-restricted-modules/+bug/1975509/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1975509] [NEW] Update to the 510.73.08 ERD NVIDIA driver series in Bionic, Focal, Impish, Jammy, and Kinetic

2022-05-23 Thread Ian May
Public bug reported:

[Impact]
These releases provide both bug fixes and new features, and we would like to
make sure all of our users have access to these improvements.

See the changelog entry below for a full list of changes and bugs.

[Test Case]
The following development and SRU process was followed:
https://wiki.ubuntu.com/NVidiaUpdates

Certification test suite must pass on a range of hardware:
https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

The QA team that executed the tests will be in charge of attaching the
artifacts and console output of the appropriate run to the bug. nVidia
maintainers team members will not mark ‘verification-done’ until this
has happened.

[Regression Potential]
In order to mitigate the regression potential, the results of the
aforementioned system level tests are attached to this bug.

[Discussion]

[Changelog]


When calculating the address of grid barrier allocated for a CUDA
stream, there was an off-by-one error. The address calculation is
corrected in this release.

An issue that caused an AC cycle test to fail with "AssertionError: NVLink 
links with inappropriate status found" is resolved.
An issue that caused NX 11 to become nonresponsive during a graphics operation 
is resolved.

Linking issues were observed when using libnvfm.so. Now and other depend tools 
use dynamic linking with libstdc++ and libgcc.
An intermittent error CUDA_ERROR_NVLINK_UNCORRECTABLE caused by some non-fatal 
nvlink interrupts is resolved.

** Affects: linux-restricted-modules (Ubuntu)
 Importance: Undecided
 Status: Confirmed

** Affects: nvidia-graphics-drivers-510-server (Ubuntu)
 Importance: Undecided
 Status: Confirmed

** Affects: linux-restricted-modules (Ubuntu Bionic)
 Importance: Undecided
 Status: Confirmed

** Affects: nvidia-graphics-drivers-510-server (Ubuntu Bionic)
 Importance: Undecided
 Status: Confirmed

** Affects: linux-restricted-modules (Ubuntu Focal)
 Importance: Undecided
 Status: Confirmed

** Affects: nvidia-graphics-drivers-510-server (Ubuntu Focal)
 Importance: Undecided
 Status: Confirmed

** Affects: linux-restricted-modules (Ubuntu Impish)
 Importance: Undecided
 Status: Confirmed

** Affects: nvidia-graphics-drivers-510-server (Ubuntu Impish)
 Importance: Undecided
 Status: Confirmed

** Affects: linux-restricted-modules (Ubuntu Jammy)
 Importance: Undecided
 Status: Confirmed

** Affects: nvidia-graphics-drivers-510-server (Ubuntu Jammy)
 Importance: Undecided
 Status: Confirmed

** Affects: linux-restricted-modules (Ubuntu Kinetic)
 Importance: Undecided
 Status: Confirmed

** Affects: nvidia-graphics-drivers-510-server (Ubuntu Kinetic)
 Importance: Undecided
 Status: Confirmed

** Also affects: nvidia-graphics-drivers-510-server (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Also affects: nvidia-graphics-drivers-510-server (Ubuntu Jammy)
   Importance: Undecided
   Status: New

** Also affects: nvidia-graphics-drivers-510-server (Ubuntu Impish)
   Importance: Undecided
   Status: New

** Also affects: nvidia-graphics-drivers-510-server (Ubuntu Kinetic)
   Importance: Undecided
   Status: New

** Also affects: nvidia-graphics-drivers-510-server (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Also affects: linux-restricted-modules (Ubuntu)
   Importance: Undecided
   Status: New

** Changed in: linux-restricted-modules (Ubuntu Bionic)
   Status: New => Confirmed

** Changed in: linux-restricted-modules (Ubuntu Focal)
   Status: New => Confirmed

** Changed in: linux-restricted-modules (Ubuntu Impish)
   Status: New => Confirmed

** Changed in: linux-restricted-modules (Ubuntu Jammy)
   Status: New => Confirmed

** Changed in: linux-restricted-modules (Ubuntu Kinetic)
   Status: New => Confirmed

** Changed in: nvidia-graphics-drivers-510-server (Ubuntu Bionic)
   Status: New => Confirmed

** Changed in: nvidia-graphics-drivers-510-server (Ubuntu Focal)
   Status: New => Confirmed

** Changed in: nvidia-graphics-drivers-510-server (Ubuntu Impish)
   Status: New => Confirmed

** Changed in: nvidia-graphics-drivers-510-server (Ubuntu Jammy)
   Status: New => Confirmed

** Changed in: nvidia-graphics-drivers-510-server (Ubuntu Kinetic)
   Status: New => Confirmed

** No longer affects: linux-restricted-modules (Ubuntu Kinetic)

** No longer affects: nvidia-graphics-drivers-510-server (Ubuntu
Kinetic)

** Also affects: linux-restricted-modules (Ubuntu Kinetic)
   Importance: Undecided
   Status: Confirmed

** Also affects: nvidia-graphics-drivers-510-server (Ubuntu Kinetic)
   Importance: Undecided
   Status: Confirmed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to nvidia-graphics-drivers-510-server in
Ubuntu.

[Kernel-packages] [Bug 1970798] Re: 32 GT/s PCI link speeds reporting "Unknown speed" in sysfs

2022-05-03 Thread Ian May
** Description changed:

  SRU Justification
  
  [Impact]
  
  NVIDIA Collective Communication Library software uses sysfs to report
  performance statistics.  Users have reported entries showing "Unknown
  speed" when they should be reporting "32 GT/s".
  
  Example:
  ""
  
  PCIe 5.0 supports 32 GT/s and is available in the 5.4 kernel, but the
  patches for properly reporting speeds in sysfs are missing.  The
- following upstream patches add the reporting capability.
+ following upstream v5.7 patches add the reporting capability.
  
- https://lore.kernel.org/linux-
- pci/20200229030706.17835-1-helg...@kernel.org/
+ PCI ML submission
+ https://lore.kernel.org/linux-pci/20200229030706.17835-1-helg...@kernel.org/
+ 
+ Upstream Patches
+ 
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=9cb3985af63555810bb07de50acdf4170771451d
+ 
+ 
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e56faff57f0b39661093c00e0262d4ab9088830e
+ 
+ 
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=6348a34dcb98d8e285685a205f2a601817fa2d38
+ 
+ 
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=757bfaa2c3515803dde9a6728bbf8c8a3c5f098a
+ 
  
  [Test Plan]
  
  Testing these speeds requires special hardware. A Test kernel with these
  patches applied was provided to the customer and they confirmed the
  proper numbers are reported.
  
  [Where problems could occur]
  
  Changes are for reporting info so chance of problems should be low.  If
  a problem did occur it would be with sysfs or pcie driver misreporting
  speeds.
  
  [Other]
  
  SF-00333784

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1970798

Title:
  32 GT/s PCI link speeds reporting "Unknown speed" in sysfs

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Focal:
  In Progress

Bug description:
  SRU Justification

  [Impact]

  NVIDIA Collective Communication Library software uses sysfs to report
  performance statistics.  Users have reported entries showing "Unknown
  speed" when they should be reporting "32 GT/s".

  Example:
  ""

  PCIe 5.0 supports 32 GT/s and is available in the 5.4 kernel, but the
  patches for properly reporting speeds in sysfs are missing.  The
  following upstream v5.7 patches add the reporting capability.

  PCI ML submission
  https://lore.kernel.org/linux-pci/20200229030706.17835-1-helg...@kernel.org/

  Upstream Patches
  
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=9cb3985af63555810bb07de50acdf4170771451d

  
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=e56faff57f0b39661093c00e0262d4ab9088830e

  
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=6348a34dcb98d8e285685a205f2a601817fa2d38

  
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=757bfaa2c3515803dde9a6728bbf8c8a3c5f098a

  
  [Test Plan]

  Testing these speeds requires special hardware. A Test kernel with
  these patches applied was provided to the customer and they confirmed
  the proper numbers are reported.

  [Where problems could occur]

  Changes are for reporting info so chance of problems should be low.
  If a problem did occur it would be with sysfs or pcie driver
  misreporting speeds.

  [Other]

  SF-00333784

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1970798/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1970798] Re: 32 GT/s PCI link speeds reporting "Unknown speed" in sysfs

2022-05-03 Thread Ian May
** Description changed:

  SRU Justification
  
  [Impact]
  
  NVIDIA Collective Communication Library software uses sysfs to report
  performance statistics.  Users have reported entries showing "Unknown
  speed" when they should be reporting "32 GT/s".
  
  Example:
  ""
  
  PCIe 5.0 supports 32 GT/s and is available in the 5.4 kernel, but the
  patches for properly reporting speeds in sysfs are missing.  The
  following upstream patches add the reporting capability.
  
  https://lore.kernel.org/linux-
  pci/20200229030706.17835-1-helg...@kernel.org/
  
  [Test Plan]
  
  Testing these speeds requires special hardware. A Test kernel with these
  patches applied was provided to the customer and they confirmed the
  proper numbers are reported.
  
  [Where problems could occur]
  
  Changes are for reporting info so chance of problems should be low.  If
- a problem did occur it would be with sysfs or pcie driver.
+ a problem did occur it would be with sysfs or pcie driver misreporting
+ speeds.
  
  [Other]
  
  SF-00333784

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1970798

Title:
  32 GT/s PCI link speeds reporting "Unknown speed" in sysfs

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Focal:
  In Progress

Bug description:
  SRU Justification

  [Impact]

  NVIDIA Collective Communication Library software uses sysfs to report
  performance statistics.  Users have reported entries showing "Unknown
  speed" when they should be reporting "32 GT/s".

  Example:
  ""

  PCIe 5.0 supports 32 GT/s and is available in the 5.4 kernel, but the
  patches for properly reporting speeds in sysfs are missing.  The
  following upstream patches add the reporting capability.

  https://lore.kernel.org/linux-
  pci/20200229030706.17835-1-helg...@kernel.org/

  [Test Plan]

  Testing these speeds requires special hardware. A Test kernel with
  these patches applied was provided to the customer and they confirmed
  the proper numbers are reported.

  [Where problems could occur]

  Changes are for reporting info so chance of problems should be low.
  If a problem did occur it would be with sysfs or pcie driver
  misreporting speeds.

  [Other]

  SF-00333784

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1970798/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1970798] Re: 32 GT/s PCI link speeds reporting "Unknown speed" in sysfs

2022-05-02 Thread Ian May
** Description changed:

  SRU Justification
  
  [Impact]
  
  NVIDIA Collective Communication Library software uses sysfs to report
  performance statistics.  Users have reported entries showing "Unknown
  speed" when they should be reporting "32 GT/s".
  
  Example:
  ""
  
  PCIe 5.0 supports 32 GT/s and is available in the 5.4 kernel, but the
  patches for properly reporting speeds in sysfs are missing.  The
  following upstream patches add the reporting capability.
  
  https://lore.kernel.org/linux-
  pci/20200229030706.17835-1-helg...@kernel.org/
  
  [Test Plan]
  
  Testing these speeds requires special hardware. A Test kernel with these
  patches applied was provided to the customer and they confirmed the
  proper numbers are reported.
  
  [Where problems could occur]
  
  Changes are for reporting info so chance of problems should be low.  If
  a problem did occur it would be with sysfs or pcie driver.
+ 
+ [Other]
+ SF00333784

** Description changed:

  SRU Justification
  
  [Impact]
  
  NVIDIA Collective Communication Library software uses sysfs to report
  performance statistics.  Users have reported entries showing "Unknown
  speed" when they should be reporting "32 GT/s".
  
  Example:
  ""
  
  PCIe 5.0 supports 32 GT/s and is available in the 5.4 kernel, but the
  patches for properly reporting speeds in sysfs are missing.  The
  following upstream patches add the reporting capability.
  
  https://lore.kernel.org/linux-
  pci/20200229030706.17835-1-helg...@kernel.org/
  
  [Test Plan]
  
  Testing these speeds requires special hardware. A Test kernel with these
  patches applied was provided to the customer and they confirmed the
  proper numbers are reported.
  
  [Where problems could occur]
  
  Changes are for reporting info so chance of problems should be low.  If
  a problem did occur it would be with sysfs or pcie driver.
  
  [Other]
+ 
  SF00333784

** Description changed:

  SRU Justification
  
  [Impact]
  
  NVIDIA Collective Communication Library software uses sysfs to report
  performance statistics.  Users have reported entries showing "Unknown
  speed" when they should be reporting "32 GT/s".
  
  Example:
  ""
  
  PCIe 5.0 supports 32 GT/s and is available in the 5.4 kernel, but the
  patches for properly reporting speeds in sysfs are missing.  The
  following upstream patches add the reporting capability.
  
  https://lore.kernel.org/linux-
  pci/20200229030706.17835-1-helg...@kernel.org/
  
  [Test Plan]
  
  Testing these speeds requires special hardware. A Test kernel with these
  patches applied was provided to the customer and they confirmed the
  proper numbers are reported.
  
  [Where problems could occur]
  
  Changes are for reporting info so chance of problems should be low.  If
  a problem did occur it would be with sysfs or pcie driver.
  
  [Other]
  
- SF00333784
+ SF-00333784

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1970798

Title:
  32 GT/s PCI link speeds reporting "Unknown speed" in sysfs

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Focal:
  In Progress

Bug description:
  SRU Justification

  [Impact]

  NVIDIA Collective Communication Library software uses sysfs to report
  performance statistics.  Users have reported entries showing "Unknown
  speed" when they should be reporting "32 GT/s".

  Example:
  ""

  PCIe 5.0 supports 32 GT/s and is available in the 5.4 kernel, but the
  patches for properly reporting speeds in sysfs are missing.  The
  following upstream patches add the reporting capability.

  https://lore.kernel.org/linux-
  pci/20200229030706.17835-1-helg...@kernel.org/

  [Test Plan]

  Testing these speeds requires special hardware. A Test kernel with
  these patches applied was provided to the customer and they confirmed
  the proper numbers are reported.

  [Where problems could occur]

  Changes are for reporting info so chance of problems should be low.
  If a problem did occur it would be with sysfs or pcie driver.

  [Other]

  SF-00333784

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1970798/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1970798] Re: 32 GT/s PCI link speeds reporting "Unknown speed" in sysfs

2022-05-02 Thread Ian May
** Description changed:

- Our NCCL software uses the sysfs to populate the attached topo.xml file.
- Several of the entries should report "32 GT/s", but they're saying
- "Unknown speed" instead. For instance:
+ [Impact]
  
- 
+ NVIDIA Collective Communication Library software uses sysfs to report
+ performance statistics.  Users have reported entries showing "Unknown
+ speed" when they should be reporting "32 GT/s".
  
- The 5.4 kernel is missing the following commit:
- https://lore.kernel.org/all/1581937984-40353-2-git-send-email-
- yangyic...@hisilicon.com/
+ Example:
+ ""
+ 
+ PCIe 5.0 which supports 32 GT/s is available in the 5.4 kernel, but the
+ patches for properly reporting speeds in sysfs are missing.  The
+ following upstream patches add the reporting capability.
+ 
+ https://lore.kernel.org/linux-
+ pci/20200229030706.17835-1-helg...@kernel.org/
+ 
+ 
+ [Test Plan]
+ 
+ Testing these speeds requires special hardware. A Test kernel with these
+ patches applied was provided to the customer and they confirmed the
+ proper numbers are reported.
+ 
+ 
+ [Where problems could occur]
+ 
+ Changes are for reporting info so chance of problems should be low.  If
+ a problem did occur it would be with sysfs or pcie driver.

** Changed in: linux (Ubuntu Focal)
   Status: Incomplete => In Progress

** Changed in: linux (Ubuntu)
   Status: Incomplete => In Progress

** Changed in: linux (Ubuntu)
   Importance: Undecided => High

** Changed in: linux (Ubuntu Focal)
   Importance: Undecided => High

** Description changed:

+ SRU Justification
+ 
  [Impact]
  
  NVIDIA Collective Communication Library software uses sysfs to report
  performance statistics.  Users have reported entries showing "Unknown
  speed" when they should be reporting "32 GT/s".
  
  Example:
  ""
  
  PCIe 5.0 which supports 32 GT/s is available in the 5.4 kernel, but the
  patches for properly reporting speeds in sysfs are missing.  The
  following upstream patches add the reporting capability.
  
  https://lore.kernel.org/linux-
  pci/20200229030706.17835-1-helg...@kernel.org/
  
- 
  [Test Plan]
  
  Testing these speeds requires special hardware. A Test kernel with these
  patches applied was provided to the customer and they confirmed the
  proper numbers are reported.
  
- 
  [Where problems could occur]
  
  Changes are for reporting info so chance of problems should be low.  If
  a problem did occur it would be with sysfs or pcie driver.

** Description changed:

  SRU Justification
  
  [Impact]
  
  NVIDIA Collective Communication Library software uses sysfs to report
  performance statistics.  Users have reported entries showing "Unknown
  speed" when they should be reporting "32 GT/s".
  
  Example:
  ""
  
- PCIe 5.0 which supports 32 GT/s is available in the 5.4 kernel, but the
+ PCIe 5.0 supports 32 GT/s and is available in the 5.4 kernel, but the
  patches for properly reporting speeds in sysfs are missing.  The
  following upstream patches add the reporting capability.
  
  https://lore.kernel.org/linux-
  pci/20200229030706.17835-1-helg...@kernel.org/
  
  [Test Plan]
  
  Testing these speeds requires special hardware. A Test kernel with these
  patches applied was provided to the customer and they confirmed the
  proper numbers are reported.
  
  [Where problems could occur]
  
  Changes are for reporting info so chance of problems should be low.  If
  a problem did occur it would be with sysfs or pcie driver.

** Changed in: linux (Ubuntu)
   Importance: High => Medium

** Changed in: linux (Ubuntu Focal)
   Importance: High => Medium

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1970798

Title:
  32 GT/s PCI link speeds reporting "Unknown speed" in sysfs

Status in linux package in Ubuntu:
  In Progress
Status in linux source package in Focal:
  In Progress

Bug description:
  SRU Justification

  [Impact]

  NVIDIA Collective Communication Library software uses sysfs to report
  performance statistics.  Users have reported entries showing "Unknown
  speed" when they should be reporting "32 GT/s".

  Example:
  ""

  PCIe 5.0 supports 32 GT/s and is available in the 5.4 kernel, but the
  patches for properly reporting speeds in sysfs are missing.  The
  following upstream patches add the reporting capability.

  https://lore.kernel.org/linux-
  pci/20200229030706.17835-1-helg...@kernel.org/

  [Test Plan]

  Testing these speeds requires special hardware. A Test kernel with
  these patches applied was provided to the customer and they confirmed
  the proper numbers are reported.

  [Where problems could occur]

  Changes are for reporting info so chance of problems should be low.
  If a problem did occur it would be with sysfs or pcie driver.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1970798/+subscriptions


-- 
Mailing list: 

[Kernel-packages] [Bug 1970798] [NEW] 32 GT/s PCI link speeds reporting "Unknown speed" in sysfs

2022-04-28 Thread Ian May
Public bug reported:

Our NCCL software uses the sysfs to populate the attached topo.xml file.
Several of the entries should report "32 GT/s", but they're saying
"Unknown speed" instead. For instance:



The 5.4 kernel is missing the following commit:
https://lore.kernel.org/all/1581937984-40353-2-git-send-email-
yangyic...@hisilicon.com/

** Affects: linux (Ubuntu)
 Importance: Undecided
     Assignee: Ian May (ian-may)
 Status: New

** Affects: linux (Ubuntu Focal)
 Importance: Undecided
     Assignee: Ian May (ian-may)
 Status: New

** Also affects: linux (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Changed in: linux (Ubuntu)
 Assignee: (unassigned) => Ian May (ian-may)

** Changed in: linux (Ubuntu Focal)
 Assignee: (unassigned) => Ian May (ian-may)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1970798

Title:
  32 GT/s PCI link speeds reporting "Unknown speed" in sysfs

Status in linux package in Ubuntu:
  New
Status in linux source package in Focal:
  New

Bug description:
  Our NCCL software uses the sysfs to populate the attached topo.xml
  file. Several of the entries should report "32 GT/s", but they're
  saying "Unknown speed" instead. For instance:

  

  The 5.4 kernel is missing the following commit:
  https://lore.kernel.org/all/1581937984-40353-2-git-send-email-
  yangyic...@hisilicon.com/

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1970798/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1970451] Re: Update to the 510.68.02 UDA NVIDIA driver series in Bionic, Focal, Impish, and Jammy

2022-04-28 Thread Ian May
** Changed in: nvidia-graphics-drivers-510 (Ubuntu Bionic)
   Status: Confirmed => Fix Committed

** Changed in: nvidia-graphics-drivers-510 (Ubuntu Focal)
   Status: Confirmed => Fix Committed

** Changed in: nvidia-graphics-drivers-510 (Ubuntu Impish)
   Status: Confirmed => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-restricted-modules in Ubuntu.
https://bugs.launchpad.net/bugs/1970451

Title:
  Update to the 510.68.02 UDA NVIDIA driver series in Bionic, Focal,
  Impish,  and Jammy

Status in linux-restricted-modules package in Ubuntu:
  Confirmed
Status in nvidia-graphics-drivers-510 package in Ubuntu:
  Confirmed
Status in linux-restricted-modules source package in Bionic:
  Confirmed
Status in nvidia-graphics-drivers-510 source package in Bionic:
  Fix Committed
Status in linux-restricted-modules source package in Focal:
  Confirmed
Status in nvidia-graphics-drivers-510 source package in Focal:
  Fix Committed
Status in linux-restricted-modules source package in Impish:
  Confirmed
Status in nvidia-graphics-drivers-510 source package in Impish:
  Fix Committed
Status in linux-restricted-modules source package in Jammy:
  Confirmed
Status in nvidia-graphics-drivers-510 source package in Jammy:
  Confirmed

Bug description:
  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.

  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.

  [Discussion]

  [Changelog]

  === 510 jammy/impish/focal/bionic ===

* New upstream release (LP: #1970451):
  - Fixed an issue where NvFBC was requesting Vulkan 1.0 while using
Vulkan 1.1 core features. This caused NvFBC to fail to initialize
with Vulkan loader versions 1.3.204 or newer.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-restricted-modules/+bug/1970451/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1970451] Re: Update to the 510.68.02 UDA NVIDIA driver series in Bionic, Focal, Impish, and Jammy

2022-04-26 Thread Ian May
** Description changed:

  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.
  
  See the changelog entry below for a full list of changes and bugs.
  
  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates
  
  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu
  
  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.
  
  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.
  
  [Discussion]
+ 
+ [Changelog]
+ 
+ === 510 jammy/impish/focal/bionic ===
+ 
+   * New upstream release (LP: #1970451):
+ - Fixed an issue where NvFBC was requesting Vulkan 1.0 while using
+   Vulkan 1.1 core features. This caused NvFBC to fail to initialize
+   with Vulkan loader versions 1.3.204 or newer.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-restricted-modules in Ubuntu.
https://bugs.launchpad.net/bugs/1970451

Title:
  Update to the 510.68.02 UDA NVIDIA driver series in Bionic, Focal,
  Impish,  and Jammy

Status in linux-restricted-modules package in Ubuntu:
  New
Status in nvidia-graphics-drivers-510 package in Ubuntu:
  New
Status in linux-restricted-modules source package in Bionic:
  New
Status in nvidia-graphics-drivers-510 source package in Bionic:
  New
Status in linux-restricted-modules source package in Focal:
  New
Status in nvidia-graphics-drivers-510 source package in Focal:
  New
Status in linux-restricted-modules source package in Impish:
  New
Status in nvidia-graphics-drivers-510 source package in Impish:
  New
Status in linux-restricted-modules source package in Jammy:
  New
Status in nvidia-graphics-drivers-510 source package in Jammy:
  New

Bug description:
  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.

  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.

  [Discussion]

  [Changelog]

  === 510 jammy/impish/focal/bionic ===

* New upstream release (LP: #1970451):
  - Fixed an issue where NvFBC was requesting Vulkan 1.0 while using
Vulkan 1.1 core features. This caused NvFBC to fail to initialize
with Vulkan loader versions 1.3.204 or newer.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-restricted-modules/+bug/1970451/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1970451] Re: Update to the 510.68.02 UDA NVIDIA driver series in Bionic, Focal, Impish, and Jammy

2022-04-26 Thread Ian May
** Also affects: linux-restricted-modules (Ubuntu Impish)
   Importance: Undecided
   Status: New

** Also affects: nvidia-graphics-drivers-510 (Ubuntu Impish)
   Importance: Undecided
   Status: New

** Also affects: linux-restricted-modules (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Also affects: nvidia-graphics-drivers-510 (Ubuntu Focal)
   Importance: Undecided
   Status: New

** Also affects: linux-restricted-modules (Ubuntu Jammy)
   Importance: Undecided
   Status: New

** Also affects: nvidia-graphics-drivers-510 (Ubuntu Jammy)
   Importance: Undecided
   Status: New

** Also affects: linux-restricted-modules (Ubuntu Bionic)
   Importance: Undecided
   Status: New

** Also affects: nvidia-graphics-drivers-510 (Ubuntu Bionic)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-restricted-modules in Ubuntu.
https://bugs.launchpad.net/bugs/1970451

Title:
  Update to the 510.68.02 UDA NVIDIA driver series in Bionic, Focal,
  Impish,  and Jammy

Status in linux-restricted-modules package in Ubuntu:
  New
Status in nvidia-graphics-drivers-510 package in Ubuntu:
  New
Status in linux-restricted-modules source package in Bionic:
  New
Status in nvidia-graphics-drivers-510 source package in Bionic:
  New
Status in linux-restricted-modules source package in Focal:
  New
Status in nvidia-graphics-drivers-510 source package in Focal:
  New
Status in linux-restricted-modules source package in Impish:
  New
Status in nvidia-graphics-drivers-510 source package in Impish:
  New
Status in linux-restricted-modules source package in Jammy:
  New
Status in nvidia-graphics-drivers-510 source package in Jammy:
  New

Bug description:
  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.

  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.

  [Discussion]

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-restricted-modules/+bug/1970451/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1970451] [NEW] Update to the 510.68.02 UDA NVIDIA driver series in Bionic, Focal, Impish, and Jammy

2022-04-26 Thread Ian May
Public bug reported:

[Impact]
These releases provide both bug fixes and new features, and we would like to
make sure all of our users have access to these improvements.

See the changelog entry below for a full list of changes and bugs.

[Test Case]
The following development and SRU process was followed:
https://wiki.ubuntu.com/NVidiaUpdates

Certification test suite must pass on a range of hardware:
https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

The QA team that executed the tests will be in charge of attaching the
artifacts and console output of the appropriate run to the bug. nVidia
maintainers team members will not mark ‘verification-done’ until this
has happened.

[Regression Potential]
In order to mitigate the regression potential, the results of the
aforementioned system level tests are attached to this bug.

[Discussion]

** Affects: linux-restricted-modules (Ubuntu)
 Importance: Undecided
 Assignee: Ian May (ian-may)
 Status: New

** Affects: nvidia-graphics-drivers-510 (Ubuntu)
 Importance: Undecided
 Assignee: Ian May (ian-may)
 Status: New

** Package changed: ubuntu => linux-restricted-modules (Ubuntu)

** Changed in: linux-restricted-modules (Ubuntu)
 Assignee: (unassigned) => Ian May (ian-may)

** Also affects: nvidia-graphics-drivers-510 (Ubuntu)
   Importance: Undecided
   Status: New

** Changed in: nvidia-graphics-drivers-510 (Ubuntu)
 Assignee: (unassigned) => Ian May (ian-may)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-restricted-modules in Ubuntu.
https://bugs.launchpad.net/bugs/1970451

Title:
  Update to the 510.68.02 UDA NVIDIA driver series in Bionic, Focal,
  Impish,  and Jammy

Status in linux-restricted-modules package in Ubuntu:
  New
Status in nvidia-graphics-drivers-510 package in Ubuntu:
  New

Bug description:
  [Impact]
  These releases provide both bug fixes and new features, and we would like to
  make sure all of our users have access to these improvements.

  See the changelog entry below for a full list of changes and bugs.

  [Test Case]
  The following development and SRU process was followed:
  https://wiki.ubuntu.com/NVidiaUpdates

  Certification test suite must pass on a range of hardware:
  https://git.launchpad.net/plainbox-provider-sru/tree/units/sru.pxu

  The QA team that executed the tests will be in charge of attaching the
  artifacts and console output of the appropriate run to the bug. nVidia
  maintainers team members will not mark ‘verification-done’ until this
  has happened.

  [Regression Potential]
  In order to mitigate the regression potential, the results of the
  aforementioned system level tests are attached to this bug.

  [Discussion]

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-restricted-modules/+bug/1970451/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1959216] Re: linux-azure: CONFIG_FB_EFI=y

2022-02-18 Thread Ian May
wget 
https://launchpad.net/~canonical-kernel-team/+archive/ubuntu/ppa/+files/linux-buildinfo-5.13.0-1014-azure_5.13.0-1014.16_amd64.deb
dpkg -x linux-buildinfo-5.13.0-1014-azure_5.13.0-1014.16_amd64.deb .
grep CONFIG_FB_EFI ./usr/lib/linux/5.13.0-1014-azure/config 
CONFIG_FB_EFI=y

** Tags removed: verification-needed-impish
** Tags added: verification-done-impish

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure-5.13 in Ubuntu.
https://bugs.launchpad.net/bugs/1959216

Title:
  linux-azure: CONFIG_FB_EFI=y

Status in linux-azure package in Ubuntu:
  Fix Committed
Status in linux-azure-4.15 package in Ubuntu:
  Invalid
Status in linux-azure-5.11 package in Ubuntu:
  Invalid
Status in linux-azure-5.13 package in Ubuntu:
  Invalid
Status in linux-azure source package in Bionic:
  Invalid
Status in linux-azure-4.15 source package in Bionic:
  Fix Committed
Status in linux-azure-5.11 source package in Bionic:
  Invalid
Status in linux-azure-5.13 source package in Bionic:
  Invalid
Status in linux-azure source package in Focal:
  Fix Committed
Status in linux-azure-4.15 source package in Focal:
  Invalid
Status in linux-azure-5.11 source package in Focal:
  Fix Committed
Status in linux-azure-5.13 source package in Focal:
  Fix Committed
Status in linux-azure source package in Impish:
  Fix Committed
Status in linux-azure-4.15 source package in Impish:
  Invalid
Status in linux-azure-5.11 source package in Impish:
  Invalid
Status in linux-azure-5.13 source package in Impish:
  Invalid
Status in linux-azure source package in Jammy:
  Fix Committed
Status in linux-azure-4.15 source package in Jammy:
  Invalid
Status in linux-azure-5.11 source package in Jammy:
  Invalid
Status in linux-azure-5.13 source package in Jammy:
  Invalid

Bug description:
  SRU Justification

  [Impact]

  Secure boot instances of linux-azure require an EFI framebuffer in
  some cases in order for the VM to boot.

  The issue was noticed in Ubuntu 18.04 linux-azure kernel, but actually
  exists in the latest mainline kernel. The issue happens when the below
  conditions are met:

  hyperv_pci is built into the kernel and hyperv_fb is not, i.e., this means 
hyperv_pci loads before hyperv_fb loads.
  CONFIG_FB_EFI is not defined, i.e., the efifb driver is not used.

  Here is how the bug happens:

  Linux VM starts, and vmbus_reserve_fb() reserves the VRAM [base=0xf800, 
length=8MB].
  hyper-pci loads, gets MMIO [base=0xf880, lengh=8KB] as the bridge config 
window, and may get some other 64-bit MMIO ranges, and some 32-bit MMIO ranges 
(if needed.)
  hyperv-fb loads, and gets MMIO [base=0xf800, lengh=8MB or a different 
length], and sets screen_info.lfb_base = 0.
  VM panics.
  The kdump kernel starts to run, and vmbus_reserve_fb() is not reserving 
[base=0xf800, length=8MB] due to the lfb_base==0.
  hyperv-pci loads and gets [base=0xf800, lengh=8KB] and the host PCI VSP 
driver rejects this address as the bridge config window.

  The crux of the problem is that Linux vmbus driver itself is unable to
  detect the VRAM base/length (it looks like a video BIOS call is needed
  to get this info and such a BIOS call is inappropriate or impossible
  in hv_vmbus) and has to rely on screen_info.lfb_base (which is set by
  grub or the kdump/kexec tool and can be reset to zero by
  hyperv_fb/drm).

  Solution: Enable CONFIG_FB_EFI=y

  [Test Case]

  Microsoft tested. This config is also enabled on the master branch.

  [Where things could go wrong]

  VMs on certain instance types could fail to boot.

  [Other Info]

  SF: #00327005

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1959216/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1959216] Re: linux-azure: CONFIG_FB_EFI=y

2022-02-18 Thread Ian May
-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure-5.13 in Ubuntu.
https://bugs.launchpad.net/bugs/1959216

Title:
  linux-azure: CONFIG_FB_EFI=y

Status in linux-azure package in Ubuntu:
  Fix Committed
Status in linux-azure-4.15 package in Ubuntu:
  Invalid
Status in linux-azure-5.11 package in Ubuntu:
  Invalid
Status in linux-azure-5.13 package in Ubuntu:
  Invalid
Status in linux-azure source package in Bionic:
  Invalid
Status in linux-azure-4.15 source package in Bionic:
  Fix Committed
Status in linux-azure-5.11 source package in Bionic:
  Invalid
Status in linux-azure-5.13 source package in Bionic:
  Invalid
Status in linux-azure source package in Focal:
  Fix Committed
Status in linux-azure-4.15 source package in Focal:
  Invalid
Status in linux-azure-5.11 source package in Focal:
  Fix Committed
Status in linux-azure-5.13 source package in Focal:
  Fix Committed
Status in linux-azure source package in Impish:
  Fix Committed
Status in linux-azure-4.15 source package in Impish:
  Invalid
Status in linux-azure-5.11 source package in Impish:
  Invalid
Status in linux-azure-5.13 source package in Impish:
  Invalid
Status in linux-azure source package in Jammy:
  Fix Committed
Status in linux-azure-4.15 source package in Jammy:
  Invalid
Status in linux-azure-5.11 source package in Jammy:
  Invalid
Status in linux-azure-5.13 source package in Jammy:
  Invalid

Bug description:
  SRU Justification

  [Impact]

  Secure boot instances of linux-azure require an EFI framebuffer in
  some cases in order for the VM to boot.

  The issue was noticed in Ubuntu 18.04 linux-azure kernel, but actually
  exists in the latest mainline kernel. The issue happens when the below
  conditions are met:

  hyperv_pci is built into the kernel and hyperv_fb is not, i.e., this means 
hyperv_pci loads before hyperv_fb loads.
  CONFIG_FB_EFI is not defined, i.e., the efifb driver is not used.

  Here is how the bug happens:

  Linux VM starts, and vmbus_reserve_fb() reserves the VRAM [base=0xf800, 
length=8MB].
  hyper-pci loads, gets MMIO [base=0xf880, lengh=8KB] as the bridge config 
window, and may get some other 64-bit MMIO ranges, and some 32-bit MMIO ranges 
(if needed.)
  hyperv-fb loads, and gets MMIO [base=0xf800, lengh=8MB or a different 
length], and sets screen_info.lfb_base = 0.
  VM panics.
  The kdump kernel starts to run, and vmbus_reserve_fb() is not reserving 
[base=0xf800, length=8MB] due to the lfb_base==0.
  hyperv-pci loads and gets [base=0xf800, lengh=8KB] and the host PCI VSP 
driver rejects this address as the bridge config window.

  The crux of the problem is that Linux vmbus driver itself is unable to
  detect the VRAM base/length (it looks like a video BIOS call is needed
  to get this info and such a BIOS call is inappropriate or impossible
  in hv_vmbus) and has to rely on screen_info.lfb_base (which is set by
  grub or the kdump/kexec tool and can be reset to zero by
  hyperv_fb/drm).

  Solution: Enable CONFIG_FB_EFI=y

  [Test Case]

  Microsoft tested. This config is also enabled on the master branch.

  [Where things could go wrong]

  VMs on certain instance types could fail to boot.

  [Other Info]

  SF: #00327005

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1959216/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1960871] Re: linux-modules-extra-* fails to install due to dependency on unsigned package

2022-02-15 Thread Ian May
Fixed sent to ML and has been applied
https://lists.ubuntu.com/archives/kernel-team/2022-February/128100.html

** Changed in: linux-aws (Ubuntu)
   Status: New => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1960871

Title:
  linux-modules-extra-* fails to install due to dependency on unsigned
  package

Status in linux-aws package in Ubuntu:
  Fix Committed

Bug description:
  Several SRU tests are failing the test setup due to failure to install
  the modules-extra package:

  * Command: 
  yes "" | DEBIAN_FRONTEND=noninteractive apt-get install --yes --force-yes
  automake bison build-essential byacc flex git keyutils libacl1-dev libaio-
  dev libcap-dev libmm-dev libnuma-dev libsctp-dev libselinux1-dev libssl-
  dev libtirpc-dev pkg-config quota xfslibs-dev xfsprogs gcc linux-modules-
  extra-4.15.0-1120-aws
  Exit status: 100
  Duration: 0.908210039139

  stdout:
  Reading package lists...
  Building dependency tree...
  Reading state information...
  xfsprogs is already the newest version (4.9.0+nmu1ubuntu2).
  xfsprogs set to manually installed.
  git is already the newest version (1:2.17.1-1ubuntu0.9).
  git set to manually installed.
  Some packages could not be installed. This may mean that you have
  requested an impossible situation or if you are using the unstable
  distribution that some required packages have not yet been created
  or been moved out of Incoming.
  The following information may help to resolve the situation:

  The following packages have unmet dependencies:
   linux-modules-extra-4.15.0-1120-aws : Depends: 
linux-image-unsigned-4.15.0-1120-aws but it is not going to be installed
  stderr:
  W: --force-yes is deprecated, use one of the options starting with --allow 
instead.
  E: Unable to correct problems, you have held broken packages.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1960871/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1871015] Re: test_vxlan_under_vrf.sh in net from ubuntu_kernel_selftests failed with H (Check VM connectivity through VXLAN (underlay in the default VRF) [FAIL])

2022-01-26 Thread Ian May
Found also on 2022.01.03/impish/linux-aws: 5.13.0-1012.13

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1871015

Title:
  test_vxlan_under_vrf.sh in net from ubuntu_kernel_selftests failed
  with H (Check VM connectivity through VXLAN (underlay in the default
  VRF) [FAIL])

Status in ubuntu-kernel-tests:
  New
Status in linux package in Ubuntu:
  Incomplete
Status in linux source package in Hirsute:
  Confirmed

Bug description:
  Issue found with GCP 5.3.0-1017.18~18.04.1

   # selftests: net: test_vxlan_under_vrf.sh
   # Checking HV connectivity [ OK ]
   # Check VM connectivity through VXLAN (underlay in the default VRF) [FAIL]
   not ok 25 selftests: net: test_vxlan_under_vrf.sh # exit=1

  
  The failure is different from bug 1837348

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1871015/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1923104] Re: Include Infiniband Peer Memory interface

2022-01-24 Thread Ian May
Tested on Focal 5.4.0-97.110, confirmed inbox peer memory interface is
working.

** Tags removed: verification-needed-focal
** Tags added: verification-done-focal

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1923104

Title:
  Include Infiniband Peer Memory interface

Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Focal:
  Fix Committed

Bug description:
  The peer_memory_client scheme allows a driver to register with the ib_umem 
system that it has the ability to understand user virtual address ranges that 
are not compatible with get_user_pages(). For instance VMAs created with 
io_remap_pfn_range(), or other driver special VMA.
  
  For ranges the interface understands it can provide a DMA mapped sg_table for 
use by the ib_umem, allowing user virtual ranges that cannot be supported by 
get_user_pages() to be used as umems for RDMA.
  
  This is designed to preserve the kABI, no functions or structures are 
changed, only new symbols are added:

   ib_register_peer_memory_client
   ib_unregister_peer_memory_client
   ib_umem_activate_invalidation_notifier
   ib_umem_get_peer

  And a bitfield in struct ib_umem uses more bits.

  This interface is compatible with the two out of tree GPU drivers:
   
https://github.com/RadeonOpenCompute/ROCK-Kernel-Driver/blob/master/drivers/gpu/drm/amd/amdkfd/kfd_peerdirect.c
   https://github.com/Mellanox/nv_peer_memory/blob/master/nv_peer_mem.c

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1923104/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1958534] Re: building of linux-signed package failing on arm64

2022-01-20 Thread Ian May
Patches have been applied and bionic/linux-signed-aws now builds
successfully

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-signed-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1958534

Title:
  building of linux-signed package failing on arm64

Status in linux-signed-aws package in Ubuntu:
  Fix Committed
Status in linux-signed-aws source package in Bionic:
  Fix Committed

Bug description:
  dpkg-buildpackage
  -

  dpkg-buildpackage: info: source package linux-signed-aws
  dpkg-buildpackage: info: source version 4.15.0-1119.126
  dpkg-buildpackage: info: source distribution bionic
   dpkg-source --before-build linux-signed-aws-4.15.0
  dpkg-buildpackage: info: host architecture arm64
  dpkg-source: info: using options from 
linux-signed-aws-4.15.0/debian/source/options: --diff-ignore --tar-ignore
   fakeroot debian/rules clean
  sed debian/control  \
-e "s/@ABI@/4.15.0-1119/g"  \
-e "s/@UNSIGNED_SRC_PACKAGE@/linux-aws/g"   \
-e "s/@UNSIGNED_SRC_VERSION@/4.15.0-1119.126/g" \
-e 's/@SRCPKGNAME@/linux-signed-aws/g'  \
-e 's/@HEADERS_COMMON@/linux-aws-headers-4.15.0-1119/g' \
-e 's/@HEADERS_ARCH@/linux-headers-4.15.0-1119-aws/g'
  rm -rf ./4.15.0-1119.126 UNSIGNED SIGNED
  rm -f debian/linux-image-*.install\
debian/linux-image-*.preinst\
debian/linux-image-*.prerm  \
debian/linux-image-*.postinst   \
debian/linux-image-*.postrm
  rm -f debian/kernel-signed-image-*.install
  dh clean
 dh_clean
   debian/rules build-arch
  dh build-arch
 dh_update_autotools_config -a
 debian/rules override_dh_auto_build
  make[1]: Entering directory '/<>'
  ./download-signed "linux-headers-4.15.0-1119-aws" "4.15.0-1119.126" 
"linux-aws"
  Downloading 
http://ppa.launchpad.net/canonical-kernel-team/ppa/ubuntu/dists/bionic/main/signed/linux-aws-arm64/4.15.0-1119.126/SHA256SUMS
 ... found
  Downloading 
http://ppa.launchpad.net/canonical-kernel-team/ppa/ubuntu/dists/bionic/main/signed/linux-aws-arm64/4.15.0-1119.126/signed.tar.gz
 ... found
  Extracting 4.15.0-1119.126 ...
  Extracting 4.15.0-1119.126/control ...
  Extracting 4.15.0-1119.126/control/options ...
  mkdir SIGNED
  ( \
cd "4.15.0-1119.126" || exit 1; \
for s in *.efi.signed; do   \
[ ! -f "$s" ] && continue;  \
base=$(echo "$s" | sed -e 's/.efi.signed//');   \
(   \
vars="${base}.efi.vars";\
[ -f "$vars" ] && . "./$vars";  \
if [ "$GZIP" = "1" ]; then  \
gzip -9 "$s";   \
mv "${s}.gz" "$s";  \
fi; \
);  \
chmod 600 "$s"; \
ln "$s" "../SIGNED/$base";  \
done;   \
for s in *.opal.sig; do \
[ ! -f "$s" ] && continue;  \
chmod 600 "$s"; \
base=$(echo "$s" | sed -e 's/.opal.sig//'); \
cat "$base.opal" "$s" >"../SIGNED/$base";   \
done;   \
for s in *.sipl.sig; do \
[ ! -f "$s" ] && continue;  \
base=$(echo "$s" | sed -e 's/.sipl.sig//'); \
cat "$base.sipl" "$s" >"../SIGNED/$base";   \
chmod 600 "../SIGNED/$base";\
done\
  )
  make[1]: Leaving directory '/<>'
   fakeroot debian/rules binary-arch
  dh binary-arch
 dh_testroot -a
 dh_prep -a
 debian/rules override_dh_auto_install
  make[1]: Entering directory '/<>'
  for signed in "SIGNED"/*; do  \
flavour=$(echo "$signed" | sed -e "s@.*-4.15.0-1119-@@");   
\
instfile=$(echo "$signed" | sed -e "s@[^/]*/@@" -e 
"s@-4.15.0-1119-.*@@");

[Kernel-packages] [Bug 1958534] Re: building of linux-signed package failing on arm64

2022-01-20 Thread Ian May
Patches sent to the ML

https://lists.ubuntu.com/archives/kernel-team/2022-January/127253.html
https://lists.ubuntu.com/archives/kernel-team/2022-January/127251.html

** Changed in: linux-signed-aws (Ubuntu)
 Assignee: (unassigned) => Ian May (ian-may)

** Changed in: linux-signed-aws (Ubuntu Bionic)
 Assignee: (unassigned) => Ian May (ian-may)

** Changed in: linux-signed-aws (Ubuntu)
   Status: New => Fix Committed

** Changed in: linux-signed-aws (Ubuntu Bionic)
   Status: New => Fix Committed

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-signed-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1958534

Title:
  building of linux-signed package failing on arm64

Status in linux-signed-aws package in Ubuntu:
  Fix Committed
Status in linux-signed-aws source package in Bionic:
  Fix Committed

Bug description:
  dpkg-buildpackage
  -

  dpkg-buildpackage: info: source package linux-signed-aws
  dpkg-buildpackage: info: source version 4.15.0-1119.126
  dpkg-buildpackage: info: source distribution bionic
   dpkg-source --before-build linux-signed-aws-4.15.0
  dpkg-buildpackage: info: host architecture arm64
  dpkg-source: info: using options from 
linux-signed-aws-4.15.0/debian/source/options: --diff-ignore --tar-ignore
   fakeroot debian/rules clean
  sed debian/control  \
-e "s/@ABI@/4.15.0-1119/g"  \
-e "s/@UNSIGNED_SRC_PACKAGE@/linux-aws/g"   \
-e "s/@UNSIGNED_SRC_VERSION@/4.15.0-1119.126/g" \
-e 's/@SRCPKGNAME@/linux-signed-aws/g'  \
-e 's/@HEADERS_COMMON@/linux-aws-headers-4.15.0-1119/g' \
-e 's/@HEADERS_ARCH@/linux-headers-4.15.0-1119-aws/g'
  rm -rf ./4.15.0-1119.126 UNSIGNED SIGNED
  rm -f debian/linux-image-*.install\
debian/linux-image-*.preinst\
debian/linux-image-*.prerm  \
debian/linux-image-*.postinst   \
debian/linux-image-*.postrm
  rm -f debian/kernel-signed-image-*.install
  dh clean
 dh_clean
   debian/rules build-arch
  dh build-arch
 dh_update_autotools_config -a
 debian/rules override_dh_auto_build
  make[1]: Entering directory '/<>'
  ./download-signed "linux-headers-4.15.0-1119-aws" "4.15.0-1119.126" 
"linux-aws"
  Downloading 
http://ppa.launchpad.net/canonical-kernel-team/ppa/ubuntu/dists/bionic/main/signed/linux-aws-arm64/4.15.0-1119.126/SHA256SUMS
 ... found
  Downloading 
http://ppa.launchpad.net/canonical-kernel-team/ppa/ubuntu/dists/bionic/main/signed/linux-aws-arm64/4.15.0-1119.126/signed.tar.gz
 ... found
  Extracting 4.15.0-1119.126 ...
  Extracting 4.15.0-1119.126/control ...
  Extracting 4.15.0-1119.126/control/options ...
  mkdir SIGNED
  ( \
cd "4.15.0-1119.126" || exit 1; \
for s in *.efi.signed; do   \
[ ! -f "$s" ] && continue;  \
base=$(echo "$s" | sed -e 's/.efi.signed//');   \
(   \
vars="${base}.efi.vars";\
[ -f "$vars" ] && . "./$vars";  \
if [ "$GZIP" = "1" ]; then  \
gzip -9 "$s";   \
mv "${s}.gz" "$s";  \
fi; \
);  \
chmod 600 "$s"; \
ln "$s" "../SIGNED/$base";  \
done;   \
for s in *.opal.sig; do \
[ ! -f "$s" ] && continue;  \
chmod 600 "$s"; \
base=$(echo "$s" | sed -e 's/.opal.sig//'); \
cat "$base.opal" "$s" >"../SIGNED/$base";   \
done;   \
for s in *.sipl.sig; do \
[ ! -f "$s" ] && continue;   

[Kernel-packages] [Bug 1958534] Re: building of linux-signed package failing on arm64

2022-01-20 Thread Ian May
This can be resolved by applying the following patches that were added
for arm64 signed support in Disco

UBUNTU: [Packaging] remove handoff check for uefi signing
UBUNTU: [Packaging] decompress gzipped efi images in signing tarball

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-signed-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1958534

Title:
  building of linux-signed package failing on arm64

Status in linux-signed-aws package in Ubuntu:
  Fix Committed
Status in linux-signed-aws source package in Bionic:
  Fix Committed

Bug description:
  dpkg-buildpackage
  -

  dpkg-buildpackage: info: source package linux-signed-aws
  dpkg-buildpackage: info: source version 4.15.0-1119.126
  dpkg-buildpackage: info: source distribution bionic
   dpkg-source --before-build linux-signed-aws-4.15.0
  dpkg-buildpackage: info: host architecture arm64
  dpkg-source: info: using options from 
linux-signed-aws-4.15.0/debian/source/options: --diff-ignore --tar-ignore
   fakeroot debian/rules clean
  sed debian/control  \
-e "s/@ABI@/4.15.0-1119/g"  \
-e "s/@UNSIGNED_SRC_PACKAGE@/linux-aws/g"   \
-e "s/@UNSIGNED_SRC_VERSION@/4.15.0-1119.126/g" \
-e 's/@SRCPKGNAME@/linux-signed-aws/g'  \
-e 's/@HEADERS_COMMON@/linux-aws-headers-4.15.0-1119/g' \
-e 's/@HEADERS_ARCH@/linux-headers-4.15.0-1119-aws/g'
  rm -rf ./4.15.0-1119.126 UNSIGNED SIGNED
  rm -f debian/linux-image-*.install\
debian/linux-image-*.preinst\
debian/linux-image-*.prerm  \
debian/linux-image-*.postinst   \
debian/linux-image-*.postrm
  rm -f debian/kernel-signed-image-*.install
  dh clean
 dh_clean
   debian/rules build-arch
  dh build-arch
 dh_update_autotools_config -a
 debian/rules override_dh_auto_build
  make[1]: Entering directory '/<>'
  ./download-signed "linux-headers-4.15.0-1119-aws" "4.15.0-1119.126" 
"linux-aws"
  Downloading 
http://ppa.launchpad.net/canonical-kernel-team/ppa/ubuntu/dists/bionic/main/signed/linux-aws-arm64/4.15.0-1119.126/SHA256SUMS
 ... found
  Downloading 
http://ppa.launchpad.net/canonical-kernel-team/ppa/ubuntu/dists/bionic/main/signed/linux-aws-arm64/4.15.0-1119.126/signed.tar.gz
 ... found
  Extracting 4.15.0-1119.126 ...
  Extracting 4.15.0-1119.126/control ...
  Extracting 4.15.0-1119.126/control/options ...
  mkdir SIGNED
  ( \
cd "4.15.0-1119.126" || exit 1; \
for s in *.efi.signed; do   \
[ ! -f "$s" ] && continue;  \
base=$(echo "$s" | sed -e 's/.efi.signed//');   \
(   \
vars="${base}.efi.vars";\
[ -f "$vars" ] && . "./$vars";  \
if [ "$GZIP" = "1" ]; then  \
gzip -9 "$s";   \
mv "${s}.gz" "$s";  \
fi; \
);  \
chmod 600 "$s"; \
ln "$s" "../SIGNED/$base";  \
done;   \
for s in *.opal.sig; do \
[ ! -f "$s" ] && continue;  \
chmod 600 "$s"; \
base=$(echo "$s" | sed -e 's/.opal.sig//'); \
cat "$base.opal" "$s" >"../SIGNED/$base";   \
done;   \
for s in *.sipl.sig; do \
[ ! -f "$s" ] && continue;  \
base=$(echo "$s" | sed -e 's/.sipl.sig//'); \
cat "$base.sipl" "$s" >"../SIGNED/$base";   \
chmod 600 "../SIGNED/$base";\
done\
  )
  make[1]: Leaving directory '/<>'
   fakeroot debian/rules binary-arch
  dh binary-arch
 dh_testroot -a
 dh_prep -a
 debian/rules override_dh_auto_install
  make[1]: Entering directory '/<>'
  for signed in "SIGNED"/*; do  \
flavour=$(echo 

[Kernel-packages] [Bug 1958534] [NEW] building of linux-signed package failing on arm64

2022-01-20 Thread Ian May
ot;; \
\
package="kernel-signed-image-$verflav-di";  \
echo "$package: adding $signed";\
echo "$signed boot" >>"debian/$package.install";\
\
package="linux-image-$verflav"; \
echo "$package: adding $signed";\
echo "$signed boot" >>"debian/$package.install";\
\
./generate-depends linux-image-unsigned-$verflav 4.15.0-1119.126
\
linux-image-$verflav\
>>"debian/linux-image-$verflav.substvars";  \
\
for which in postinst postrm preinst prerm; do  \
template="debian/templates/image.$which.in";\
script="debian/$package.$which";\
sed -e "s/@abiname@/4.15.0-1119/g"  
\
-e "s/@localversion@/-$flavour/g"   \
-e "s/@image-stem@/$instfile/g" \
<"$template" >"$script";\
done;   \
echo "interest linux-update-4.15.0-1119-$flavour"   
\
>"debian/$package.triggers";\
done
kernel-signed-image-4.15.0-1119-SIGNED/*-di: adding SIGNED/*
/bin/sh: 8: cannot create 
debian/kernel-signed-image-4.15.0-1119-SIGNED/*-di.install: Directory 
nonexistent
linux-image-4.15.0-1119-SIGNED/*: adding SIGNED/*
/bin/sh: 12: cannot create debian/linux-image-4.15.0-1119-SIGNED/*.install: 
Directory nonexistent
/bin/sh: 14: cannot create debian/linux-image-4.15.0-1119-SIGNED/*.substvars: 
Directory nonexistent
/bin/sh: 21: cannot create debian/linux-image-4.15.0-1119-SIGNED/*.postinst: 
Directory nonexistent
/bin/sh: 21: cannot create debian/linux-image-4.15.0-1119-SIGNED/*.postrm: 
Directory nonexistent
/bin/sh: 21: cannot create debian/linux-image-4.15.0-1119-SIGNED/*.preinst: 
Directory nonexistent
/bin/sh: 21: cannot create debian/linux-image-4.15.0-1119-SIGNED/*.prerm: 
Directory nonexistent
/bin/sh: 26: cannot create debian/linux-image-4.15.0-1119-SIGNED/*.triggers: 
Directory nonexistent
debian/rules:81: recipe for target 'override_dh_auto_install' failed
make[1]: *** [override_dh_auto_install] Error 2
make[1]: Leaving directory '/<>'
debian/rules:45: recipe for target 'binary-arch' failed
make: *** [binary-arch] Error 2
dpkg-buildpackage: error: fakeroot debian/rules binary-arch subprocess returned 
exit status 2

** Affects: linux-signed-aws (Ubuntu)
 Importance: Undecided
 Assignee: Ian May (ian-may)
 Status: Fix Committed

** Affects: linux-signed-aws (Ubuntu Bionic)
 Importance: Undecided
 Assignee: Ian May (ian-may)
 Status: Fix Committed

** Also affects: linux-signed-aws (Ubuntu Bionic)
   Importance: Undecided
   Status: New

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-signed-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1958534

Title:
  building of linux-signed package failing on arm64

Status in linux-signed-aws package in Ubuntu:
  Fix Committed
Status in linux-signed-aws source package in Bionic:
  Fix Committed

Bug description:
  dpkg-buildpackage
  -

  dpkg-buildpackage: info: source package linux-signed-aws
  dpkg-buildpackage: info: source version 4.15.0-1119.126
  dpkg-buildpackage: info: source distribution bionic
   dpkg-source --before-build linux-signed-aws-4.15.0
  dpkg-buildpackage: info: host architecture arm64
  dpkg-source: info: using options from 
linux-signed-aws-4.15.0/debian/source/options: --diff-ignore --tar-ignore
   fakeroot debian/rules clean
  sed debian/control  \
-e "s/@ABI@/4.15.0-1119/g"  \
-e "s/@UNSIGNED_SRC_PACKAGE@/linux-aws/g"   \
-e "s/@UNSIGNED_SRC_VERSION@/4.15.0-1119.126/g" \
-e 's/@SRCPKGNAME@/linux-signed-aws/g'  \
-e 's/@HEADERS_COMMON@/linux-aws-headers-4.15.0-1119/g' \
-e 's/@HEADERS_ARCH@/linux-headers-4.15.0-1119-aws/g'
  rm -rf ./4.15.0-1119.126 UNSIGNED SIGNED
  rm -f debian/linux-image-*.ins

[Kernel-packages] [Bug 1949532] Re: ubuntu_ltp_controllers tests failing on Impish

2021-11-22 Thread Ian May
** Tags added: aws azures sru-20211108

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1949532

Title:
  ubuntu_ltp_controllers tests failing on Impish

Status in ubuntu-kernel-tests:
  New
Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Almost half of the ubuntu_ltp_controllers tests are failing due to the
  general pattern 'cgroup_name already mounted or mount point busy'
  causing the tests to fail.

  e.g.
  mount: /dev/cgroup: ltp_cgroup already mounted or mount point busy.
  cgroup_fj_function2_memory 1 TBROK: mount -t cgroup -o memory ltp_cgroup 
/dev/cgroup failed

  From investigation it seems there could be an issue with the
  transition to cgroup-v2. There have been rumors on the ltp mailing
  list that one of these days the tests could break due to the
  transition. Switching to cgroup-v2, likely due to a systemd update,
  could cause these tests to break due to different mount and cgroup
  hierarchy semantics.

  I could only reproduce a subset of the new failures we are seeing, but
  after setting systemd.unified_cgroup_hierarchy=0 on the kernel command
  line which sets cgroup back to v1, a lot of the failures I could
  produce went away.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1949532/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1949532] Re: ubuntu_ltp_controllers tests failing on Impish

2021-11-22 Thread Ian May
Found on impish/linux-azure: 5.13.0-1008.9

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1949532

Title:
  ubuntu_ltp_controllers tests failing on Impish

Status in ubuntu-kernel-tests:
  New
Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Almost half of the ubuntu_ltp_controllers tests are failing due to the
  general pattern 'cgroup_name already mounted or mount point busy'
  causing the tests to fail.

  e.g.
  mount: /dev/cgroup: ltp_cgroup already mounted or mount point busy.
  cgroup_fj_function2_memory 1 TBROK: mount -t cgroup -o memory ltp_cgroup 
/dev/cgroup failed

  From investigation it seems there could be an issue with the
  transition to cgroup-v2. There have been rumors on the ltp mailing
  list that one of these days the tests could break due to the
  transition. Switching to cgroup-v2, likely due to a systemd update,
  could cause these tests to break due to different mount and cgroup
  hierarchy semantics.

  I could only reproduce a subset of the new failures we are seeing, but
  after setting systemd.unified_cgroup_hierarchy=0 on the kernel command
  line which sets cgroup back to v1, a lot of the failures I could
  produce went away.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1949532/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1949532] Re: ubuntu_ltp_controllers tests failing on Impish

2021-11-22 Thread Ian May
Found on impish/linux-aws: 5.13.0-1007.8

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1949532

Title:
  ubuntu_ltp_controllers tests failing on Impish

Status in ubuntu-kernel-tests:
  New
Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Almost half of the ubuntu_ltp_controllers tests are failing due to the
  general pattern 'cgroup_name already mounted or mount point busy'
  causing the tests to fail.

  e.g.
  mount: /dev/cgroup: ltp_cgroup already mounted or mount point busy.
  cgroup_fj_function2_memory 1 TBROK: mount -t cgroup -o memory ltp_cgroup 
/dev/cgroup failed

  From investigation it seems there could be an issue with the
  transition to cgroup-v2. There have been rumors on the ltp mailing
  list that one of these days the tests could break due to the
  transition. Switching to cgroup-v2, likely due to a systemd update,
  could cause these tests to break due to different mount and cgroup
  hierarchy semantics.

  I could only reproduce a subset of the new failures we are seeing, but
  after setting systemd.unified_cgroup_hierarchy=0 on the kernel command
  line which sets cgroup back to v1, a lot of the failures I could
  produce went away.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1949532/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1946149] Re: Bionic/linux-aws Boot failure downgrading from Bionic/linux-aws-5.4 on r5.metal

2021-10-14 Thread Ian May
As I was bisecting the commits, I was attempting to take advantage of
parallelism. While my test kernel was building I would deploy a clean
AWS r5.metal instance.  I started seeing test kernels boot that I
wouldn't expect to boot.  So I decided as a sanity test, I would deploy
an r5.metal instance, let it sit idle for 20 minutes and then install
the known problematic 4.15.0-1113-aws kernel.  Sure enough it booted
fine.  Tried the same thing again with letting it sit idle 20 mins and
it worked again.  So this does appear to be a race condition.  I think
this also explains some of the erratic test results I've seen while
looking at this bug.  Fortunately the console output gave us some
definitive proof as to where the problem was occurring.

With that being said, it appears I have found the offending commits.

PCI/MSI: Enforce that MSI-X table entry is masked for update
PCI/MSI: Enforce MSI[X] entry updates to be visible

https://git.launchpad.net/~canonical-kernel/ubuntu/+source/linux-
aws/+git/bionic/commit/?id=27571f5ea1dd074924b41a455c50dc2278e8c2b7

https://git.launchpad.net/~canonical-kernel/ubuntu/+source/linux-
aws/+git/bionic/commit/?id=2478f358c2b35fea04e005447ce99ad8dc53fd5d

More specifically the hang is introduced by 'PCI/MSI: Enforce that MSI-X
table entry is masked for update', but it isn't a clean revert without
reverting the other commit.  So for a quick test confirmation I reverted
both.

I have not had a chance to determine why these commits are causing the
problem, but with these reverted in a test build on top of
4.15.0-1113-aws, I can migrate from 5.4 to 4.15 as soon as the instance
is available.  I've done at least 6 attempts now and all have passed and
doing the same steps without the reverts all have hung(unless I wait 20
mins).

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1946149

Title:
  Bionic/linux-aws Boot failure downgrading from Bionic/linux-aws-5.4 on
  r5.metal

Status in linux-aws package in Ubuntu:
  New

Bug description:
  When creating an r5.metal instance on AWS, the default kernel is
  bionic/linux-aws-5.4(5.4.0-1056-aws), when changing to bionic/linux-
  aws(4.15.0-1113-aws) the machine fails to boot the 4.15 kernel.

  If I remove these patches the instance correctly boots the 4.15 kernel

  https://lists.ubuntu.com/archives/kernel-
  team/2021-September/123963.html

  With that being said, after successfully updating to the 4.15 without
  those patches applied, I can then upgrade to a 4.15 kernel with the
  above patches included, and the instance will boot properly.

  This problem only appears on metal instances, which uses NVME instead
  of XVDA devices.

  AWS instances also use the 'discard' mount option with ext4, thought
  maybe there could be a race condition between ext4 discard and journal
  flush.  Removed 'discard' from mount options and rebooted 5.4 kernel
  prior to 4.15 kernel installation, but still wouldn't boot after
  installing the 4.15 kernel.

  I have been unable to capture a stack trace using 'aws get-console-
  output'. After enabling kdump I was unable to replicate the failure.
  So there must be some sort of race with either ext4 and/or nvme.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1946149/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1946149] Re: Bionic/linux-aws Boot failure downgrading from Bionic/linux-aws-5.4 on r5.metal

2021-10-14 Thread Ian May
Hi Mauricio,

Thanks for getting this info.  This is very helpful!  I see a few
potential patches between 4.15.0-159.167 and 4.15.0-160.168 that could
be related to the hang.  This will help greatly with the bisect.

Ian

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1946149

Title:
  Bionic/linux-aws Boot failure downgrading from Bionic/linux-aws-5.4 on
  r5.metal

Status in linux-aws package in Ubuntu:
  New

Bug description:
  When creating an r5.metal instance on AWS, the default kernel is
  bionic/linux-aws-5.4(5.4.0-1056-aws), when changing to bionic/linux-
  aws(4.15.0-1113-aws) the machine fails to boot the 4.15 kernel.

  If I remove these patches the instance correctly boots the 4.15 kernel

  https://lists.ubuntu.com/archives/kernel-
  team/2021-September/123963.html

  With that being said, after successfully updating to the 4.15 without
  those patches applied, I can then upgrade to a 4.15 kernel with the
  above patches included, and the instance will boot properly.

  This problem only appears on metal instances, which uses NVME instead
  of XVDA devices.

  AWS instances also use the 'discard' mount option with ext4, thought
  maybe there could be a race condition between ext4 discard and journal
  flush.  Removed 'discard' from mount options and rebooted 5.4 kernel
  prior to 4.15 kernel installation, but still wouldn't boot after
  installing the 4.15 kernel.

  I have been unable to capture a stack trace using 'aws get-console-
  output'. After enabling kdump I was unable to replicate the failure.
  So there must be some sort of race with either ext4 and/or nvme.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1946149/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1830585] Re: cpuset_memory_spread from controllers test suite in LTP failed (hog the memory on the unexpected node)

2021-10-13 Thread Ian May
Found on bionic/linux-oracle-5.4: 5.4.0-1056.60~18.04.1 -
BM.Standard2.52

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-azure in Ubuntu.
https://bugs.launchpad.net/bugs/1830585

Title:
  cpuset_memory_spread from controllers test suite in LTP failed (hog
  the memory on the unexpected node)

Status in ubuntu-kernel-tests:
  Triaged
Status in linux package in Ubuntu:
  Confirmed
Status in linux-azure package in Ubuntu:
  New
Status in linux source package in Bionic:
  New
Status in linux-azure source package in Bionic:
  New
Status in linux source package in Disco:
  Won't Fix
Status in linux-azure source package in Disco:
  Won't Fix
Status in linux source package in Focal:
  New
Status in linux-azure source package in Focal:
  New
Status in linux source package in Hirsute:
  New
Status in linux-azure source package in Hirsute:
  New

Bug description:
  Test failed with:
  cpuset_memory_spread 7 TFAIL: hog the memory on the unexpected 
node(FilePages_For_Nodes(KB): _0: 2276
  _1: 102428, Expect Nodes: 1).

  <<>>
  tag=cpuset_memory_spread stime=1558937747
  cmdline="   cpuset_memory_spread_testset.sh"
  contacts=""
  analysis=exit
  <<>>
  100+0 records in
  100+0 records out
  104857600 bytes (105 MB, 100 MiB) copied, 0.0993112 s, 1.1 GB/s
  cpuset_memory_spread 1 TPASS: Cpuset memory spread page test succeeded.
  cpuset_memory_spread 3 TPASS: Cpuset memory spread page test succeeded.
  cpuset_memory_spread 5 TPASS: Cpuset memory spread page test succeeded.
  cpuset_memory_spread 7 TFAIL: hog the memory on the unexpected 
node(FilePages_For_Nodes(KB): _0: 2276
  _1: 102428, Expect Nodes: 1).
  cpuset_memory_spread 9 TPASS: Cpuset memory spread page test succeeded.
  cpuset_memory_spread 11 TPASS: Cpuset memory spread page test succeeded.
  cpuset_memory_spread 13 TPASS: Cpuset memory spread page test succeeded.
  <<>>
  initiation_status="ok"
  duration=10 termination_type=exited termination_id=1 corefile=no
  cutime=364 cstime=383
  <<>>

  ProblemType: Bug
  DistroRelease: Ubuntu 19.04
  Package: linux-image-5.0.0-15-generic 5.0.0-15.16
  ProcVersionSignature: User Name 5.0.0-15.16-generic 5.0.6
  Uname: Linux 5.0.0-15-generic x86_64
  AlsaDevices:
   total 0
   crw-rw 1 root audio 116,  1 May 27 05:39 seq
   crw-rw 1 root audio 116, 33 May 27 05:39 timer
  AplayDevices: Error: [Errno 2] No such file or directory: 'aplay': 'aplay'
  ApportVersion: 2.20.10-0ubuntu27
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory: 'arecord': 
'arecord'
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  Date: Mon May 27 06:16:49 2019
  IwConfig: Error: [Errno 2] No such file or directory: 'iwconfig': 'iwconfig'
  MachineType: HP ProLiant DL360 Gen9
  PciMultimedia:

  ProcFB: 0 mgadrmfb
  ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-5.0.0-15-generic 
root=UUID=6422cfdd-2a69-4c0b-9784-6809a77ab980 ro
  RelatedPackageVersions:
   linux-restricted-modules-5.0.0-15-generic N/A
   linux-backports-modules-5.0.0-15-generic  N/A
   linux-firmware1.178.1
  RfKill: Error: [Errno 2] No such file or directory: 'rfkill': 'rfkill'
  SourcePackage: linux
  UpgradeStatus: No upgrade log present (probably fresh install)
  dmi.bios.date: 04/25/2017
  dmi.bios.vendor: HP
  dmi.bios.version: P89
  dmi.board.name: ProLiant DL360 Gen9
  dmi.board.vendor: HP
  dmi.chassis.type: 23
  dmi.chassis.vendor: HP
  dmi.modalias: 
dmi:bvnHP:bvrP89:bd04/25/2017:svnHP:pnProLiantDL360Gen9:pvr:rvnHP:rnProLiantDL360Gen9:rvr:cvnHP:ct23:cvr:
  dmi.product.family: ProLiant
  dmi.product.name: ProLiant DL360 Gen9
  dmi.product.sku: 780020-S01
  dmi.sys.vendor: HP

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1830585/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1876687] Re: function traceon/off triggers in ftace from ubuntu_kernel_selftests failed on B/F

2021-10-13 Thread Ian May
Found on bionic/linux-gcp-fips: 4.15.0-2020.22 - n1-highcpu-4

** Tags added: gcp sru-20210927

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1876687

Title:
  function traceon/off triggers in ftace from ubuntu_kernel_selftests
  failed on B/F

Status in ubuntu-kernel-tests:
  New
Status in linux package in Ubuntu:
  Incomplete

Bug description:
  Issue found on Focal 5.4.0-29.33 with node amaura (passed on rizzo,
  rizzo failed with other failures)

  # [27] ftrace - test for function traceon/off triggers [FAIL]

  Need to retest on amaura to check if this is just a glitch.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1876687/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1946149] Re: Bionic/linux-aws Boot failure downgrading from Bionic/linux-aws-5.4 on r5.metal

2021-10-08 Thread Ian May
Mauricio,

Interesting update, I agree that we need more info as to what the state
is when the instance won't boot switching to the new 4.15 kernel.  I'll
check with my team in the morning and see if we can get additional info
from AWS

I was trying a few more scenarios this evening the first being the most
interesting.

Scenario 1
start with 5.4.0-1056-aws
install 5.4.0-1058-aws
reboot
confirm 5.4.0-1058-aws booted
reboot AGAIN
install 4.15.0-1113-aws
reboot
machine booted 4.15.0-1113-aws successfully

Scenario 2
start with 5.4.0-1056-aws
install 4.15.0-1112-aws
reboot
install 4.15.0-1113-aws
reboot
confirmed 4.15.0-1113-aws booted
then booted back into 5.4.0-1056-aws
removed 4.15.0-1112-aws and 4.15.0-1113-aws
rebooted again for good measure
confirmed still running 5.4.0-1056-aws
installed 4.15.0-1113-aws
rebooted
4.15.0-1113-aws successfully loaded

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1946149

Title:
  Bionic/linux-aws Boot failure downgrading from Bionic/linux-aws-5.4 on
  r5.metal

Status in linux-aws package in Ubuntu:
  New

Bug description:
  When creating an r5.metal instance on AWS, the default kernel is
  bionic/linux-aws-5.4(5.4.0-1056-aws), when changing to bionic/linux-
  aws(4.15.0-1113-aws) the machine fails to boot the 4.15 kernel.

  If I remove these patches the instance correctly boots the 4.15 kernel

  https://lists.ubuntu.com/archives/kernel-
  team/2021-September/123963.html

  With that being said, after successfully updating to the 4.15 without
  those patches applied, I can then upgrade to a 4.15 kernel with the
  above patches included, and the instance will boot properly.

  This problem only appears on metal instances, which uses NVME instead
  of XVDA devices.

  AWS instances also use the 'discard' mount option with ext4, thought
  maybe there could be a race condition between ext4 discard and journal
  flush.  Removed 'discard' from mount options and rebooted 5.4 kernel
  prior to 4.15 kernel installation, but still wouldn't boot after
  installing the 4.15 kernel.

  I have been unable to capture a stack trace using 'aws get-console-
  output'. After enabling kdump I was unable to replicate the failure.
  So there must be some sort of race with either ext4 and/or nvme.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1946149/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1946149] Re: Bionic/linux-aws Boot failure downgrading from Bionic/linux-aws-5.4 on r5.metal

2021-10-07 Thread Ian May
Just want to add an update.  I haven't been able to replicate
successfully booting 4.15.0-1113-aws from 5.4.0-1058-aws, so I'm
questioning whether I made a mistake the time I thought it was
successful.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1946149

Title:
  Bionic/linux-aws Boot failure downgrading from Bionic/linux-aws-5.4 on
  r5.metal

Status in linux-aws package in Ubuntu:
  New

Bug description:
  When creating an r5.metal instance on AWS, the default kernel is
  bionic/linux-aws-5.4(5.4.0-1056-aws), when changing to bionic/linux-
  aws(4.15.0-1113-aws) the machine fails to boot the 4.15 kernel.

  If I remove these patches the instance correctly boots the 4.15 kernel

  https://lists.ubuntu.com/archives/kernel-
  team/2021-September/123963.html

  With that being said, after successfully updating to the 4.15 without
  those patches applied, I can then upgrade to a 4.15 kernel with the
  above patches included, and the instance will boot properly.

  This problem only appears on metal instances, which uses NVME instead
  of XVDA devices.

  AWS instances also use the 'discard' mount option with ext4, thought
  maybe there could be a race condition between ext4 discard and journal
  flush.  Removed 'discard' from mount options and rebooted 5.4 kernel
  prior to 4.15 kernel installation, but still wouldn't boot after
  installing the 4.15 kernel.

  I have been unable to capture a stack trace using 'aws get-console-
  output'. After enabling kdump I was unable to replicate the failure.
  So there must be some sort of race with either ext4 and/or nvme.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1946149/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1946149] Re: Bionic/linux-aws Boot failure downgrading from Bionic/linux-aws-5.4 on r5.metal

2021-10-07 Thread Ian May
Thanks for the in-depth update Mauricio!  Is there any investigation
you'd like me to specifically target?

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1946149

Title:
  Bionic/linux-aws Boot failure downgrading from Bionic/linux-aws-5.4 on
  r5.metal

Status in linux-aws package in Ubuntu:
  New

Bug description:
  When creating an r5.metal instance on AWS, the default kernel is
  bionic/linux-aws-5.4(5.4.0-1056-aws), when changing to bionic/linux-
  aws(4.15.0-1113-aws) the machine fails to boot the 4.15 kernel.

  If I remove these patches the instance correctly boots the 4.15 kernel

  https://lists.ubuntu.com/archives/kernel-
  team/2021-September/123963.html

  With that being said, after successfully updating to the 4.15 without
  those patches applied, I can then upgrade to a 4.15 kernel with the
  above patches included, and the instance will boot properly.

  This problem only appears on metal instances, which uses NVME instead
  of XVDA devices.

  AWS instances also use the 'discard' mount option with ext4, thought
  maybe there could be a race condition between ext4 discard and journal
  flush.  Removed 'discard' from mount options and rebooted 5.4 kernel
  prior to 4.15 kernel installation, but still wouldn't boot after
  installing the 4.15 kernel.

  I have been unable to capture a stack trace using 'aws get-console-
  output'. After enabling kdump I was unable to replicate the failure.
  So there must be some sort of race with either ext4 and/or nvme.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1946149/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1946149] Re: Bionic/linux-aws Boot failure downgrading from Bionic/linux-aws-5.4 on r5.metal

2021-10-06 Thread Ian May
** Description changed:

  When creating an r5.metal instance on AWS, the default kernel is
  bionic/linux-aws-5.4(5.4.0-1056-aws), when changing to bionic/linux-
  aws(4.15.0-1113-aws) the machine fails to boot the 4.15 kernel.
  
  If I remove these patches the instance correctly boots the 4.15 kernel
  
  https://lists.ubuntu.com/archives/kernel-team/2021-September/123963.html
  
  With that being said, after successfully updating to the 4.15 without
  those patches applied, I can then upgrade to a 4.15 kernel with the
  above patches included, and the instance will boot properly.
  
  This problem only appears on metal instances, which uses NVME instead of
  XVDA devices.
  
  AWS instances also use the 'discard' mount option with ext4, thought
  maybe there could be a race condition between ext4 discard and journal
- flush.  Removed 'discard' mount and rebooted 5.4 kernel prior to 4.15
- kernel installation, but still wouldn't boot after installing the 4.15
- kernel.
+ flush.  Removed 'discard' from mount options and rebooted 5.4 kernel
+ prior to 4.15 kernel installation, but still wouldn't boot after
+ installing the 4.15 kernel.
  
  I have been unable to capture a stack trace using 'aws get-console-
  output'. After enabling kdump I was unable to replicate the failure. So
  there must be some sort of race with either ext4 and/or nvme.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1946149

Title:
  Bionic/linux-aws Boot failure downgrading from Bionic/linux-aws-5.4 on
  r5.metal

Status in linux-aws package in Ubuntu:
  New

Bug description:
  When creating an r5.metal instance on AWS, the default kernel is
  bionic/linux-aws-5.4(5.4.0-1056-aws), when changing to bionic/linux-
  aws(4.15.0-1113-aws) the machine fails to boot the 4.15 kernel.

  If I remove these patches the instance correctly boots the 4.15 kernel

  https://lists.ubuntu.com/archives/kernel-
  team/2021-September/123963.html

  With that being said, after successfully updating to the 4.15 without
  those patches applied, I can then upgrade to a 4.15 kernel with the
  above patches included, and the instance will boot properly.

  This problem only appears on metal instances, which uses NVME instead
  of XVDA devices.

  AWS instances also use the 'discard' mount option with ext4, thought
  maybe there could be a race condition between ext4 discard and journal
  flush.  Removed 'discard' from mount options and rebooted 5.4 kernel
  prior to 4.15 kernel installation, but still wouldn't boot after
  installing the 4.15 kernel.

  I have been unable to capture a stack trace using 'aws get-console-
  output'. After enabling kdump I was unable to replicate the failure.
  So there must be some sort of race with either ext4 and/or nvme.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1946149/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1946149] Re: Bionic/linux-aws Boot failure downgrading from Bionic/linux-aws-5.4 on r5.metal

2021-10-06 Thread Ian May
Confirmed it does work to first upgrade bionic/linux-5.4 from
5.4.0-1056-aws to 5.4.0-1058-aws and then update to 4.15.0-1113-aws

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1946149

Title:
  Bionic/linux-aws Boot failure downgrading from Bionic/linux-aws-5.4 on
  r5.metal

Status in linux-aws package in Ubuntu:
  New

Bug description:
  When creating an r5.metal instance on AWS, the default kernel is
  bionic/linux-aws-5.4(5.4.0-1056-aws), when changing to bionic/linux-
  aws(4.15.0-1113-aws) the machine fails to boot the 4.15 kernel.

  If I remove these patches the instance correctly boots the 4.15 kernel

  https://lists.ubuntu.com/archives/kernel-
  team/2021-September/123963.html

  With that being said, after successfully updating to the 4.15 without
  those patches applied, I can then upgrade to a 4.15 kernel with the
  above patches included, and the instance will boot properly.

  This problem only appears on metal instances, which uses NVME instead
  of XVDA devices.

  AWS instances also use the 'discard' mount option with ext4, thought
  maybe there could be a race condition between ext4 discard and journal
  flush.  Removed 'discard' mount and rebooted 5.4 kernel prior to 4.15
  kernel installation, but still wouldn't boot after installing the 4.15
  kernel.

  I have been unable to capture a stack trace using 'aws get-console-
  output'. After enabling kdump I was unable to replicate the failure.
  So there must be some sort of race with either ext4 and/or nvme.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1946149/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1946149] Re: Bionic/linux-aws Boot failure downgrading from Bionic/linux-aws-5.4 on r5.metal

2021-10-05 Thread Ian May
** Description changed:

  When creating an r5.metal instance on AWS, the default kernel is
  bionic/linux-aws-5.4(5.4.0-1056-aws), when changing to bionic/linux-
- aws(4.15.0-1113-aws) the machine fails to boot 4.15 kernel.
+ aws(4.15.0-1113-aws) the machine fails to boot the 4.15 kernel.
  
  If I remove these patches the instance correctly boots the 4.15 kernel
  
  https://lists.ubuntu.com/archives/kernel-team/2021-September/123963.html
  
- But after successfully updating to the 4.15 without those patches
- applied, I can then upgrade to a 4.15 kernel with the above patches
- included, and the instance will boot properly.
+ With that being said, after successfully updating to the 4.15 without
+ those patches applied, I can then upgrade to a 4.15 kernel with the
+ above patches included, and the instance will boot properly.
  
  This problem only appears on metal instances, which uses NVME instead of
  XVDA devices.
  
  AWS instances also use the 'discard' mount option with ext4, thought
  maybe there could be a race condition between ext4 discard and journal
  flush.  Removed 'discard' mount and rebooted 5.4 kernel prior to 4.15
  kernel installation, but still wouldn't boot after installing the 4.15
  kernel.
  
  I have been unable to capture a stack trace using 'aws get-console-
  output'. After enabling kdump I was unable to replicate the failure. So
  there must be some sort of race with either ext4 and/or nvme.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1946149

Title:
  Bionic/linux-aws Boot failure downgrading from Bionic/linux-aws-5.4 on
  r5.metal

Status in linux-aws package in Ubuntu:
  New

Bug description:
  When creating an r5.metal instance on AWS, the default kernel is
  bionic/linux-aws-5.4(5.4.0-1056-aws), when changing to bionic/linux-
  aws(4.15.0-1113-aws) the machine fails to boot the 4.15 kernel.

  If I remove these patches the instance correctly boots the 4.15 kernel

  https://lists.ubuntu.com/archives/kernel-
  team/2021-September/123963.html

  With that being said, after successfully updating to the 4.15 without
  those patches applied, I can then upgrade to a 4.15 kernel with the
  above patches included, and the instance will boot properly.

  This problem only appears on metal instances, which uses NVME instead
  of XVDA devices.

  AWS instances also use the 'discard' mount option with ext4, thought
  maybe there could be a race condition between ext4 discard and journal
  flush.  Removed 'discard' mount and rebooted 5.4 kernel prior to 4.15
  kernel installation, but still wouldn't boot after installing the 4.15
  kernel.

  I have been unable to capture a stack trace using 'aws get-console-
  output'. After enabling kdump I was unable to replicate the failure.
  So there must be some sort of race with either ext4 and/or nvme.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1946149/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1946149] Re: Bionic/linux-aws Boot failure downgrading from Bionic/linux-aws-5.4 on r5.metal

2021-10-05 Thread Ian May
** Description changed:

  When creating an r5.metal instance on AWS, the default kernel is
  bionic/linux-aws-5.4(5.4.0-1056-aws), when changing to bionic/linux-
  aws(4.15.0-1113-aws) the machine fails to boot 4.15 kernel.
  
  If I remove these patches the instance correctly boots the 4.15 kernel
  
  https://lists.ubuntu.com/archives/kernel-team/2021-September/123963.html
  
  But after successfully updating to the 4.15 without those patches
  applied, I can then upgrade to a test kernel with the above patches
  included, and the instance will boot properly.
  
  This problem only appears on metal instances, which uses NVME instead of
  XVDA devices.
  
  AWS instances also use the 'discard' mount option with ext4, thought
  maybe there could be a race condition between ext4 discard and journal
  flush.  Removed 'discard' mount and rebooted 5.4 kernel prior to 4.15
  kernel installation, but still wouldn't boot.
+ 
+ I have been unable to capture a stack trace using 'aws get-console-
+ output'. I enabled kdump and was unable to replicate the failure. So
+ there must be some sort of race with either ext4 and/or nvme.

** Description changed:

  When creating an r5.metal instance on AWS, the default kernel is
  bionic/linux-aws-5.4(5.4.0-1056-aws), when changing to bionic/linux-
  aws(4.15.0-1113-aws) the machine fails to boot 4.15 kernel.
  
  If I remove these patches the instance correctly boots the 4.15 kernel
  
  https://lists.ubuntu.com/archives/kernel-team/2021-September/123963.html
  
  But after successfully updating to the 4.15 without those patches
- applied, I can then upgrade to a test kernel with the above patches
+ applied, I can then upgrade to a 4.15 kernel with the above patches
  included, and the instance will boot properly.
  
  This problem only appears on metal instances, which uses NVME instead of
  XVDA devices.
  
  AWS instances also use the 'discard' mount option with ext4, thought
  maybe there could be a race condition between ext4 discard and journal
  flush.  Removed 'discard' mount and rebooted 5.4 kernel prior to 4.15
  kernel installation, but still wouldn't boot.
  
  I have been unable to capture a stack trace using 'aws get-console-
  output'. I enabled kdump and was unable to replicate the failure. So
  there must be some sort of race with either ext4 and/or nvme.

** Description changed:

  When creating an r5.metal instance on AWS, the default kernel is
  bionic/linux-aws-5.4(5.4.0-1056-aws), when changing to bionic/linux-
  aws(4.15.0-1113-aws) the machine fails to boot 4.15 kernel.
  
  If I remove these patches the instance correctly boots the 4.15 kernel
  
  https://lists.ubuntu.com/archives/kernel-team/2021-September/123963.html
  
  But after successfully updating to the 4.15 without those patches
  applied, I can then upgrade to a 4.15 kernel with the above patches
  included, and the instance will boot properly.
  
  This problem only appears on metal instances, which uses NVME instead of
  XVDA devices.
  
  AWS instances also use the 'discard' mount option with ext4, thought
  maybe there could be a race condition between ext4 discard and journal
  flush.  Removed 'discard' mount and rebooted 5.4 kernel prior to 4.15
- kernel installation, but still wouldn't boot.
+ kernel installation, but still wouldn't boot after installing the 4.15
+ kernel.
  
  I have been unable to capture a stack trace using 'aws get-console-
- output'. I enabled kdump and was unable to replicate the failure. So
+ output'. After enabling kdump I was unable to replicate the failure. So
  there must be some sort of race with either ext4 and/or nvme.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1946149

Title:
  Bionic/linux-aws Boot failure downgrading from Bionic/linux-aws-5.4 on
  r5.metal

Status in linux-aws package in Ubuntu:
  New

Bug description:
  When creating an r5.metal instance on AWS, the default kernel is
  bionic/linux-aws-5.4(5.4.0-1056-aws), when changing to bionic/linux-
  aws(4.15.0-1113-aws) the machine fails to boot 4.15 kernel.

  If I remove these patches the instance correctly boots the 4.15 kernel

  https://lists.ubuntu.com/archives/kernel-
  team/2021-September/123963.html

  But after successfully updating to the 4.15 without those patches
  applied, I can then upgrade to a 4.15 kernel with the above patches
  included, and the instance will boot properly.

  This problem only appears on metal instances, which uses NVME instead
  of XVDA devices.

  AWS instances also use the 'discard' mount option with ext4, thought
  maybe there could be a race condition between ext4 discard and journal
  flush.  Removed 'discard' mount and rebooted 5.4 kernel prior to 4.15
  kernel installation, but still wouldn't boot after installing the 4.15
  kernel.

  I have been unable to capture a stack trace using 'aws get-console-
  output'. After enabling 

[Kernel-packages] [Bug 1946149] Re: Bionic/linux-aws Boot failure downgrading from Bionic/linux-aws-5.4 on r5.metal

2021-10-05 Thread Ian May
Have been unable to capture a stack trace using 'aws get-console-
output'. Enabled kdump and was unable to replicate the failed boot,
which makes this feel like a race condition with NVME.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1946149

Title:
  Bionic/linux-aws Boot failure downgrading from Bionic/linux-aws-5.4 on
  r5.metal

Status in linux-aws package in Ubuntu:
  New

Bug description:
  When creating an r5.metal instance on AWS, the default kernel is
  bionic/linux-aws-5.4(5.4.0-1056-aws), when changing to bionic/linux-
  aws(4.15.0-1113-aws) the machine fails to boot 4.15 kernel.

  If I remove these patches the instance correctly boots the 4.15 kernel

  https://lists.ubuntu.com/archives/kernel-
  team/2021-September/123963.html

  But after successfully updating to the 4.15 without those patches
  applied, I can then upgrade to a test kernel with the above patches
  included, and the instance will boot properly.

  This problem only appears on metal instances, which uses NVME instead
  of XVDA devices.

  AWS instances also use the 'discard' mount option with ext4, thought
  maybe there could be a race condition between ext4 discard and journal
  flush.  Removed 'discard' mount and rebooted 5.4 kernel prior to 4.15
  kernel installation, but still wouldn't boot.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1946149/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1946149] Re: Bionic/linux-aws Boot failure downgrading from Bionic/linux-aws-5.4 on r5.metal

2021-10-05 Thread Ian May
** Description changed:

  When creating an r5.metal instance on AWS, the default kernel is
  bionic/linux-aws-5.4(5.4.0-1056-aws), when changing to bionic/linux-
  aws(4.15.0-1113-aws) the machine fails to boot 4.15 kernel.
+ 
+ If I remove these patches the instance correctly boots the 4.15 kernel
+ 
+ https://lists.ubuntu.com/archives/kernel-team/2021-September/123963.html
+ 
+ But after successfully updating to the 4.15 without those patches
+ applied, I can then upgrade to a test kernel with the above patches
+ included, and the instance will boot properly.
+ 
+ This problem only appears on metal instances, which uses NVME instead of
+ XVDA devices.
+ 
+ AWS instances also use the 'discard' mount option with ext4, thought
+ maybe there could be a race condition between ext4 discard and journal
+ flush.  Removed 'discard' mount and rebooted 5.4 kernel prior to 4.15
+ kernel installation, but still wouldn't boot.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1946149

Title:
  Bionic/linux-aws Boot failure downgrading from Bionic/linux-aws-5.4 on
  r5.metal

Status in linux-aws package in Ubuntu:
  New

Bug description:
  When creating an r5.metal instance on AWS, the default kernel is
  bionic/linux-aws-5.4(5.4.0-1056-aws), when changing to bionic/linux-
  aws(4.15.0-1113-aws) the machine fails to boot 4.15 kernel.

  If I remove these patches the instance correctly boots the 4.15 kernel

  https://lists.ubuntu.com/archives/kernel-
  team/2021-September/123963.html

  But after successfully updating to the 4.15 without those patches
  applied, I can then upgrade to a test kernel with the above patches
  included, and the instance will boot properly.

  This problem only appears on metal instances, which uses NVME instead
  of XVDA devices.

  AWS instances also use the 'discard' mount option with ext4, thought
  maybe there could be a race condition between ext4 discard and journal
  flush.  Removed 'discard' mount and rebooted 5.4 kernel prior to 4.15
  kernel installation, but still wouldn't boot.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1946149/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1946149] [NEW] Bionic/linux-aws Boot failure downgrading from Bionic/linux-aws-5.4 on r5.metal

2021-10-05 Thread Ian May
Public bug reported:

When creating an r5.metal instance on AWS, the default kernel is
bionic/linux-aws-5.4(5.4.0-1056-aws), when changing to bionic/linux-
aws(4.15.0-1113-aws) the machine fails to boot 4.15 kernel.

** Affects: linux-aws (Ubuntu)
 Importance: Undecided
 Status: New

** Package changed: ubuntu => linux-aws (Ubuntu)

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1946149

Title:
  Bionic/linux-aws Boot failure downgrading from Bionic/linux-aws-5.4 on
  r5.metal

Status in linux-aws package in Ubuntu:
  New

Bug description:
  When creating an r5.metal instance on AWS, the default kernel is
  bionic/linux-aws-5.4(5.4.0-1056-aws), when changing to bionic/linux-
  aws(4.15.0-1113-aws) the machine fails to boot 4.15 kernel.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1946149/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1931325] Re: cfs_bandwidth01 in sched from ubuntu_ltp_stable failed on B-4.15

2021-08-31 Thread Ian May
Found on: bionic/linux-aws: 4.15.0-.118

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1931325

Title:
  cfs_bandwidth01 in sched from ubuntu_ltp_stable failed on B-4.15

Status in ubuntu-kernel-tests:
  Confirmed
Status in linux package in Ubuntu:
  Fix Released
Status in linux source package in Bionic:
  Confirmed

Bug description:
  [Impact]
  Test case cfs_bandwidth01 in LTP sched test suite is a reproducer
  of a CFS unthrottle_cfs_rq() issue (fe61468b2cbc2b sched/fair: Fix
  enqueue_task_fair warning).

  This test triggers a warning on our 4.15 kernel:
   LTP: starting cfs_bandwidth01 (cfs_bandwidth01 -i 5)
   [ cut here ]
   rq->tmp_alone_branch != >leaf_cfs_rq_list
   WARNING: CPU: 0 PID: 0 at 
/build/linux-fYK9kF/linux-4.15.0/kernel/sched/fair.c:393 
unthrottle_cfs_rq+0x16f/0x200
   Modules linked in: input_leds joydev serio_raw mac_hid qemu_fw_cfg kvm_intel 
kvm irqbypass sch_fq_codel binfmt_misc ib_iser rdma_cm iw_cm ib_cm ib_core 
iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi nfsd auth_rpcgss nfs_acl 
lockd grace sunrpc ip_tables x_tables autofs4 btrfs zstd_compress raid10 
raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq 
libcrc32c raid1 raid0 multipath linear cirrus ttm drm_kms_helper syscopyarea 
sysfillrect sysimgblt fb_sys_fops drm psmouse virtio_blk pata_acpi floppy 
virtio_net i2c_piix4
   CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-144-generic #148-Ubuntu
   Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 
04/01/2014
   RIP: 0010:unthrottle_cfs_rq+0x16f/0x200
   RSP: 0018:989ebfc03e80 EFLAGS: 00010082
   RAX:  RBX: 989eb4c6ac00 RCX: 
   RDX: 0005 RSI: acb63c4d RDI: 0046
   RBP: 989ebfc03ea8 R08: 00af39e61b33 R09: acb63c20
   R10:  R11: 0001 R12: 989eb57fe400
   R13: 989ebfc21900 R14: 0001 R15: 0001
   FS: () GS:989ebfc0() knlGS:
   CS: 0010 DS:  ES:  CR0: 80050033
   CR2: 55593258d618 CR3: 7a044000 CR4: 06f0
   DR0:  DR1:  DR2: 
   DR3:  DR6: fffe0ff0 DR7: 0400
   Call Trace:
   
   distribute_cfs_runtime+0xc3/0x110
   sched_cfs_period_timer+0xff/0x220
   ? sched_cfs_slack_timer+0xd0/0xd0
   __hrtimer_run_queues+0xdf/0x230
   hrtimer_interrupt+0xa0/0x1d0
   smp_apic_timer_interrupt+0x6f/0x140
   apic_timer_interrupt+0x90/0xa0
   
   RIP: 0010:native_safe_halt+0x12/0x20
   RSP: 0018:ac603e28 EFLAGS: 0246 ORIG_RAX: ff11
   RAX: abbc9280 RBX:  RCX: 
   RDX:  RSI:  RDI: 
   RBP: ac603e28 R08: 00af39850067 R09: 989e73749d00
   R10:  R11: 7fff R12: 
   R13:  R14:  R15: 
   ? __sched_text_end+0x1/0x1
   default_idle+0x20/0x100
   arch_cpu_idle+0x15/0x20
   default_idle_call+0x23/0x30
   do_idle+0x172/0x1f0
   cpu_startup_entry+0x73/0x80
   rest_init+0xae/0xb0
   start_kernel+0x4dc/0x500
   x86_64_start_reservations+0x24/0x26
   x86_64_start_kernel+0x74/0x77
   secondary_startup_64+0xa5/0xb0
   Code: 50 09 00 00 49 39 85 60 09 00 00 74 68 80 3d 3a 6e 54 01 00 75 5f 31 
db 48 c7 c7 c0 3d 2d ac c6 05 28 6e 54 01 01 e8 11 36 fc ff <0f> 0b 48 85 db 74 
43 49 8b 85 78 09 00 00 49 39 85 70 09 00 00
   ---[ end trace b6b9a70bc2945c0c ]---

  [Fix]
  Base on the test case description, we will need these fixes:
    * fe61468b2cbc2b sched/fair: Fix enqueue_task_fair warning
    * b34cb07dde7c23 sched/fair: Fix enqueue_task_fair() warning some more
    * 39f23ce07b9355 sched/fair: Fix unthrottle_cfs_rq() for leaf_cfs_rq list
    * 6d4d22468dae3d sched/fair: Reorder enqueue/dequeue_task_fair path
    * 5ab297bab98431 sched/fair: Fix reordering of enqueue/dequeue_task_fair()

  Backport needed for Bionic since we're missing some new variables /
  coding style changes introduced in the following commits (and their
  corresponding fixes):
    * 97fb7a0a8944bd sched: Clean up and harmonize the coding style of the 
scheduler code base
    * 9f68395333ad7f sched/pelt: Add a new runnable average signal
    * 6212437f0f6043 sched/fair: Fix runnable_avg for throttled cfs
    * 43e9f7f231e40e sched/fair: Start tracking SCHED_IDLE tasks count in cfs_rq

  I have also searched in the upstream tree to see if there is any other
  commit claim to be a fix of these but didn't see any.

  [Test]
  Test kernel can be found here:
  https://people.canonical.com/~phlin/kernel/lp-1931325-cfs_bandwidth01/

  With these patches applied, the test can pass without triggering this
  warning.

  

[Kernel-packages] [Bug 1940261] Re: ubuntu_seccomp 11-basic-basic_errors failure on X/oracle

2021-08-31 Thread Ian May
Found on: bionic/linux-aws: 4.15.0-.118

** Tags added: aws

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-oracle in Ubuntu.
https://bugs.launchpad.net/bugs/1940261

Title:
  ubuntu_seccomp 11-basic-basic_errors failure on X/oracle

Status in ubuntu-kernel-tests:
  New
Status in linux package in Ubuntu:
  Incomplete
Status in linux-oracle package in Ubuntu:
  New
Status in linux source package in Xenial:
  Incomplete
Status in linux-oracle source package in Xenial:
  New
Status in linux source package in Bionic:
  Incomplete
Status in linux-oracle source package in Bionic:
  New

Bug description:
  Xenial/Oracle 4.15.0-1079.87~16.04.1 fails 11-basic-basic_errors test
  from ubuntu_seccomp on all Oracle cloud instances:

   batch name: 11-basic-basic_errors
   test mode:  c
   test type:  basic
  Test 11-basic-basic_errors%%001-1 result:   FAILURE 11-basic-basic_errors 
rc=255

  Base kernel bionic/linux-oracle/4.15.0-1079.87 is OK.
  Previous cycle (xenial/linux-oracle/4.15.0-1077.85~16.04.1) is OK, so this 
looks like regression.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-kernel-tests/+bug/1940261/+subscriptions


-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp


[Kernel-packages] [Bug 1932065] Re: Upstream v5.9 introduced 'module' patches that removed exported symbols

2021-07-09 Thread Ian May
** Description changed:

  SRU Justification:
  
  [Impact]
  
  * The following patches removed an exported symbol that will cause
  potential disruption and breakage for customers
  
   modules: inherit TAINT_PROPRIETARY_MODULE
   modules: return licensing information from find_symbol
   modules: rename the licence field in struct symsearch to license
   modules: unexport __module_address
   modules: unexport __module_text_address
   modules: mark each_symbol_section static
   modules: mark find_symbol static
   modules: mark ref_module static
  
  [Fix]
  
  * Temporarily revert as SAUCE patches to allow customers time to make
  necessary changes to support eventual patch changes.
  
  [Test Plan]
  
- * none
+ * Check symbols on running kernel
+  sudo grep -e ' ref_module' -e ' find_symbol' -e ' each_symbol_section$' -e ' 
__module_address' -e ' __module_text_address' /proc/kallsyms
+ 
+ * Check symbols on all installed kernels
+  sudo grep -e ' ref_module' -e ' find_symbol' -e ' each_symbol_section$' -e ' 
__module_address' -e ' __module_text_address' /boot/System.map-*
  
  [Where problems could occur]
  
  * The new functionality provided by patches will be removed, since we
  aren't removing existing functionality the risk should be low.

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1932065

Title:
  Upstream v5.9 introduced 'module' patches that removed exported
  symbols

Status in linux package in Ubuntu:
  New
Status in linux source package in Bionic:
  Fix Committed
Status in linux source package in Focal:
  Fix Released
Status in linux source package in Groovy:
  Fix Committed

Bug description:
  SRU Justification:

  [Impact]

  * The following patches removed an exported symbol that will cause
  potential disruption and breakage for customers

   modules: inherit TAINT_PROPRIETARY_MODULE
   modules: return licensing information from find_symbol
   modules: rename the licence field in struct symsearch to license
   modules: unexport __module_address
   modules: unexport __module_text_address
   modules: mark each_symbol_section static
   modules: mark find_symbol static
   modules: mark ref_module static

  [Fix]

  * Temporarily revert as SAUCE patches to allow customers time to make
  necessary changes to support eventual patch changes.

  [Test Plan]

  * Check symbols on running kernel
   sudo grep -e ' ref_module' -e ' find_symbol' -e ' each_symbol_section$' -e ' 
__module_address' -e ' __module_text_address' /proc/kallsyms

  * Check symbols on all installed kernels
   sudo grep -e ' ref_module' -e ' find_symbol' -e ' each_symbol_section$' -e ' 
__module_address' -e ' __module_text_address' /boot/System.map-*

  [Where problems could occur]

  * The new functionality provided by patches will be removed, since we
  aren't removing existing functionality the risk should be low.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1932065/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp