Public bug reported: While we are validating the FlexRAN on Ubuntu 20.04 with Low-Latency kernel one of the long-run stability tests failed, looking for your comments/support in understanding the problem. At this point, we believe, some additional platform/OS level fine turning is required for the flexRAN application to run stable (as you know flexran L1 application is very sensible to jitter and latency, hence we tried to make sure flexran worker threads get highest priority, isolated cores, irq isolation, etc.. ).
With Low-Latency kernel, FlexRAN short-term timer mode tests are passed. But when we run oRAN mode test, flexran L1 application crashed with an indication of - one of the worker thread was not able to complete its processing in a given time. This test was aimed for a long duration (continuous until interrupted), goal is check for at least 12hr stability. Failure was observed at random time periods: ~2hr, ~4hr, ~45min… Our initial thoughts are some interrupt raised and pre-empted the worker core for longer period of time. Attached is the dmesg output Machine config is: Configuration Comments uname -a Linux flexran-ubuntu 5.4.0-77-lowlatency #86-Ubuntu SMP PREEMPT Thu Jun 17 03:26:36 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux Kernel command line # cat /proc/cmdline BOOT_IMAGE=/vmlinuz-5.4.0-77-lowlatency root=/dev/mapper/ubuntu--vg- ubuntu--lv ro maybe-ubiquity intel_iommu=on iommu=pt usbcore.autosuspend=-1 selinux=0 enforcing=0 nmi_watchdog=0 softlockup_panic=0 audit=0 cgroup_disable=memory intel_pstate=disable mce=off hugepagesz=1G hugepages=40 hugepagesz=2M hugepages=0 default_hugepagesz=1G kthread_cpus=0,20 irqaffinity=0,20 nohz=on nosoftlockup nohz_full=1-19,21-39 rcu_nocbs=1-19,21-39 skew_tick=1 isolcpus=1-19,21-39 root@flexran-ubuntu:flexran-21.03# Application core mask 0x1F0001F0 Cores: 4,5,6,7,8,24,25,26,27,28 CPU info # lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 46 bits physical, 48 bits virtual CPU(s): 40 On-line CPU(s) list: 0-39 Thread(s) per core: 2 Core(s) per socket: 20 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 85 Model name: Intel(R) Xeon(R) Gold 6248 CPU @ 2.50GHz Stepping: 7 CPU MHz: 2500.301 BogoMIPS: 3200.00 Virtualization: VT-x L1d cache: 640 KiB L1i cache: 640 KiB L2 cache: 20 MiB L3 cache: 27.5 MiB NUMA node0 CPU(s): 0-39 Vulnerability Itlb multihit: KVM: Mitigation: Split huge pages Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization Vulnerability Spectre v2: Mitigation; Enhanced IBRS, IBPB conditional, RSB filling Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Mitigation; TSX disabled Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lah f_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ep t vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid cqm mpx rdt_a avx512f avx512dq rdseed adx smap clflushopt clwb intel_pt avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts pku o spke avx512_vnni md_clear flush_l1d arch_capabilities Irqbalence service is off sysv-rc-conf --level 12345 irqbalance off Flexran application (L1 and testmac) run on bare metal environment No VM or Container ** Affects: linux-lowlatency (Ubuntu) Importance: Undecided Status: New ** Attachment added: "dmesg output" https://bugs.launchpad.net/bugs/1938580/+attachment/5514894/+files/dmesg_output.txt -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1938580 Title: Ubuntu low-latency kernel 20.04 fails stability tests To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-lowlatency/+bug/1938580/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs