[Group.of.nepali.translators] [Bug 1819437] Re: transient mon<->osd connectivity HEALTH_WARN events don't self clear in 13.2.4
** Changed in: ceph (Ubuntu Eoan) Status: In Progress => Fix Released -- You received this bug notification because you are a member of नेपाली भाषा समायोजकहरुको समूह, which is subscribed to Xenial. Matching subscriptions: Ubuntu 16.04 Bugs https://bugs.launchpad.net/bugs/1819437 Title: transient mon<->osd connectivity HEALTH_WARN events don't self clear in 13.2.4 Status in ceph package in Ubuntu: Fix Released Status in ceph source package in Xenial: Invalid Status in ceph source package in Bionic: In Progress Status in ceph source package in Eoan: Fix Released Status in ceph source package in Focal: Fix Released Bug description: In a recently juju deployed 13.2.4 ceph cluster (as part of an OpenStack Rocky deploy) we experienced a none clearing HEALTH_WARN event that appeared to be associated with a short planned network outage, but did not clear without human intervention: health: HEALTH_WARN 6 slow ops, oldest one blocked for 112899 sec, daemons [mon.shinx,mon.sliggoo] have slow ops. We can correlate this back to a known network event, but all OSDs are up and the cluster otherwise looks healthy: ubuntu@juju-df624b-4-lxd-14:~$ sudo ceph osd tree ID CLASS WEIGHT TYPE NAMESTATUS REWEIGHT PRI-AFF -1 7.64076 root default -13 0.90970 host happiny 8 hdd 0.90970 osd.8up 1.0 1.0 -5 0.90970 host jynx 9 hdd 0.90970 osd.9up 1.0 1.0 -3 1.63739 host piplup 0 hdd 0.81870 osd.0up 1.0 1.0 3 hdd 0.81870 osd.3up 1.0 1.0 -9 1.63739 host raichu 5 hdd 0.81870 osd.5up 1.0 1.0 6 hdd 0.81870 osd.6up 1.0 1.0 -11 0.90919 host shinx 7 hdd 0.90919 osd.7up 1.0 1.0 -7 1.63739 host sliggoo 1 hdd 0.81870 osd.1up 1.0 1.0 4 hdd 0.81870 osd.4up 1.0 1.0 ubuntu@shinx:~$ sudo ceph daemon mon.shinx ops { "ops": [ { "description": "osd_failure(failed timeout osd.0 10.48.2.158:6804/211414 for 31sec e911 v911)", "initiated_at": "2019-03-07 00:40:43.282823", "age": 113953.696205, "duration": 113953.696225, "type_data": { "events": [ { "time": "2019-03-07 00:40:43.282823", "event": "initiated" }, { "time": "2019-03-07 00:40:43.282823", "event": "header_read" }, { "time": "0.00", "event": "throttled" }, { "time": "0.00", "event": "all_read" }, { "time": "0.00", "event": "dispatched" }, { "time": "2019-03-07 00:40:43.283360", "event": "mon:_ms_dispatch" }, { "time": "2019-03-07 00:40:43.283360", "event": "mon:dispatch_op" }, { "time": "2019-03-07 00:40:43.283360", "event": "psvc:dispatch" }, { "time": "2019-03-07 00:40:43.283370", "event": "osdmap:preprocess_query" }, { "time": "2019-03-07 00:40:43.283371", "event": "osdmap:preprocess_failure" }, { "time": "2019-03-07 00:40:43.283386", "event": "osdmap:prepare_update" }, { "time": "2019-03-07 00:40:43.283386", "event": "osdmap:prepare_failure" } ], "info": { "seq": 48576937, "src_is_mon": false, "source": "osd.8 10.48.2.206:6800/1226277", "forwarded_to_leader": false } } }, { "description": "osd_failure(failed timeout osd.3
[Group.of.nepali.translators] [Bug 1697501] Re: ksh segfault on job_chksave () after it receive a SIGCHLD (Signal 17)
** Changed in: ksh (Debian) Status: New => Fix Released -- You received this bug notification because you are a member of नेपाली भाषा समायोजकहरुको समूह, which is subscribed to Xenial. Matching subscriptions: Ubuntu 16.04 Bugs https://bugs.launchpad.net/bugs/1697501 Title: ksh segfault on job_chksave () after it receive a SIGCHLD (Signal 17) Status in ksh package in Ubuntu: Fix Released Status in ksh source package in Trusty: Fix Released Status in ksh source package in Xenial: Fix Released Status in ksh source package in Yakkety: Fix Released Status in ksh source package in Zesty: Fix Released Status in ksh source package in Artful: Fix Released Status in ksh package in Debian: Fix Released Bug description: [Impact] * The compiler optimization dropped parts from the ksh job locking mechanism from the binary code. As a consequence, ksh could terminate unexpectedly with a segmentation fault after it received the SIGCHLD signal. [Test Case] Unfortunately, there is no clear and easy way to reproduce the segfault. * But the original reporter of this bug can randomly reproduce the problem using an in-house ksh script that only works inside his infrastructure as follow : "ksh " and then once in a while ksh will segfault as follow : (gdb) bt #0 job_chksave (pid=pid@entry=19003) at /build/ksh-6IEHIC/ksh-93u+20120801/src/cmd/ksh93/sh/jobs.c:1948 #1 0x004282ab in job_reap (sig=17) at /build/ksh-6IEHIC/ksh-93u+20120801/src/cmd/ksh93/sh/jobs.c:428 #2 ... [Regression Potential] * Regression risk : low/none expected, the package has been highly/intensively tested by a user who run over 18M ksh scripts a day on each of their clusters. + * Secondly, I doubt ksh has much traction nowadays, so if a regression occurs... It will most likely be limited to a small amount of users IMHO. For instance, the bug has been reported 3 years ago for Red Hat, and we, Ubuntu, only heard about this same situation for the first time a few weeks ago. + * The fix has been written by RH and has been proven to work for them for the last 3 years. Note that the RH fix has never been merged upstream (ksh is a unmaintained project) and/or possibly never been proposed to upstream (to be verified). + * A test package including the RH fix has been intensively tested and verified (pre-SRU) by an affected user with positive feedbacks using a reproducer that segfault without the RH patch. + * Test package (pre-SRU) feedbacks : https://bugs.launchpad.net/ubuntu/xenial/+source/ksh/+bug/1697501/comments/7 [Other Info] * ksh project is unmaintained nowadays [https://github.com/att/ast], thus no new development is made upstream nor in debian upstream. * Details about the RH bug : -- - https://bugzilla.redhat.com/show_bug.cgi?id=1123467 - https://bugzilla.redhat.com/show_bug.cgi?id=1112306 - https://access.redhat.com/solutions/1253243 - http://rhn.redhat.com/errata/RHBA-2014-1015.html # ksh.spec Fri Jul 25 2014 Michal Hlavinka - 20120801-10.8 - job locking mechanism did not survive compiler optimization (#1123467) # patch - ksh-20120801-locking.patch -- * Debian bug: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=867181 [Original Description] # gdb [New LWP 3882] Core was generated by `/bin/ksh .ksh'. Program terminated with signal SIGSEGV, Segmentation fault. #0 job_chksave (pid=pid@entry=19385) at /build/ksh-6IEHIC/ksh-93u+20120801/src/cmd/ksh93/sh/jobs.c:1948 1948 if(jp->pid==pid) (gdb) p *jp Cannot access memory at address 0xb (gdb) p *jp->pid Cannot access memory at address 0x13 (gdb) p pid $2 = 19385 (gdb) p *jpold $1 = {next = 0xb, pid = -604008960, exitval = 11124} The struct is corrupted at some point looking at the next,pid and exitval struct members values which isn't valid data. # assembly code => 0x00427159 <+41>: cmp %edi,0x8(%rdx) (gdb) p $edi ## pid variable $1 = 19385 (gdb) p *($rdx + 8) ## jp->pid struct Cannot access memory at address 0x13 -- ksh is segfaulting because it can't access struct "jp" ($rdx) thus cannot de-reference the struct member "jp>pid" ($rdx + 8) at line : src/cmd/ksh93/sh/jobs.c:1948 when looking if jp->pid is equal to pid ($edi) variable. I have looked at the github project "att/ast" upstream repo and some patches here and there, and nothing seems to apply. Note that the project seems unmaintained nowadays. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/ksh/+bug/1697501/+subscriptions ___ Mailing list: https://launchpad.net/~group.of.nepali.translators Post to : group.of.nepali.translators@lists.launchpad.net Unsubscribe : https://launchpad.net/~group.of.nepali.translators More help : https://help.launchpad.net/ListHelp
[Group.of.nepali.translators] [Bug 1616123] Re: rpc-svcgssd.service uses incorrrect variable SVCGSSDARGS
** Changed in: nfs-utils (Debian) Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of नेपाली भाषा समायोजकहरुको समूह, which is subscribed to Xenial. Matching subscriptions: Ubuntu 16.04 Bugs https://bugs.launchpad.net/bugs/1616123 Title: rpc-svcgssd.service uses incorrrect variable SVCGSSDARGS Status in nfs-utils package in Ubuntu: Fix Released Status in nfs-utils source package in Xenial: Fix Released Status in nfs-utils source package in Bionic: Fix Released Status in nfs-utils source package in Cosmic: Fix Released Status in nfs-utils package in Debian: Fix Released Bug description: [Impact] Command line options set for rpc.svcgssd in the /etc/default/nfs-kernel-server file are not passed on to the service, being ignored. [Test Case] * In a VM (LXD won't work), install nfs-server and a kerberos server. Use "EXAMPLE.LOCAL" for the realm, and "localhost" for the servers, when prompted: sudo apt install nfs-server krb5-kdc krb5-user krb5-admin-server * create the EXAMPLE.LOCAL realm. Use any password you want for the database master key, it won't be requested again: sudo krb5_newrealm * create a principal for the nfs service: sudo kadmin.local -q "addprinc -randkey nfs/$(hostname -f)" * extract the key into the system wide keytab: sudo kadmin.local -q "ktadd -k /etc/krb5.keytab nfs/$(hostname -f)" * edit /etc/default/nfs-common and enable gssd: NEED_GSSD=y * edit /etc/default/nfs-kernel-server and add an option to RPCSVCGSSDOPTS: RPCSVCGSSDOPTS="-v" * restart nfs-server sudo systemctl restart nfs-server * on xenial, you also have to restart nfs-config: sudo systemctl restart nfs-config * verify if /run/sysconfig/nfs-utils has the option we added above: $ cat /run/sysconfig/nfs-utils PIPEFS_MOUNTPOINT=/run/rpc_pipefs RPCNFSDARGS=" 8" RPCMOUNTDARGS="--manage-gids" STATDARGS="" RPCSVCGSSDARGS="-v" * Verify the running rpc.gssd process. Without the fix, it won't have the "-v" option: ps axw|grep svcgssd|grep -v grep 4285 ? Ss 0:00 /usr/sbin/rpc.svcgssd With the fix, right after installing the udpated packages, the option we added to /etc/default/nfs-kernel-server will show up: ps axw|grep svcgssd|grep -v grep 5656 ? Ss 0:00 /usr/sbin/rpc.svcgssd -v [Regression Potential] This is an old bug and whoever was affected by it probably worked around the problem by now. I tried to cope with one such scenario by not just renaming the variable we export, but exporting the correct one in addition to the old incorrect one, but that's it. I hope this, and the explanation added to the shell script wrapper nfs-utils.sh, is enough to help people with corner cases. idance to testers in regression-testing the SRU. [Other Info] This patch was accepted in debian: https://salsa.debian.org/debian/nfs-utils/merge_requests/2 [Original Description] In /etc/default/nfs-kernel-server you can specify parameters for rpc.svcgssd: # Options for rpc.svcgssd. RPCSVCGSSDOPTS="-n" But the variable is named incorrectly in /lib/systemd/system/rpc- svcgssd.service: ExecStart=/usr/sbin/rpc.svcgssd $SVCGSSDARGS To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1616123/+subscriptions ___ Mailing list: https://launchpad.net/~group.of.nepali.translators Post to : group.of.nepali.translators@lists.launchpad.net Unsubscribe : https://launchpad.net/~group.of.nepali.translators More help : https://help.launchpad.net/ListHelp
[Group.of.nepali.translators] [Bug 1870660] Re: xenial/linux: 4.4.0-178.208 -proposed tracker
4.4.0-178.208 - lowlatency Regression test CMPL, RTB. 58 / 59 tests were run, missing: ubuntu_xfstests_ext4 Issue to note in amd64: ubuntu_kvm_unit_tests - apic (bug 1748103) debug (bug 1821906) vmx (bug 1821394) vmx_host_state_area (bug 1866585) vmx_intr_window_test (bug 1866586) vmx_nm_test (bug 1866587) vmx_nmi_window_test (bug 1866588) vmx_pending_event_test (bug 1866591) ubuntu_ltp_syscalls - add_key05 (bug 1869644) btrfs fill_fs test in fallocate06 (bug 1866323) fanotify06 (bug 1833028) fanotify10 (bug 1802454) kill11 (bug 1865965) ubuntu_unionmount_ovlfs - failed with the latest code in upstream (bug 1854298) ubuntu_xfstests_btrfs - no scratch drive for the test ubuntu_xfstests_xfs - no scratch drive for the test 50 / 54 tests were run, missing: ubuntu_kvm_unit_tests, ubuntu_xfstests_btrfs, ubuntu_xfstests_ext4, ubuntu_xfstests_xfs Issue to note in i386: ubuntu_ltp_syscalls - add_key05 (bug 1869644) btrfs fill_fs test in fallocate06 (bug 1866323) fanotify06 (bug 1833028) fanotify10 (bug 1802454) kill11 (bug 1865965) ubuntu_unionmount_ovlfs - failed with the latest code in upstream (bug 1854298) 4.4.0-178.208 - generic Regression test CMPL, RTB. 58 / 59 tests were run, missing: ubuntu_xfstests_ext4 Issue to note in amd64: ubuntu_kvm_unit_tests - apic (bug 1748103) debug (bug 1821906) vmx (bug 1821394) vmx_host_state_area (bug 1866585) vmx_intr_window_test (bug 1866586) vmx_nm_test (bug 1866587) vmx_nmi_window_test (bug 1866588) vmx_pending_event_test (bug 1866591) ubuntu_ltp_syscalls - add_key05 (bug 1869644) btrfs fill_fs test in fallocate06 (bug 1866323) fanotify06 (bug 1833028) fanotify10 (bug 1802454) kill11 (bug 1865965) ubuntu_unionmount_ovlfs - failed with the latest code in upstream (bug 1854298) ubuntu_xfstests_btrfs - no scratch drive for the test ubuntu_xfstests_xfs - no scratch drive for the test 51 / 54 tests were run, missing: ubuntu_xfstests_btrfs, ubuntu_xfstests_ext4, ubuntu_xfstests_xfs Issue to note in arm64: hwclock - issue for HP m400 (bug 1716603) ubuntu_kernel_selftests - cpu-hotplug failed on moonshot (bug 1809701) ubuntu_kvm_smoke_test - unable to create KVM with uvtool (bug 1749427) ubuntu_kvm_unit_tests - gicv2-mmio on X-ARM64 (bug 1828165) gicv2-mmio-3p (bug 1828027) gicv2-mmio-up (bug 1828026) pmu on ms10-34-mcdivittB0-kernel (bug 1751000) ubuntu_ltp_syscalls - add_key05 (bug 1869644) btrfs fill_fs test in fallocate06 (bug 1866323) fanotify06 (bug 1833028) fanotify10 (bug 1802454) kill11 (bug 1865965) ubuntu_unionmount_ovlfs - failed with commit dc24a45a upstream (bug 1854298) 50 / 54 tests were run, missing: ubuntu_kvm_unit_tests, ubuntu_xfstests_btrfs, ubuntu_xfstests_ext4, ubuntu_xfstests_xfs Issue to note in i386: ubuntu_ltp_syscalls - add_key05 (bug 1869644) btrfs fill_fs test in fallocate06 (bug 1866323) fanotify06 (bug 1833028) fanotify10 (bug 1802454) kill11 (bug 1865965) ubuntu_unionmount_ovlfs - failed with the latest code in upstream (bug 1854298) 52 / 55 tests were run, missing: ubuntu_xfstests_btrfs, ubuntu_xfstests_ext4, ubuntu_xfstests_xfs Issue to note in ppc64le (P8): ubuntu_btrfs_kernel_fixes - Unable to mount a btrfs filesystem smaller than 320M on Xenial P8 (bug 1813863) ubuntu_fan_smoke_test - Failed to fetch file from http://ports.ubuntu.com (bug 1864140) ubuntu_ltp_syscalls - add_key05 (bug 1869644) copy_file_range01, fallocate04, fanotify13, fanotify14, fanotify15, fdatasync03, fgetxattr01,fremovexattr01, fremovexattr02, fsetxattr01, fsync01, fsync04, lremovexattr01, msync04, preadv03, preadv03_64, preadv203, preadv203_64, pwritev03, pwritev03_64, pwritev03, pwritev03_64, setxattr01, sync03, syncfs01 (bug 1842270) btrfs fill_fs test in fallocate06 (bug 1866323) fanotify06 (bug 1833028) fanotify10 (bug 1802454) move_pages12 (bug 1831043) kill11 (bug 1865965) ubuntu_seccomp - 36-sim-ipc_syscalls, 37-sim-ipc_syscalls_be failed on s390x / PowerPC (bug 1850904) ubuntu_unionmount_ovlfs - failed with the latest code in upstream (bug 1854298) Issue to note in s390x (KVM): libhugetlbfs - failed 5 (Address is not hugepage, Heap not on hugepages) killed by signal 1 bad config 1 ubuntu_bpf_jit - 4 failures reported for X s390x (bug 1768452) ubuntu_kernel_selftests - test_bpf in net (bug 1768452) ubuntu_kvm_smoke_test - uvtool issue (bug 1729854) ubuntu_kvm_unit_tests - test should be skipped for X s390x KVM ubuntu_ltp_syscalls - add_key05 (bug 1869644) btrfs fill_fs test in fallocate06 (bug 1866323) fanotify06 (bug 1833028) fanotify10 (bug 1802454) kill11 (bug 1865965) ubuntu_unionmount_ovlfs - failed with the latest code in upstream (bug 1854298) Issue to note in s390x (Ubuntu on LPAR): libhugetlbfs - failed 5 (Address is not hugepage, Heap not on hugepages) killed by signal 1 bad config 1 ubuntu_bpf_jit - 4 failures reported for X s390x (bug 1768452) ubuntu_kernel_selftests - test_bpf in net (bug 1768452) ubuntu_kvm_smoke_test - uvtool