Bug#898165: Regression in [v2] nfs: Fix ugly referral attributes ?
> On May 17, 2018, at 5:48 PM, Moritz Schlarbwrote: > > Hi Chuck, > > On 17.05.2018 16:15, Chuck Lever wrote: > >> Just a shot in the dark: Wondering if v3.16 needs >> >> commit ea96d1ecbe4fcb1df487d99309d3157b4ff5fc02 >> Author: Anna Schumaker >> AuthorDate: Fri Apr 3 14:35:59 2015 -0400 >> Commit: Trond Myklebust >> CommitDate: Thu Apr 23 14:43:54 2015 -0400 >> >>nfs: Fetch MOUNTED_ON_FILEID when updating an inode > > Gosh, it seems you're right! > When I take that patch and apply it, the referrals are being followed again! > > Thanks for your idea! > Now how do we make sure it gets applied soonish? Hi Moritz- Anyone (including you or your distributor) can request that this patch be applied to the upstream v3.16 LTS branch. Since you have confirmed that it applies and fixes your problem, it would be appropriate to report that along with your backport request. -- Chuck Lever
Bug#899044: Oops: 0000 [#1] SMP in skb_release_data, openvswitch related
Package: src:linux Version: 4.9.88-1 Hi, I'm observing the attached errors on machines that are Xen dom0 and running the latest Debian Stretch 4.9 kernel as dom0 kernel. The errors have been happening a few times in the last few weeks. It started after upgrading them from Jessie and 3.16 kernel to Stretch with 4.9 kernel. For networking between domUs and the outside world, we use openvswitch. After such an error happens: * The amount of "flows" in the kernel quickly raises to the limit, 1, as seen in output of ovs-dpctl show. * Network traffic that should flow through the openvswitch bridge starts disappearing in a seemingly random way. * The memory usage of the userspace ovs-vswitchd starts growing quickly. * Many of the ovs commands, like to add or remove an interface or bridge hang. After a restart of the openvswitch-switch service, and fixing up a bunch of configuration of connected interfaces, functionality is restored. While most of the symptoms seem related to userspace openvswitch processes, the cause of it all seems to be in the kernel, while the userspace ovs-vswitchd process is receiving a network packet? Sadly I do not know how to reproduce this, except for just waiting until it happens again. Please advice what else I could use to help resolving this issue. Thanks, Regards, -- Hans van Kranenburg May 4 08:23:03 altair kernel: [83978.662075] BUG: unable to handle kernel paging request at 0003001f May 4 08:23:03 altair kernel: [83978.665887] IP: [] skb_release_data+0x8d/0x110 May 4 08:23:03 altair kernel: [83978.669837] PGD 0 May 4 08:23:03 altair kernel: [83978.669882] May 4 08:23:03 altair kernel: [83978.673589] Oops: [#1] SMP May 4 08:23:03 altair kernel: [83978.677281] Modules linked in: cls_u32 sch_ingress act_mirred sch_fq_codel ifb xt_mark sch_htb xt_physdev br_netfilter bridge stp llc xen_netback xen_blkback algif_skcipher af_alg dm_service_time binfmt_misc xen_gntdev xen_evtchn openvswitch nf_nat_ipv6 libcrc32c xenfs xen_privcmd ip6t_REJECT nf_reject_ipv6 nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_filter ip6table_mangle ip6table_raw ip6_tables ipt_REJECT nf_reject_ipv4 xt_tcpudp xt_owner xt_multiport xt_conntrack iptable_filter iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_raw dm_crypt intel_powerclamp crct10dif_pclmul crc32_pclmul iTCO_wdt iTCO_vendor_support ghash_clmulni_intel pcspkr serio_raw joydev evdev amdkfd radeon ttm drm_kms_helper drm i2c_algo_bit lpc_ich mfd_core i7core_edac hpilo May 4 08:23:03 altair kernel: [83978.701936] sg ipmi_si hpwdt edac_core ipmi_msghandler acpi_power_meter button shpchp dm_multipath dm_mod scsi_dh_rdac scsi_dh_emc scsi_dh_alua ib_iser rdma_cm iw_cm ib_cm ib_core configfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 ext4 crc16 jbd2 fscrypto ecb mbcache btrfs crc32c_generic xor raid6_pq mlx4_en ptp pps_core hid_generic usbhid hid sd_mod crc32c_intel aesni_intel aes_x86_64 glue_helper lrw gf128mul ablk_helper cryptd psmouse ehci_pci uhci_hcd ehci_hcd usbcore usb_common hpsa scsi_transport_sas bnx2 mlx4_core devlink scsi_mod thermal May 4 08:23:03 altair kernel: [83978.724406] CPU: 1 PID: 1486 Comm: revalidator7 Not tainted 4.9.0-6-amd64 #1 Debian 4.9.88-1 May 4 08:23:03 altair kernel: [83978.729139] Hardware name: HP ProLiant DL360 G7, BIOS P68 08/16/2015 May 4 08:23:03 altair kernel: [83978.733958] task: 880119e1ee80 task.stack: c90042764000 May 4 08:23:03 altair kernel: [83978.738724] RIP: e030:[] [] skb_release_data+0x8d/0x110 May 4 08:23:03 altair kernel: [83978.743560] RSP: e02b:c90042767c78 EFLAGS: 00010206 May 4 08:23:03 altair kernel: [83978.748352] RAX: 0050 RBX: 0002 RCX: 81ce0f40 May 4 08:23:03 altair kernel: [83978.753116] RDX: RSI: 8800cc998900 RDI: 8800cc998900 May 4 08:23:03 altair kernel: [83978.757867] RBP: 8800cc998900 R08: 880123c0 R09: 88011f22 May 4 08:23:03 altair kernel: [83978.762598] R10: 8800cc998900 R11: 880119e10280 R12: 0002 May 4 08:23:03 altair kernel: [83978.767321] R13: 88011f227ec0 R14: 88011dea2800 R15: May 4 08:23:03 altair kernel: [83978.772000] FS: 7fc1656cc700() GS:88012824() knlGS: May 4 08:23:03 altair kernel: [83978.776671] CS: e033 DS: ES: CR0: 80050033 May 4 08:23:03 altair kernel: [83978.781355] CR2: 0003001f CR3: 0001212b1000 CR4: 2660 May 4 08:23:03 altair kernel: [83978.786135] Stack: May 4 08:23:03 altair kernel: [83978.790841] 880120a28000 8800cc998900 c90042767ec0 7ea4 May 4 08:23:03 altair kernel: [83978.795898] 814f6267 880120a28000 8800cc998900 814fcc91 May 4 08:23:03 altair kernel: [83978.800806] 880120a28000 8153f2df c900
Bug#899027: Calling sendfile(2) on sparse files on tmpfs allocates space for them
Package: src:linux Version: 4.16.5-1 Severity: normal Hello, the code below, if run on a normal file system prints: After creation: 0/1048576 After read: 0/1048576 but if run on a tmpfs it prints: After creation: 0/1048576 After read: 1048576/1048576 This unexpected allocation happens only when using sendfile(2), not while reading the file with read(2). I observed this behaviour on a range of kernels, from Centos7's 3.10.0 to 4.16 currently running in my machine. This is the code to reproduce it: #include #include #include #include #include #include int main() { int fd = open("test", O_WRONLY | O_CREAT, 0644); if (fd == -1) { perror("cannot open for writing"); return 1; } if (ftruncate(fd, 1024*1024) == -1) { perror("cannot ftruncate"); return 1; } close(fd); struct stat st; if (stat("test", ) == -1) { perror("cannot stat after creation"); return 1; } fprintf(stdout, "After creation: %d/%d\n", st.st_blocks * 512, st.st_size); fd = open("test", O_RDONLY); if (fd == -1) { perror("cannot open for reading"); return 1; } int out = open("/dev/null", O_WRONLY); if (out == -1) { perror("cannot open /dev/null"); return 1; } off_t offset = 0; ssize_t res = sendfile(out, fd, , st.st_size); if (res == -1) { perror("cannot sendfile"); return 1; } if (res != st.st_size) fprintf(stderr, "warning: partial sendfile\n"); if (stat("test", ) == -1) { perror("cannot stat after read"); return 1; } fprintf(stdout, "After read: %d/%d\n", st.st_blocks * 512, st.st_size); close(out); close(fd); unlink("test"); return 0; } Thanks, Enrico -- Package-specific info: ** Version: Linux version 4.16.0-1-amd64 (debian-kernel@lists.debian.org) (gcc version 7.3.0 (Debian 7.3.0-17)) #1 SMP Debian 4.16.5-1 (2018-04-29) ** Command line: BOOT_IMAGE=/vmlinuz-4.16.0-1-amd64 root=/dev/mapper/ploma--vg-root ro quiet ** Tainted: O (4096) * Out-of-tree module has been loaded. ** Kernel log: Unable to read kernel log; any relevant messages should be attached ** Model information sys_vendor: HP product_name: HP EliteBook x360 1020 G2 product_version: chassis_vendor: HP chassis_version: bios_vendor: HP bios_version: P90 Ver. 01.01 board_vendor: HP board_name: 8300 board_version: KBC Version 56.3A ** Loaded modules: fuse ctr ccm sd_mod sg cdc_ether usbnet r8152 mii uas usb_storage scsi_mod pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) rfcomm bnep wacom usbhid hid_multitouch arc4 hid_generic binfmt_misc nls_ascii nls_cp437 vfat fat iwlmvm mac80211 snd_hda_codec_hdmi snd_hda_codec_generic snd_soc_skl wmi_bmof hp_wmi snd_soc_skl_ipc snd_hda_ext_core iwlwifi snd_soc_sst_dsp snd_soc_sst_ipc snd_soc_acpi snd_soc_core snd_compress cfg80211 intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp btusb kvm_intel btrtl btbcm btintel kvm bluetooth irqbypass intel_cstate intel_uncore intel_rapl_perf drbg ansi_cprng uvcvideo ecdh_generic efi_pstore videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 rfkill videobuf2_common joydev evdev snd_hda_intel videodev media serio_raw snd_hda_codec efivars pcspkr snd_hda_core snd_hwdep snd_pcm i915 snd_timer snd iTCO_wdt tpm_crb iTCO_vendor_support soundcore drm_kms_helper shpchp idma64 drm intel_lpss_pci processor_thermal_device intel_lpss i2c_algo_bit intel_pch_thermal intel_soc_dts_iosf tpm_tis battery wmi soc_button_array int3403_thermal intel_vbtn int340x_thermal_zone tpm_tis_core intel_hid tpm hp_wireless rng_core int3400_thermal sparse_keymap acpi_thermal_rel video acpi_pad ac button parport_pc ppdev lp parport efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 fscrypto ecb btrfs zstd_decompress zstd_compress xxhash algif_skcipher af_alg dm_crypt dm_mod raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear md_mod crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel pcbc aesni_intel aes_x86_64 crypto_simd glue_helper cryptd nvme xhci_pci psmouse xhci_hcd nvme_core i2c_i801 intel_ish_ipc usbcore usb_common intel_ishtp i2c_hid thermal hid ** PCI devices: 00:00.0 Host bridge [0600]: Intel Corporation Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers [8086:5904] (rev 02) Subsystem: Hewlett-Packard Company Xeon E3-1200 v6/7th Gen Core Processor Host Bridge/DRAM Registers [103c:8300] Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- SERR- 00:02.0 VGA compatible controller [0300]: Intel Corporation HD Graphics 620 [8086:5916] (rev 02) (prog-if 00 [VGA controller]) Subsystem: Hewlett-Packard Company HD Graphics 620 [103c:8300] Control: I/O+ Mem+
Processed: severity of 898165 is serious
Processing commands for cont...@bugs.debian.org: > severity 898165 serious Bug #898165 [src:linux] linux-image-3.16.0-6-amd64: can't mount NFS shares via nfs referrals Severity set to 'serious' from 'grave' > thanks Stopping processing here. Please contact me if you need assistance. -- 898165: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=898165 Debian Bug Tracking System Contact ow...@bugs.debian.org with problems
Processed: severity of 898165 is serious
Processing commands for cont...@bugs.debian.org: > # regression in stable > severity 898165 serious Bug #898165 [src:linux] linux-image-3.16.0-6-amd64: can't mount NFS shares via nfs referrals Ignoring request to change severity of Bug 898165 to the same value. > thanks Stopping processing here. Please contact me if you need assistance. -- 898165: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=898165 Debian Bug Tracking System Contact ow...@bugs.debian.org with problems
Bug#898165: Regression in [v2] nfs: Fix ugly referral attributes ?
Control: tags -1 + upstream patch Control: severity -1 grave Control: summary -1 0 Control: outlook -1 0 3.16.54 introduced a regression by including "nfs: Fix ugly referral attributes" but not "nfs: Fetch MOUNTED_ON_FILEID when updating an inode". Please include that other patch, too so NFS referrals work again. The required patch is https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ea96d1ecbe4fcb1df487d99309d3157b4ff5fc02 Best regards, Moritz On 17.05.2018 16:15, Chuck Lever wrote: > Just a shot in the dark: Wondering if v3.16 needs > > commit ea96d1ecbe4fcb1df487d99309d3157b4ff5fc02 > Author: Anna Schumaker> AuthorDate: Fri Apr 3 14:35:59 2015 -0400 > Commit: Trond Myklebust > CommitDate: Thu Apr 23 14:43:54 2015 -0400 > > nfs: Fetch MOUNTED_ON_FILEID when updating an inode
Processed: Re: Regression in [v2] nfs: Fix ugly referral attributes ?
Processing control commands: > tags -1 + upstream patch Bug #898165 [src:linux] linux-image-3.16.0-6-amd64: can't mount NFS shares via nfs referrals Ignoring request to alter tags of bug #898165 to the same tags previously set > severity -1 grave Bug #898165 [src:linux] linux-image-3.16.0-6-amd64: can't mount NFS shares via nfs referrals Severity set to 'grave' from 'important' > summary -1 0 Summary recorded from message bug 898165 message 80 > outlook -1 0 Outlook recorded from message bug 898165 message 82 -- 898165: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=898165 Debian Bug Tracking System Contact ow...@bugs.debian.org with problems