Bug#900821: linux-image-4.9.0-6-amd64: apache reads wrong data over cifs filesystems served by samba
Hi, I can support Lindfors findings. Testing different LTS kernels without `EnableMMAP` and `EnableSendfile`: 5.4 - broken 5.10 - works 5.15 - works
Bug#900821: linux-image-4.9.0-6-amd64: apache reads wrong data over cifs filesystems served by samba
Hi, just as a random exercise to learn debbisect I tried it against this bug. I hope this might be useful, or not :) Early on I noticed that the issue can demonstrate itself in two different ways: 1) wget fails with "200 No headers, assuming HTTP/0.9". The original steps to reproduce included "2>/dev/null" which was hiding this. This is apparently bug https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=930574 2) kernel reports a bug in dmesg and sets /proc/sys/kernel/tainted to a non-zero value. This is the case I started to investigate. Using $ debbisect --qemu=defaults --depends=linux-image-amd64 --verbose --cache=./cache 2019-06-01 2022-06-05 ./script2.sh I see that the issue occurs with 4.19.0-5-amd64 but does not occur anymore with 5.2.0-2-amd64. The helper scripts I used are below. Note that I had to negate the "good" status to find the version that fixes the bug instead of finding the version that introduces it. script2.sh #!/bin/sh echo "script2.sh starting as $(whoami)" ssh_config="$1" scp -F "$ssh_config" script3.sh qemu: ssh -F "$ssh_config" qemu ./script3.sh script3.sh #!/bin/sh set -exu echo "script3.sh starting as $(whoami)" env APT_LISTCHANGES_FRONTEND=none DEBIAN_FRONTEND=noninteractive apt-get -o Dpkg::Options::="--force-confold" -o Dpkg::Options::="--force-confdef" install -y -q samba apache2 cifs-utils curl tee -a /etc/samba/smb.conf
Bug#900821: linux-image-4.9.0-6-amd64: apache reads wrong data over cifs filesystems served by samba
Hi, by default, apache uses mmap, so probably mmap is broken on cifs. An alternate workaround should be to set EnableMMAP off in the apache config. Cheers, Stefan
Bug#900821: linux-image-4.9.0-6-amd64: apache reads wrong data over cifs filesystems served by samba
Hi, I am facing this problem with debian stable, but kernel from backports: ii linux-image-4.19.0-0.bpo.1-amd64-unsigned 4.19.12-1~bpo9+1 amd64Linux 4.19 for 64-bit PCs Linux version 4.19.0-0.bpo.1-amd64 (debian-ker...@lists.debian.org) (gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1)) #1 SMP Debian 4.19.12-1~bpo9+1 (2018-12-30) The problem appear around twice a day while copying data via rsync to software raid disk array. Jan 15 22:31:48 kk-router kernel: [ 6406.089208] general protection fault: [#1] SMP PTI Jan 15 22:31:48 kk-router kernel: [ 6406.089330] CPU: 2 PID: 1183 Comm: sshd Tainted: GE 4.19.0-0.bpo.1-amd64 #1 Debian 4.19.12-1~bpo9+1 Jan 15 22:31:48 kk-router kernel: [ 6406.089598] Hardware name: Gigabyte Technology Co., Ltd. Default string/J3455N-D3H, BIOS F2 03/07/2017 Jan 15 22:31:48 kk-router kernel: [ 6406.089751] RIP: 0010:__check_object_size+0x7b/0x1a0 Jan 15 22:31:48 kk-router kernel: [ 6406.089834] Code: 00 00 80 48 2b 15 5d 0e c7 00 48 01 c2 48 c1 ea 0c 48 c1 e2 06 48 03 15 3b 0e c7 00 48 8b 42 08 48 8d 48 ff a8 01 48 0f 45 d1 <48> 8b 4a 08 48 8d 41 ff 83 e1 01 48 0f 44 c2 48 8b 00 f6 c4 01 75 Jan 15 22:31:48 kk-router kernel: [ 6406.090115] RSP: 0018:96f9415dfc58 EFLAGS: 00010202 Jan 15 22:31:48 kk-router kernel: [ 6406.090202] RAX: efffd06949355a01 RBX: 896a0d56cc02 RCX: efffd06949355a00 Jan 15 22:31:48 kk-router kernel: [ 6406.090314] RDX: efffd06949355a00 RSI: 896a3fff RDI: 896a8d56cc02 Jan 15 22:31:48 kk-router kernel: [ 6406.090426] RBP: 05a8 R08: 05a8 R09: 05a8 Jan 15 22:31:48 kk-router kernel: [ 6406.090538] R10: R11: R12: 0001 Jan 15 22:31:48 kk-router kernel: [ 6406.090650] R13: 896a0d56d1aa R14: 05a8 R15: 896a0d56cc02 Jan 15 22:31:48 kk-router kernel: [ 6406.090764] FS: 7ff8efe39d40() GS:896a37b0() knlGS: Jan 15 22:31:48 kk-router kernel: [ 6406.090890] CS: 0010 DS: ES: CR0: 80050033 Jan 15 22:31:48 kk-router kernel: [ 6406.090982] CR2: 7fadf5efbfb0 CR3: 000265ebc000 CR4: 003406e0 Jan 15 22:31:48 kk-router kernel: [ 6406.091095] Call Trace: Jan 15 22:31:48 kk-router kernel: [ 6406.091152] skb_copy_datagram_iter+0x75/0x260 Jan 15 22:31:48 kk-router kernel: [ 6406.091232] tcp_recvmsg+0x72b/0xca0 Jan 15 22:31:48 kk-router kernel: [ 6406.091300] ? aa_sk_perm+0x44/0x130 Jan 15 22:31:48 kk-router kernel: [ 6406.091366] inet_recvmsg+0x5b/0xd0 Jan 15 22:31:48 kk-router kernel: [ 6406.091430] sock_read_iter+0x94/0xf0 Jan 15 22:31:48 kk-router kernel: [ 6406.091498] new_sync_read+0xfa/0x160 Jan 15 22:31:48 kk-router kernel: [ 6406.091565] vfs_read+0x91/0x130 Jan 15 22:31:48 kk-router kernel: [ 6406.091624] ksys_read+0x52/0xc0 Jan 15 22:31:48 kk-router kernel: [ 6406.091685] do_syscall_64+0x55/0x110 Jan 15 22:31:48 kk-router kernel: [ 6406.091752] entry_SYSCALL_64_after_hwframe+0x44/0xa9 Jan 15 22:31:48 kk-router kernel: [ 6406.091837] RIP: 0033:0x7ff8edfad6d0 Jan 15 22:31:48 kk-router kernel: [ 6406.091900] Code: b6 fe ff ff 48 8d 3d 17 be 08 00 48 83 ec 08 e8 06 db 01 00 66 0f 1f 44 00 00 83 3d 39 30 2c 00 00 75 10 b8 00 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 de 9b 01 00 48 89 04 24 Jan 15 22:31:48 kk-router kernel: [ 6406.092180] RSP: 002b:7ffe060775a8 EFLAGS: 0246 ORIG_RAX: Jan 15 22:31:48 kk-router kernel: [ 6406.092301] RAX: ffda RBX: 0003 RCX: 7ff8edfad6d0 Jan 15 22:31:48 kk-router kernel: [ 6406.092413] RDX: 4000 RSI: 7ffe060775b0 RDI: 0003 Jan 15 22:31:48 kk-router kernel: [ 6406.092525] RBP: 5572caec33e0 R08: R09: 4500 Jan 15 22:31:48 kk-router kernel: [ 6406.092637] R10: 7ffe0607b530 R11: 0246 R12: Jan 15 22:31:48 kk-router kernel: [ 6406.092748] R13: 7ffe0607b63f R14: 5572c925cb67 R15: 0003 Jan 15 22:31:48 kk-router kernel: [ 6406.092862] Modules linked in: snd_hda_codec_hdmi(E) snd_hda_codec_realtek(E) snd_hda_codec_generic(E) nls_ascii(E) intel_rapl(E) nls_cp437(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) vfat(E) coretemp(E) fat(E) efi_pstore(E) kvm(E) irqbypass(E) snd_soc_skl(E) snd_soc_skl_ipc(E) snd_soc_sst_ipc(E) crct10dif_pclmul(E) snd_soc_sst_dsp(E) snd_hda_ext_core(E) ppdev(E) snd_soc_acpi_intel_match(E) snd_soc_acpi(E) crc32_pclmul(E) i915(E) snd_soc_core(E) snd_compress(E) snd_hda_intel(E) ghash_clmulni_intel(E) intel_cstate(E) drm_kms_helper(E) intel_rapl_perf(E) snd_hda_codec(E) drm(E) evdev(E) snd_hda_core(E) i2c_algo_bit(E) snd_hwdep(E) efivars(E) pcspkr(E) snd_pcm(E) lpc_ich(E) snd_timer(E) snd(E) mei_me(E) soundcore(E) mei(E) sg(E) button(E) parport_pc(E) parport(E) video(E) pcc_cpufreq(E) nfsd(E) auth_rpcgss(E) Jan 15 22:31:48 kk-router kernel: [ 6406.094025] nfs_acl(E) lockd(E) grace(E) sunrpc(E) efivarfs(E) ip_tables(E)
Bug#900821: linux-image-4.9.0-6-amd64: apache reads wrong data over cifs filesystems served by samba
Hi Santiago, On Wed, Aug 29, 2018 at 12:03:31PM +0200, Santiago Garcia Mantinan wrote: > Hi! > > I have rechecked everything again. > > Salvatore, I'm testing on an up to date buster running kernel 4.17.17-1 and > I still see the kernel warning messages and the downloads are breaking and > wget > still shows this king of messages: > 2018-08-29 13:45:31 (122 MB/s) - Read error at byte 1056768/6538880 (Bad > address). Retrying. Please disregard my comment about not able to beeing reproducible with newer kernels. I potentially did an error in my testenvironment for it, I would properly need to retest. Salvatore
Bug#900821: linux-image-4.9.0-6-amd64: apache reads wrong data over cifs filesystems served by samba
Hi! I have rechecked everything again. Salvatore, I'm testing on an up to date buster running kernel 4.17.17-1 and I still see the kernel warning messages and the downloads are breaking and wget still shows this king of messages: 2018-08-29 13:45:31 (122 MB/s) - Read error at byte 1056768/6538880 (Bad address). Retrying. So I see no progresses with newer versions or anything like that. Don't know what are the differences between your setup and mine, maybe it is the file length? What seems to work ok is the workaround of setting EnableSendfile to on, this avoids the original problem I had found on Stretch and also the problems I later found on buster with the kernel warnings and broken downloads. Hope this helps. Regards. -- Manty/BestiaTester -> http://manty.net
Bug#900821: linux-image-4.9.0-6-amd64: apache reads wrong data over cifs filesystems served by samba
Control: found -1 4.9.110-1 Control: tags -1 + confirmed Hi The issue seem to be still present in 4.9.110-1 but I have trouble to reproduce it on a sid system running 4.17.8-1. So this might give us some indication on a possible fix. Regards, Salvatore
Bug#900821: linux-image-4.9.0-6-amd64: apache reads wrong data over cifs filesystems served by samba
Package: src:linux Version: 4.9.82-1+deb9u3 Dear Maintainer, while we were investigating a problem, we found that we are affected by this bug. We found some information you might find useful: Another workaround than using an older kernel is to set the EnableSendFile directive to on. http://httpd.apache.org/docs/2.4/mod/core.html#EnableSendfile Without this setting, about 1/3 of the results are correct. Mostly there are a few wrong results that are served more than once. $ for i in {1..100} ; do wget -O- http://localhost/file | md5sum ; done | sort | uniq -c | sort -r 27 ca1014c32a2074b10b94461c1aa121d0 - 4 01e4b0e9bc097f6494e93ea03c67 - 2 77a39ffc2021e92e43d7b0320aa24d31 - 2 3ea2b0369a4357b7343304760c9a604b - 1 fe2918ea421b449f66c70472b4becfb5 - 1 fc5a29070cb70371ecb235e56d372d9d - ... Regards. Note: This e-mail is for the named person's use only. It may contain confidential and/or privileged information. If you have received this e-mail in error, please notify the sender immediately and delete the material from any system. Any unauthorized copying, disclosure, distribution or other use of this information by persons or entities other than the intended recipient is prohibited. Thank You.
Bug#900821: linux-image-4.9.0-6-amd64: apache reads wrong data over cifs filesystems served by samba
Package: src:linux Version: 4.9.88-1+deb9u1 Severity: important Dear Maintainer, I've found that when you mount a filesystem being served by samba on a host running apache and serve the files on this filesystem over apache, you'll get garbage mixed with the file content. This means that you get the right length but the file's content gets corrupted. This only happens when serving the files from samba, if you serve them from Windows the problem doesn't appear. I have found this problem in a pure Debian stable installation (Stretch), but I have tested this on a pure testing (Buster) installation with even worst results, the download breaks and the kernel shows this: [ 649.547840] WARNING: CPU: 6 PID: 1573 at /build/linux-43CEzF/linux-4.16.12/lib/iov_iter.c:695 copy_page_to_iter+0x1dd/0x2f0 [ 649.547844] Modules linked in: cmac arc4 md4 nls_utf8 cifs ccm dns_resolver fscache amd64_edac_mod edac_mce_amd radeon ccp rng_core joydev kvm sg evdev ttm k10temp drm_kms_helper serio_raw pcspkr shpchp drm irqbypass i2c_algo_bit hpilo hpwdt ipmi_si ipmi_devintf button ipmi_msghandler ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic fscrypto ecb crypto_simd cryptd glue_helper aes_x86_64 hid_generic usbhid hid sd_mod ohci_pci qla2xxx hpsa nvme_fc scsi_transport_fc scsi_transport_sas psmouse uhci_hcd ohci_hcd ehci_pci nvme_fabrics ehci_hcd scsi_mod nvme_core usbcore bnx2 i2c_piix4 usb_common [ 649.547943] CPU: 6 PID: 1573 Comm: wget Tainted: GW 4.16.0-2-amd64 #1 Debian 4.16.12-1 [ 649.547945] Hardware name: HP ProLiant BL465c G6 , BIOS A13 12/08/2009 [ 649.547953] RIP: 0010:copy_page_to_iter+0x1dd/0x2f0 [ 649.547956] RSP: 0018:ad6602defc58 EFLAGS: 00010297 [ 649.547960] RAX: 8000 RBX: d65a085b1000 RCX: 0003 [ 649.547963] RDX: 8075 RSI: 017fffc08000 RDI: 085b1000 [ 649.547965] RBP: 148b R08: 2000 R09: 9ca6e457cd24 [ 649.547968] R10: 9ca6e20df8e8 R11: 548b R12: ad6602defdf0 [ 649.547970] R13: 6bea R14: 0040 R15: 0001 [ 649.547974] FS: 7f0978403780() GS:9ca6e7cc() knlGS: [ 649.547977] CS: 0010 DS: ES: CR0: 80050033 [ 649.547980] CR2: 5592e97b3078 CR3: 000223f6c000 CR4: 06e0 [ 649.547983] Call Trace: [ 649.548001] skb_copy_datagram_iter+0x175/0x280 [ 649.548010] tcp_recvmsg+0x279/0xb90 [ 649.548019] ? set_fd_set+0x38/0x50 [ 649.548024] ? core_sys_select+0x2a4/0x2d0 [ 649.548032] inet_recvmsg+0x58/0xd0 [ 649.548038] sock_read_iter+0x94/0xf0 [ 649.548047] new_sync_read+0xe9/0x140 [ 649.548060] vfs_read+0x89/0x130 [ 649.548066] SyS_read+0x52/0xc0 [ 649.548075] do_syscall_64+0x6c/0x130 [ 649.548082] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 [ 649.548089] RIP: 0033:0x7f0976eb7061 [ 649.548091] RSP: 002b:7ffec8800db8 EFLAGS: 0246 ORIG_RAX: [ 649.548095] RAX: ffda RBX: 0003 RCX: 7f0976eb7061 [ 649.548097] RDX: 2000 RSI: 5592e97afc70 RDI: 0003 [ 649.548100] RBP: 0010113b R08: 7ffec8800cd0 R09: 7f0978403780 [ 649.548102] R10: R11: 0246 R12: 2000 [ 649.548105] R13: 5592e97afc70 R14: R15: 5592e97b1c80 [ 649.548108] Code: ff ff 48 89 c5 41 83 ae 28 0a 00 00 01 48 83 c4 10 48 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 0f b6 49 69 48 d3 e0 e9 a6 fe ff ff <0f> 0b 31 ed eb dc 85 c9 0f 84 ad 00 00 00 31 ed eb d0 4d 01 f5 [ 649.548180] ---[ end trace 5c988a789d68247f ]--- Doing several md5sums of the files directly on the cifs filesystem will allways result in the same md5, also doing dd if=file|md5sum, however wget http://localhost/file -O -|md5sum will result on a different code each time. The same tests running the same Stretch machine with Jessie's kernel will work Ok. Like I've said I've been able to replicate this on standard Stretch and Buster configs. These are the steps to replicate... install: apt-get install samba apache2 cifs-utils add to smb.conf to create a ftp share and then: service smbd reload [ftp] writable = no locking = no path = /srv/ftp public = yes browseable = no generate a file to be served: dd if=/dev/zero of=/srv/ftp/100Mzero bs=1024k count=100 mount the share on the web directory to serve it: mount.cifs //localhost/ftp /var/www/html/ test the local access of the cifs: md5sum /srv/ftp/100Mzero 2f282b84e7e608d5852449ed940bfc51 /srv/ftp/100Mzero Acces the file over apache: wget http://localhost/100Mzero -O - 2>/dev/null|md5sum 2b0ac997ed705924db55cf5f45ad3c88 - Like I said, changing to a Jessie's kernel this works ok, changing to a Buster 4.16 kernel or testing on a full Buster setup gives similar problem but http transfer is interrupted and kernel shows previous message. Also serving the