Memory corruption (redzone overwritten) names_cache?

2019-09-12 Thread Jakub Jankowski

Hi,

We're getting some random memory corruption on an AWS EC2 instance with 
4.19.x kernels. I've tried 4.19.19, 4.19.52, but the results below are 
from the most recent (4.19.72). For debugging I enabled 
KASAN+slub_debug, but TBH, I can't make heads or tails from these.


Without slub_debug, the host reboots within couple of minutes of uptime. 
With slub_debug it survives a bit longer, but eventually all sorts of 
issues manifest (including: reboot; ps not being able to read some 
processes' /proc//cmdline while /proc//stack shows 
acct_collect()->down_read(), etc).


Upon multiple tests, the slab I most often seen pop up as first detected 
as corrupted was names_cache.
What is really weird is that multiple times I saw redzone being 
overwritten by the same content, which looks like part of 'sessions.py'  
Python's 'requests' module.


Any debugging hints would be greatly appreciated.


Command line: BOOT_IMAGE=(hd0,msdos2)/vmlinuz ro root=/dev/xvda5 console=tty0 
console=ttyS0,9600n8 crashkernel=512M-2G:64M,2G-:128M kmemleak=on 
slub_debug=FZPU slub_nomerge
(...)
[  262.957418] 
=
[  262.957423] BUG vm_area_struct (Tainted: GB  O ): Redzone 
overwritten
[  262.957424] 
-

[  262.957427] INFO: 0xb91cc681-0x98bd5238. First byte 0x6e 
instead of 0xcc
[  262.957433] INFO: Allocated in vm_area_dup+0x1e/0x180 age=6117 cpu=0 pid=8187
[  262.957438]  kmem_cache_alloc+0x1a4/0x1d0
[  262.957439]  vm_area_dup+0x1e/0x180
[  262.957441]  copy_process.part.4+0x2fa9/0x6cd0
[  262.957443]  _do_fork+0x151/0x7a0
[  262.957446]  do_syscall_64+0x9b/0x290
[  262.957452]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  262.957455] INFO: Freed in qlist_free_all+0x37/0xd0 age=7431 cpu=0 pid=8521
[  262.957457]  quarantine_reduce+0x1a2/0x210
[  262.957458]  kasan_kmalloc+0x95/0xc0
[  262.957460]  kmem_cache_alloc+0xc6/0x1d0
[  262.957463]  getname_flags+0xba/0x510
[  262.957465]  user_path_at_empty+0x1d/0x40
[  262.957468]  vfs_statx+0xb9/0x140
[  262.957470]  __se_sys_newstat+0x7c/0xd0
[  262.957472]  do_syscall_64+0x9b/0x290
[  262.957474]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  262.957476] INFO: Slab 0xca532806 objects=30 used=24 
fp=0x6ce6da86 flags=0x2008101
[  262.957477] INFO: Object 0x5eb7e26b @offset=8 fp=0xac807fa7

[  262.957480] Redzone b91cc681: 6e 73 2e 70 79 5c 22 2c 
 ns.py\",
[  262.957482] Object 5eb7e26b: 20 6c 69 6e 65 20 36 34 36 2c 20 69 6e 
20 73 65   line 646, in se
[  262.957484] Object 7d5d4673: 6e 64 5c 6e 20 20 20 20 72 20 3d 20 61 
64 61 70  nd\nr = adap
[  262.957485] Object a3cf6db1: 74 65 72 2e 73 65 6e 64 28 72 65 71 75 
65 73 74  ter.send(request
[  262.957487] Object d8b14cdd: 2c 20 2a 2a 6b 77 61 72 00 00 00 00 00 
00 00 00  , **kwar
[  262.957489] Object 5eca0928: 40 97 5a 73 83 88 ff ff 25 00 00 00 00 
00 00 80  @.Zs%...
[  262.957491] Object 592ffbd7: 71 00 00 00 00 00 00 00 e0 c8 22 6d 83 88 ff 
ff  q."m
[  262.957492] Object 84c88ae5: 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00  
[  262.957494] Object ea6d1cb3: 83 00 00 00 00 00 00 00 80 c0 fd 5a 83 
88 ff ff  ...Z
[  262.957495] Object a236617c: 80 c0 fd 5a 83 88 ff ff 00 00 00 00 00 
00 00 00  ...Z
[  262.957497] Object 91c7956c: 00 3a 94 b0 ff ff ff ff 75 00 00 00 00 
00 00 00  .:..u...
[  262.957499] Object 216cef35: c0 85 cc 6a 83 88 ff ff 00 00 00 00 00 
00 00 00  ...j
[  262.957500] Object e0fd506c: 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00 00  
[  262.957502] Redzone f5906e86: cc cc cc cc cc cc cc cc
  
[  262.957503] Padding 53d79574: 5a 5a 5a 5a 5a 5a 5a 5a
  
[  262.957507] CPU: 3 PID: 11769 Comm: ps Kdump: loaded Tainted: GB  O  
4.19.72_3upstreamdbg #1
[  262.957508] Hardware name: Xen HVM domU, BIOS 4.2.amazon 08/24/2006
[  262.957509] Call Trace:
[  262.957516]  dump_stack+0x9a/0xf0
[  262.957519]  check_bytes_and_report.cold.24+0x3f/0x6b
[  262.957521]  check_object+0x17c/0x280
[  262.957524]  free_debug_processing+0x105/0x2a0
[  262.957526]  ? qlist_free_all+0x37/0xd0
[  262.957527]  ? qlist_free_all+0x37/0xd0
[  262.957529]  __slab_free+0x218/0x3b0
[  262.957533]  ? __free_pages_ok+0x62f/0x840
[  262.957536]  ? _raw_spin_unlock_irqrestore+0x2b/0x40
[  262.957537]  ? qlist_free_all+0x37/0xd0
[  262.957541]  ? trace_hardirqs_on+0x35/0x140
[  262.957543]  ? qlist_free_all+0x37/0xd0
[  262.957544]  qlist_free_all+0x4c/0xd0
[  262.957546]  quarantine_reduce+0x1a2/0x210
[  262.957549]  ? getname_flags+0xba/0x510
[  262.957550]  kasan_kmalloc+0x95/0xc0
[  262.957553]  ? getname

Re: [PATCH AUTOSEL 4.19 04/42] netfilter: conntrack: always store window size un-scaled

2019-08-14 Thread Jakub Jankowski

On 2019-08-14, Reindl Harald wrote:


that's still not in 5.2.8


It will make its way into next 5.2.x release, as it is now in the pending 
queue: 
https://git.kernel.org/pub/scm/linux/kernel/git/stable/stable-queue.git/tree/queue-5.2


Regards,
 Jakub.




without the exception and "nf_conntrack_tcp_timeout_max_retrans = 60" a
vnc-over-ssh session having the VNC view in the background freezes
within 60 secods

---
IPV4 TABLE MANGLE (STATEFUL PRE-NAT/FILTER)
---
Chain PREROUTING (policy ACCEPT 100 packets, 9437 bytes)
num   pkts bytes target prot opt in out source
destination
1 6526 3892K ACCEPT all  --  *  *   0.0.0.0/0
0.0.0.0/0ctstate RELATED,ESTABLISHED
2  125  6264 ACCEPT all  --  lo *   0.0.0.0/0
0.0.0.0/0
3   64  4952 ACCEPT all  --  vmnet8 *   0.0.0.0/0
0.0.0.0/0
4140 DROP   all  --  *  *   0.0.0.0/0
0.0.0.0/0ctstate INVALID

 Weitergeleitete Nachricht 
Betreff: [PATCH AUTOSEL 5.2 07/76] netfilter: conntrack: always store
window size un-scaled

Am 08.08.19 um 11:02 schrieb Thomas Jarosch:

Hello together,

You wrote on Fri, Aug 02, 2019 at 09:22:24AM -0400:

From: Florian Westphal 

[ Upstream commit 959b69ef57db00cb33e9c4777400ae7183ebddd3 ]

Jakub Jankowski reported following oddity:

After 3 way handshake completes, timeout of new connection is set to
max_retrans (300s) instead of established (5 days).

shortened excerpt from pcap provided:
25.070622 IP (flags [DF], proto TCP (6), length 52)
10.8.5.4.1025 > 10.8.1.2.80: Flags [S], seq 11, win 64240, [wscale 8]
26.070462 IP (flags [DF], proto TCP (6), length 48)
10.8.1.2.80 > 10.8.5.4.1025: Flags [S.], seq 82, ack 12, win 65535, [wscale 3]
27.070449 IP (flags [DF], proto TCP (6), length 40)
10.8.5.4.1025 > 10.8.1.2.80: Flags [.], ack 83, win 512, length 0

Turns out the last_win is of u16 type, but we store the scaled value:
512 << 8 (== 0x2) becomes 0 window.

The Fixes tag is not correct, as the bug has existed forever, but
without that change all that this causes might cause is to mistake a
window update (to-nonzero-from-zero) for a retransmit.

Fixes: fbcd253d2448b8 ("netfilter: conntrack: lower timeout to RETRANS seconds if 
window is 0")
Reported-by: Jakub Jankowski 
Tested-by: Jakub Jankowski 
Signed-off-by: Florian Westphal 
Acked-by: Jozsef Kadlecsik 
Signed-off-by: Pablo Neira Ayuso 
Signed-off-by: Sasha Levin 


Also:
Tested-by: Thomas Jarosch 

;)

We've hit the issue with the wrong conntrack timeout at two different sites,
long-lived connections to a SAP server over IPSec VPN were constantly dropping.

For us this was a regression after updating from kernel 3.14 to 4.19.
Yesterday I've applied the patch to kernel 4.19.57 and the problem is fixed.

The issue was extra hard to debug as we could just boot the new kernel
for twenty minutes in the evening on these productive systems.

The stable kernel patch from last Friday came right on time. I was just
about the replay the TCP connection with tcpreplay, so this saved
me from another week of debugging. Thanks everyone!




--
Jakub Jankowski|sha...@toxcorp.com|https://toxcorp.com/


Re: [PATCH] proc: Fix uninitialized byte read in get_mm_cmdline()

2019-07-12 Thread Jakub Jankowski

On 2019-07-12, Alexey Izbyshev wrote:


On 7/12/19 8:46 PM, Alexey Dobriyan wrote:

The proper fix to all /proc/*/cmdline problems is to revert

f5b65348fd77839b50e79bc0a5e536832ea52d8d
proc: fix missing final NUL in get_mm_cmdline() rewrite

5ab8271899658042fabc5ae7e6a99066a210bc0e
fs/proc: simplify and clarify get_mm_cmdline() function


Should this be interpreted as an actual suggestion to revert the patches,
fix the conflicts, test and submit them, or is this more like thinking out
loud? In the former case, will it be OK for long term branches?

get_mm_cmdline() does seem easier to read for me before 5ab8271899658042.
But it also has different semantics in corner cases, for example:

- If there is no NUL at arg_end-1, it reads only the first string in
the combined arg/env block, and doesn't terminate it with NUL.

- If there is any problem with access_remote_vm() or copy_to_user(),
it returns -EFAULT even if some data were copied to userspace.

On the other hand, 5ab8271899658042 was merged not too long ago (about a year),
so it's possible that the current semantics isn't heavily relied upon.


I posted this (corner?) case ~3 months ago, unfortunately it wasn't picked 
up by anyone: https://lkml.org/lkml/2019/4/5/825

You can treat it as another datapoint in this discussion.


Regards,
 Jakub

--
Jakub Jankowski|sha...@toxcorp.com|https://toxcorp.com/


get_mm_cmdline and userspace (Perl) changing argv0

2019-04-05 Thread Jakub Jankowski
Starting with 4.18 we noticed changing own commandline in Perl by setting 
$0 leads to (a bit) surprising results.


As long as we're setting $0 to something longer than the initial argv0, 
everything works (almost) as expected: reading from /proc/$$/cmdline 
yields what we set $0 to, followed by a single NULL byte.


However, when setting $0 to something shorter, doing the same yields what 
we set $0 to, then a single NULL byte, then thousands (depending on the 
size of the environment) of ' ' (0x20), and then a single NULL.


$ for i in $(seq 1 10); do
   perl -e '$0 = "1234567890"x'"$i"'; print `cat /proc/$$/cmdline`;' | wc -c;
  done
3291
3291
3291
3291
3291
3291
71
81
91
101
$

This leads to weird output of commands like "ps axufww", where a 
daemonized Perl script that had changed its $0 shows up as a long, mostly 
empty line on the process list.


We've seen this on both Perl 5.20 and 5.28, and Perl's code handling 
changes to $0 hasn't really changed since: 
https://perl5.git.perl.org/perl.git/blob/HEAD:/mg.c#l2673


Reverting commit 5ab8271899658042fabc5ae7e6a99066a210bc0e ("fs/proc: 
simplify and clarify get_mm_cmdline() function") in kernel, however, seems 
to restore previous behavior (no NULL byte at the end):


$ for i in $(seq 1 10); do
   perl -e '$0 = "1234567890"x'"$i"'; print `cat /proc/$$/cmdline`;' | wc -c;
  done
10
20
30
40
50
60
70
80
90
100
$

Has anybody else seen / been bothered by this?



Regards,
 Jakub.

--
Jakub Jankowski|sha...@toxcorp.com|https://toxcorp.com/
GPG: FCBF F03D 9ADB B768 8B92 BB52 0341 9037 A875 942D